Your city sucks! (And so does mine)

It seems the the latest craze amongst entrepreneurs and, in particular, tech “hubs” is to pull out the ruler and compare penises. As someone who’s slept with three of those technology hubs I’m going to tell you that each penis has its own merits and it’s own disadvantages. Allow me to explain.

For the record I’ve lived in three legitimate technology hubs: Seattle, San Francisco, and Boulder. Additionally, I’ve spent a considerable amount of time in Portland. Each one of them had its benefits and detractions. Ultimately, I’m settling down in San Francisco (after moving to it, away from it, and subsequently back 3 times now). Why? It’s simple, despite all of the bullshit that is involved in this incestuous, crowded, echo chamber of a dirty ass city, it is Mecca for nerds. Period. All of the pieces of the proverbial startup pie are here: money, history, talent, schools, partners, clients, press, etc. I don’t have to wait for any part of that ecosystem to grow or blossom. It’s already here and I’m far to lazy too grow or foster any part of those for an entire city.

And guess what? The real Mecca hasn’t moved and neither will the technology Mecca. That does not mean that you have to go to San Francisco to be a real geek or to have a real startup. Anyone who tells you differently is lying to you. Please read that again, they are lying to you. Just ask Michael Dell, Jeff Bezos, and Bill Gates. I hear Dennis Crowley is pretty happy in New York City and Steven Frank is happy in Portland, as well.

So what do I think of each of these hubs?

Seattle

I lived in Seattle for 3.5 years and dated a girl in Portland for some of that so I’ve spent a considerable amount of time in both cities. I love the Pacific Northwest. The Pacific Northwest is, hands down, the best kept secret in North America. Seattle has a pretty rad technology scene with Microsoft, Amazon, Expedia, Boeing, and Lockheed all calling the area home in one way or another. I’d say, outside of San Francisco, it has the most mature startup/technology scene. New York City might have a legitimate claim to #2 here, but I honestly think Seattle gets left off the map too often in these discussions. It is, as the kids say, legit.

Seattle is seriously where my soul resides. It’s just a fantastic city. If you’re considering Seattle, you should definitely check out Portland as well. Particularly if you like your cities a bit smaller, more bike friendly, and laid back.

Pros

  • I cannot stress how cool it is to have a 14,000 foot mountain be an integral part of your skyline.
  • Mild climate where it snows rarely.
  • Close to some of the best skiing in the world. Crystal Mountain, Stephen’s Pass, and Snoqualmi are all within an hour. Whistler is about 5 hours away and Baker is 3 hours away.
  • Near a big body of water.
  • Lots of established technology companies.
  • You will likely never want for things to do outdoors.
  • You will never want for a tasty beer. Nor a tasty wine. I’d say Portland, Seattle, and San Francisco are all equally, per capita, wine crazy with the local vineyards and wineries to back it up.
  • Centrally located between two other fantastic cities: Vancouver, BC and Portland, OR.
  • University of Washington is right in the city and has a good CS department from which to poach.
  • Cost of living is relatively low. A quick look at Craigslist shows 2 bedroom apartments in the city averaging $1700 a month, which is about 30-40% cheaper than San Francisco.
  • This city basically invented the grunge/alternative rock scene. You will never want for awesome shows to watch.

Cons

  • I’m not going to lie; it’s gray, in the 40’s and 50’s, and drizzling 4-6 months a year. When I lived there we broke the record by having 38 days straight of rain. The irony here is that Chicago, Boston, and Atlanta all get more precipitation per year.
  • I’m not going to lie; 6 months out of the year it’s sunny and 70. If you like mild weather with no snow and can put up with gloom in the winter, you’ll be just fine here.
  • The food scene is pretty poor. If you’re used to San Francisco, Portland, or New York City you’re going to be pretty bummed out here.
  • The technology scene is largely driven by Microsoft. I hope you’ve sharpened your .NET skills.
  • Public transportation has been pretty crappy for years. It’s getting better, but don’t expect it to be on par with San Francisco, Portland, or New York City anytime soon.

Portland

Portland is about 2/3 the size of Seattle and doesn’t have any, that I know of, anchors in the technology scene. My other concerns about Portland, in particular, is that I don’t know what the investing side looks like and that there is no remotely top CS schools nearby from which to poach young talent. Seattle, on the other hand, has University of Washington right downtown and has a pretty well regarded CS program (compliments of Microsoft).

All this being said, Portland is a fucking fantastic city with a lot of great benefits. I think if you’re considering Seattle you have to consider Portland.

Pros

  • This is a lovely 580,000’ish person city which is exceptionally bike friendly. You can bike across the main city center in 15 minutes top, which is pretty great.
  • The food scene here is, per capita, top notch.
  • BEER! BEER! BEER! Oh man, with McMenamins, Deschutes, Full Sail, etc. all calling this place home you’ll basically never run out of new brews to taste.
  • Not only is it less than an hour from the Goonies rock and the Pacific Ocean, but a couple of giant rivers flow through it.
  • Portland didn’t want to be left out of having a mountain be in its skyline either. Mt. Hood is much closer to the city than Mt. Rainer so, despite it being about 3,000 feet shorter, is much more imposing on the skyline.
  • All of the things I said about the great outdoors in Seattle applies here.
  • Same goes for skiing. If you like boarding and skiing, you’ll find plenty of friends in Portland.
  • Great public transportation.
  • Rent on a two bedroom apartment will come with about a 20-30% discount over Seattle as well.
  • Also has a great music scene.

Cons

  • Everything I just said about Seattle’s weather can be equally applied here.
  • Portland has a higher chance of snow in the winter. It’s not surrounded on two sides by mountains and doesn’t have the same water barrier that Seattle has. As a result, Seattle gets snow once every 3-4 years and Portland gets a mild flurry or two every year or two.
  • No decent CS departments to poach talent from.
  • No real anchor technology companies feeding the ecosystem and I don’t believe there are any big players in the investment scene there.

Boulder

I moved to Boulder to start SimpleGeo. Mainly because I could move and Matt couldn’t. There’s really no better way to describe Boulder than as a laid back college town nestled in the foothills of the Rocky Mountains. This is not a real city by any means, but does have some of the big city amenities thanks to all the rich white people that live there.

Pros

  • The technology community there is growing like a weed. While I can’t really point to a single anchor company, there is a pretty impressive investment ecosystem with Foundry Group and TechStars there.
  • If you like hiking, biking, climbing, kayaking, etc., then you’re gonna love Boulder. The great outdoors isn’t just close by to Boulder, it’s literally at your footstep. I very much loved opening my back patio door to hear the Boulder Creek rushing by in the spring.
  • Speaking of biking, I can’t imagine any other city of 90,000 people in the US being more bike friendly than Boulder. My leisurely bike rides from my apartment along the Boulder Creek bike path were just stunning.
  • Despite being such a small town there’s some pretty spectacular food.
  • If you do NOT like skiing or snowboarding you probably shouldn’t even consider Boulder. Seriously. Vail, Breckenridge, and Beaver Creek are all within 3 hours.
  • University of Colorado is a legitimate CS school. Tons of eager talent lurking around. The drawback is many of them, such as Dave Morin, get lured to other technology hubs.

Cons

  • Due to zoning weirdness, rent and real estate in Boulder is actually pretty expensive for a city its size. I’d make a wager that it’s as expensive as Seattle, if not more so. Particularly when it comes to purchasing. Rent is about on par with Portland and Seattle.
  • I really yearned for hardcore technology while I was in Boulder. A lot of the technology scene was very much focused on creating consumer applications. I’m a big infrastructure, big data, big scale kind of guy and often felt out of place in that scene.
  • It’s fucking cold. Don’t believe one ounce of that bullshit Brad Feld or Andrew Hyde tell you about mild winters and 300 days of sunshine a year. The sun doesn’t help one bit when it’s -10 and there’s two feet of snow on the ground.
  • It’s tiny. 90,000 people does not make a real city.
  • It’s predominantly filled with rich white people. The kinds of people who have $1m condos in Boulder as their second or third homes. If you like your home to be a melting pot, Boulder is not for you.
  • Music scene is kind of bunk. Good shows swing through Denver on occasion, but Denver isn’t exactly easy to get to from Boulder.

San Francisco

All told I’ve spent almost four years living in San Francisco. I’ve moved here three times (2000, 2007, & 2010) and have finally given up on attempting further departures. What I find so amazing about San Francisco is that I simultaneously love it to death and want to strangle it at the same time.

Besides all of this technology talk, the reason I love San Francisco so much comes down to two things: the weather and how outright fucking weird this place is.

Pros

  • The weather is dramatically better than Boulder, Seattle, or Portland. No hints of snow and, outside the actual city, it’s basically 60-70 and sunny year round. During the winter months, and near the water, San Francisco has a tendency to be on the cool side and rainy at times. All-in-all, though, the weather is pretty great. Of course, it’s a hell of a lot better in LA as Chris Lea constantly reminds me.
  • Basically every single major venture capitalist in technology in the US is either headquartered here or has major operations out here. The place is literally awash in capital. The flip side of this is that almost any idiot can get his stupid idea funded.
  • Two of the most prestigious universities in the world, UC Berkeley and Stanford, are within 30 minute drives of the city.
  • San Francisco, and the valley, are home to basically ever major consumer internet company in the world. The area has been completely overrun by nerds for the most part.
  • The art scene in San Francisco is amazing. It runs the gamut of photography, installation art, the Burning Man scene, Maker scene, etc.
  • Possibly the second or third most famous wine country, outside of France and Italy, in the world.
  • There are sandy beaches on the Pacific Ocean within the city limits and easily within biking distance.
  • The food scene here is fantastic. Outside of LA and San Diego, I’ve never found a city with burritos like this.
  • Pretty fantastic music scene.
  • It’s a major hub for many international carriers. If you want to get to Asia quickly and cheaply, it’s a great option. So is, of course, LAX.
  • Pretty fantastic, yet dysfunctional, public transportation.
  • San Francisco is filled with some of the most wonderfully weird, eccentric, accepting, creative, intelligent people I’ve ever met in the world. There is no other city I’ve visited that has such a high per capita of weird. I love this.
  • I’m not sure if you’ll find a better place to get a tattoo. So many fantastic artists to choose from. I mean, fucking Ed Hardy helms the scene here and many of his proteges are following in his footsteps.
  • Have you driven down Highway 1? Have you seen the Redwoods? Stunning.
  • You can surf inside the city limits.

Cons

  • This is, hands down, the dirtiest fucking city I’ve lived in. Nobody else is even close. You can walk down the sidewalks of the nicest neighborhoods in the city, the ones with million dollar mansions, and there will be human feces on the sidewalk. The flip side to this is that they’re creative poops, such as the one my girlfriend recently spotted inside of a banana peel.
  • It’s an excruciatingly expensive city. Decent studios start at $1500 a month. Two bedroom apartments, which you’ll share with a roommate, will likely start in the $2800 to $3000 a month range.
  • San Francisco, being the epicenter of technology, is just as equally filled with snake oil salesmen and wannabes. In fact, there’s a term for the wannabes: wantrepreneurs.
  • This is an echo chamber. There is no counter balance to what you’re working on really. Everyone thinks everything is fucking amazing or a pile of shit because Arrington or Kevin Rose said so. This leads to a lot of self-masturbation.
  • Seattle, Boulder, and Portland make San Francisco’s beer scene look paltry at best.
  • I honestly can’t describe how terrible the traffic is here. 280, 80, 101, 880, 580, HWY 1, and 680 are all four lane highways and all are completely gridlocked during rush hour. It’s fucking terrible.
  • You literally can’t escape technology. There are fucking Farmville signs along the major highways. I find it extremely difficult to get way from the tech scene without physically leaving the city.
  • Despite what some people might tell you, this is not a very bike friendly city. The cabs and MUNI drivers have been in a decades long competition to see who can kill and/or maim the most bikers.
  • It’s an absolute nightmare to get to anything that resembles the mountains you find in Boulder, Seattle, or Portland. During peak season it’s not unheard of to spend 5 hours in the car getting to Tahoe.
  • The city government is terrible on so many levels.

Conclusion

I very much enjoyed my time in the Pacific Northwest and would recommend checking out both Portland and Seattle. I’m slightly biased towards Seattle because I prefer bigger, denser cities. I didn’t like Boulder at all due to the cold climate and small size of the city.

As a result, I’m sticking with San Francisco, despite poop filled bananas, because it’s a big, dense city filled with a bunch of weirdos who love building great technology.

HOWTO: Travel more intelligently

This year I’ve taken 23 trips, travelled just shy of 106,000 miles, visited 28 cities, 4 continents, and 9 countries. Tomorrow I’m heading to London for my 29th city and my 10th country. It will also put me within striking distance of United 1k status. Over the course of such a travel schedule one tends to learns the tricks of the travel trade. I thought I’d share a few of mine.

  • Remember to download enough TV, movie, music, and game content before you embark on your journey. Network connections on the road, particularly in hotels and in foreign countries, can be extremely poor. I have more music than I know what to do with and tend to download enough video content to cover 150% of my travel time as I tend to wind down at night with a little TV. In order to take a gargantuan amount of rich media with me, I’ve purchased a 1TB Western Digital Passport, which I used to store my iTunes and iPhoto libraries.
  • Buy an iPad or iPhone or Android or some other device that will play music and video. I’d recommend one that allows you to install lots of cheap casual games as well. Both the iPad and iPhone 4 have 10 hours of video playback, which should comfortably get you through many long flights.
  • That being said, the longest flight I’ve ever flown was from Chicago to Hong Kong and took 15 hours. My iPad or iPhone 4 would have ran out of battery 2/3 of the way through. To rectify this situation I’ve purchased the Zagg Sparq 2.0, which adds between 5 and 6 hours of battery life to your iPad. Additionally, I ensure my laptop is fully charged so I can use that as a glorified battery charger in the air.
  • Ever try and find an outlet at an airport only to find them all taken by people charging their laptops and their cell phones? Yeah, me too. Go and buy a Belkin mini 3 port power strip and don’t worry about that problem anymore. In addition to turning a single outlet into three, it has two USB 2.0 powered plugs. Beware though! The USB plugs will not charge and iPad. Also great for conferences.
  • If you’re traveling with a friend or your significant other, you might consider buying a headphone splitter. When Diana and I travel, she can plug her headphones into this and we can both comfortably watch my iPad.
  • Speaking of headphones, do yourself a favor and buy some nice over the hear, noise canceling ones. I have the classic Bose around the hear headphones, which are comfortable and block out the jet engines. Even without any sound they do a great job of blocking out the sound of crying babies during takeoff and landing.
  • Get yourself a 22” carry-on. It’s not really well known, but domestic carry-on limitations are usually more liberal than international ones. 22” is the international standard.
  • You don’t need to pack 14 distinct outfits for a two week vacation. I promise that they have washers and driers wherever you are going that operate, more or less, how the ones at your home does. I don’t think I’ve been to a hotel yet that didn’t offer wash and fold service either.
  • When you’re picking your seat find a seat that’s as close to the front of the plane as possible. This allows for a quicker exit when deboarding and usually keeps you away from toilet seats. Be sure to check out SeatGuru before you choose your seat.
  • Avoid any line containing one of the following: families with children, elderly people, someone in a wheelchair. Instead, look for people with roller laptop bags in suits; they’re the ones who will navigate lines the most quickly.
  • Stand on the right and walk on the left.

The last note I’ll mention is to find a decent worldwide carrier and stick with them until you hit status. Things get a whole lot better once you hit status on a single airline. I love United, which is #1 among the top carriers in on-time departures. I’ve heard others who enjoy Delta.

“Travel is fatal to prejudice, bigotry, and narrow-mindedness.” –Mark Twain

Why you should probably have an LLC for side projects

I recently posted an illustration I was getting made for a holding company I’m creating for my side projects. One of the comments asked me what the benefits were for creating a holding company for side projects. It’s a great question and I thought would make a good post for other nerds to think about.

I’m operating under two assumptions:

  1. You, the creator of said side projects, have assets you wish to protect. Houses, laptops, cars, bank accounts, other side projects, etc.
  2. You have side projects which are actually used by other people. A lot of people have side projects that only they use or a small group of friends engage with. For instance, my tweetimag.es project services over 10m Twitter avatars in a single day.

When I was in business school, one thing was hammered into my head: limit yourself to “exposure.” Exposure, is a business term that basically encompasses the things you do that could get you sued. Unfortunately, the US is a litigious country and any idiot can sue you for anything. Getting sued costs a lot of money; even if you’re totally innocent.

In the US, and I assume other countries, we have LLCs. LLC stands for Limited Liability Corporation. It’s what business people call a “passthrough” entity. This means that you don’t get double taxed like you do with corporations. In other words, it’s like a fake version of yourself as far as the IRS is concerne. It can also own property and enter into contracts on its own.

That last tidbit is what you should be most concerned about. If you create a side project that gets used by lots of people, you are at a risk for getting sued directly. If your LLC owned said side project, you are not personally at risk of anything. The LLC could get sued and all associated assets could be taken, but your car, house, etc. should be fine … unless they’re owned by the LLC.

And this, my friends, is why I’m creating an LLC for my side projects. I have assets, outside of tshirts featuring the word fuck, that I’d like to protect. By creating an LLC for my side projects, having the LLC engage consultants on its own, and having the LLC own all of the IP for said projects, I’m greatly limiting my personal exposure.

NOTE: I am not a lawyer or a tax advisor. I think having an LLC is a good thing for me personally, but it could be a really dumb idea for you. Everyone’s miles may vary.

UPDATE: Marco Tabini has a great followup noting that an LLC is not necessarily a silver bullet. A must read if you found this post interesting.

Why I’ll never own another server

When Matt and I started SimpleGeo I made a decision early on to use Amazon’s AWS services to run our infrastructure. A lot of people basically think I’m nuts for a lot of reasons for this, but I generally get two major questions/concerns when I mention that we run on AWS/EC2.

  1. AWS is slow!
  2. AWS is expensive!

I’ve covered IO performance on EC2 in-depth before and have compared the IO benchmarks, favorably, against numbers from Digg and Media Temple’s systems engineers. The notion that AWS is too slow for your application is, largely, not supported by the numbers and comparisons. The second point I often make with regards to performance on AWS is that Amazon uses this to run large portions of their own infrastructure. Trust me, if it’s good enough for the largest online retailer in the world, it’s good enough for you.

The second point is a bit harder to defend sometimes. Amazon’s AWS can be cheaper than running your own hardware and vice versus. If you run huge amounts of servers AWS can be a few hundred thousand more by comparison on raw numbers that compare cost of your own hardware to cost of AWS. The problem with this vanilla comparison is it forgets one extremely important cost for startups – opportunity cost.

I have a few rhetorical questions as to why people are not using AWS.

  • How many people does it take to maintain your own DC? People have to wrangle hardware, travel around to various DCs, RMA hardware, etc. If they weren’t doing those things, or you didn’t need those people, what could you be doing with those resources if they weren’t wiring your DC?
  • How much time, money, effort, and overhead is it going to take to create multiple data centers? Have you negotiated bandwidth contracts before? Do they have power from multiple providers? Do they have power and bandwidth failover? Amazon has amazing economies of scales and has spent thousands of man hours (years?) preparing for power/bandwidth failover, floods/natural disasters, etc.
  • Managing multiple data centers requires a small army of highly trained network operations people. Have you built DC failover before? Have you implemented load balancing across multiple DCs? It took me about 30 minutes to set up an Elastic Load Balancer that spread traffic across three Availability Zones (Amazon’s term for DCs).
  • Have you thought about building your own automation and self-service APIs for the DC you want to build? Fabric/Chef/Puppet/Capistrano combined with AWS’s automation API is an extremely potent combination for automating large clusters. For instance, we use Fabric and Boto to automate the creation of all nodes in our cluster. I can run a command in Fabric that creates an API server out of thin air, bootstraps it, and puts it into our ELB. This takes about five or so minutes.
  • Have you ever set up a DC in Europe? What about Asia? Would you even know where to start? I can spin up a server in Europe in a matter of seconds. How much might you spend on flying your network operations folks to and fro all of these DCs you plan on building?

These are just a few of the nooks and crannies that people often forget when comparing running their own data centers that I think are extremely important. The two biggest costs, in my opinion, that people forget are opportunity cost and cost of creating automation systems.

To expound a bit on opportunity cost, I’d like to quote the ever-thoughtful Joi Ito.

“If you want to increase the pace of innovation, you need to lower the cost of failure.” — Joi Ito

I can fire up an entire DC for SimpleGeo with a 20-30 node cluster with a few commands, totally automated, run load/consumption/system tests against it, find flaws in my system, and iterate in a matter of hours at a cost of a few hundred dollars.

The simple fact is, SimpleGeo wouldn’t be anywhere near as robust, indeed it might not even exist, as it is without leveraging the cloud.

HOWTO: Maintain a Rock Star Culture

In a recent post I described a few bullet points on how I’ve managed to go about recruiting great engineers. That’s all and good, but what happens when you manage to land a few of these people? A common retort to my recruiting efforts at SimpleGeo has been, “How are you going to get them to work together without killing each other?” It’s a common misperception that all top engineers are also prima donnas who are impossible to work with. I’d argue that the truly talented engineers are far from prima donnas.

So here is a small list of things you’re going to have to prepare yourself to do if you want to hire, encourage, and foster a team of top engineers.

  • Don’t skimp on making them comfortable. You’re paying this person over $100,000 a year so why cheap out on their hardware, desk, chair, snacks, etc.? What we do at SimpleGeo is we tell them they have about a $3000 budget to purchase an Apple MacBook Pro of their choice and whatever monitor setup they desire. We make sure there’s plenty of water in the water cooler, soda in the fridge, nice chairs, whiteboards, etc. in the office as well.
  • You need to be prepared to fire a great engineer. There are going to be great engineers on your team who, for one reason or another, are not going to work out. They’re causing more harm than good – act quickly and decisively when you identify this. Yes, I just told you to fire great engineers.
  • Do not allow language and other technology zealotry to take hold amongst your engineers. Truly great engineers will run benchmarks, be pragmatic in their technology choices, and abide by bake-offs. I hate Java with a passion, but if it will pull five times more jobs a second off of RabbitMQ than Python then we’re using Java for our queue processors.
  • Allow ample time for proper development and, if that’s not possible due to arbitrary deadlines, allow developers to circle back after the fact to clean up. Great developers loath writing shitty code; at least give them the opportunity to revisit quickly/poorly written code. They’ve got a reputation to uphold you know!
  • Institute and promote proper development cycles and practices. Great engineers will naturally do this on their own and, when they do, get out of their way or figure out how to help with the process.
  • Hire or promote a great engineer to manage them. Great engineers, for lots of reasons, tend not to trust non-engineers to ensure their voices are heard within the company. If you think about it, engineering is one of the only disciplines where people who were not trained in the vocation are allowed to manage a trained team. You wouldn’t put a marketing person in charge of accounting would you?
  • This is a personal preference of mine, but I don’t think you should hire or have any dedicated product/project managers.
  • Your engineers should, for the most part, drive new feature development. You’ve just hired a small army of the best technologists on the planet. Nobody in your company knows technology trends better. This is where it becomes crucial that you’re adequately communicating your vision for the product and company. If your engineers buy into that vision they’ll dream up and program things you never thought possible. If they don’t they’ll tell you they don’t and why; listen to them carefully as your business’s success might depend on it.
  • Within reason, give them flexibility in where, when, and how they work. It’s true that face-to-face time and pair programming lead to better code, but sometimes an engineer just wants to walk to his favorite cafe and work with IM turned off. Let them.
  • The phrase is “Work hard. Play hard.” not “Work hard and then work hard some more.” This applies generally to the entire team, but especially so with highly cognitive vocations such as accounting, engineering, etc. Give your team ample vacation time and encourage them to use it. I’ve gone so far as to enforce reasonable hours at the office. Your engineers simply aren’t producing great code if you’re forcing them to work 12 hours a day, 7 days a week. Some engineers will resist this, if they do, read Alex Payne’s great post on not being a hero.
  • Promote a culture amongst engineers where their first response to a technical question is, “I don’t know.” A good way to tell a good engineer from a great engineer after you’ve hired them is a willingness to quickly admit when they don’t know something.

The above is just a small list of things I’m actively doing in an attempt to maintain a team solely built of great engineers that are producing great products at a very rapid rate. The usual caveat being that your mileage may vary when you attempt to do this yourself.

NoSQL vs. RDBMS: Let the flames begin!

I’ve been getting solidly flamed recently, as have my former coworkers at Digg, my friends at Twitter, etc. about our adoption and promotion of various NoSQL storage systems. It seems that some DBAs are very, very upset that us internet kids are considering abandoning SQL’s ship. I’m not here to throw out a bunch of insane numbers, benchmarks, or flame back, but I did want to point out why SimpleGeo and others are jumping onto the NoSQL bandwagon.

First, and foremost, I haven’t heard of anyone saying MySQL or PostgreSQL on comparable hardware is faster than NoSQL options. The best I’ve heard is that MS SQL setups on SSD drives with lots of RAM could do 6,100 result sets a second. I guess, based on these posts, I’d like to ask a few questions to the people who honestly think RDBMSs can compete with NoSQL solutions at large scale.

  • Do you honestly think that the PhDs at Google, Amazon, Twitter, Digg, and Facebook created Cassandra, BigTable, Dynamo, etc. when they could have just used a RDBMS instead?
  • Has anyone ran RDBMS benchmarks with highly heterogeneous datasets with lots of varying indexes on them? At Digg we had probably a hundred or so tables, each table had varying indexes (a char here, an integer there, a date+time here). Disk IO becomes a serious problem when indexes for different tables are stored on different parts of disks and you have concurrent reads/writes. I know that people have found ways around this, such as 37Signals systems guy putting 15 x 15k RPM drives on his DB server. Assuming $500 a disk (15k disks range from $300 to $800 on Newegg) that’s $7,500 just for disks.
  • Anyone out there running an EC2 large instance with a RDBMS on it that’s doing 1,800 reads/second? I’ve got a Cassandra node that was getting hammered with a load of 6 serving that much traffic without falling over, which I think is pretty decent when you consider each node could easily do that and adding more nodes to handle more load is trivial.
  • How much are you spending on those MS SQL servers with SSD drives that serve up 6,100 results a second? MS SQL is $5,999 per processor. Windows Server 2008 is another $1029. Decent 128GB SSDs appear to cost around $450 each. You see where I’m going with this. Nobody is arguing you can’t get RDBMSs to scale up to a few thousand reads/writes a second if you can afford to spend $50,000 or $100,000 per server. The problem is that very few startups can spend that much money on a single server.
  • How much time are your DBAs spending administering your RDBMSs? How much time are they in the data centers? How much do those data centers cost? How much do DBAs cost a year? Let’s say you have 10 monster DB servers and 1 DBA; you’re looking at about $500,000 in database costs.
  • How easy is it to add a new server to your cluster? If we identify a hot spot in our Cassandra cluster, we can have a new node bootstrapped into our cluster in about five minutes. And I mean it’s in production taking writes and serving reads.
  • Does your RDBMS automatically rebalance the entire cluster when a new node is bootstrapped into it?
  • I’m running a 50 node cluster, which spans three data centers, on Amazon’s EC2 service for about $10,000 a month. Furthermore, this is an operational expense as opposed to a capital expense, which is a bit nicer on the books. In order to scale a RDBMS to 6,000 reads/second I’d need to spend on the order of five months of operation of my 50 node cluster.
  • Has anyone ran benchmarks with MySQL or PostgreSQL in an environment that sees 35,000 requests a second? IO contention becomes a huge issue when your stack needs to serve that many requests simultaneously. I know of one company that’s managing to scale portions of their PostgreSQL servers by purchasing $250,000 servers. This would cover my 50 node EC2 cluster for two years.

I guess what I’m saying is that my decision to use NoSQL, and I’m guessing others’ decisions to do so, has less to do with the fact that we can’t squeeze a few thousand writes a second out of MySQL and more to do with management and cost overhead. NoSQL solutions allow us to serve absurd amounts of data for a really, really low price. I’m happy to put my $/write, $/read, and $/GB numbers for my NoSQL setup against anyone’s RDBMS numbers.

We’re not nearly as dumb as everyone thinks we are; I promise.

HOWTO: Recruit Rock Stars

Yesterday, SimpleGeo announced the hiring of five new employees. Four will be engineers and one is an engineer moonlighting as a developer advocate for us. The feedback I’ve received about the team we’ve managed to put together in such a short period of time usually involves two statements:

  1. How on earth did you manage to build a team like this in such a short period of time?
  2. Will you please leave some engineers for the rest of us?

The answer to #2 is easy; no. The answer to #1 is somewhat funny; I’m a better recruiter than I am a coder or architect. No, it’s true. Ask Jay if he was more sorry to see his lead architect or his top recruiter leave Digg. My guess is he’s more upset his top recruiter left (I recruited about 40% of engineering and over 10% of the company by the time I’d left). The simple fact is that I would have made more money as a recruiter for Digg than as their lead architect.

But, how do I do it? Good question. I honestly don’t know, but I am going to share some insights that might help you land your own rock star. Here’s an arbitrarily ranked list of do’s and don’t’s for finding, recruiting, and hiring great engineers.

  • Make sure your ego is in check. My only rule in hiring is to hire people that are smarter than me. Top engineers like this will be smarter than you and most likely command a higher salary than you. You need to be fully prepared to pay handsomely for someone who will likely make you look and feel like a fool on a daily basis.
  • If you take nothing else away from this post, please remember this: amazing engineers are not perusing job boards for their next job. Do you honestly think Alex Payne or Ian Eure actively seek out employment?
  • Great engineers generally seek out two things when looking for new employment: interesting problems and awesome people. If your startup is “like Twitter plus blah”, you’re not likely going to be able to recruit top engineers.
  • Get involved in local meetups, bleeding edge protocols, open source, etc. I’ve found, and recruited, two of the best engineers I know via meetups (Arin Sarkissian) and open source projects (Chris Goffinet).
  • Do not, under any circumstances, send a recruiter after an engineer you covet. Send your most senior technologist after them. Pitch them on your team. Pitch them on your problem sets. Tell them about your strict adherence to TDD and Agile. My point on this is you need to send in someone who can speak engineering and sell that aspect of your company.
  • Pay them what they’re worth. If you don’t, someone else will. You aren’t going to lure these guys in with dreams of IPO or M&A riches. They’re smart enough to know that 1% of your company won’t lead to “fuck you money.” So don’t bicker over paying them an extra $10,000.

I’ve got a lot more ideas on how to manage and foster such a team once it’s been built, but those will have to wait for another blog post.

Why I hate the AGPL

I’ve been, shall we say, debating the AGPL with the Neo4j guys for a few weeks now. I’d originally reviewed Neo4j for some uses at SimpleGeo, but ended up excluding it as a possibility for three reasons.

  1. It didn’t support replication.
  2. There wasn’t any inherent support for partitioning. This means we’d have to use Memcached’esque hashing to partition our data, which might work for many, but won’t work for our use cases.
  3. It was licensed under the AGPL.

The first two precluded me from using it merely on technical reasons, which aren’t huge hurdles because I was looking for a foundation to build on and would have gladly built that into Neo4j and released it back. The AGPL, however, was a deal breaker.

For those who don’t know, the AGPL is an incredibly viral license that says that not only do you have to redistribute your code changes, as the GPL states, but you must also publicly release any code that connects to an AGPL piece of software over a network. I fear the day that an HTTP server is built using AGPL. Think about it.

I’m extremely opposed to viral software. Since when is the term “viral” a good thing? Ultimately, though, my biggest complaint is the blatant hypocrisy of companies using AGPL. They’re basically stating that they’re “open” and “free”, when in reality they’re outsourcing free labor while retaining the right to charge for a proprietary license. The hypocrisy comes from companies trumpeting this so-called freedom when, in reality, they’re forcing certain behavior on their users. Reminds me of pious people saying, “You can live your life however you want! As long as you live it according to our rules.”

You want real freedom in software? Then release your code into the public domain, use CC0, or a BSD/Apache/MIT license. That’s true freedom. Anyone who tells you otherwise is either lying or delusional.

So do I have an issue with proprietary software? No, because Microsoft isn’t trying to play me for a fool. They’re up front saying, “Hey, it’s $199 to install Windows.” I can respect that. They’ve spent a lot of time building a product. They think it’s worth something. Consumers think it’s worth something. Simple economics come into effect after that.

What about services like SimpleGeo and AWS? No, because it’s not that you’re buying software. You’re buying a service. The crucial differentiator here is if your data is free. You’re not paying for AWS software or SimpleGeo software, you’re paying for service, support, and knowledge. What you pay SimpleGeo for is our knowledge in scaling, managing large server clusters, our data agreements with our various partners, and ease-of-use. It’s the same reason you go to the restaurant for a dish you could make at home.

What about Twitter or Digg? No, because those are communities. The notion that the software that runs Digg or Twitter is what makes them so special is ludicrous. The reason people keep going back to those sites is to participate with like-minded people in an environment they find fun and interesting. It’s like going to the bar.

So, in conclusion, if you’re going to release software then grow a pair and truly set your code free. If not, that’s fine too. Just don’t release your code using some zealot’s license and pretend your code is truly free when it’s not.

Fail fast and often

I’m saddened to hear the news that EventVue has closed up shop. It’s never fun watching other entrepreneurs have an exit like this knowing full well how much of themselves they put into it. The good news is the EventVue guys clearly have learned huge lessons and will be in a much better position to win in round two.

The one point they brought up as a lessoned learned was that they didn’t focus on learning and failing fast until it was too late, which reminded me of a quote from Joi Ito I recently heard at a conference.

“Want to increase innovation? Lower the cost of failure.” — Joi Ito

This is absolutely crucial. As a startup you need to fail and fail often. Not a lot of people know that SimpleGeo was, in fact, born of failure. Matt and I originally started the company as a location-based gaming company. Due to market realities (investors rarely invest in gaming companies in the seed stage), personnel realities (neither Matt nor I had built games), and other factors we had to have what I call our “come to Jesus” moment. Since then we’ve focused on small, quick iterations and quickly scrapping or moving on from our failures (or in many cases simply adjusting our assumptions).

Easy to say, but how are we doing it?

  1. We use a loose approach to Agile development with two week sprints. This means that, worst case scenario, we’ve wasted two weeks of effort in order to find out if a tweak, product, etc. is going to fail. I can’t emphasize how important this is. If you push your team 6 months and release a giant product only to find out users hate it or don’t use it, you’re screwed. 6 months is a lot of opportunity cost for a startup.
  2. We actively engage our user feedback. If users don’t like something we abandon it. If users are demanding something we move it up the priority list. For instance, we thought for sure that people wanted location-based search first and foremost after storage. Turns out, users actually wanted better tools for accessing and managing the data they’re sending us. So we’ve focused on churning out more SDKs and tools for managing data.
  3. We use Amazon EC2. Specifically, for testing and prototyping, we use spot instances. It’s an extremely cost efficient way to destroy servers during testing.
  4. We start small. Our current product offering is probably only 10% of what users expect from us long-term as far as pure number of features go. That being said, we picked the initial use-case and, I think, nailed it. The reality is 60% or more of the use cases for our users is simply storing millions of points and running radius queries on those points quickly. It’s also the foundation that all of our other products will be built on. So we’ve spent time focusing on building and iterating on that use-case. As a result, we’re multi-homed in three datacenters (actively serving data from all three), redundant across all three (meaning your data is stored in every datacenter), searches are extremely fast (under 50ms in some cases if you’re on EC2), and all points are backed up to S3 in YOUR account (it’s your data after all).

We’re not perfect by any means, but I think Matt and I realized early on that we need to be most efficient at recognizing and owning our failures and deficiencies. If you focus on minimally viable products, small iterations, and prioritizing user feedback you, too, can fail fast and often.

“Failure is useful.” — Larry Page

Creating Nagios Plugins using Python

Like just about everyone else on the planet, we use Nagios to monitor our servers. We needed to create a few plugins for Nagios to monitor some services that Nagios didn’t have plugins for; namely Cassandra and Gearman. I wanted to be able to easily create plugins and have them installed with setuptools.

The code above is a simple class that will implement all of the option parsing and such to run a Nagios plugin from the command line. From there you need to implement a plugin. Below is the Gearman plugin that I created. It runs a simple job I created that sums numbers and returns the results. If all goes well then the job should run and the correct sum of the randomly selected numbers should be returned.

Once you’ve got all of that working it’s pretty simple to then add the following snippet of code to your setup.py file.


      entry_points = {
          'console_scripts': [
              'check_gearmand = path.to.nagios.plugins:check_gearmand',
          ]
      },