By Teri Evans
When David Karp created Tumblr in 2007, he saw it as a fun side project to his tech consulting. But the blogging platform gained traction among creative, techie early-adopters, and he ultimately gave up consulting. Still, Karp never imagined that a few years later Tumblr's growth trajectory would shoot up with the force and shape of a hockey stick.
The New York-based company hosts more than 19 million bloggers across the globe, drawing about 7 billion page views per month, according to web-analytics service Quantcast.
While Tumblr has become an investor darling -- having partnered with three institutional investors and a half-dozen angels -- the fast-growing company has had its share of growing pains. Notably, one weekend in December last year the entire site was knocked offline and so were the millions of blogs it hosted.
"It was a cascading failure. The wrong server in the wrong place failed, and it took out a whole cluster of servers with it," recalls Karp, 24. "We were back after 16 hours in certain countries, but we weren't able to bring America back all at once. We brought back clusters [of data] at a time."
It was the startup's most challenging time, one that Karp says he will never forget as it's a case study of the critical steps needed to recover from a major network failure. Ironically, Karp says, Tumblr's traffic actually increased after the crash -- and it has continued on a sharp upswing since. Here, Karp shares his experience of recovering from that crisis. The lessons he learned offer takeaways for other business owners facing a similar challenge.
No. 1: Don't let a stubborn streak stand in the way.
In the first few years, even as signs the company was taking off became abundant, Karp thought he could handle the growth with a team of just two engineers. The need to forecast growth was never as obvious to him as it was to investors, he says.
"Investors wanted to know: 'Are you hiring as many people as you should? And, if you had more people, could you move faster?' They asked me that question all the time," says Karp, noting his initial resistance to hire more engineers. The massive network failure in 2010 was a wakeup call. "I was such a stubborn perfectionist until it became really clear we had to bring other people in, because having it all fall on me was really slowing us down."
No. 2: Take responsibility and act fast.
When Tumblr's network went down, the company tweeted developments throughout the recovery. After saying it was "incredibly sorry" for the outage, the company reassured users their blogs were safe and would be back online as quickly as possible. Within about 24 hours, service was restored. Karp went back on Twitter to explain what happened, linking to a post on Tumblr's own blog.
"The follow-up was not the techie postmortem, which I find kind of disingenuous and insulting to mainstream users," says Karp, noting that an overly technical explanation can be distracting and even add to the confusion. "We opted to give some technical detail as to what happened, but really make it all about how we messed up, we owe them better and here's what we're doing.'"
By the end of the month, Tumblr's engineering team quadrupled from two to eight, as part of the effort to manage hypergrowth more seamlessly.
No. 3: Communicate clearly to your team and investors.
Striving to be the voice of calm during the crisis, Karp sat down with his team to explain what happened and come up with a strategic plan to fix it.
"For example, bringing blogs back online first was way more important than bringing our dashboard back online," Karp recalls. "Over the next few days, we were also vetting the rest of our infrastructure to make sure nothing like that [large-scale outage] could happen again."
Of course, investors at the next board meeting wanted to know what allowed the network to fail and what could be done to prevent a future tumble. As a result, Karp's team reached out to the tech teams of other fast-growth companies who had experienced similar challenges and could offer advice and feedback.
Today Tumblr has 33 employees, nearly half of whom were hired within the past six months.
"We're constantly adding servers and we're catching up to [our growth] really quickly, and pretty much ahead of it now," Karp says. "We haven't been able to push product out as fast as we used to, so I'm excited to get back to that. It's been the fire under us more than anything else."
More from Entrepreneur: