Watch CBS News

Google DRAM Study Offers Promises, Problems to Industry

The DRAM industry has depended on major technical advances to increase capacity and lower manufacturing costs in long-term strategies to make money when they can and grit their teeth in the down cycles when the red ink flows like the Nile when it floods, spilling over to mark all around. Now a multi-year study done by Google in its own data centers has upended one of the fundamental assumptions about DRAM -- that errors are rare and aren't affected by time. And that could spell one of the biggest good news-bad news scenarios the industry could see.

The Google study is impressive in scope and duration. Over two-and-a-half years, the company conducted what is reportedly the largest real-world-get-me-out-of-this-dang-lab examination of how DRAM actually functions. Given the size of Google's datacenters, you could understand how this could become more than a passing curiosity. And the results were surprising:

The headline conclusion in the study is that DRAM errors are vastly more common that is typically assumed. Nearly one-third of the individual machines in the study saw at least one error per year, a rate that's orders of magnitude higher than previous research had indicated. To give some hard numbers, previous studies report 200 to 5,000 failures in time per billion hours of operation (FIT) per Mbit, Google found that their numbers were between 25,000 and 75,000 FIT per Mbit.On the bright side, most of these errors are the result of a few bad apples. About 8 percent of DIMMs were responsible for over 90 percent of the errors, as DIMMs that produced one error were hundreds of times more likely to produce another error in the same month.
In addition, failures seemed to really kick in at about 20 months, suggesting that companies start swapping out RAM a lot sooner than the habitual 36-month cycle that many IT shops use. That's the backhanded plus side for DRAM vendors: sell more product to companies that depend on it, read that as everyone, because they have a miserably short lifespan.

However, the negative cast doesn't go away. Product failure rate of 8 percent at least raises the question of whether anyone is bothering with sufficient quality control. Luckily for the vendors, there is no easy replacement for their products. Still, to say, "You're going to have to replace hardware far more frequently than you ever thought," is tantamount to also offering large companies huge bargaining chips in negotiating for lower prices. After all, if you have to swap out roughly two-thirds the way through what you thought was normal product life, you may very well want to pay only two-thirds of what you were paying before. If there's anything with which the DRAM industry could less, it's downward pressure on prices. And it doesn't do any favors for the rest of the computer industry. When customers now have to take $X out of their budgets for additional memory purchases, it means that much less for other purposes in this decidedly zero-sum game.

Image via stock.xchng user dimshik, site standard license.

View CBS News In
CBS News App Open
Chrome Safari Continue