Computer Reliability Study Offers Unreliable, Bogus Data

I just came across a "2010 computer reliability report" from Rescuecom Corporation. It's getting some high profile play in the press, in places such as Fortune's Apple 2.0 blog, and I'm shaking my head. This is just another example of a high tech company issuing an invalid study for the benefit of credulous reporters with a poor understanding of statistics. The reporters eat up, to the detriment of the public.

Rescuecom issued the release today, saying that it gave "factual, unbiased data to determine the reliability of today's personal computers," and claiming that the top five computer manufacturers for reliability were:

Apple (AAPL)
Asus (AKCIF)
IBM/Lenovo (LNVGY)
Toshiba (TOSBF)
HP/Compaq (HPQ)

So what's the problem? The results are meaningless, given the methodology. According to Rescuecom president Josh Kaplan, the company looked at a sample of 69,900 support calls it received from its clients in 2009. It then looked at the machine that was the subject of the calls, and compared the percentage breakout to the U.S. personal computer market share data (percentage share of computers shipped) from market researcher IDC. However, there are a few major problems:

The company doesn't have support contracts with users. They simply provide support for people who call.
Rescuecom assumes that the calls come in a breakdown proportionate to the computer-buying public as a whole.
Rescuecom compares its numbers to market share numbers for people who bought computers in the country last year.
They assume that every call for support indicates a problem with the computer, even if the software and hardware are functioning as designed and a user misunderstood how to do something.

One of the biggest problems in poor statistics is the non-representative sample. For statistics to work, you need responses from a randomly picked group of people out of the population you're studying. However, taking a random sampling of its own support calls means that Rescuecom can only describe the makeup of its support calls. It assumes that the calls are representative of the general computer buying populace, but it has no way to know that. It's essentially the same as saying that if you looked at people who bought a sandwich at McDonalds for a year, you could determine preferences for chicken versus beef versus fish of the American public as a whole.

Therefore, they can't reasonably compare their support calls to the populace of people who are having problems with a computer. Because reliability is essentially the ratio of products with problems to the total number of products, their numbers cannot even address the issue of reliability. To do that, they'd need to have support contracts with people, know the vendors of the machines in advance, and then see what percentage of each vendor ended up with problems. That still wouldn't be representative of the populace as a whole, but at least it would let the company accurately note the reliability that its own customers experienced.

Finally, the company doesn't address when people who called for support actually purchased their computers. For all they knew, some larger percentage of the support calls were for computers that were over a year old and out of warranty. Yet, the company was comparing its support calls to IDC market share numbers explicitly for last year, making the two sets of numbers even more of a mismatch

I spoke with not only the company president, but also its marketing director, Eric Fontaine. Both insisted that the methodology was reasonable. As Kaplan said, "We are brand agnostic. Every single brand, every single customer, is just as able to call us for support." In other words, because every computer buyer could in theory call them, the company assumes that every buyer is equally likely to call.

I also called an old friend and colleague Jeffrey Henning, who is a founder and current VP of strategy for Vovici, which makes survey software. Henning has worked in the survey business for years and has written software packages for the practice. Here are three sentences of his that sum up his reaction:

"Have they heard of an installed base [of product users]?"
"A random sampling of their calls means it's a random sampling of their calls, and nothing else."
"Wow, that's pretty meaningless."

It's not that the Rescuecom people are trying to pull one over on the public. I think they're sincere. Unfortunately, misunderstandings of statistics are as rampant in the high tech industry as they are anywhere, and journalists should get a lot smarter about what they read in press releases.

By the way, if you want to see the lengths to which some companies will try to game reports, read this piece by ZDNet's Larry Dignan about Randall C. Kennedy and Devil Mountain Software. It demonstrates how thoroughly high tech journalists often are conned by people who want to issue reports for notoriety and profit.

Image via RGBStock.com user lusi, site standard license.

Erik Sherman

Erik Sherman is a widely published writer and editor who also does select ghosting and corporate work. The views expressed in this column belong to Sherman and do not represent the views of CBS Interactive. Follow him on Twitter at @ErikSherman or on Facebook.

Twitter Facebook

More from CBS News