Main Background Statistics Archive Analysis

Stats They say that 83% of all statistics are lies. Think about that for a second.

One of the largest things media outlets use to back their claims are statistics. It is absolutely incredible how many times a media outlet will quote a statistic and not credit where it came from. Further, they are fond of taking creative liberty with how they quote the article to suit their needs.

These statistics cover damage to systems, percentage of intrusions, virus infections and everything else related to security. There are simply too many instances of suspect statistics as they relate to the computer security industry to read, match and provide analysis of them all. Most of the statistics here are simply referenced and left to the keen reader to draw their own conclusions. Analysis may be provided for articles and reports that are widely quoted or otherwise interesting. Use the feedback link at the bottom of the page if you wish to recommend an article or report for analysis, please include why you feel this article is important.

Due to the number of articles with statistics and the time drain in trying to analyze them, this page only serves as a very primitive repository for quotes and statistics about security. It is intended to be used by utilizing the 'find' feature in your web browser while viewing the Statistics or Archive pages. As time permits, we will try to lump similar statistics together.


Background

Statistics involving computer security suffer from many problems. As with any topic, it is trivial to lie with statistics and how to lie with charts. Depending on how a figure is worded, it is easy to misread, misinterpret and mislead the audience. Thomas Gilovich goes into great detail on how human nature helps us to interpret and mis-interpret facts, based on how they are presented to us, in his book How We Know What Isn't So: The Fallibility of Human Reason in Everyday Life. This book should be essential reading in high school.

The problem with damage figures related to security get even more complicated. Not only are complex technical terms and situations often presented to an audience not familiar with technology, journalists and researchers liberally interchange terms that may have distinct meanings to security professionals. When the media reports on damage figures such as telephone fraud and software piracy, they largely ignore the fact that human nature will take advantage of 'free' while not willing to pay for the very same thing under the same circumstances. Someone may be willing to download a movie to watch with passing interest while they work, but would be unwilling to pay $10.00 to see it in a theatre. Does that mean the movie industry suffered $10.00 in damages, when the person never would have paid for it in the first place? The movie industry says so, and reflects it in their damage figures and statistics.

Important notes about context

For anyone who hasn't taken a course in statistics or read the books above, here is an example of how statistics can be manipulated or misinterpreted:

"75% of respondants indicated that they felt spyware wasn't an issue for them or anyone they knew."

75% is a nice believable number well above a majority. With the type of statistics that are most often used in news reporting, the idea is that because sampling an entire population (e.g., Republicans, the French, etc.) is infeasible; it is only possible to sample a much smaller group as a significant represenation of the larger population. This is the type of statistic used when election polls come out or when opinion surveys are done. If you hear some news anchor prattling off a number like the President's approval rating, they didn't ask everyone in the country (you didn't get a call did you?) what their opinion of the President was.

Using our example, "75% of the respondants" seems like its a statement that gives you a lot of information; We took a survey and 75% of people said the same thing.

By itself, this statement is essentially worthless. There are too many questions that go unanswered that would put the statement in an understandable context. For example:

Statistics can be an excellent way to give a summary of a large ammount of data. However, they can also be an excellent tool to obfuscate the truth if the reader doesn't ask the proper questions. Keep this in mind when you see a statistic that purports to report the answer of a complex question for a large number of people. Unless that question is quantitative (How many children do you have) it probably deserves a closer look.

[an error occurred while processing this directive]