Statistics and Scientific Research

All measurements contain some uncertainty and error, and statistical methods help us quantify and characterize this uncertainty. This helps explain why scientists often speak in qualified statements. For example, no seismologist who studies earthquakes would be willing to tell you exactly when an earthquake is going to occur; instead, the US Geological Survey issues statements like this: “There is … a 62% probability of at least one magnitude 6.7 or greater earthquake in the 3-decade interval 2003-2032 within the San Francisco Bay Region” (USGS, 2007). This may sound ambiguous, but it is in fact a very precise, mathematically-derived description of how confident seismologists are that a major earthquake will occur, and open reporting of error and uncertainty is a hallmark of quality scientific research.

Today, science and statistical analyses have become so intertwined that many scientific disciplines have developed their own subsets of statistical techniques and terminology. For example, the field of biostatistics (sometimes referred to as biometry) involves the application of specific statistical techniques to disciplines in biology such as population genetics, epidemiology, and public health. The field of geostatistics has evolved to develop specialized spatial analysis techniques that help geologists map the location of petroleum and mineral deposits; these spatial analysis techniques have also helped Starbuck’s® determine the ideal distribution of coffee shops based on maximizing the number of customers visiting each store. Used correctly, statistical analysis goes well beyond finding the next oil field or cup of coffee to illuminating scientific data in a way that helps validate scientific knowledge.

Scientific research rarely leads to absolute certainty. There is some degree of uncertainty in all conclusions, and statistics allow us to discuss that uncertainty. Statistical methods are used in all areas of science. Statistics in research explores the difference between (a) proving that something is true and (b) measuring the probability of getting a certain result. It explains how common words like “significant,” “control,” and “random” have a different meaning in the field of statistics than in everyday life.

Key Concepts

Statistics are used to describe the variability inherent in data in a quantitative fashion, and to quantify relationships between variables.

Statistical analysis is used in designing scientific studies to increase consistency, measure uncertainty, and produce robust datasets.

There are a number of misconceptions that surround statistics, including confusion between statistical terms and the common language use of similar terms, and the role that statistics employ in data analysis.