What is fdr in statistics

Last updated: April 1, 2026

Quick Answer: FDR (False Discovery Rate) is a statistical method used in hypothesis testing to control the expected proportion of false positive results among multiple comparisons, balancing accuracy with statistical power.

Key Facts

Understanding False Discovery Rate

The False Discovery Rate (FDR) is a fundamental concept in statistics that helps researchers manage the problem of multiple comparisons. When conducting many statistical tests simultaneously, the probability of finding false positives (Type I errors) increases substantially. For example, if you conduct 1,000 independent tests at a significance level of 0.05, you would expect approximately 50 false positives by chance alone. FDR provides a principled way to control this inflation while maintaining reasonable statistical power.

The Multiple Comparisons Problem

In modern research, scientists often test thousands of hypotheses simultaneously. In genomics, researchers might test whether each of thousands of genes is associated with a disease. In neuroimaging, researchers test associations across millions of brain voxels. Traditional methods like the Bonferroni correction, which divides the significance level by the number of tests, are overly conservative in these settings. The Bonferroni approach controls the family-wise error rate but becomes too stringent for large-scale testing, missing true discoveries.

How FDR Works

FDR controls the expected proportion of false discoveries among all rejected hypotheses. If FDR is set to 0.05, it means that among all tests you call significant, approximately 5% are expected to be false positives. This is fundamentally different from traditional significance levels, which control the probability of a single false positive. The Benjamini-Hochberg procedure implements FDR control by ranking p-values and determining a threshold that controls the expected proportion of false discoveries.

FDR vs. Traditional Methods

Compared to stricter corrections:

Applications and Advantages

FDR has become the standard in genomics and gene expression studies, where thousands of genes are tested simultaneously. It's equally valuable in neuroimaging analysis, microarray experiments, and psychological research involving multiple tests. The primary advantage is maintaining statistical power while controlling false discoveries, enabling researchers to make meaningful discoveries in high-dimensional data without being overwhelmed by false positives.

Related Questions

What is the difference between FDR and p-value?

A p-value represents the probability of observing results as extreme as or more extreme than those observed under the null hypothesis for a single test. FDR, conversely, controls the expected proportion of false discoveries among multiple rejected hypotheses, making it applicable when conducting many tests simultaneously.

What does an FDR of 0.05 mean?

An FDR of 0.05 means that among all tests you declare significant, approximately 5% are expected to be false positives. This is different from a p-value of 0.05, which addresses a single test, not multiple comparisons.

Why is FDR important in genomics?

In genomics, researchers test thousands of genes simultaneously. FDR control allows researchers to manage false positive rates efficiently while maintaining enough statistical power to detect true genetic associations, which would be impossible with stricter corrections like Bonferroni.

Sources

  1. Wikipedia - False Discovery Rate CC-BY-SA-4.0
  2. Benjamini & Hochberg (1995) - Controlling the False Discovery Rate Academic