Optimizing Sequencing Depth in Microbiome Studies: Food For Thought

Introduction

Determining the optimal sequencing depth is a critical step in microbiome research, especially for studies involving 16S rRNA, ITS, and other marker genes. The depth required depends on factors such as the target gene region, sample complexity, study design, and bioinformatics strategy. While no universal threshold exists, numerous empirical and modeling-based studies provide solid guidance.

1. 16S rRNA Sequencing in Human Gut Microbiome

In relatively low-complexity environments like the human gut, 10,000–50,000 reads per sample are typically sufficient to capture most genus-level diversity. For species-level resolution using denoising algorithms such as DADA2, 50,000–100,000 reads per sample are often recommended.

Example: The American Gut Project showed that rarefaction curves plateaued at ~25,000 reads per sample for genus-level richness, indicating adequate sequencing depth for many gut samples (McDonald et al., 2018).

2. Soil or Marine Microbiomes

Highly diverse environments like soils or oceans require deeper sequencing to recover rare taxa and enable robust beta diversity comparisons. In such cases, 100,000–500,000 reads per sample are commonly used.

Example: In Lundberg et al. (2012), ~200,000 reads per sample were needed to capture rare taxa in rhizosphere soil microbiomes.

3. Fungal ITS Sequencing

Fungal ITS regions, such as ITS1 or ITS2, exhibit variability in length and copy number. To avoid underrepresenting rare taxa, depths of 30,000–100,000 reads per sample are typical.

Reference: Nilsson et al. (2019) noted that insufficient read depth can distort community composition and hinder accurate identification of low-abundance fungi.

4. Other Marker Genes (e.g., COI, rbcL)

Marker genes such as COI, used in metazoan profiling, or rbcL, used for plants, often require more depth due to their variability and the lack of highly conserved universal primers. For environmental samples, 50,000–150,000 reads per sample are recommended.

Reference: Leray et al. (2013) developed a versatile COI primer set and recommended high read counts to ensure detection of rare species across taxa.

Conclusion

Selecting the appropriate sequencing depth depends on the sample type, target region, taxonomic resolution required, and downstream statistical goals. Over-sequencing wastes resources, while under-sequencing may miss rare taxa or reduce statistical power. Empirical and model-based guidelines—such as those from the American Gut Project, rhizosphere soil studies, and fungal ITS research—provide a strong basis for tailoring depth to the study context.