Amplicon sequence variant
Amplicon sequence variant (ASV) is a term used to refer to single DNA sequences recovered from a high-throughput marker gene analysis. These amplicon reads are created following the removal of erroneous sequences generated during PCR and sequencing. This allows ASVs to distinguish sequence variation by a single nucleotide change. ASVs are utilized to classify groups of species based on DNA sequences, finding biological and environmental variation and to determine ecological patterns. For many years the standard unit for marker gene analysis was operational taxonomic units (OTUs), which are generated by clustering sequences based on a shared similarity threshold. These traditional units were created by construction of molecular taxonomic units by either clustering based on similarities between sequencing reads (de-novo OTUs) or by clustering reference databases to define and label an OTU (closed-reference OTUs). Instead of using exact sequence variants (single nucleotide changes), OTUs are distinguished by a less fixed dissimilarity threshold which is most commonly 3%. This means these units have to share 97% of the DNA sequence. ASV methods on the other hand are able to resolve sequence differences by as little as a single nucleotide change which allows this method the ability to avoid similarity-based operational clustering units all together. Therefore, ASVs provide a more precise measurement of sequence variation since this method uses DNA differences instead of user created OTU differences. ASVs are also referred to as exact sequence variants (ESVs), zero-radius OTUs (zOTUs), sub-OTUs (sOTUs), Haplotypes, or Oligotypes.[1] [2]





OTUs benefits
Although ASVs allow for a more precise and accurate measurement for sequence variation, OTUs are still an acceptable and valuable approach. In a research study conducted by Glassman and Martiny, these researchers were able to prove the validity of OTUs when applied to broad-scale diversity analyses studies. They concluded that OTUs and ASVs provided similar ecological results, with ASVs enabling a slightly stronger detection of fungal and bacteria diversity. This study revealed that even though ASVs will now allow for a more accurate measurement of species diversification, scientists should not question the validity of well constructed research studies where OTUs were utilized to demonstrate broad-scale diversification. [3]
ASVs benefits
The introduction of ASV methods spurred a debate among researchers regarding their utility. Some have argued that ASVs should replace OTUs in marker gene analysis. Arguments in favor of ASVs focus on the precision, tractability, reproducibility and comprehensiveness that ASVs can provide to marker gene analysis. The utility of finer sequence resolution (precision) and the advantage of being able to easily compare sequences between different studies (tractability and reproducibility) makes ASVs the better option for analyzing sequence differences. The units within OTUs can change between researchers, experiments, and databases, since these are operational units and therefore depend on the person who created that specific similarity threshold. Whereas ASVs are exact nucleotide sequence variation, so the changes seen between past experiments can be more easily traced to biological differences instead of unit clustering differences. This means researchers are able to work with themselves from two years ago because ASVs do not utilize database or researcher biases clusters, instead ASV's are detectable biological variation providing consistent labeling across all datasets. Also ASV’s tables provide a more precise and comprehensive sequence variation compared to OTUs databases because operational units vary between experiment and researcher. Since these are exact sequence variations, ASV’s are more comprehensive and precise in comparison to the operational units created by each database. Although the validity of OTUs has been proven, ASVs are more precise, reusable, comprehensive, and reproducible for marker gene sequencing. [4] [5]
ASV methods
Popular methods for resolving ASVs including DADA2,[6] Deblur,[7] MED,[8] and UNOISE.[9] These methods work broadly by generating an error model tailored to an individual sequencing run and employing algorithms that use the model to distinguish between true biological sequences and those generated by error.
References
- Porter, Teresita M.; Hajibabaei, Mehrdad (2018). "Scaling up: A guide to high-throughput genomic approaches for biodiversity analysis". Molecular Ecology. 27 (2): 313–338. doi:10.1111/mec.14478. ISSN 1365-294X. PMID 29292539.
- Callahan, Benjamin J.; McMurdie, Paul J.; Holmes, Susan P. (December 2017). "Exact sequence variants should replace operational taxonomic units in marker-gene data analysis". The ISME Journal. 11 (12): 2639–2643. doi:10.1038/ismej.2017.119. ISSN 1751-7370.
- Glassman, Sydney I.; Martiny, Jennifer B. H. (29 August 2018). "Broadscale Ecological Patterns Are Robust to Use of Exact Sequence Variants versus Operational Taxonomic Units". mSphere. 3 (4). doi:10.1128/mSphere.00148-18. ISSN 2379-5042.
- Callahan, Benjamin J; McMurdie, Paul J; Holmes, Susan P (2017-07-21). "Exact sequence variants should replace operational taxonomic units in marker gene data analysis". The ISME Journal. 11 (12): 2639–2643. doi:10.1038/ismej.2017.119. PMC 5702726.
- Callahan, Benjamin J.; McMurdie, Paul J.; Holmes, Susan P. (December 2017). "Exact sequence variants should replace operational taxonomic units in marker-gene data analysis". The ISME Journal. 11 (12): 2639–2643. doi:10.1038/ismej.2017.119. ISSN 1751-7370.
- Callahan, Benjamin J; McMurdie, Paul J; Rosen, Michael J; Han, Andrew W; Johnson, Amy J; Holmes, Susan P (2015-08-06). "DADA2: High resolution sample inference from amplicon data". doi:10.1101/024034. Cite journal requires
|journal=
(help) - Amir, Amnon; McDonald, Daniel; Navas-Molina, Jose A.; Kopylova, Evguenia; Morton, James T.; Zech Xu, Zhenjiang; Kightley, Eric P.; Thompson, Luke R.; Hyde, Embriette R. (2017-04-25). Gilbert, Jack A. (ed.). "Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns". mSystems. 2 (2). doi:10.1128/mSystems.00191-16. ISSN 2379-5077. PMC 5340863. PMID 28289731.
- Eren, A Murat; Morrison, Hilary G; Lescault, Pamela J; Reveillaud, Julie; Vineis, Joseph H; Sogin, Mitchell L (2014-10-17). "Minimum entropy decomposition: Unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences". The ISME Journal. 9 (4): 968–979. doi:10.1038/ismej.2014.195. ISSN 1751-7362. PMC 4817710. PMID 25325381.
- Edgar, Robert C (2016-10-15). "UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing". doi:10.1101/081257. Cite journal requires
|journal=
(help)