TY - JOUR
T1 - Advancing yeast identification using high-throughput DNA barcode data from a curated culture collection
AU - Vu, Duong
AU - de Vries, Michel
AU - Gerrits van den Ende, Bert
AU - Houbraken, Jos
AU - Nilsson, R. Henrik
AU - Brankovics, Balazs
AU - Hernandez Restrepo, Margarita
AU - Groenewald, Johannes Z.
AU - Crous, Pedro W.
AU - Hagen, Ferry
AU - Meyer, Wieland
AU - Verkley, Gerard J.M.
AU - Groenewald, Marizeth
PY - 2026/1
Y1 - 2026/1
N2 - Yeast identification is essential in fields ranging from microbiology and biotechnology to food science and medicine. While DNA barcoding has become the standard for identifying cultured strains, environmental DNA (eDNA) metabarcoding has revolutionised microbial community profiling, providing deeper insights into yeast communities across diverse ecosystems. A major challenge in DNA (meta)barcoding remains the limited availability of high-quality reference sequences, which are critical for accurate species identification and comprehensive taxonomic profiling of both environmental and clinical samples. To address this gap, the Westerdijk Fungal Biodiversity Institute (WI) launched a DNA barcoding initiative in 2006 to generate high-quality, often type-derived ITS and LSU barcodes for all ~100,000 fungal strains preserved in the CBS culture collection, including approximately 15,000 yeasts. Building on the yeast barcode dataset released in 2016, we now present an expanded set of 2856 ITS and 3815 LSU sequences, representing 911 and 1137 yeast species, respectively. Notably, 27%-29% of these sequences are derived from ex-type cultures. Using both newly generated and previously published barcodes, we assess the taxonomic resolution of commonly used yeast metabarcoding markers (ITS, ITS1, ITS2 and LSU) and propose marker-specific similarity cutoffs for different yeast taxonomic groups. These results provide actionable guidance for marker selection and improve the interpretation of metabarcoding data. We further demonstrate the impact of well-curated reference databases with up-to-date taxonomy by reanalyzing Human Microbiome Project data, revealing how diet and environment shape the gut mycobiota.
AB - Yeast identification is essential in fields ranging from microbiology and biotechnology to food science and medicine. While DNA barcoding has become the standard for identifying cultured strains, environmental DNA (eDNA) metabarcoding has revolutionised microbial community profiling, providing deeper insights into yeast communities across diverse ecosystems. A major challenge in DNA (meta)barcoding remains the limited availability of high-quality reference sequences, which are critical for accurate species identification and comprehensive taxonomic profiling of both environmental and clinical samples. To address this gap, the Westerdijk Fungal Biodiversity Institute (WI) launched a DNA barcoding initiative in 2006 to generate high-quality, often type-derived ITS and LSU barcodes for all ~100,000 fungal strains preserved in the CBS culture collection, including approximately 15,000 yeasts. Building on the yeast barcode dataset released in 2016, we now present an expanded set of 2856 ITS and 3815 LSU sequences, representing 911 and 1137 yeast species, respectively. Notably, 27%-29% of these sequences are derived from ex-type cultures. Using both newly generated and previously published barcodes, we assess the taxonomic resolution of commonly used yeast metabarcoding markers (ITS, ITS1, ITS2 and LSU) and propose marker-specific similarity cutoffs for different yeast taxonomic groups. These results provide actionable guidance for marker selection and improve the interpretation of metabarcoding data. We further demonstrate the impact of well-curated reference databases with up-to-date taxonomy by reanalyzing Human Microbiome Project data, revealing how diet and environment shape the gut mycobiota.
KW - DNA barcoding
KW - eDNA metabarcoding
KW - similarity cutoff
KW - yeast identification
U2 - 10.1111/1755-0998.70082
DO - 10.1111/1755-0998.70082
M3 - Article
SN - 1755-098X
VL - 26
JO - Molecular Ecology Resources
JF - Molecular Ecology Resources
M1 - e70082
ER -