Abstract
Introduction: Children with early childhood caries (ECC) show different caries severities and susceptibility in different tooth types and location in the oral cavity. The study aimed to investigate differences in the oral microbiome in ECC subjects stratified according to the severity of caries and between more and less caries-prone teeth within the same subjects. Methods: Supragingival plaque from the upper and lower anterior regions in the oral cavity of subjects were collected in 3 groups of increasing caries severity: G1 – molar (M) caries only; G2 – molar and upper anterior (UA) caries; and G3 – M + UA + lower anterior (LA) caries followed by microbiome analysis. Results: Alpha-diversity analyses showed inter- but no intra-individual statistically significant differences between the UA and LA (p < 0.001, LA > UA) and a significant difference between the microbiome of the three caries groups (p < 0.001). There were significant beta-diversity differences between G1 and G2 (p < 0.05) and in the composition and diversity among the three groups (p < 0.001). Actinomyces, Saccharibacteria_genera_inserta_sedis, and Eikenella had increased differential abundance in G1 versus G3 and Fusobacterium was less abundant in G2 compared to the other groups. Conclusions: There were clear distinct differences in tooth-site-specific and caries-severity microbiome diversity patterns and bacterial abundance profiles in S-ECC children.
Introduction
Dental caries is a highly prevalent biofilm-mediated and diet-modulated disease affecting approximately 600 million children globally [1]. The disease of early childhood caries (ECC) includes the presence of one or more decayed (non-cavitated/cavitated lesions), missing (due to caries), or filled tooth surfaces in any primary tooth in a child 71 months of age or younger. Caries can affect multiple primary teeth in a condition known as severe early childhood caries (S-ECC) [2]. Interestingly, different teeth in ECC children have different caries susceptibility – a declining caries susceptibility from maxillary anterior to mandibular anterior teeth has been observed [3, 4]. Some reasons cited for the reduced susceptibility of lower anterior (LA) teeth include protective tongue covering, salivary glands’ location and position, and salivary flow rate [5]. Additionally, there is also varying caries susceptibility of tooth surfaces [6]. The proximal, compared to occlusal, surfaces have a greater rate of caries progression due likely to increased rates of plaque retention [7]. Any microbial associations that may account for these site differences, however, have not yet been explored.
The tooth-associated microbiome develops in stages, defining either a healthy (caries-free) or disease (caries-active) clinical status of the tooth. In healthy subjects, supra- and subgingival plaques resemble one another, exhibit a balanced pH and flora suitable for a favorable demineralization-remineralization dynamic process, and comprise predominantly of members of the Firmicutes, Proteobacteria, Bacteroidetes, and Fusobacteria phyla. In contrast, caries is characterized by a dysbiosis of acidogenic and acid-tolerant bacteria that, in the presence of a sugary diet, can produce a wide range of acidic end products that are released, such as lactic acid. Caries is thus not the outcome of a single bacterial species but, rather, an increase to the tooth biofilm of the acid challenge that favors the succession of aciduric bacteria while inhibiting beneficial organisms.
The diversity in the microbiome differs within the oral cavity and at different stages of the caries process [8]. Differences in microbiotas have been demonstrated based on tooth site, extent of carious lesions, and rate of disease progression. For example, the bacterial compositions in enamel and dentin caries differ significantly. The dysbiotic microbiome of enamel caries, although less diverse than the caries-free subjects, is enriched in highly acidogenic bacteria. In contrast, dentin caries generally possesses a more diverse microbiota with moderately acidogenic organisms and an increase in bacterial taxa with proteolytic activity during caries progression into dentin.
A major goal of caries microbiome research from a clinical perspective is to re-establish eubiosis in the plaque to improve health and prevent tooth decay. The conclusion from a recent meta-analysis of caries microbiome studies was that significant differences in diversity, richness, and abundance were observed across health and caries [9]. However, the authors also noted the wide differentiation between the many different metrics used and the lack of uniformity in study methods between studies in the field. For example, there were differences in subjects regarding severity of the caries phenotype, use of saliva versus plaque sampling, longitudinal versus cross-sectional, and the varied chronological and dental ages of subjects [9].
In the present study, we explored two specific questions related to the plaque microbiome of S-ECC children. Were there differences in the plaque microbiome of children’s teeth that are more prone to caries compared to less caries-prone teeth? Were there differences in the microbiome of S-ECC children with different severities of caries attack?
Methods
Subjects and Sample Collection
For this cross-sectional study, S-ECC subjects were recruited from the SurgiCentre, Faculty of Dentistry, University of Toronto (University of Toronto Health Sciences Research Ethics Board, Human Protocol Number #39516). Inclusion criteria included subjects between 46 and 60 months of age, presenting with one or more cavitated, missing (due to caries), or filled tooth surfaces to fit the definition of S-ECC [10]. Exclusion criteria included subjects presenting with any visible permanent teeth in the oral cavity, fewer than 18 teeth, systemic conditions, use of antibiotics in the last 6 months, or used probiotics, synbiotics, or fluorides in the last 3 months, or experienced apparent viral and bacterial infections. Prior to plaque sample collection, written informed consent was obtained from the parent/guardian of the child after verbal review of the rationale, objectives, and protocol. The participants were then assigned, and subsequently identified by, a unique 4-digit identifier. The personal, medical, and dental information collected were analyzed in accordance with the University of Toronto, Faculty of Dentistry’s Privacy Policy.
Bitewing, periapical, and occlusal radiographs of each subject were reviewed by two examiners (J.S. and S.M.H.N.) and only those subjects accepted by both examiners were included in the study. Subjects were divided into three groups in order of increasing caries severity: G1 (molar caries), G2 (caries in upper anterior [UA] and molar), and G3 (caries in upper/lower anterior and molar). G1 was further divided into two sub-groups based on the extent of caries and surfaces involved: G1-P with molar proximal caries and G1-O with molar occlusal caries.
Collection of Plaque Samples and DNA Processing
Supragingival plaque samples were harvested from the labial surfaces of the 4 upper and 4 lower anterior teeth, suspended in sterile saline and transferred to a PowerBead (Qiagen) tube for mechanical bacterial lysis. Plaque samples from the upper and lower incisors of each subject were kept separate in subsequent experimental manipulations and analyses; thereby, the total number of plaque samples were 19 subjects × 2 = 38 samples. The supernatant was next treated with Inhibitor Removal Technology reagent and eluted through an MB spin column. Total genomic DNA was extracted using the DNeasy Power Soil Pro Kit (Qiagen) according to the manufacturer’s recommendations and quantitated by spectrophotometry (NanoDrop).
Microbiome Analysis
Sequencing and microbiome analyses were performed at the Centre for the Analysis of Genome Evolution and Function (CAGEF), University of Toronto. Illumina sequencing analysis of 16s rRNA gene sequencing was performed on the hypervariable region of 16 rRNA – V4 region. Details of the statistical analyses are found in online supplemental material (for all online suppl. material, see https://doi.org/10.1159/000543421).
Results
Study Population
A total of 19 subjects between the ages of 48–60 months were recruited and categorized in three groups (G1, G2, G3) based on radiographic and clinical determination of caries severity (representative radiographs of each group in Fig. 1A). The mildest of the three groups, G1, had dmft of 6.2 ± 1.4 with caries only in the M, which was further subdivided into two sub-groups based on caries location: interproximal in G1-P (dmft scores of 6.0 ± 2.0) and occlusal in G1-O (dmft scores of 7.0 ± 1.1). G2 had dmft scores of 12.0 ± 1.6, with caries in the UA and M teeth. The most severe group, G3, had dmft scores of 16.0 ± 3.0 with caries in UA, LA, and M teeth (Fig. 1B). There were about twice as many males compared to females in all groups with a total of 13 males and 6 females in the study.
Clinical radiographs of study subjects and caries phenotype presentation. A Representative radiographs (a, b, c Periapical images. d, e, f Occlusal radiographs. g Posterior bitewings) of G1 (G1-O, G1-P), G2, and G3. Red arrows indicate some of the caries identified on the radiographs. B Left: demographic information of subjects in study; middle: box plot representing the age distribution of the subjects in the study; right: box plot representing the dmft distribution of the subjects in the study.
Clinical radiographs of study subjects and caries phenotype presentation. A Representative radiographs (a, b, c Periapical images. d, e, f Occlusal radiographs. g Posterior bitewings) of G1 (G1-O, G1-P), G2, and G3. Red arrows indicate some of the caries identified on the radiographs. B Left: demographic information of subjects in study; middle: box plot representing the age distribution of the subjects in the study; right: box plot representing the dmft distribution of the subjects in the study.
A total of 287 operational taxonomic units (OTUs) were obtained and subjected to prevalence and abundance analyses after filtering out 80 OTUs in the lowest quartile (25%). For alpha diversity analysis, the data were rarefied to 13,057 sequences per sample, matching the lowest abundance sample. No OTUs were lost from the rarefaction procedure.
The most abundant phyla were Firmicutes, Actinobacteria, Proteobacteria, Fusobacteria, and Bacteroidetes that altogether comprised more than 99% in each group (Fig. 2a, top). Firmicutes dominated in G1 and G3, representing 42.8% and 37.4%, respectively, of total abundance across all samples in the groups, followed by Actinobacteria at 26.3% and 24.0%, in G1 and G3, respectively. Actinobacteria predominated in G2 at 57.3% of total abundance, followed by Firmicutes at 22.5%. Proteobacteria, Fusobacteria, and Bacteroides together formed 30.7%, 19.8%, and 38.4% of the total phyla observed in G1, G2, and G3, respectively.
a Relative abundance at the phylum level (top) and genus level (bottom). b Genus level of G1, G2, and G3 categorized at three levels of abundance: high = >30%, moderate = ∼5 to 15%, and low = 1–3% shown graphically (top) and in tabular (bottom) form with specific genera and the percentages of relative abundance of each genus within each group.
a Relative abundance at the phylum level (top) and genus level (bottom). b Genus level of G1, G2, and G3 categorized at three levels of abundance: high = >30%, moderate = ∼5 to 15%, and low = 1–3% shown graphically (top) and in tabular (bottom) form with specific genera and the percentages of relative abundance of each genus within each group.
We observed eight dominant genera present at ≥1% in all three groups (Fig. 2a, bottom). To examine the differences and similarities of genera in G1, G2, and G3, we divided the population into different categories: high abundant (>30%), moderately abundant (∼5–15%), and low abundant (∼1–3%) genera (Fig. 2b). Streptococcus spp. and Actinomyces spp. represent the genera of high abundance in G1 and G2 corresponding to ∼45% of the total biofilm biomass. Interestingly, no genera with relative abundance higher than 20% were detected in G3 and this group displayed greater variability when comparing to G1 and G2. Among the moderately abundant genera, the observed differential difference of Leptotrichia spp., Veillonella spp., and Corynebacterium spp. was small between G1 and G2 in each case but potentially biologically significant given their high abundance in the plaque biofilm. Among the low abundant genera, G1 and G2 included Capnocytophaga spp., Fusobacterium spp., and Prevotella spp. at similar abundance. In G3, however, Capnocytophaga spp. and Prevotella spp. were found to be at higher relative abundance.
Microbiome Diversity
Analyses of the plaque microbial diversity within the UA and LA regions of each subject revealed no statistically significant differences in the alpha diversity within the same subject. In contrast, the alpha diversity of the LA and UA was significantly different between individuals, irrespective of their caries type/grouping (observed diversity p = 0.00004; Chao1 p = 0.00003) with a greater diversity in the LA compared to the UA. Significant differences were found in the alpha diversity between the microbiome of the three caries groups (observed diversity p = 0.0008; Chao1 p diversity = 0.0018). The highest diversity overall was found in the LA region of G1 and the lowest in the UA region of G2.
Analysis of beta diversity of the microbiome of the three groups showed significant differences between G1 and G2 (observed diversity p < 0.05; Chao1 p < 0.05; Fig. 3a). Beta diversity metrics with principal coordinate axis analyses showed similar distributions between the three groups with significant differences in the centroids of the three groups (Fig. 3b). The composition and diversity of the microbiome among the three groups was further confirmed to be significantly different from each other via a PERMANOVA analysis (p = 0.001).
a Alpha diversity boxplots for observed diversity among G1, G2, G3, showing statistically significant differences between G1 and G2 (*p < 0.05). b Beta diversity – principal coordinate analysis (PCoA) plot of Bray-Curtis dissimilarity for the three caries groups. Each colored circle represents one sample: brown for G1, blue for G2, and green for G3, with the larger black-circled color dots representing the arbitrary centroids of each group.
a Alpha diversity boxplots for observed diversity among G1, G2, G3, showing statistically significant differences between G1 and G2 (*p < 0.05). b Beta diversity – principal coordinate analysis (PCoA) plot of Bray-Curtis dissimilarity for the three caries groups. Each colored circle represents one sample: brown for G1, blue for G2, and green for G3, with the larger black-circled color dots representing the arbitrary centroids of each group.
Microbiome Differential Abundance between Groups
Comparisons were next made to determine the differential abundance of the OTUs in the three groups using the DESeq2 analysis. Within the same subject, the UA and LA regions demonstrated absence of differentially abundant OTUs. In contrast, statistically significant differences in differential abundance levels in several OTUs were found in two group comparisons (Fig. 4). A total of 8 genera were identified with significantly different abundance between groups (Fig. 4a). The most compelling observation when comparing G1 to G3 and G2 to G3 was that nearly two-third of the observed genera were overrepresented in G3. Treponema spp., Streptococcus spp., Veillonella spp., and Neisseria spp. exhibited higher levels in G3. Conversely, Saccharibacteria_genera_incertae_sedis and Eikenella spp. were found to be decreased. Interestingly, Saccharibacteria_genera_incertae_sedis was overrepresented in G1 and G2 and the only genus enriched in G2. When comparing G1 to the other two groups, Actinomyces spp. was the only genus showing a notable increase in abundance in G1. Fusobacteria spp. and Neisseria spp. were also found to be overrepresented in G1 but only when compared to G2.
Differential abundance (DESeq2) of significantly (p adj <0.01) differentially abundant (|LFC| >1) OTUs among G1–G3. a Two-group comparisons of G1 versus G2 (light blue bars), G1 versus G3 (yellow bars), and G2 versus G3 (brown bars). Horizontal bars to the right of 0 log2 fold change show positive log fold changes of the first relative to the second group and bars to the left showing higher values in the second relative to the first groups. The vertical axis represents the genera. b Three-way comparison of differentially abundant specific OTUs representing the indicated genera. c Comparison of log fold changes between G1-O and G1-P, where all OTUs were more highly abundant in the G1-O compared to G1-P.
Differential abundance (DESeq2) of significantly (p adj <0.01) differentially abundant (|LFC| >1) OTUs among G1–G3. a Two-group comparisons of G1 versus G2 (light blue bars), G1 versus G3 (yellow bars), and G2 versus G3 (brown bars). Horizontal bars to the right of 0 log2 fold change show positive log fold changes of the first relative to the second group and bars to the left showing higher values in the second relative to the first groups. The vertical axis represents the genera. b Three-way comparison of differentially abundant specific OTUs representing the indicated genera. c Comparison of log fold changes between G1-O and G1-P, where all OTUs were more highly abundant in the G1-O compared to G1-P.
Collectively, group-specific differential abundance levels of some genera were observed when comparing the three groups together (Fig. 4b). The differential abundance of Streptococcus spp. and Veillonella spp. showed a higher level in G3 compared to both G1 and G2, whereas Saccharibacteria_genera_incertae_sedis was the lowest in G3 relative to the other two groups. Veillonella spp. increased from G2 to G3 and Actinomyces appeared to be highest in G1 and Fusobacteria spp. was lowest in G2 relative to the other groups.
Five genera, represented by 9 OTUs, showed statistically significant higher log fold differences in differential abundance in the occlusal caries (G1-O) versus proximal caries (G1-P) subgroups of G1 (Fig. 4c). Increases in differential abundance were found in G1-P versus G1-O including 3 different OTUs of Saccharibacteria_genera_incertae_sedis; 2 OTUs of the genus Actinomyces spp.; 2 OTUs of Leptotrichia spp.; 1 OTU of Fusobacteria spp. and Neisseria spp.
Discussion
Early caries can have long-lasting effects on oral health, impacting both primary and permanent dentitions. Traditionally, Streptococcus mutans was considered the primary cause of dental caries. However, next-generation sequencing and innovative imaging tools revealed a more complex picture. In the current study, the segregation of subjects into three separate and distinct categories of caries severity in S-ECC allowed us to differentiate severity-specific microbiome that would otherwise be missed when subjects are homogenously treated as one group. That is, careful attention was given to ensuring the recruitment of distinct groups of S-ECC children representing three sharply delineated and well-defined severities of caries profile. Thus, the study represents a notable difference in the scientific literature, almost all of which did not differentiate between ECC severities. The sample size in the study could arguably be increased (i.e., >19 subjects). However, it would have increased significantly the length of the recruitment period, and hence the overall study, since very distinct inclusion criteria for each group were implemented with regard to clinical presentation and chronological and dental ages. For example, the recruitment of the 19 subjects in the study took about a year and a half and that from a busy clinical center focusing on care of ECC patients. Importantly and interestingly, with the sample size of 19, our study yielded clear and statistically significant changes among the three stratified groups in the diversities and differential abundance of taxa. The current data overall may offer insights into documented differences between different ECC microbiome studies. For example, we observed that Neisseria spp. was overrepresented in G1 and G3 but strongly decreased in G2. In contrast, a previous study found decreased Neisseria spp. levels in their ECC group that upon closer examination had dmft of 4–5 [11], a group that was more similar to G1 and G2 in the current study. Therefore, stratification of caries according to severity can affect the final microbiome data and result in interpretation that cannot be compared between studies. Our findings underscore the dynamic nature of the oral microbiome and its potential impact on caries susceptibility.
Another strength of the study was the use of supragingival plaque samples from specific areas of the oral cavity, e.g., upper versus lower anterior arches, to identify microbial differences between different sites within the same individual. The finding of a lack of difference in the relative or differential abundance of bacterial groups between the UA and LA within the individual indicates a lack of site specificity within a caries susceptible site with one that is less within the same individual. This reinforces the hypothesis that the physiological and metabolic causes such as protective position of the tongue, proximity to salivary glands, and salivary flow rate may be a major factor in decreasing the susceptibility of the LA teeth to caries. Our finding of the uniqueness of an individual’s oral microbiome is consistent with other studies that showed that the oral microbiome is highly individual-specific, like “fingerprints” [12], where plaque microbiome is highly individualized and stable for a period of time. The greater microbial diversity of the supragingival plaque between LA and UA across subjects was expected and fell in line with the prevailing fact that carious regions have a less diverse microbiome, whereas the caries-free regions showed a more diverse microbiome [13]. Interestingly, we showed that the alpha diversity decreased only from G1 to G2, and not G3, suggesting a microbial shift with caries formation in the molar (G1) to subsequent caries in the UA (G2). This finding may suggest that stabilization of the microbiome over the period matched that of G1. It may also suggest that the progression of lesions to less caries-prone sites (G3) may be accompanied by a more diverse microbiota including more proteolytic bacteria such as Treponema spp. and Fusobacterium spp. as observed in this study. In that case, caries progression may include additional mechanisms besides acid stress, highlighting the multifactorial nature of dental caries.
The taxonomic bacterial phyla represented in the three subject groups appeared representative of those identified in the dental plaque community of S-ECC children. Due to the careful stratification of the severity of caries in our S-ECC subjects, a more detailed distinction of the bacterial groups that predominate in early to later stages of caries severity was obtained. For example, Fusobacterium spp was decreased in G2 and Actinomyces, Saccharibacteria_genera_inserta_sedis, and Eikenella had increased differential abundance in G1. In G3, Neisseria, Streptococcus, Treponema, and Veillonella were more abundant and Saccharibacteria_genera_inserta_sedis was less (summarized in Fig. 5). Caries is a biofilm-induced and acid demineralization process. Thus, it is not surprising to find the Streptococcus-Veillonella association at the highest level in the most severe group [14]. It is likely that it is not just one or two bacterial genera but groups that could be used as a predictor of progression of caries. Indeed, attempts to define a core microbiome associated with caries are challenging as caries microbiomes are more variable in community structure compared to the healthy microbiome. Nevertheless, we observed a significant shift in both the composition and genus abundance of the plaque microbiome for all caries groups. For instance, the increased abundance of Neisseria spp. in G3 fits with what has been documented in other studies of S-ECC subjects. Members of Neisseria genus have been known as late colonizers that proliferate with progression of caries [15]. Among the dominant genera in G3, Treponema was also identified. Treponema spp. showed higher frequencies in periodontal pockets, in agreement with its known role in periodontal diseases. Interestingly, there is evidence suggesting a positive correlation between dental caries and periodontitis [16]. It has been suggested that patients with three or more untreated caries were more likely to develop periodontal disease.
Summary of the differentially abundant genera in each group. Diagrammatic representation of the clinical caries phenotype of the 20 primary teeth of subjects in G1, G2, and G3. Black areas drawn on teeth represent caries in primary molars only (G1), caries in molars and upper incisors (G2), and caries in molars and upper and lower incisors (G3). Listed under each group are the genera, as represented by specific OTUs, that are more (in black) or less (in red) abundant in each group.
Summary of the differentially abundant genera in each group. Diagrammatic representation of the clinical caries phenotype of the 20 primary teeth of subjects in G1, G2, and G3. Black areas drawn on teeth represent caries in primary molars only (G1), caries in molars and upper incisors (G2), and caries in molars and upper and lower incisors (G3). Listed under each group are the genera, as represented by specific OTUs, that are more (in black) or less (in red) abundant in each group.
It was interesting to note the high differential abundance of Saccharibacteria_genera_inserta_sedis in G1-O compared to G1-P. Saccharibacteria_genera_inserta_sedis belongs to the Saccharibacteria phylum (formerly known as TM7), a group of ultrasmall bacteria that is particularly prevalent in the oral cavity. The role of Saccharibacteria has garnered interest recently for their role in caries. A recent study in South China population found that Saccharibacteria phylum was associated with ECC, regardless of age [17]. Some Saccharibacteria strains have been shown to be capable of using arginine deiminase system, thus allowing them to maintain higher viability and infectivity [18]. It is possible that in G1, and within G1-O, therefore, different species of the Saccharibacteria_genera_inserta_sedis may interact with the highly abundant Actinomyces spp., also highly abundant in the groups, in a host-parasitic relationship [19]. Known for its early colonizing role in the dental biofilm, Actinomyces spp. has been shown to be one of the initial and early colonizers of the dental biofilm [15]. Actinomyces spp. have been detected at high levels in white spot lesions and may be involved in caries initiation [20]. Interestingly, we found the dominance of anaerobes in subjects with involvement of occlusal molar caries as compared to proximal surface caries. Leptotrichia could be one of the key pathogens in the occlusal caries because of its reported property to metabolize sucrose and create an acidic niche, even in the absence of S. mutans, thereby inviting other acid-tolerant bacteria [21], such as Neisseria spp. This may then result in increased acidic conditions within the retentive occlusal pits and fissures as opposed to the proximal sites that are more protected from masticatory movements, along with flushing action of saliva and tongue.
Altogether, the findings strongly support a strong need for stratification of caries severity in any microbiome studies in S-ECC children. We propose that there are some “core genera” significantly associated with clinically measured childhood caries experience. The core genera may play a fundamental role in shaping the overall microbiome function. Also, using some of the markers of bacterial groups identified in the current study, longitudinal studies could also be conducted to test their use in the prediction of caries progression. Therefore, our study has provided the framework for subsequent studies examining the microbiome regarding site specificity and caries severity.
Acknowledgments
The authors wish to acknowledge the contributions of Donny Chan from University of Toronto Centre for the Analysis of Genome Evolution and Function in the bioinformatic analyses of the study.
Statement of Ethics
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the research Ethics Committee of the University of Toronto Health Sciences Research Ethics Board, Human Protocol Number #39516. Informed written consent was obtained from the guardians/parents of all individual participants included in the study.
Conflict of Interest Statement
The authors have no conflicts of interest to declare.
Funding Sources
This research was financially supported by Align Technology 2020 Align Research Award (Gong) and Canadian Institutes of Health Research PJT-183893 (Gong). Neither funder had any role in the design, data collection, data analysis, and reporting of this study.
Author Contributions
S.-G.G. contributed to conception, design, and data analysis and interpretation and drafted and critically revised the manuscript. J.S. contributed to conception and data acquisition and interpretation and drafted the manuscript. S.M.H.N. contributed to conception, design, and data acquisition, analysis, and interpretation and critically reviewed the manuscript. C.M.L. contributed to conception and design and data analysis and interpretation and critically reviewed the manuscript. All authors gave their final approval and agreed to be accountable for all aspects of the work.
Data Availability Statement
The data that support the findings of this study have been deposited with links to BioProject accession number PRJNA1194511 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/).