Introduction

Metagenomic analysis was conducted on our oral biofilm model to assess the viability of the model as a proxy for dental calculus.

Materials

A total of 35 biofilm model samples were taken throughout the course of the experiment and includes saliva inoculate, media from the wells, and the biofilm end-product. DNA extraction was performed at the archaeogenetic facility at the Max Planck Institute for the Science of Human History (Jena, Germany).

Extractions were performed in duplicates. A total of DNA extracts.

DNeasy PowerSoil Kit from QIAGEN. C2 inhibitor removal step skipped, going directly to C3 step.

were paired-end sequenced on a NextSeq (2 color chemistry) to 150bp

Comparative samples

Methods

Preprocessing

Processing of the raw DNA reads was conducted using the nf-core/eager, v2.4.4 pipeline [@yatesEAGER2020]. Adapter removal and read merging was performed using AdapterRemoval, v2.3.2 [@AdapterRemovalv2]. Merged reads were mapped to the human reference genome, GRCh38, using BWA, v0.7.17-r1188 [@BWA] with default settings (-n 0.01; -l 32), and unmapped reads were extracted using Samtools, v1.12.

Metagenomic classification was conducted in kraken, v2.1.2 using the Standard database (https://benlangmead.github.io/aws-indexes/k2).

Kraken output reports were combined and converted to OTU tables, and only species-level assignments were selected for downstream analysis 02-scripts/00-comb-kraken-reports.R. The OTU tables were further filtered by removing species with relative abundance lower than 0.001%.

Authentication

SourceTracker [@knightsSourceTracker2011] was used to estimate source composition of the oral biofilm model samples using a Bayesian framework. Samples were compared with oral and environmental controls to detect potential external contamination.

Potential contaminants were identified using the frequency and prevalence method in the decontam v1.16.0 [@R-decontam] R package. 02-scripts/02-authentication.R. Samples from indoor_air, skin, and sediment were used as negative controls for the prevalence method with a probability threshold of 0.01. DNA concentrations were used for the ‘frequency’ method with a probability threshold of 0.99 and negative controls.

Putative contaminants were filtered out of the OTU tables for all downstream analyses.

See 06-reports/metagen-authentication.Rmd for more details.

Community composition

Genus- and species-level OTU tables were prepared from the decontaminated OTU table, and relative abundance of species in a sample was calculated as recommended for compositional data [@gloorMicrobiomeDatasets2017].

Information on the oxygen tolerance of bacterial species was downloaded from BacDive on 2022-05-27, 2022-06-29. A list of amylase-binding streptococci (ABS) was created based on [@nikitkovaStarchBiofilms2013].

Alpha-diversity, specifically the Shannon Index, was calculated to compare species richness and diversity across experimental and comparative oral samples. Shannon Index was calculated using the R package vegan v2.6.2 [@Rvegan].

Sparse principal components analysis (sPCA) was conducted on centered-log-ratio (CLR) transformed species-level counts using the R package mixOmics v6.20.0 [@RmixOmics]. The mixOmics implementation of sPCA uses a LASSO penalisation to eliminate unimportant variables. Two separate analyses were conducted: 1) on experiment samples to assess the difference in sample types and biofilm age; and 2) on oral comparative samples and biofilm model end-products to explore community differences between in vitro and in vivo biofilms.

Loadings obtained from the sPCA analyses were used in combination with CLR-transformed counts to create heatmaps of species- and genus-level counts within the experiment and across comparative samples.

Beta-diversity, Bray Curtis dissimilarity.

Differential abundance analysis

Results

Authentication

SourceTracker

Results from SourceTracker indicated a large portion of species from most samples are from an oral origin (plaque, saliva, or calculus). Some samples contained a large proportion of species from potential contaminants (indoor air) and of unknown origin. Potential contaminants were compared to a database of oral bacteria to see the proportion of known oral species were present in the samples. Many of the species assigned to indoor_air are known oral species (Figure @ref(fig:st-plot)).

Based on these results, samples SYN015.F0101, SYN015.G0101, SYN015.H0101, SYN017.F0101, SYN017.G0101, SYN018.H0101, SYN013.I0101, SYN016.I0101 were removed from further analysis. The removed samples were all sampled late in the experiment (day 18+).

decontam

A total of 5424 potential contaminants were removed. Remaining counts of species in the biofilm samples ranged between 88 and 284 with a mean of 181.59.

Community composition

Alpha-diversity. Days are grouped to increase sample sizes:

  • Inoculation (inoc) = days 0,3,5
  • Treatment (treatm) = days 7,9,12,15
  • End-product (final) = day 24

Alpha-diversity sees a slight decrease over the course of the experiment. The model calculus is less variable than the initial and middle biofilm samples. Compared to other oral samples, there is lower species diversity and richness in the medium and model calculus.

