Predicting the HMA-LMA status in marine sponges by machine learning

Moitinho-Silva, Lucas, Steinert, Georg, Nielsen, Shaun, Hardoim, Cristiane C., Wu, Yu-Chen, McCormack, Grace P., Lopez-Legentil, Susanna, Marchant, Roman, Webster, Nicole, Thomas, Torsten and Hentschel, Ute (2017) Predicting the HMA-LMA status in marine sponges by machine learning Frontiers in Microbiology, 8 (Art.No. 752). DOI 10.3389/fmicb.2017.00752.

258275_Moitinho-Silva_ProvisionalPDF.pdf - Accepted Version
Available under License ["licenses_description_cc_by_4.0" not defined].

Download (5Mb) | Preview

Supplementary data:


The dichotomy between high microbial abundance (HMA) and low microbial abundance (LMA) sponges has been observed in sponge-microbe symbiosis, although the extent of this pattern remains poorly unknown. We characterized the differences between the microbiomes of HMA (n=19) and LMA (n=17) sponges (575 specimens) present in the Sponge Microbiome Project. HMA sponges were associated with richer and more diverse microbiomes than LMA sponges, as indicated by the comparison of alpha diversity metrics. Microbial community structures differed between HMA and LMA sponges considering Operational Taxonomic Units (OTU) abundances and across microbial taxonomic levels, from phylum to species. The largest proportion of microbiome variation was explained by the host identity. Several phyla, classes, and OTUs were found differentially abundant in either group, which were considered “HMA indicators” and “LMA indicators”. Machine learning algorithms (classifiers) were trained to predict the HMA-LMA status of sponges. Among nine different classifiers, higher performances were achieved by Random Forest trained with phylum and class abundances. Random Forest with optimized parameters predicted the HMA-LMA status of additional 135 sponge species (1,232 specimens) without a priori knowledge. These sponges were grouped in four clusters, from which the largest two were composed of species consistently predicted as HMA (n=44) and LMA (n=74). In summary, our analyses shown distinct features of the microbial communities associated with HMA and LMA sponges. The prediction of the HMA-LMA status based on the microbiome profiles of sponges demonstrates the application of machine learning to explore patterns of host-associated microbial communities.

Document Type: Article
Additional Information: All amplicon data and metadata are public at the European Nucleotide Archive (accession number: ERP020690). Quality-filtered, demultiplexed fastq files are available at (Study ID: 10793). OTU abundance matrix and OTU taxonomic information is available in Moitinho-Silva et al. (in preparation).
Keywords: marine sponges, microbiome, 16S rRNA gene, Microbial Diversity, Symbiosis, random forest
Research affiliation: OceanRep > GEOMAR > FB3 Marine Ecology > FB3-MI Marine Microbiology
Kiel University
Refereed: Yes
DOI etc.: 10.3389/fmicb.2017.00752
ISSN: 1664-302X
Date Deposited: 28 Apr 2017 11:51
Last Modified: 12 Jun 2017 13:45

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...