Chemically informed analyses of metabolomics mass spectrometry data with Qemistree.

Affiliation

Tripathi A(#)(1)(2)(3), Vázquez-Baeza Y(#)(4)(5), Gauglitz JM(3)(6), Wang M(3), Dührkop K(7), Nothias-Esposito M(3), Acharya DD(3)(8), Ernst M(3)(6)(9), van der Hooft JJJ(10), Zhu Q(2), McDonald D(2), Brejnrod AD(3), Gonzalez A(2), Handelsman J(8), Fleischauer M(7), Ludwig M(7), Böcker S(7), Nothias LF(3), Knight R(2)(4)(5)(11), Dorrestein PC(12)(13)(14).
Author information:
(1)Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA.
(2)Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
(3)Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.
(4)Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
(5)Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA.
(6)Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA.
(7)Chair for Bioinformatics, Friedrich-Schiller-University, Jena, Germany.
(8)Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA.
(9)Section for Clinical Mass Spectrometry, Department of Congenital Disorders, Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark.
(10)Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
(11)Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
(12)Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA. [Email]
(13)Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA. [Email]
(14)Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, CA, USA. [Email]
(#)Contributed equally

Abstract

Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult to interpret. We developed Qemistree, a data exploration strategy based on the hierarchical organization of molecular fingerprints predicted from fragmentation spectra. Qemistree allows mass spectrometry data to be represented in the context of sample metadata and chemical ontologies. By expressing molecular relationships as a tree, we can apply ecological tools that are designed to analyze and visualize the relatedness of DNA sequences to metabolomics data. Here we demonstrate the use of tree-guided data exploration tools to compare metabolomics samples across different experimental conditions such as chromatographic shifts. Additionally, we leverage a tree representation to visualize chemical diversity in a heterogeneous collection of samples. The Qemistree software pipeline is freely available to the microbiome and metabolomics communities in the form of a QIIME2 plugin, and a global natural products social molecular networking workflow.