Systematic Feature Filtering in Exploratory Metabolomics: Application toward Biomarker Discovery

Investor logo
Investor logo
Investor logo


This publication doesn't include Faculty of Medicine. It includes Faculty of Science. Official publication website can be found on

GADARA Darshak Chandulal COUFALÍKOVÁ Kateřina BOSÁK Juraj ŠMAJS David SPÁČIL Zdeněk

Year of publication 2021
Type Article in Periodical
Magazine / Source Analytical chemistry
MU Faculty or unit

Faculty of Science

Description Exploratory mass spectrometry-based metabolomics generates a plethora of features in a single analysis. However, >85% of detected features are typically false positives due to inefficient elimination of chimeric signals and chemical noise not relevant for biological and clinical data interpretation. The data processing is considered a bottleneck to unravel the translational potential in metabolomics. Here, we describe a systematic workflow to refine exploratory metabolomics data and reduce reported false positives. We applied the feature filtering workflow in a case/control study exploring common variable immunodeficiency (CVID). In the first stage, features were detected from raw liquid chromatography-mass spectrometry data by XCMS Online processing, blank subtraction, and reproducibility assessment. Detected features were annotated in metabolomics databases to produce a list of tentative identifications. We scrutinized tentative identifications' physicochemical properties, comparing predicted and experimental reversed-phase liquid chromatography (LC) retention time. A prediction model used a linear regression of 42 retention indices with the cLogP ranging from -6 to 11. The LC retention time probes the physicochemical properties and effectively reduces the number of tentatively identified metabolites, which are further submitted to statistical analysis. We applied the retention time-based analytical feature filtering workflow to datasets from the Metabolomics Workbench (www. ), demonstrating the broad applicability. A subset of tentatively identified metabolites significantly different in CVID patients was validated by MS/MS acquisition to confirm potential CVID biomarkers' structures and virtually eliminate false positives. Our exploratory metabolomics data processing workflow effectively removes false positives caused by the chemical background and chimeric signals inherent to the analytical technique. It reduced the number of tentatively identified metabolites by 88%, from initially detected 6940 features in XCMS to 839 tentative identifications and streamlined consequent statistical analysis and data interpretation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info