Discovery of food identity markers by metabolomics and machine learning technology

Abstract Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to effi...

Full description

Bibliographic Details
Main Authors: Alexander Erban, Ines Fehrle, Federico Martinez-Seidel, Federico Brigante, Agustín Lucini Más, Veronica Baroni, Daniel Wunderlin, Joachim Kopka
Format: Article
Language:English
Published: Nature Publishing Group 2019-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-019-46113-y
id doaj-e43a47c8e9e649ffb640ab938f2e05ad
record_format Article
spelling doaj-e43a47c8e9e649ffb640ab938f2e05ad2020-12-08T09:41:05ZengNature Publishing GroupScientific Reports2045-23222019-07-019111910.1038/s41598-019-46113-yDiscovery of food identity markers by metabolomics and machine learning technologyAlexander Erban0Ines Fehrle1Federico Martinez-Seidel2Federico Brigante3Agustín Lucini Más4Veronica Baroni5Daniel Wunderlin6Joachim Kopka7Max-Planck-Institute of Molecular Plant Physiology, Department of Molecular Physiology: Applied Metabolome AnalysisMax-Planck-Institute of Molecular Plant Physiology, Department of Molecular Physiology: Applied Metabolome AnalysisMax-Planck-Institute of Molecular Plant Physiology, Department of Molecular Physiology: Applied Metabolome AnalysisUniversidad Nacional de Córdoba, Facultad de Ciencias Químicas, Dpto. Química OrgánicaUniversidad Nacional de Córdoba, Facultad de Ciencias Químicas, Dpto. Química OrgánicaUniversidad Nacional de Córdoba, Facultad de Ciencias Químicas, Dpto. Química OrgánicaUniversidad Nacional de Córdoba, Facultad de Ciencias Químicas, Dpto. Química OrgánicaMax-Planck-Institute of Molecular Plant Physiology, Department of Molecular Physiology: Applied Metabolome AnalysisAbstract Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.https://doi.org/10.1038/s41598-019-46113-y
collection DOAJ
language English
format Article
sources DOAJ
author Alexander Erban
Ines Fehrle
Federico Martinez-Seidel
Federico Brigante
Agustín Lucini Más
Veronica Baroni
Daniel Wunderlin
Joachim Kopka
spellingShingle Alexander Erban
Ines Fehrle
Federico Martinez-Seidel
Federico Brigante
Agustín Lucini Más
Veronica Baroni
Daniel Wunderlin
Joachim Kopka
Discovery of food identity markers by metabolomics and machine learning technology
Scientific Reports
author_facet Alexander Erban
Ines Fehrle
Federico Martinez-Seidel
Federico Brigante
Agustín Lucini Más
Veronica Baroni
Daniel Wunderlin
Joachim Kopka
author_sort Alexander Erban
title Discovery of food identity markers by metabolomics and machine learning technology
title_short Discovery of food identity markers by metabolomics and machine learning technology
title_full Discovery of food identity markers by metabolomics and machine learning technology
title_fullStr Discovery of food identity markers by metabolomics and machine learning technology
title_full_unstemmed Discovery of food identity markers by metabolomics and machine learning technology
title_sort discovery of food identity markers by metabolomics and machine learning technology
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2019-07-01
description Abstract Verification of food authenticity establishes consumer trust in food ingredients and components of processed food. Next to genetic or protein markers, chemicals are unique identifiers of food components. Non-targeted metabolomics is ideally suited to screen food markers when coupled to efficient data analysis. This study explored feasibility of random forest (RF) machine learning, specifically its inherent feature extraction for non-targeted metabolic marker discovery. The distinction of chia, linseed, and sesame that have gained attention as “superfoods” served as test case. Chemical fractions of non-processed seeds and of wheat cookies with seed ingredients were profiled. RF technology classified original seeds unambiguously but appeared overdesigned for material with unique secondary metabolites, like sesamol or rosmarinic acid in the Lamiaceae, chia. Most unique metabolites were diluted or lost during cookie production but RF technology classified the presence of the seed ingredients in cookies with 6.7% overall error and revealed food processing markers, like 4-hydroxybenzaldehyde for chia and succinic acid monomethylester for linseed additions. RF based feature extraction was adequate for difficult classifications but marker selection should not be without human supervision. Combination with alternative data analysis technologies is advised and further testing of a wide range of seeds and food processing methods.
url https://doi.org/10.1038/s41598-019-46113-y
work_keys_str_mv AT alexandererban discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT inesfehrle discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT federicomartinezseidel discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT federicobrigante discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT agustinlucinimas discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT veronicabaroni discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT danielwunderlin discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
AT joachimkopka discoveryoffoodidentitymarkersbymetabolomicsandmachinelearningtechnology
_version_ 1724389850744881152