The taxonomic name resolution service: an online tool for automated standardization of plant names

<p>Abstract</p> <p>Background</p> <p>The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of mi...

Full description

Bibliographic Details
Main Authors: Boyle Brad, Hopkins Nicole, Lu Zhenyuan, Raygoza Garay Juan Antonio, Mozzherin Dmitry, Rees Tony, Matasci Naim, Narro Martha L, Piel William H, Mckay Sheldon J, Lowry Sonya, Freeland Chris, Peet Robert K, Enquist Brian J
Format: Article
Language:English
Published: BMC 2013-01-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://www.biomedcentral.com/1471-2105/14/16
id doaj-de967cc4c1e040da93f1a94cf1f9cb5e
record_format Article
spelling doaj-de967cc4c1e040da93f1a94cf1f9cb5e2020-11-25T02:21:55ZengBMCBMC Bioinformatics1471-21052013-01-011411610.1186/1471-2105-14-16The taxonomic name resolution service: an online tool for automated standardization of plant namesBoyle BradHopkins NicoleLu ZhenyuanRaygoza Garay Juan AntonioMozzherin DmitryRees TonyMatasci NaimNarro Martha LPiel William HMckay Sheldon JLowry SonyaFreeland ChrisPeet Robert KEnquist Brian J<p>Abstract</p> <p>Background</p> <p>The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.</p> <p>Results</p> <p>The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.</p> <p>Conclusions</p> <p>We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at <url>http://tnrs.iplantcollaborative.org/</url> and as a RESTful web service and application programming interface. Source code is available at <url>https://github.com/iPlantCollaborativeOpenSource/TNRS/</url>.</p> http://www.biomedcentral.com/1471-2105/14/16Biodiversity informaticsDatabase integrationTaxonomyPlants
collection DOAJ
language English
format Article
sources DOAJ
author Boyle Brad
Hopkins Nicole
Lu Zhenyuan
Raygoza Garay Juan Antonio
Mozzherin Dmitry
Rees Tony
Matasci Naim
Narro Martha L
Piel William H
Mckay Sheldon J
Lowry Sonya
Freeland Chris
Peet Robert K
Enquist Brian J
spellingShingle Boyle Brad
Hopkins Nicole
Lu Zhenyuan
Raygoza Garay Juan Antonio
Mozzherin Dmitry
Rees Tony
Matasci Naim
Narro Martha L
Piel William H
Mckay Sheldon J
Lowry Sonya
Freeland Chris
Peet Robert K
Enquist Brian J
The taxonomic name resolution service: an online tool for automated standardization of plant names
BMC Bioinformatics
Biodiversity informatics
Database integration
Taxonomy
Plants
author_facet Boyle Brad
Hopkins Nicole
Lu Zhenyuan
Raygoza Garay Juan Antonio
Mozzherin Dmitry
Rees Tony
Matasci Naim
Narro Martha L
Piel William H
Mckay Sheldon J
Lowry Sonya
Freeland Chris
Peet Robert K
Enquist Brian J
author_sort Boyle Brad
title The taxonomic name resolution service: an online tool for automated standardization of plant names
title_short The taxonomic name resolution service: an online tool for automated standardization of plant names
title_full The taxonomic name resolution service: an online tool for automated standardization of plant names
title_fullStr The taxonomic name resolution service: an online tool for automated standardization of plant names
title_full_unstemmed The taxonomic name resolution service: an online tool for automated standardization of plant names
title_sort taxonomic name resolution service: an online tool for automated standardization of plant names
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2013-01-01
description <p>Abstract</p> <p>Background</p> <p>The digitization of biodiversity data is leading to the widespread application of taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records and inflated species numbers. The ultimate consequences of misspelled names and bad taxonomy are erroneous scientific conclusions and faulty policy decisions. The lack of tools for correcting this ‘names problem’ has become a fundamental obstacle to integrating disparate data sources and advancing the progress of biodiversity science.</p> <p>Results</p> <p>The TNRS, or Taxonomic Name Resolution Service, is an online application for automated and user-supervised standardization of plant scientific names. The TNRS builds upon and extends existing open-source applications for name parsing and fuzzy matching. Names are standardized against multiple reference taxonomies, including the Missouri Botanical Garden's Tropicos database. Capable of processing thousands of names in a single operation, the TNRS parses and corrects misspelled names and authorities, standardizes variant spellings, and converts nomenclatural synonyms to accepted names. Family names can be included to increase match accuracy and resolve many types of homonyms. Partial matching of higher taxa combined with extraction of annotations, accession numbers and morphospecies allows the TNRS to standardize taxonomy across a broad range of active and legacy datasets.</p> <p>Conclusions</p> <p>We show how the TNRS can resolve many forms of taxonomic semantic heterogeneity, correct spelling errors and eliminate spurious names. As a result, the TNRS can aid the integration of disparate biological datasets. Although the TNRS was developed to aid in standardizing plant names, its underlying algorithms and design can be extended to all organisms and nomenclatural codes. The TNRS is accessible via a web interface at <url>http://tnrs.iplantcollaborative.org/</url> and as a RESTful web service and application programming interface. Source code is available at <url>https://github.com/iPlantCollaborativeOpenSource/TNRS/</url>.</p>
topic Biodiversity informatics
Database integration
Taxonomy
Plants
url http://www.biomedcentral.com/1471-2105/14/16
work_keys_str_mv AT boylebrad thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT hopkinsnicole thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT luzhenyuan thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT raygozagarayjuanantonio thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT mozzherindmitry thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT reestony thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT matascinaim thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT narromarthal thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT pielwilliamh thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT mckaysheldonj thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT lowrysonya thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT freelandchris thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT peetrobertk thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT enquistbrianj thetaxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT boylebrad taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT hopkinsnicole taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT luzhenyuan taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT raygozagarayjuanantonio taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT mozzherindmitry taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT reestony taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT matascinaim taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT narromarthal taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT pielwilliamh taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT mckaysheldonj taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT lowrysonya taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT freelandchris taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT peetrobertk taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
AT enquistbrianj taxonomicnameresolutionserviceanonlinetoolforautomatedstandardizationofplantnames
_version_ 1724864646339362816