High‐throughput methods for efficiently building massive phylogenies from natural history collections

Premise Large phylogenetic data sets have often been restricted to small numbers of loci from GenBank, and a vetted sampling‐to‐sequencing phylogenomic protocol scaling to thousands of species is not yet available. Here, we report a high‐throughput collections‐based approach that empowers researcher...

Full description

Bibliographic Details
Main Authors: Ryan A. Folk, Heather R. Kates, Raphael LaFrance, Douglas E. Soltis, Pamela S. Soltis, Robert P. Guralnick
Format: Article
Language:English
Published: Wiley 2021-02-01
Series:Applications in Plant Sciences
Subjects:
Online Access:https://doi.org/10.1002/aps3.11410
Description
Summary:Premise Large phylogenetic data sets have often been restricted to small numbers of loci from GenBank, and a vetted sampling‐to‐sequencing phylogenomic protocol scaling to thousands of species is not yet available. Here, we report a high‐throughput collections‐based approach that empowers researchers to explore more branches of the tree of life with numerous loci. Methods We developed an integrated Specimen‐to‐Laboratory Information Management System (SLIMS), connecting sampling and wet lab efforts with progress tracking at each stage. Using unique identifiers encoded in QR codes and a taxonomic database, a research team can sample herbarium specimens, efficiently record the sampling event, and capture specimen images. After sampling in herbaria, images are uploaded to a citizen science platform for metadata generation, and tissue samples are moved through a simple, high‐throughput, plate‐based herbarium DNA extraction and sequencing protocol. Results We applied this sampling‐to‐sequencing workflow to ~15,000 species, producing for the first time a data set with ~50% taxonomic representation of the “nitrogen‐fixing clade” of angiosperms. Discussion The approach we present is appropriate at any taxonomic scale and is extensible to other collection types. The widespread use of large‐scale sampling strategies repositions herbaria as accessible but largely untapped resources for broad taxonomic sampling with thousands of species.
ISSN:2168-0450