Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations

The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in...

Full description

Bibliographic Details
Main Authors: Ian Walsh, Matthew S. F. Choo, Sim Lyn Chiin, Amelia Mak, Shi Jie Tay, Pauline M. Rudd, Yang Yuansheng, Andre Choo, Ho Ying Swan, Terry Nguyen-Khuong
Format: Article
Language:English
Published: Beilstein-Institut 2020-08-01
Series:Beilstein Journal of Organic Chemistry
Subjects:
Online Access:https://doi.org/10.3762/bjoc.16.176
id doaj-3a87638c992641a195d5dc6c94c19a48
record_format Article
spelling doaj-3a87638c992641a195d5dc6c94c19a482021-04-02T12:22:25ZengBeilstein-InstitutBeilstein Journal of Organic Chemistry1860-53972020-08-011612087209910.3762/bjoc.16.1761860-5397-16-176Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operationsIan Walsh0Matthew S. F. Choo1Sim Lyn Chiin2Amelia Mak3Shi Jie Tay4Pauline M. Rudd5Yang Yuansheng6Andre Choo7Ho Ying Swan8Terry Nguyen-Khuong9Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668Animal Cell Technology Group, Bioprocessing Technology Institute, Agency for Science Technology and Research, Singapore 138668Stem Cells 1 Group, Bioprocessing Technology Institute - Agency for Science Technology and Research, Singapore 138668Department of Biomedical Engineering, Faculty of Engineering, National University of Singapore (NUS), Singapore 117575Analytics Group, Bioprocessing Technology Institute - Agency for Science Technology and Research. Singapore 138668The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in protocols for capillary electrophoresis-laser-induced fluorescence (CE-LIF) measurements of antibody N-glycans have increased the potential for generating large datasets of N-glycosylation values for assessment. With large cohorts of CE-LIF data, peak picking and peak area calculations still remain a problem for fast and accurate quantitation, despite the presence of internal and external standards to reduce misalignment for the qualitative analysis. The peak picking and area calculation problems are often due to fluctuations introduced by varying process conditions resulting in heterogeneous peak shapes. Additionally, peaks with co-eluting glycans can produce peaks of a non-Gaussian nature in some process conditions and not in others. Here, we describe an approach to quantitatively and qualitatively curate large cohort CE-LIF glycomics data. For glycan identification, a previously reported method based on internal triple standards is used. For determining the glycan relative quantities our method uses a clustering algorithm to ‘divide and conquer’ highly heterogeneous electropherograms into similar groups, making it easier to define peaks manually. Open-source software is then used to determine peak areas of the manually defined peaks. We successfully applied this semi-automated method to a dataset (containing 391 glycoprofiles) of monoclonal antibody biosimilars from a bioreactor optimization study. The key advantage of this computational approach is that all runs can be analyzed simultaneously with high accuracy in glycan identification and quantitation and there is no theoretical limit to the scale of this method.https://doi.org/10.3762/bjoc.16.176capillary electrophoresisclusteringdata analysiselectropherogramglycosylationmonoclonal antibodiespeak pickingprocess development
collection DOAJ
language English
format Article
sources DOAJ
author Ian Walsh
Matthew S. F. Choo
Sim Lyn Chiin
Amelia Mak
Shi Jie Tay
Pauline M. Rudd
Yang Yuansheng
Andre Choo
Ho Ying Swan
Terry Nguyen-Khuong
spellingShingle Ian Walsh
Matthew S. F. Choo
Sim Lyn Chiin
Amelia Mak
Shi Jie Tay
Pauline M. Rudd
Yang Yuansheng
Andre Choo
Ho Ying Swan
Terry Nguyen-Khuong
Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
Beilstein Journal of Organic Chemistry
capillary electrophoresis
clustering
data analysis
electropherogram
glycosylation
monoclonal antibodies
peak picking
process development
author_facet Ian Walsh
Matthew S. F. Choo
Sim Lyn Chiin
Amelia Mak
Shi Jie Tay
Pauline M. Rudd
Yang Yuansheng
Andre Choo
Ho Ying Swan
Terry Nguyen-Khuong
author_sort Ian Walsh
title Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_short Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_full Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_fullStr Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_full_unstemmed Clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
title_sort clustering and curation of electropherograms: an efficient method for analyzing large cohorts of capillary electrophoresis glycomic profiles for bioprocessing operations
publisher Beilstein-Institut
series Beilstein Journal of Organic Chemistry
issn 1860-5397
publishDate 2020-08-01
description The accurate assessment of antibody glycosylation during bioprocessing requires the high-throughput generation of large amounts of glycomics data. This allows bioprocess engineers to identify critical process parameters that control the glycosylation critical quality attributes. The advances made in protocols for capillary electrophoresis-laser-induced fluorescence (CE-LIF) measurements of antibody N-glycans have increased the potential for generating large datasets of N-glycosylation values for assessment. With large cohorts of CE-LIF data, peak picking and peak area calculations still remain a problem for fast and accurate quantitation, despite the presence of internal and external standards to reduce misalignment for the qualitative analysis. The peak picking and area calculation problems are often due to fluctuations introduced by varying process conditions resulting in heterogeneous peak shapes. Additionally, peaks with co-eluting glycans can produce peaks of a non-Gaussian nature in some process conditions and not in others. Here, we describe an approach to quantitatively and qualitatively curate large cohort CE-LIF glycomics data. For glycan identification, a previously reported method based on internal triple standards is used. For determining the glycan relative quantities our method uses a clustering algorithm to ‘divide and conquer’ highly heterogeneous electropherograms into similar groups, making it easier to define peaks manually. Open-source software is then used to determine peak areas of the manually defined peaks. We successfully applied this semi-automated method to a dataset (containing 391 glycoprofiles) of monoclonal antibody biosimilars from a bioreactor optimization study. The key advantage of this computational approach is that all runs can be analyzed simultaneously with high accuracy in glycan identification and quantitation and there is no theoretical limit to the scale of this method.
topic capillary electrophoresis
clustering
data analysis
electropherogram
glycosylation
monoclonal antibodies
peak picking
process development
url https://doi.org/10.3762/bjoc.16.176
work_keys_str_mv AT ianwalsh clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT matthewsfchoo clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT simlynchiin clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT ameliamak clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT shijietay clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT paulinemrudd clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT yangyuansheng clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT andrechoo clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT hoyingswan clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
AT terrynguyenkhuong clusteringandcurationofelectropherogramsanefficientmethodforanalyzinglargecohortsofcapillaryelectrophoresisglycomicprofilesforbioprocessingoperations
_version_ 1721569203810467840