Published on 16 June 2025 |

Version v1.0.0

MS1 feature library-based virtual match-between-runs quantification improves site-specific glycan identification and occupancy ratio analysis

View Dataset
Zhu, He;Fang, Zheng;Ye, Mingliang

Description

Glycosylation changes are closely related to various diseases including cancer.  The quantitative analysis of site-specific glycans at proteomics scale remains challenging due to low glycopeptide spectra interpretation. Here, we present GlyPep-Quant, a tool for sensitive quantification and identification of site-specific glycans. Using a well-trained machine learning model, GlyPep-Quant quantified 25.1%-178.9% more site-specific glycans without missing values than pGlycoQuant, MSFragger-Glyco, and Skyline. To utilize identified information from previous large-scale dataset, an MS1 feature library-based “virtual match-between-runs” quantification scheme was developed, enabling over 8-fold more site-specific glycan identification/quantification than conventional MS2-based methods. Enhanced  coverage prompted the development of a glycoproteomic biomarker discovery method, involving calculation of site-specific glycan abundances ratios at the same glycosylation site, minimizing individual expression and experimental condition variability. Two pairs of site-specific glycan ratios on sites P01011-N127 and P08185-N96, were selected as high-performance biomarkers to classify gastric cancer (GC) from healthy controls (AUC >0.95). Moreover, the two ratios performed well in distinguishing GC using an independent cohort by the library-based quantification strategy with diagnostic accuracy up to 85%. GlyPep-Quant is poised for broader glycoproteomic applications.

Citations (0)

Mentions (0)

Metrics

Dataset Index

0.8

FAIR Score

73%

Citations

0

Mentions

0

Metrics Over Time

Publication Details

DOI

Publisher

Zenodo

Assigned Domain

Subfield

Molecular Biology

Field

Biochemistry, Genetics and Molecular Biology

Domain

Life Sciences

Confidence Score

60%

Source

Scholar Data Model

Normalization Factors

FT

30.77

CTw

1.00

MTw

1.00