MS-PyCloud: A Cloud Computing-Based Pipeline for Proteomic and Glycoproteomic Data Analyses

Yingwei Hu, Michael Schnaubelt, Li Chen, Bai Zhang, Trung Hoang, T. Mamie Lih, Zhen Zhang, Hui Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

Rapid development and wide adoption of mass spectrometry-based glycoproteomic technologies have empowered scientists to study proteins and protein glycosylation in complex samples on a large scale. This progress has also created unprecedented challenges for individual laboratories to store, manage, and analyze proteomic and glycoproteomic data, both in the cost for proprietary software and high-performance computing and in the long processing time that discourages on-the-fly changes of data processing settings required in explorative and discovery analysis. We developed an open-source, cloud computing-based pipeline, MS-PyCloud, with graphical user interface (GUI), for proteomic and glycoproteomic data analysis. The major components of this pipeline include data file integrity validation, MS/MS database search for spectral assignments to peptide sequences, false discovery rate estimation, protein inference, quantitation of global protein levels, and specific glycan-modified glycopeptides as well as other modification-specific peptides such as phosphorylation, acetylation, and ubiquitination. To ensure the transparency and reproducibility of data analysis, MS-PyCloud includes open-source software tools with comprehensive testing and versioning for spectrum assignments. Leveraging public cloud computing infrastructure via Amazon Web Services (AWS), MS-PyCloud scales seamlessly based on analysis demand to achieve fast and efficient performance. Application of the pipeline to the analysis of large-scale LC-MS/MS data sets demonstrated the effectiveness and high performance of MS-PyCloud. The software can be downloaded at https://github.com/huizhanglab-jhu/ms-pycloud.

Original languageEnglish (US)
Pages (from-to)10145-10151
Number of pages7
JournalAnalytical Chemistry
Volume96
Issue number25
DOIs
StatePublished - Jun 25 2024

ASJC Scopus subject areas

  • Analytical Chemistry

Fingerprint

Dive into the research topics of 'MS-PyCloud: A Cloud Computing-Based Pipeline for Proteomic and Glycoproteomic Data Analyses'. Together they form a unique fingerprint.

Cite this