Volume 8 Issue 5-6
Dec.  2022
Turn off MathJax
Article Contents
Junjie Hou, Jifeng Wang, Fuquan Yang, Tao Xu. DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data[J]. Biophysics Reports, 2022, 8(5-6): 253-268. doi: 10.52601/bpr.2022.220011
Citation: Junjie Hou, Jifeng Wang, Fuquan Yang, Tao Xu. DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data[J]. Biophysics Reports, 2022, 8(5-6): 253-268. doi: 10.52601/bpr.2022.220011

DIA-MS2pep: a library-free framework for comprehensive peptide identification from data-independent acquisition data

doi: 10.52601/bpr.2022.220011
More Information
  • Corresponding author: houjunjie@ibp.ac.cn (J. Hou); xutao@ibp.ac.cn (T. Xu)
  • Received Date: 28 May 2022
  • Accepted Date: 06 June 2022
  • Available Online: 25 July 2022
  • Publish Date: 31 December 2022
  • Identifying peptides directly from data-independent acquisition (DIA) data remains challenging due to the highly multiplexed MS/MS spectra. Spectral library-based peptide detection is sensitive, but it is limited to the depth of the library and mutes the discovery potential of DIA data. We present here, DIA-MS2pep, a library-free framework for comprehensive peptide identification from DIA data. DIA-MS2pep uses a data-driven algorithm for MS/MS spectrum demultiplexing using the fragments data without the need of a precursor. With a large precursor mass tolerance database search, DIA-MS2pep can identify the peptides and their modified forms. We demonstrate the performance of DIA-MS2pep by comparing it to conventional library-free tools in accuracy and sensitivity of peptide identifications using publicly available DIA datasets of varying samples, including HeLa cell lysates, phosphopeptides, plasma, etc. Compared with data-dependent acquisition-based spectral libraries, spectral libraries built directly from DIA data with DIA-MS2pep improve the accuracy and reproducibility of the quantitative proteome.

  • Junjie Hou, Jifeng Wang, Fuquan Yang and Tao Xu declare that they have no conflict of interest.
    This article does not contain any studies with human or animal subjects performed by any of the authors.

  • loading
  • Bekker-Jensen DB, Bernhardt OM, Hogrebe A, Martinez-Val A, Verbeke L, Gandhi T, Kelstrup CD, Reiter L, Olsen JV (2020a) Rapid and site-specific deep phosphoproteome profiling by data-independent acquisition without the need for spectral libraries. Nat Commun 11(1): 787. https://doi.org/10.1038/s41467-020-14609-1
    Bekker-Jensen DB, Martinez-Val A, Steigerwald S, Ruther P, Fort KL, Arrey TN, Harder A, Makarov A, Olsen JV (2020b) A compact quadrupole-orbitrap mass spectrometer with FAIMS interface improves proteome coverage in short LC gradients. Mol Cell Proteomics 19(4): 716−729 doi: 10.1074/mcp.TIR119.001906
    Chambers MC, Maclean B, Burke R, Amodei D, Ruderman DL, Neumann S, Gatto L, Fischer B, Pratt B, Egertson J, Hoff K, Kessner D, Tasman N, Shulman N, Frewen B, Baker TA, Brusniak MY, Paulse C, Creasy D, Flashner L, Kani K, Moulding C, Seymour SL, Nuwaysir LM, Lefebvre B, Kuhlmann F, Roark J, Rainer P, Detlev S, Hemenway T, Huhmer A, Langridge J, Connolly B, Chadick T, Holly K, Eckels J, Deutsch EW, Moritz RL, Katz JE, Agus DB, MacCoss M, Tabb DL, Mallick P (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30(10): 918−920 doi: 10.1038/nbt.2377
    Chang HY, Kong AT, da Veiga Leprevost F, Avtonomov DM, Haynes SE, Nesvizhskii AI (2020) Crystal-C: a computational tool for refinement of open search results. J Proteome Res 19(6): 2511−2515 doi: 10.1021/acs.jproteome.0c00119
    Chapman JD, Goodlett DR, Masselon CD (2014) Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spectrom Rev 33(6): 452−470 doi: 10.1002/mas.21400
    Creasy DM, Cottrell JS (2004) Unimod: protein modifications for mass spectrometry. Proteomics 4(6): 1534−1536 doi: 10.1002/pmic.200300744
    Du XX, Yang F, Manes NP, Stenoien DL, Monroe ME, Adkins JN, States DJ, Purvine SO, Camp DG, Smith RD (2008) Linear discriminant analysis-based estimation of the false discovery rate for phosphopeptide identifications. J Proteome Res 7(6): 2195−2203 doi: 10.1021/pr070510t
    Eng JK, Mccormack AL, Yates JR (1994) An approach to correlate tandem mass-spectral data of peptides with amino-acid-sequences in a protein database. J Am Soc Mass Spectrom 5(11): 976−989 doi: 10.1016/1044-0305(94)80016-2
    Enserink JM, Kolodner RD (2010) An overview of Cdk1-controlled targets and processes. Cell Div 5: 11. https://doi.org/10.1186/1747-1028-5-11
    Frewen BE, Merrihew GE, Wu CC, Noble WS, MacCoss MJ (2006) Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. Anal Chem 78(16): 5678−5684 doi: 10.1021/ac060279n
    Gessulat S, Schmidt T, Zolg DP, Samaras P, Schnatbaum K, Zerweck J, Knaute T, Rechenberger J, Delanghe B, Huhmer A, Reimer U, Ehrlich HC, Aiche S, Kuster B, Wilhelm M (2019) Prosit: proteome-wide prediction of peptide tandem mass spectra by deep learning. Nat Methods 16(6): 509−518 doi: 10.1038/s41592-019-0426-7
    Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, Bonner R, Aebersold R (2012) Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol Cell Proteomics 11(6): O111 016717. https://doi.org/10.1074/mcp.O111.016717
    Granholm V, Noble WS, Kall L (2011) On using samples of known protein content to assess the statistical calibration of scores assigned to peptide-spectrum matches in shotgun proteomics. J Proteome Res 10(5): 2671−2678 doi: 10.1021/pr1012619
    Helbig AO, Gauci S, Raijmakers R, van Breukelen B, Slijper M, Mohammed S, Heck AJR (2010) Profiling of N-acetylated protein termini provides in-depth insights into the N-terminal nature of the proteome. Mol Cell Proteom 9(5): 928−939 doi: 10.1074/mcp.M900463-MCP200
    Horn H, Schoof EM, Kim J, Robin X, Miller ML, Diella F, Palma A, Cesareni G, Jensen LJ, Linding R (2014) KinomeXplorer: an integrated platform for kinome biology studies. Nat Methods 11(6): 603−604 doi: 10.1038/nmeth.2968
    Hu A, Noble WS, Wolf-Yadlin A (2016) Technical advances in proteomics: new developments in data-independent acquisition. F1000Res 5: F1000 Faculty Rev-419. https://doi.org/10.12688/f1000research.7042.1
    Kiledjian M, Dreyfuss G (1992) Primary structure and binding-activity of the hnRNP U-protein: binding RNA through RGG box. EMBO J 11(7): 2655−2664 doi: 10.1002/j.1460-2075.1992.tb05331.x
    Kong AT, Leprevost FV, Avtonomov DM, Mellacheruvu D, Nesvizhskii AI (2017) MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics. Nat Methods 14(5): 513−520 doi: 10.1038/nmeth.4256
    Kubinyi H (1991) Calculation of isotope distributions in mass-spectrometry — A trivial solution for a nontrivial problem. Anal Chim Acta 247(1): 107−119 doi: 10.1016/S0003-2670(00)83059-7
    Lund SP, Nettleton D, McCarthy DJ, Smyth GK (2012) Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Stat Appl Genet Mol Biol 11(5): /j/sagmb.2012.11.issue-5/1544-6115.1826/1544-6115.1826. xml. https://doi.org/10.1515/1544-6115.1826
    Ma J, Chen T, Wu SF, Yang CY, Bai MZ, Shu KX, Li KL, Zhang GQ, Jin Z, He FC, Hermjakob H, Zhu YP (2019) iProX: an integrated proteome resource. Nucleic Acids Res 47(D1): D1211−D1217 doi: 10.1093/nar/gky869
    MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler DC, MacCoss MJ (2010) Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 26(7): 966−968 doi: 10.1093/bioinformatics/btq054
    Meier F, Brunner AD, Frank M, Ha A, Bludau I, Voytik E, Kaspar-Schoenefeld S, Lubeck M, Raether O, Bache N, Aebersold R, Collins B, Rost HL, Mann M (2020) diaPASEF: parallel accumulation-serial fragmentation combined with data-independent acquisition. Nat Methods 17(12): 1229−1236 doi: 10.1038/s41592-020-00998-0
    Mun DG, Renuse S, Saraswat M, Madugundu A, Udainiya S, Kim H, Park SKR, Zhao H, Nirujogi RS, Na CH, Kannan N, Yates III, Lee SW, Pandey A (2020) PASS-DIA: a data-independent acquisition approach for discovery studies. Anal Chem 92(21): 14466−14475 doi: 10.1021/acs.analchem.0c02513
    Rhee SY, Kim YS (2018) The role of advanced glycation end products in diabetic vascular complications. Diabetes Metab J 42(3): 188−195 doi: 10.4093/dmj.2017.0105
    Risso D, Ngai J, Speed TP, Dudoit S (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32(9): 896−902 doi: 10.1038/nbt.2931
    Rosenberger G, Koh CC, Guo TN, Rost HL, Kouvonen P, Collins B, Heusel M, Liu YS, Caron E, Vichalkovski A, Faini M, Schubert OT, Faridi P, Ebhardt HA, Matondo M, Lam H, Bader SL, Campbell DS, Deutsch EW, Moritz RL, Tate S, Aebersold R (2014) A repository of assays to quantify 10, 000 human proteins by SWATH-MS. Sci Data 1: 140031. https://doi.org/10.1038/sdata.2014.31
    Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, Wolskit W, Collins BC, Malmstrom J, Malmstrom L, Aebersold R (2014) OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol 32(3): 219−223 doi: 10.1038/nbt.2841
    Searle BC, Lawrence RT, MacCoss MJ, Villen J (2019) Thesaurus: quantifying phosphopeptide positional isomers. Nat Methods 16(8): 703−706 doi: 10.1038/s41592-019-0498-4
    Searle BC, Pino LK, Egertson JD, Ting YS, Lawrence RT, MacLean BX, Villen J, MacCoss MJ (2018) Chromatogram libraries improve peptide detection and quantification by data independent acquisition mass spectrometry. Nat Commun 9(1): 5128. https://doi.org/10.1038/s41467-018-07454-w
    Sergushichev A (2020) Fast gene set enrichment analysis. bioRxiv. https://doi.org/10.1101/060012
    Sinitcyn P, Hamzeiy H, Soto FS, Itzhak D, McCarthy F, Wichmann C, Steger M, Ohmayer U, Distler U, Kaspar-Schoenefeld S, Prianichnikov N, Yilmaz S, Rudolph JD, Tenzer S, Perez-Riverol Y, Nagaraj N, Humphrey SJ, Cox J (2021) MaxDIA enables library-based and library-free data-independent acquisition proteomics. Nat Biotechnol 39(12): 1563−1573 doi: 10.1038/s41587-021-00968-7
    Spivak M, Weston J, Bottou L, Kall L, Noble WS (2009) Improvements to the percolator algorithm for peptide identification from shotgun proteomics data sets. J Proteome Res 8(7): 3737−3745 doi: 10.1021/pr801109k
    Taus T, Kocher T, Pichler P, Paschke C, Schmidt A, Henrich C, Mechtler K (2011) Universal and confident phosphorylation site localization using phosphoRS. J Proteome Res 10(12): 5354−5362 doi: 10.1021/pr200611n
    Ting YS, Egertson JD, Bollinger JG, Searle BC, Payne SH, Noble WS, MacCoss MJ (2017) PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data. Nat Methods 14(9): 903−908 doi: 10.1038/nmeth.4390
    Tsou CC, Avtonomov D, Larsen B, Tucholska M, Choi H, Gingras AC, Nesvizhskii AI (2015) DIA-Umpire: comprehensive computational framework for data-independent acquisition proteomics. Nat Methods 12(3): 258−264 doi: 10.1038/nmeth.3255
    Tsou CC, Tsai CF, Teo GC, Chen YJ, Nesvizhskii AI (2016) Untargeted, spectral library-free analysis of data-independent acquisition proteomics data generated using Orbitrap mass spectrometers. Proteomics 16(15-16): 2257−2271 doi: 10.1002/pmic.201500526
    Vizcaino JA, Deutsch EW, Wang R, Csordas A, Reisinger F, Rios D, Dianes JA, Sun Z, Farrah T, Bandeira N, Binz PA, Xenarios I, Eisenacher M, Mayer G, Gatto L, Campos A, Chalkley RJ, Kraus HJ, Albar JP, Martinez-Bartolome S, Apweiler R, Omenn GS, Martens L, Jones AR, Hermjakob H (2014) ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 32(3): 223−226 doi: 10.1038/nbt.2839
    Wang XR, Chen CF, Baker PR, Chen PL, Kaiser P, Huang L (2007) Mass spectrometric characterization of the affinity-purified human 26S proteasome complex. Biochemistry 46(11): 3553−3565 doi: 10.1021/bi061994u
    Yang Y, Liu XH, Shen CP, Lin Y, Yang PY, Qiao L (2020) In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics. Nat Commun 11(1): 146. https://doi.org/10.1038/s41467-019-13866-z
    Zacchi LF, Schulz BL (2019) Data-independent acquisition for yeast glycoproteomics. Methods Mol Biol 2049: 191−202
    Zhang F, Ge W, Ruan G, Cai X, Guo T (2020) Data-independent acquisition mass spectrometry-based proteomics and software tools: a glimpse in 2020. Proteomics 20(17-18): e1900276. https://doi.org/10.1002/pmic.201900276
  • Supplementary material.pdf
  • 加载中


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索


    Article Metrics

    Article views (763) PDF downloads(46) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint