Citation: | Jiangping He, Lihui Lin, Jiekai Chen. Practical bioinformatics pipelines for single-cell RNA-seq data analysis[J]. Biophysics Reports, 2022, 8(3): 158-169. doi: 10.52601/bpr.2022.210041 |
Anders S, Pyl PT, Huber W (2015) HTSeq - A Python framework to work with high-throughput sequencing data. Bioinformatics 31(2): 166−169 doi: 10.1093/bioinformatics/btu638
|
Andrews TS, Kiselev VY, McCarthy D, Hemberg M (2021) Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data. Nat Protoc 16(1): 1−9 doi: 10.1038/s41596-020-00409-w
|
Armingol E, Officer A, Harismendy O, Lewis NE (2021) Deciphering cell-cell interactions and communication from gene expression. Nat Rev Genet 22(2): 71−88
|
Bacher R, Kendziorski C (2016) Design and computational analysis of single-cell RNA-sequencing experiments. Genome Biol 17: 63. https://doi.org/10.1186/s13059-016-0927-y
|
Bais AS, Kostka D (2020) scds: computational annotation of doublets in single-cell RNA sequencing data. Bioinformatics 36(4): 1150−1158 doi: 10.1093/bioinformatics/btz698
|
Baran-Gale J, Chandra T, Kirschner K (2018) Experimental design for single-cell RNA sequencing. Brief Funct Genomics 17(4): 233−239 doi: 10.1093/bfgp/elx035
|
Barkas N, Petukhov V, Nikolaeva D, Lozinsky Y, Demharter S, Khodosevich K, Kharchenko PV (2019) Joint analysis of heterogeneous single-cell RNA-seq dataset collections. Nat Methods 16(8): 695−698 doi: 10.1038/s41592-019-0466-z
|
Becht E, McInnes L, Healy J, Dutertre CA, Kwok IWH, Ng LG, Ginhoux F, Newell EW (2018) Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol. https://doi.org/10.1038/nbt.4314
|
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech10: P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008
|
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30(15): 2114−2120 doi: 10.1093/bioinformatics/btu170
|
Brennecke P, Anders S, Kim JK, Kolodziejczyk AA, Zhang X, Proserpio V, Baying B, Benes V, Teichmann SA, Marioni JC, Heisler MG (2013) Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10(11): 1093−1095 doi: 10.1038/nmeth.2645
|
Browaeys R, Saelens W, Saeys Y (2020) NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 17(2): 159−162 doi: 10.1038/s41592-019-0667-5
|
Butler A, Hoffman P, Smibert P, Papalexi E, Satija R (2018) Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36(5): 411−420 doi: 10.1038/nbt.4096
|
Buttner M, Miao Z, Wolf FA, Teichmann SA, Theis FJ (2019) A test metric for assessing single-cell RNA-seq batch correction. Nat Methods 16(1): 43−49 doi: 10.1038/s41592-018-0254-1
|
Cao J, Cusanovich DA, Ramani V, Aghamirzaie D, Pliner HA, Hill AJ, Daza RM, McFaline-Figueroa JL, Packer JS, Christiansen L, Steemers FJ, Adey AC, Trapnell C, Shendure J (2018) Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361(6409): 1380−1385 doi: 10.1126/science.aau0730
|
Clarke ZA, Andrews TS, Atif J, Pouyabahar D, Innes BT, MacParland SA, Bader GD (2021) Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat Protoc 16(6): 2749−2764 doi: 10.1038/s41596-021-00534-0
|
Cole MB, Risso D, Wagner A, DeTomaso D, Ngai J, Purdom E, Dudoit S, Yosef N (2019) Performance assessment and selection of normalization procedures for single-cell RNA-Seq. Cell Syst 8(4): 315−328 doi: 10.1016/j.cels.2019.03.010
|
Crow M, Paul A, Ballouz S, Huang ZJ, Gillis J (2018) Characterizing the replicability of cell types defined by single cell RNA-sequencing data using MetaNeighbor. Nat Commun 9(1): 884. https://doi.org/10.1038/s41467-018-03282-0
|
Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1): 15−21 doi: 10.1093/bioinformatics/bts635
|
Efremova M, Vento-Tormo M, Teichmann SA, Vento-Tormo R (2020) CellPhoneDB: inferring cell-cell communication from combined expression of multi-subunit ligand-receptor complexes. Nat Protoc 15(4): 1484−1506 doi: 10.1038/s41596-020-0292-x
|
Feng H, Lin L, Chen J (2022) scDIOR: single cell RNA-seq data IO software. BMC Bioinformatics 23(1): 16. https://doi.org/10.1186/s12859-021-04528-3
|
Goke J, Lu X, Chan YS, Ng HH, Ly LH, Sachs F, Szczerbinska I (2015) Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells. Cell Stem Cell 16(2): 135−141 doi: 10.1016/j.stem.2015.01.005
|
Grindberg RV, Yee-Greenbaum JL, McConnell MJ, Novotny M, O'Shaughnessy AL, Lambert GM, Arauzo-Bravo MJ, Lee J, Fishman M, Robbins GE, Lin X, Venepally P, Badger JH, Galbraith DW, Gage FH, Lasken RS (2013) RNA-sequencing from single nuclei. Proc Natl Acad Sci USA 110(49): 19802−19807 doi: 10.1073/pnas.1319700110
|
Guo L, Lin L, Wang X, Gao M, Cao S, Mai Y, Wu F, Kuang J, Liu H, Yang J, Chu S, Song H, Li D, Liu Y, Wu K, Liu J, Wang J, Pan G, Hutchins AP, Liu J, Pei D, Chen J (2019) Resolving cell fate decisions during somatic cell reprogramming by single-cell RNA-Seq. Mol Cell 73(4): 815−829 doi: 10.1016/j.molcel.2019.01.042
|
Haghverdi L, Lun ATL, Morgan MD, Marioni JC (2018) Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 36(5): 421−427 doi: 10.1038/nbt.4091
|
Hainer SJ, Boskovic A, McCannell KN, Rando OJ, Fazzio TG (2019) Profiling of pluripotency factors in single cells and early embryos. Cell 177(5): 1319−1329 doi: 10.1016/j.cell.2019.03.014
|
He J, Babarinde IA, Sun L, Xu S, Chen R, Shi J, Wei Y, Li Y, Ma G, Zhuang Q, Hutchins AP, Chen J (2021) Identifying transposable element expression dynamics and heterogeneity during development at the single-cell level with a processing pipeline scTE. Nat Commun 12(1): 1456. https://doi.org/10.1038/s41467-021-21808-x
|
He J, Cai S, Feng H, Cai B, Lin L, Mai Y, Fan Y, Zhu A, Huang H, Shi J, Li D, Wei Y, Li Y, Zhao Y, Pan Y, Liu H, Mo X, He X, Cao S, Hu F, Zhao J, Wang J, Zhong N, Chen X, Deng X, Chen J (2020) Single-cell analysis reveals bronchoalveolar epithelial dysfunction in COVID-19 patients. Protein Cell 11(9): 680−687 doi: 10.1007/s13238-020-00752-4
|
Hie B, Bryson B, Berger B (2019) Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol 37(6): 685−691 doi: 10.1038/s41587-019-0113-3
|
Jiang L, Chen H, Pinello L, Yuan GC (2016) GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol 17(1): 144. https://doi.org/10.1186/s13059-016-1010-4
|
Jin S, Guerrero-Juarez CF, Zhang L, Chang I, Ramos R, Kuan CH, Myung P, Plikus MV, Nie Q (2021) Inference and analysis of cell-cell communication using CellChat. Nat Commun 12(1): 1088. https://doi.org/10.1038/s41467-021-21246-9
|
Kaminow B, Yunusov D, Dobin A (2021) STARsolo: accurate, fast and versatile mapping/quantification of single-cell and single-nucleus RNA-seq data. bioRxiv. https://doi.org/10.1101/2021.05.05.442755
|
Kiselev VY, Kirschner K, Schaub MT, Andrews T, Yiu A, Chandra T, Natarajan KN, Reik W, Barahona M, Green AR, Hemberg M (2017) SC3: consensus clustering of single-cell RNA-seq data. Nat Methods 14(5): 483−486 doi: 10.1038/nmeth.4236
|
Kiselev VY, Yiu A, Hemberg M (2018) scmap: projection of single-cell RNA-seq data across data sets. Nat Methods 15(5): 359−362 doi: 10.1038/nmeth.4644
|
Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh PR, Raychaudhuri S (2019) Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16(12): 1289−1296 doi: 10.1038/s41592-019-0619-0
|
La Manno G, Soldatov R, Zeisel A, Braun E, Hochgerner H, Petukhov V, Lidschreiber K, Kastriti ME, Lonnerberg P, Furlan A, Fan J, Borm LE, Liu Z, van Bruggen D, Guo J, He X, Barker R, Sundstrom E, Castelo-Branco G, Cramer P, Adameyko I, Linnarsson S, Kharchenko PV (2018) RNA velocity of single cells. Nature 560(7719): 494−498 doi: 10.1038/s41586-018-0414-6
|
Lacar B, Linker SB, Jaeger BN, Krishnaswami SR, Barron JJ, Kelder MJE, Parylak SL, Paquola ACM, Venepally P, Novotny M, O'Connor C, Fitzpatrick C, Erwin JA, Hsu JY, Husband D, McConnell MJ, Lasken R, Gage FH (2016) Nuclear RNA-seq of single neurons reveals molecular signatures of activation. Nat Commun 7: 11022. https://doi.org/10.1038/ncomms11022
|
Lafzi A, Moutinho C, Picelli S, Heyn H (2018) Tutorial: guidelines for the experimental design of single-cell RNA sequencing studies. Nat Protoc 13(12): 2742−2757 doi: 10.1038/s41596-018-0073-y
|
Lareau CA, Ma S, Duarte FM, Buenrostro JD (2020) Inference and effects of barcode multiplets in droplet-based single-cell assays. Nat Commun 11(1): 866. https://doi.org/10.1038/s41467-020-14667-5
|
Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10): 733−739 doi: 10.1038/nrg2825
|
Li B, Dewey CN (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12: 323. https://doi.org/10.1186/1471-2105-12-323
|
Lin Y, Ghazanfar S, Wang KYX, Gagnon-Bartsch JA, Lo KK, Su X, Han ZG, Ormerod JT, Speed TP, Yang P, Yang JYH (2019) scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets. Proc Natl Acad Sci USA 116(20): 9775−9784 doi: 10.1073/pnas.1820006116
|
Litvinukova M, Talavera-Lopez C, Maatz H, Reichart D, Worth CL, Lindberg EL, Kanda M, Polanski K, Heinig M, Lee M, Nadelmann ER, Roberts K, Tuck L, Fasouli ES, DeLaughter DM, McDonough B, Wakimoto H, Gorham JM, Samari S, Mahbubani KT, Saeb-Parsy K, Patone G, Boyle JJ, Zhang H, Zhang H, Viveiros A, Oudit GY, Bayraktar OA, Seidman JG, Seidman CE, Noseda M, Hubner N, Teichmann SA (2020) Cells of the adult human heart. Nature 588(7838): 466−472 doi: 10.1038/s41586-020-2797-4
|
Liu J, Gao C, Sodicoff J, Kozareva V, Macosko EZ, Welch JD (2020a) Jointly defining cell types from multiple single-cell datasets using LIGER. Nat Protoc 15(11): 3632−3662 doi: 10.1038/s41596-020-0391-8
|
Liu X, Zhu A, He J, Chen Z, Liu L, Xu Y, Ye F, Feng H, Luo L, Cai B, Mai Y, Lin L, Zhang Z, Chen S, Shi J, Wen L, Wei Y, Zhuo J, Zhao Y, Li F, Wei X, Chen D, Zhang X, Zhong N, Huang Y, Liu H, Wang J, Xu X, Wang J, Chen R, Chen X, Zhong N, Zhao J, Li Y, Zhao J, Chen J (2020b) Single-cell analysis reveals macrophage-driven T cell dysfunction in severe COVID-19 patients. medRxiv. https://doi.org/10.1101/2020.05.23.20100024
|
Liu Y, Wang T, Zhou B, Zheng D (2021) Robust integration of multiple single-cell RNA sequencing datasets using a single reference space. Nat Biotechnol 39(7): 877−884 doi: 10.1038/s41587-021-00859-x
|
Luecken MD, Theis FJ (2019) Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol 15(6): e8746. https://doi.org/10.15252/msb.20188746
|
Lun AT, Bach K, Marioni JC (2016) Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol 17: 75. https://doi.org/10.1186/s13059-016-0947-7
|
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17(1): 3. https://doi.org/10.14806/ej.17.1.200
|
McGinnis CS, Murrow LM, Gartner ZJ (2019) DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst 8(4): 329−337 doi: 10.1016/j.cels.2019.03.003
|
Mohammed H, Hernando-Herraez I, Savino A, Scialdone A, Macaulay I, Mulas C, Chandra T, Voet T, Dean W, Nichols J, Marioni JC, Reik W (2017) Single-cell landscape of transcriptional heterogeneity and cell fate decisions during mouse early gastrulation. Cell Rep 20(5): 1215−1228 doi: 10.1016/j.celrep.2017.07.009
|
Nowotschin S, Setty M, Kuo YY, Liu V, Garg V, Sharma R, Simon CS, Saiz N, Gardner R, Boutet SC, Church DM, Hoodless PA, Hadjantonakis AK, Pe'er D (2019) The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature 569(7756): 361−367 doi: 10.1038/s41586-019-1127-1
|
Paik DT, Cho S, Tian L, Chang HY, Wu JC (2020) Single-cell RNA sequencing in cardiovascular development, disease and medicine. Nat Rev Cardiol 17(8): 457−473 doi: 10.1038/s41569-020-0359-y
|
Papalexi E, Satija R (2018) Single-cell RNA sequencing to explore immune cell heterogeneity. Nat Rev Immunol 18(1): 35−45 doi: 10.1038/nri.2017.76
|
Pliner HA, Shendure J, Trapnell C (2019) Supervised classification enables rapid annotation of cell atlases. Nat Methods 16(10): 983−986 doi: 10.1038/s41592-019-0535-3
|
Potter SS (2018) Single-cell RNA sequencing for the study of development, physiology and disease. Nat Rev Nephrol 14(8): 479−492 doi: 10.1038/s41581-018-0021-7
|
Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, Bodenmiller B, Campbell P, Carninci P, Clatworthy M, Clevers H, Deplancke B, Dunham I, Eberwine J, Eils R, Enard W, Farmer A, Fugger L, Gottgens B, Hacohen N, Haniffa M, Hemberg M, Kim S, Klenerman P, Kriegstein A, Lein E, Linnarsson S, Lundberg E, Lundeberg J, Majumder P, Marioni JC, Merad M, Mhlanga M, Nawijn M, Netea M, Nolan G, Pe'er D, Phillipakis A, Ponting CP, Quake S, Reik W, Rozenblatt-Rosen O, Sanes J, Satija R, Schumacher TN, Shalek A, Shapiro E, Sharma P, Shin JW, Stegle O, Stratton M, Stubbington MJT, Theis FJ, Uhlen M, van Oudenaarden A, Wagner A, Watt F, Weissman J, Wold B, Xavier R, Yosef N, Human Cell Atlas Meeting P (2017) The Human Cell Atlas. Elife 6: e27041. https://doi.org/10.7554/eLife.27041
|
Ren X, Wen W, Fan X, Hou W, Su B, Cai P, Li J, Liu Y, Tang F, Zhang F, Yang Y, He J, Ma W, He J, Wang P, Cao Q, Chen F, Chen Y, Cheng X, Deng G, Deng X, Ding W, Feng Y, Gan R, Guo C, Guo W, He S, Jiang C, Liang J, Li YM, Lin J, Ling Y, Liu H, Liu J, Liu N, Liu SQ, Luo M, Ma Q, Song Q, Sun W, Wang G, Wang F, Wang Y, Wen X, Wu Q, Xu G, Xie X, Xiong X, Xing X, Xu H, Yin C, Yu D, Yu K, Yuan J, Zhang B, Zhang P, Zhang T, Zhao J, Zhao P, Zhou J, Zhou W, Zhong S, Zhong X, Zhang S, Zhu L, Zhu P, Zou B, Zou J, Zuo Z, Bai F, Huang X, Zhou P, Jiang Q, Huang Z, Bei JX, Wei L, Bian XW, Liu X, Cheng T, Li X, Zhao P, Wang FS, Wang H, Su B, Zhang Z, Qu K, Wang X, Chen J, Jin R, Zhang Z (2021) COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas. Cell 184(7): 1895−1913 doi: 10.1016/j.cell.2021.01.053
|
Risso D, Ngai J, Speed TP, Dudoit S (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32(9): 896−902 doi: 10.1038/nbt.2931
|
Risso D, Perraudeau F, Gribkova S, Dudoit S, Vert JP (2018) A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun 9(1): 284. https://doi.org/10.1038/s41467-017-02554-5
|
Saelens W, Cannoodt R, Todorov H, Saeys Y (2019) A comparison of single-cell trajectory inference methods. Nat Biotechnol 37(5): 547−554 doi: 10.1038/s41587-019-0071-9
|
Setty M, Kiseliovas V, Levine J, Gayoso A, Mazutis L, Pe’er D (2019) Characterization of cell fate probabilities in single-cell data with Palantir. Nat Biotechnol 37(4): 451−460 doi: 10.1038/s41587-019-0068-4
|
Setty M, Tadmor MD, Reich-Zeliger S, Angel O, Salame TM, Kathail P, Choi K, Bendall S, Friedman N, Pe'er D (2016) Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat Biotechnol 34(6): 637−645 doi: 10.1038/nbt.3569
|
Stuart T, Butler A, Hoffman P, Hafemeister C, Papalexi E, Mauck WM, 3rd, Hao Y, Stoeckius M, Smibert P, Satija R (2019) Comprehensive integration of single-cell data. Cell 177(7): 1888−1902 doi: 10.1016/j.cell.2019.05.031
|
Stuart T, Satija R (2019) Integrative single-cell analysis. Nat Rev Genet 20(5): 257−272 doi: 10.1038/s41576-019-0093-7
|
Traag VA, Waltman L, van Eck NJ (2019) From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9(1): 5233. https://doi.org/10.1038/s41598-019-41695-z
|
Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL (2014) The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32(4): 381−386 doi: 10.1038/nbt.2859
|
van der Maaten L, Hinton G (2008) Viualizing data using t-SNE. J Mach Learn Res 9: 2579−2605
|
Wang Q, Xiong H, Ai S, Yu X, Liu Y, Zhang J, He A (2019) CoBATCH for high-throughput single-cell epigenomic profiling. Mol Cell 76(1): 206−216 doi: 10.1016/j.molcel.2019.07.015
|
Wolf FA, Angerer P, Theis FJ (2018) SCANPY: large-scale single-cell gene expression data analysis. Genome Biol 19(1): 15. https://doi.org/10.1186/s13059-017-1382-0
|
Wolf FA, Hamey FK, Plass M, Solana J, Dahlin JS, Gottgens B, Rajewsky N, Simon L, Theis FJ (2019) PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol 20(1): 59. https://doi.org/10.1186/s13059-019-1663-x
|
Wolock SL, Lopez R, Klein AM (2019) Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst 8(4): 281−291 doi: 10.1016/j.cels.2018.11.005
|
Yu S, Zhou C, He J, Yao Z, Huang X, Rong B, Zhu H, Wang S, Chen S, Wang X, Cai B, Zhao G, Chen Y, Xiao L, Liu H, Qin Y, Guo J, Wu H, Zhang Z, Zhang M, Zhao X, Lan F, Wang Y, Chen J, Cao S, Pei D, Liu J (2022) BMP4 drives primed to naive transition through PGC-like state. Nat Commun 13(1): 2756. https://doi.org/10.1038/s41467-022-30325-4
|
Zappia L, Phipson B, Oshlack A (2018) Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. PLoS Comput Biol 14(6): e1006245. https://doi.org/10.1371/journal.pcbi.1006245
|
Zhang AW, O'Flanagan C, Chavez EA, Lim JLP, Ceglia N, McPherson A, Wiens M, Walters P, Chan T, Hewitson B, Lai D, Mottok A, Sarkozy C, Chong L, Aoki T, Wang X, Weng AP, McAlpine JN, Aparicio S, Steidl C, Campbell KR, Shah SP (2019) Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling. Nat Methods 16(10): 1007−1015 doi: 10.1038/s41592-019-0529-1
|
Zheng GX, Terry JM, Belgrader P, Ryvkin P, Bent ZW, Wilson R, Ziraldo SB, Wheeler TD, McDermott GP, Zhu J, Gregory MT, Shuga J, Montesclaros L, Underwood JG, Masquelier DA, Nishimura SY, Schnall-Levin M, Wyatt PW, Hindson CM, Bharadwaj R, Wong A, Ness KD, Beppu LW, Deeg HJ, McFarland C, Loeb KR, Valente WJ, Ericson NG, Stevens EA, Radich JP, Mikkelsen TS, Hindson BJ, Bielas JH (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8: 14049. https://doi.org/10.1038/ncomms14049
|