SCP2019
11 – 12 June 2019 | Boston, USA
Program
Location: All talks will be at the Cabral Auditorium in the John D. O’Bryant building, 1st floor (40 Leon Street)
Abstracts: Click on the presenter’s name and title to toggle (show/hide) the abstract.
Hashtag: #SCP2019
Monday June 10th: Workshop
- Registration at 1pm -- 2pm
- Nikolai Slavov: Opening remarks: Welcome to the second single-cell proteomics conference
Welcome to the second single-cell proteomics conference
- Harrison Specht: Design of single-cell proteomics experiments
Specht H, Emmott E, Koller T., and Slavov N. We will discuss how to design SCoPE2 (Single-Cell ProteOmics by Mass Spectrometry 2) experiments such that data quality can be rigorously assessed and experimental failure can be specifically diagnosed. This includes built-in quality control metrics for establishing background noise, evaluating measurement consistency, randomizing samples, and benchmarking quantification by using fluorescent proteins or bulk proteomics measurements.
- Edward Emmott: Sample preparation for single-cell MS analysis
Specht H, Emmott E, Koller T, and Slavov N. A major limitation to applying quantitative LC-MS/MS proteomics to small samples, such as single cells, are the losses incurred during sample cleanup. We discuss our revised SCoPE2 pipeline for processing single-cell samples, building on the SCoPE-MS method using mPOP lysis (Specht et al. 2018, bioRxiv). This involves using only mass spectrometry-compatible reagents, as well as a move to freeze-heat lysis of samples. This removes the need for sample cleanup and permits automation and higher throughput sample processing. We will explain how proceed from FACS-sorted single cells in multi-well plates to semi-automated SCoPE2 sample preparation.
- Coffee Break 4pm -- 4:30pm
- Gray Huffman: Optimizing LC-MS/MS analysis with DO-MS
Gray Huffman, Harrison Specht, Albert Chen, Nikolai Slavov We will discuss experimental and computational methods for optimizing single-cell mass-spec analysis. The emphasis will be on methods to establish the optimal settings for any specific experiment rather than on reporting a set of settings optimal for all cases. On the experimental side, we will discuss experimental standards (samples) and methods that we use to maintain and evaluate nLC cleanliness and performance. On the computational side, we will describe how we use DO-MS to optimize our methods. The performance of ultrasensitive LC-MS/MS methods, such as Single-Cell Proteomics by Mass Spectrometry (SCoPE-MS), depends on multiple interdependent parameters. This interdependence makes it challenging to specifically pinpoint bottlenecks in the LC-MS/MS methods and approaches for resolving them. For example, low signal at MS2 level can be due to poor LC separation, ionization, apex targeting, ion transfer, or ion detection. We sought to specifically diagnose such bottlenecks by interactively visualizing data from all levels of bottom-up LC-MS/MS analysis. Many search engines, such as MaxQuant, already provide such data, and we developed an open source platform for their interactive visualization and analysis: Data-driven Optimization of MS (DO-MS). We found that in many cases DO-MS not only specifically diagnosed bottlenecks but also enabled us to rationally optimize them. For example, we used DO-MS to diagnose poor sampling of the elution peak apex and to optimize it, which increased the efficiency of delivering ions for MS2 analysis by 370 %. DO-MS is easy to install and use, and its GUI allows for interactive data subsetting and high-quality figure generation. The modular design of DO-MS facilitates customization and expansion. DO-MS is available for download from GitHub: https://github.com/SlavovLab/DO-MS
- Sung-Huan Yu: The analysis of single cell proteomics in MaxQuant and Perseus
Sung-Huan Yu and Juergen Cox Single cell technology is bringing a huge impact on biological field. Based on it, the mechanisms which cannot be seen in the studies of bulk cells are able to be detected. In order to support single cell proteomics studies, we newly developed several methods and integrated them into MaxQuant and Perseus, which are widely used platforms for the analyses of proteomics data. These novel methods significantly improve the TMT quantification and data normalization. In this presentation, we will introduce these new functions of MaxQuant and Perseus as well as how to use them to analyze single cell proteomics data.
- Nikolai Slavov: Data integration and analysis. Standards for benchmarking quantification.
I will discuss our efforts and progress on miniaturizing and automating sample preparation for single-cell proteomics, with special emphasis on key challenges and the solutions that enabled us to increase the quality of our data and the throughput of our analysis. I will present single-cell protein measurements from monocytes differentiating to macrophages in the context of many standards and benchmarks that give us confidence in the data. Some of the benchmarks suggest that single-cell protein measurements by mass-spec can be much more accurate than single-cell RNA measurements by RNA-seq. The last part of the talk will be devoted to community standards that I believe are essential for the healthy growth of this emerging field. Many of these standards relate to data analysis, so I will share approaches that have proven useful to us and ideas for future approaches that I believe will transform our understanding of biological systems.
Tuesday June 11th
- Registration and breakfast at 8:30am -- 9am
- Nikolai Slavov: High-throughput single-cell proteomics quantifies the emergence of macrophage heterogeneity
Specht H, Perlman DH, Emmott E, Harmange G, Koller T., and Slavov N. The fate and physiology of individual cells are controlled by networks of proteins. Yet, our ability to quantitatively analyze protein networks in single cells has remained limited. To overcome this barrier, we developed SCoPE2. It integrates concepts from Single-Cell ProtEomics by Mass Spectrometry (SCoPE-MS) with automated and miniaturized sample preparation, substantially lowering cost and hands-on time. SCoPE2 uses data-driven analytics to optimize instrument parameters for sampling more ion copies per protein, thus supporting quantification with improved count statistics. These advances enabled us to analyze the emergence of cellular heterogeneity as homogeneous monocytes differentiated into macrophage-like cells in the absence of polarizing cytokines. We used SCoPE2 to quantify over 2,000 proteins in 356 single monocytes and macrophages in about 85 hours of instrument time, and the quantified proteins allowed us to discern single cells by cell type. Furthermore, the data uncovered a continuous gradient of proteome states for the macrophage-like cells, suggesting that macrophage heterogeneity may emerge even in the absence of polarizing cytokines. Our methodology lays the foundation for quantitative analysis of protein networks at single-cell resolution.
- Ruedi Aebersold: Towards single-cell proteomics: Challenges and possible solutions
Biological or clinical phenotypes arise from the biochemical state of a cell or tissue which, in turn, is the result of the composition of biomolecules and their organization in the cell. The biochemical state is largely defined by proteins. The systematic analysis of proteins has therefore been demonstrated to be highly informative.Over recent years mass spectrometry based proteomic analyses have significantly advanced with respect to proteome coverage, reproducibility and accuracy of quantitative proteome maps, sample throughput and amount of sample consumed per analysis. The technology has achieved a state where the analysis of low cell numbers to possibly single cells becomes plausible. However, a number of challenges remain to single cell proteomics. They can broadly be grouped in challenges with sample preparation and workup, mass spectrometric data acquisition and data analysis.In this presentation we will discuss systematic assessment of these issues. We will discuss advanced sample processing methods to minimize sample losses during sample workup, the optimization of data acquisition in SWATH/DIA mode for small sample sizes and the optimization of data analysis strategies optimizing low S/N.
- Coffee Break 10:45 -- 11am
- Alex Kentsis: Ultrasensitive pathway-scale quantitative functional proteomics using the MSK Quantitative Cell Proteomics Atlas
The advent of molecular biology and molecular profiling in clinical medicine has transformed our understanding of the molecular basis of human cancer. As a result, we are increasingly improving the classification of human tumors based on their specific genetic and molecular mechanisms of pathogenesis. However, currently only a small number of mutant alleles guide treatment decisions, while most observed mutations remain of unknown pathologic and clinical significance. In addition, even for recently approved drugs, such as those targeting activated kinase signaling, clinical efficacy is highly varied, with no currently satisfactory means to identify molecular markers of response and resistance. Quantitative measurements of the abundance of proteins and stoichiometry of their regulatory post-translational modifications can be used to determine activation states of of pathways and cells. However, current quantitative mass spectrometry techniques are limited by peptide ion fragmentation, duty cycles that restrict assays to about 100 proteins, and limited scalability to permit high throughput clinical applications. To address this need, we have recently developed a new method with 3 orders of magnitude improvement in sensitivity, termed accumulated ion monitoring (AIM). Using AIM, we developed the Quantitative Cell Proteomics Atlas (http://qcpa.mskcc.org) for functional profiling of biochemical processes mediating normal and pathologic cell functions. We will describe how this technology permits highly multiplexed, quantitative analysis of the expression and biochemical activity of thousands of proteins, covering most recurrently mutated and known pathogenic pathways in cancer cells, and designed to be applied to clinically-accessible, microgram patient specimens and rare populations of as few as thousands of cells.
- Sue Abbatiello: Development and Evaluation of FAIMS for Digging Deeper into the Human Proteome
High-Field Asymmetric Waveform Ion Mobility Spectrometry (FAIMS) is a technology that was developed for selectively passing ions of interest into the mass spectrometer, effectively improving the signal-to-noise of analytes. FAIMS sits between the ion source and the mass spectrometer to separate ions in the gas phase based on their mobility in an oscillating electric field at atmospheric conditions. Over the past decade, improvements to the technology have been made to minimize losses in ion transmission and to make execution much easier and more readily applied to nanoflow proteomics applications. This presentation will cover the evolution of FAIMS and its evaluation for bottom-up and top-down proteomics applications, highlighting its benefits for detecting lower-abundant species in complex samples from cells and plasma.
- Lunch and Poster Session 12:30 -- 2pm
- Sedide Ozturk: High Throughput Single Cell Analysis of Proteins and RNA via Quantum Barcoding
- Alexander Ivanov: High sensitivity proteomic, phosphoproteomic, and glycomic profiling of limited samples using ultra-low flow separations coupled to mass spectrometry
- Purushottam Dixit: Maximum Entropy Framework for Inference of Cell Population Heterogeneity in Signaling Networks
- Ákos Végvári: How Quantitative Is Single Cell Proteomics?
- Coffee Break 4pm -- 4:30pm
- Konrad Loehr: Towards quantitative high throughput single cell LA-ICP-TOF-MS
- Camille Lombard-Banek: Microcapillary Sampling of Cells to Study In-vivo Proteomic Cell-to-Cell Heterogeneity in Embryos
Cell-to-cell heterogeneity is critical for proper embryonic development and brain function. Understanding how proteins differ from one cell to another opens new frontiers to understand the biochemistry of heterogenous systems like embryos or the brain. However, to characterize proteins in single-cells, new analytical tools are needed for the reproducible sampling of identified cells and sensitive proteomic measurements. To address this technological gap, we have developed a strategy to microsample cells and hyphenated the approach to our custom-built capillary electrophoresis-electrospray ionization high resolution mass spectrometer (CE-ESI-HRMS). Using a pulled borosilicate capillary, which geometry is optimized for the type of cell sampled, we have collected protein contents from embryonic cells of different animal models. We first applied the approach to decipher spatial and temporal protein differences in live developing frog embryos. Using this approach, we found differences between the animal and vegetal poles of the embryos. We extended the sampling method to uncover protein differences in clones of the neural-fated cells in developing embryos. From the quantification of ~450 protein groups, we identified protein trends across clones at four different stages:16-, 32-, 64-, and 128-cell stages. Moreover, we applied the approach to sample cells in the 2-cell zebrafish embryos, which are morphologically very different. We identified ~400 protein groups from ~30 pg of material measured. In conclusion, this approach is widely applicable to many cell-types and opens new opportunity for cell and developmental biology.
- Luca Gerosa: Single-cell ERK signaling dynamics drive adaptive drug resistance of BRAF V600E cancers
- Bogdan Budnik: "SCoPED-MS novel method to detect cell populations based on proteome level changes on single cell level."
- Dinner for all attendees 6:30 -- 9:30 pm | Alumni Center (building 64)
Maeve O’Huallachain, Mary Shen, Simun Xu, Garry P. Nolan, Carolina Dallet, Sri Paladagu, Jan Berka, Sedide Ozturk
Introduction
Single cell analysis can resolve differences between cells within heterogeneous populations (i.e. most clinical samples) which are otherwise masked in bulk analysis. Cancer immunotherapy and hematologic oncology are a few examples where single cell information gave remarkable insights towards effective personalized therapies. Here, we describe Quantum Barcoding (QBC) Technology which enables simultaneous, high throughput single-cell analysis of proteins and RNA.
Methods
The method is based on vastly parallel cell barcoding via stepwise combinatorial tag generation. The cell-specific DNA barcodes are then linked to multiple biomarkers on or within a given cell and read using high-throughput DNA sequencing. For the protein expression assay, antibodies are labeled with oligonucleotides containing unique barcode identifiers which later act as “recall” sequences when they are bound to cells. For RNA measurements, the sequence of mRNA is its own tag.
Results
Roche Sequencing Solutions (RSS) demonstrated simultaneous single-cell tagging of several million cells with accurate measurements of mRNA and protein. Approximately 50,000 cells were analyzed during each sequencing run to achieve adequate sequencing depth per cell. Protein markers were measured in mouse spleen cells, mouse bone marrow cells and peripheral blood mononuclear cells (PBMC)s from healthy donors. The results matched the expected expression levels in these diverse immune cell populations. Sixteen mRNA targets were measured in a human T-cell line and a human pre-B cell line, Jurkat and Nalm-6, respectively. RNA targets specific to each of the cell lines were accurately measured in only the appropriate cell type.
Conclusion
Oligonucleotide labeling of antibodies not only circumvents the multiplexing limitations of cytometry based single cell analysis platforms but also enables the combination of highly multiplex protein marker detection with transcriptome profiling in single cells. Together with its high throughput quantitative single cell barcoding methodology, QBC emerges as a cost-effective method for analyzing multiple cellular markers in millions of cells.
Informative proteomic characterization of limited samples (e.g., small populations of rare cells, microneedle biopsies, extracellular vesicles (EVs) isolated from minute volumes of physiological fluids, or even single cells) and especially, profiling of post-translational modifications, e.g., glycosylation and phosphorylation, of such specimens have been a major challenge because of very low abundance and high heterogeneity in biological matrices. With the advent of more powerful separation techniques coupled to more sensitive, higher duty cycle mass spectrometers, and more advanced data processing platforms, analysis of such limited samples is getting more feasible. In this study, we explored several electric field- and pressure-driven ultra-low flow separation approaches coupled to mass spectrometry to enhance the sensitivity and depth of proteomic, glycomic, and phosphoproteomic profiling of several types of limited biological specimens. The acquired results demonstrate the potential applicability of the developed techniques in single cell proteomic studies.
Purushottam Dixit, Eugenia Lyashenko, Mario Niepel, Dennis Vitkup
Predictive models of signaling networks are essential tools for understanding cell population heterogeneity and designing rational interventions in disease. However, using network models to predict signaling dynamics heterogeneity is often challenging due to the extensive variability of network parameters across cell populations. Here, we describe a Maximum Entropy-based fRamework for Inference of heterogeneity in Dynamics of sIgnAling Networks (MERIDIAN). MERIDIAN allows us to estimate the joint probability distribution over network parameters that is consistent with experimentally observed cell-to-cell variability in abundances of network species. We apply the developed approach to investigate the heterogeneity in the signaling network activated by the epidermal growth factor (EGF) and leading to phosphorylation of protein kinase B (Akt). Using the inferred parameter distribution, we also predict heterogeneity of phosphorylated Akt levels and the distribution of EGF receptor abundance hours after EGF stimulation. We discuss how MERIDIAN can be generalized and applied to problems beyond modeling of heterogeneous signaling dynamics.
Ákos Végvári, Jin Wang, Roman A. Zubarev
Today’s proteomics affords identification and quantification of ≥10,000 proteins in bulk biological samples. However, the large dynamic range of protein concentrations together with the limited sensitivity of current mass spectrometers do not permit the analysis of full cellular proteomes in mammalian samples. Additionally, capturing even the abundant part of the proteome of a single mammalian cell (ca 0.2 ng of protein) turned out to be quite challenging.
The single cell proteomic strategy recently introduced by the Boston group is based on the use of isobaric tandem mass tag (TMT) together with the “carrier proteome” (CP). This strategy seems to have for the first time allowed one to identify and quantify hundreds of proteins obtained from isolated cells and probe the heterogeneity of their proteomes. However, the magic of this approach remained largely unexplained; in particular, the CP’s role in boosting the sensitivity. Questions also remain about the quantitative nature of single cell proteomics.
Hence, we have designed experiments to investigate the correlations of the measured fold changes (FCs) of proteins in single cells with the corresponding FCs in bulk samples. Two TMT-based methods, with and without CP, were compared. The samples were obtained in four replicates from RKO cancer cells treated with methotrexate (MTX) at IC50 level for 48 h. Experiments were performed on a brand-new Orbitrap Lumos using a range of protein concentrations obtained by serial dilution.
In the CP-free method, 8, 40, 200 and 1000 ng samples were injected on column (corresponding to 5, 25, 125 and 625 cells in each TMT channel, respectively), resulting in linearly decreased numbers of identified proteins and peptides from 200 ng to 8 ng loaded, in proportion with the number of acquired MS2 spectra. The quantitative similarity with the bulk sample (1 µg loaded), measured as Pearson’s correlation between the FCs, was also decreased, as expected, but maintained significance to the lowest loaded level for both protein and peptide FCs. In addition, the significantly upregulated MTX target protein (DHFR) used as an indicator of the analytical performance, showed a gradually decreasing rank with lowering of the amount injected (ranked 8 out of 3728 in 1000 ng and 20 out of 3045 in 200 ng loaded), not being identified for 40 ng and 8 ng sample loads (1347 and 199 protein IDs, respectively).
For the CP-based method with a 200-fold enhanced load in the TMT-Zero channel, the numbers of identified proteins were always higher for the same sample amounts in the TMT-10 channels (“single-cell channels”) in comparison with the CP-free method. At a single cell level, the CP-based method quantified 774 proteins, while the CP-free method produced no identifications. On average, it was found that the 200-fold CP offers an order of magnitude improvement in detection threshold for “single-cell channels”. The quantitative aspect, i.e. statistically significant correlation with bulk proteome FCs, was preserved in the CP-based method down to the level of single cells.
Our results indicate that the FC-based approach has a great analytical potential. However, further improvements in methodology are desirable to obtain reliable quantification for >1000 proteins at a single cell level.
Konrad Löhr, Olga Borovinskaya, Guilhem Tourniaire, Ulrich Panne, Norbert Jakubowski
Analysis of single cells via LA-ICP-TOF-MS is a technique with great potential for multidimensional analysis of a cells’ metallome and also its proteome using elemental markers. However, widespread use of this technique is hampered by its relatively low sample throughput due to laborious manual cell targeting. To circumvent these limitations, cell microarraying approaches were previously demonstrated. Indeed, if one aims to create a microarray of single cells via spotting a suitably diluted cell suspension, one will observe a Poisson-distributed cell number per spot. In this work, we investigated the use of a commercial non-contact piezo dispenser system (sciFLEXARRAYER S3, Scienion AG, Berlin), equipped with a novel technology for accurate single-cell isolation called cellenONE (Cellenion, Lyon). The latter overcomes Poisson distribution using optical monitoring of cells inside the piezo dispense capillary (PDC) and automated selection and dispensing of droplets containing only one cell to obtain true single cell arrays. In order to demonstrate the benefits of this new platform, THP-1 cells were stained with two elemental dyes, mDOTA-Ho (CheMatech, Dijon), and Ir-DNA intercalator (Fluidigm, San Francisco) which were subsequently quantified at single cell resolution via LA-ICP-TOF-MS (Analyte G2, Teledyne CETAC Technologies; icpTOF, TOFWERK). This novel approach allowed efficient and automated quantitative single cell analysis by LA-ICP-TOF-MS.
Luca Gerosa, Christopher Chidley, Fabian Froehlich, Gabriela Sanchez Figueroa, Sang Kyun Lim, H. Steven Wiley, Peter K. Sorger
Cancer cells treated with targeted inhibitors of oncogenic pathways can escape treatment through homeostatic adaptation of their signaling networks, a phenomenon termed ‘adaptive resistance’. Our limited ability to predict the response of signaling pathways to drug perturbations is a key obstacle to design drug strategies that can prevent adaptive resistance. Here, we use experiments and computational modeling to build predictive models of drug adaptation in colorectal, thyroid and skin cancers bearing BRAF V600E, a mutation that is present in up to 50% of these cancers and is responsible for hyper-activation of the pro-growth RAF/MEK/ERK signaling pathway. We hypothesize that adaptive resistance to targeted kinase inhibitors in these cancers is governed by their lineage-specific receptor dynamics and feedback regulation strengths. By incorporating the biochemistry of ERK signaling and the mechanisms of action of targeted drugs into an Ordinary Differential Equation model, we reproduced the adaptive response of these cancers to targeted inhibitors. To validate and extend the model, we generated time-course, single-cell data using multiplexed immunofluorescence and live-cell imaging and discovered that single-cell ERK signaling dynamics determine the adaptive drug resistance of these cancers.
Wednesday June 12th
- Registration and breakfast at 8:30am -- 9am
- Peter Kharchenko: Joint analysis of heterogeneous single-cell dataset collections
Single-cell RNA-seq assays are being increasingly applied in complex study designs, which involve measurements of many samples, commonly spanning multiple individuals, conditions, or tissue compartments. Joint analysis of such extensive, and often heterogeneous, sample collections requires a way of identifying and tracking recurrent cell subpopulations across the entire collection. We describe a flexible approach, called Conos (Clustering On Network Of Samples), that relies on multiple plausible inter-sample mappings to construct a global graph connecting all measured cells. The graph can then be used to propagate information between samples and to identify cell communities that show consistent grouping across broad subsets of the collected samples. Conos results enable investigators to balance between resolution and breadth of the detected subpopulations. In this way, it is possible to focus on the fine-grained clusters appearing within more similar subsets of samples, or analyze coarser clusters spanning broader sets of samples in the collection. We show its applications to integrated analysis of clinically-oriented single-cell transcriptional panels, timeseries, atlas-like collections, and integration across different molecular modalities.
- Savas Tay: Proximity Ligation Sequencing for Single Cell Proteomics
There is a great demand for a proteomic counterpart to RNA sequencing for high-throughput single cell studies. Proximity ligation assay (PLA) allows simultaneous detection of single proteins and protein complexes both in solution and in solid phase. We use DNA barcoded PLA probes to detect the abundance of proteins, protein complexes as well as protein modifications in single cells. The main advantage of this method is almost unlimited multiplexing potential, the ability of detecting protein complexes, and seamless integration to existing sequencing pipelines, allowing quantification of both proteins and nucleic acids in the same single cells.
- Coffee Break 10:45 -- 11am
- Jürgen Cox: Support for single cell analysis in the MaxQuant and Perseus software platforms
Jürgen Cox MaxQuant is a popular software platform for the analysis of shotgun proteomics data. Recently, it has been demonstrated that mass spectrometry-based single cell proteomics is feasible and will hopefully become a scalable technology in the future. We are planning to extend the MaxQuant and Perseus platforms in order to support single cell studies. Since the biggest challenge for single cell proteomics is to provide sufficient sensitivity, we offer new functionalities in MaxQuant to address this problem. These include improved TMT quantification making use of reporter ions in unidentified MS/MS spectra and a new version of the Andromeda search engine which utilizes MS/MS fragment intensity prediction to increase the number of identified spectra. New plugins are developed for the Perseus platform in order to enable the downstream analysis of single cell data, both for proteomics and transcriptomics.
- Rune Linding: Probability-based detection of phosphoproteomic uncertainty reveals rare signaling events driven by oncogenic kinase gene fusion
Xavier Robin, Franziska Voellmy, Jesper Ferkinghoff-Borg, Conor Howard, Tom Altenburg, Mathias Engel, Craig D. Simpson, Gaye Saginc, Simon Koplev, Edda Klipp, James Longden, Rune Linding We describe a novel Bayesian method for estimating protein concentration and phosphorylation site occupancy ratios from mass spectrometry experiments. Our variance model assigns standard deviations to all quantitative ratios, even when only a single peptide is observed, increasing the number of quantifiable observations in a sample compared to conventional methods. We further demonstrate the application of this method using a dataset investigating the impact of the PRKAR1A-RET gene fusion in immortalized thyroid cells.
- Lunch and Poster Session 12:30 -- 2pm
- Yuval Kluger: High dimensional approaches for analyzing single cells datasets
- Sara Rouhanifard: Building a toolbox of single-cell technologies for nucleic acid detection
- Albert Chen: DART-ID increases single-cell proteome coverage
- Coffee Break 4pm -- 4:30pm
- Evan Macosko: Revealing new cell types and states in the brain with scalable, single-cell genomics
- All: Breakout discussions 5:30 pm -- 6 pm
- Summary from the breakout discussions 6 pm -- 6:30 pm
- Closing remarks
Uri Shaham, George Linderman, Ofir Lindenbaum, Boaz Nadler, Ariel Jaffe and Yuval Kluger
High throughput single cell techniques introduce new challenges such as dimensional reduction and visualization of datasets with millions of cells, batch effects, missing values etc. We provide several algorithmic solutions for efficient linear and nonlinear dimensional reduction techniques as well as visualization techniques. We also provide nonlinear multivariate deep learning technique for removal of batch effects in scRNA-seq and mass cytometry data.
A key challenge in bioinformatics is how to rank and combine the possibly conflicting predictions of several algorithms, of unknown reliability. We provide new mathematical insights of striking conceptual simplicity that explain mutual relationships between independent classifiers/algorithms. These insights enable the design of efficient, robust and reliable methods to rank the classifiers performances and construct improved predictions in the absence of ground truth.
Sara H Rouhanifard, Ian A Mellis, Margaret Dunagin, Sareh Bayatpour, Connie L Jiang, Ian Dardani, Orsolya Symmons, Benjamin Emert, Eduardo Torre, Allison Cote, Alessandra Sullivan, John A Stamatoyannopoulos & Arjun Raj
Single-molecule RNA fluorescence in situ hybridization (RNA FISH), which enables the direct detection of individual RNA molecules, has emerged as a powerful technique for measuring both RNA abundance and localization in single cells. Yet, while single molecule RNA FISH is simple and robust, the total signal generated by single molecule RNA FISH probes is low, thus requiring high-powered microscopy for detection. This keeps throughput relatively low and precludes the ability to use downstream detection methods such as flow cytometry. As such, high efficiency, high gain amplification methods for single molecule FISH signal could enable a host of new applications. I will present click-amplifying fluorescent in situ hybridization (clampFISH), which allows high specificity and high-gain signal amplification of fluorescent signal with single-molecule resolution. I will demonstrate that clampFISH enables a broad range of basic and translational applications, and creates a unique opportunity to pursue new research avenues for gene expression in disease and development.
Albert T. Chen, Alexander Franks, Nikolai Slavov
Analysis by liquid chromatography and tandem mass spectrometry can identify and quantify thousands of proteins in microgram-level samples, such as those comprised of thousands of cells. This process, however, remains challenging for smaller samples, such as the proteomes of single mammalian cells, because reduced protein levels reduce the number of confidently sequenced peptides. To alleviate this reduction, we developed Data-driven Alignment of Retention Times for IDentification (DART-ID). DART-ID implements principled Bayesian frameworks for global retention time (RT) alignment and for incorporating RT estimates towards improved confidence estimates of peptide-spectrum-matches. When applied to bulk or to single-cell samples, DART-ID increased the number of data points by 30 – 50% at 1% FDR, and thus decreased missing data. Benchmarks indicate excellent quantification of peptides upgraded by DART-ID and support their utility for quantitative analysis, such as identifying cell types and cell-type specific proteins. The additional datapoints provided by DART-ID boost the statistical power and double the number of proteins identified as differentially abundant in monocytes and T-cells. DART-ID can be applied to diverse experimental designs and is freely available at http://github.com/SlavovLab/DART-ID.
Samuel G. Rodriques, Robert R. Stickels, Aleksandrina Goeva, Carly A. Martin, Evan Murray, Charles R. Vanderburg, Joshua Welch, Linlin M. Chen, Fei Chen, Evan Z. Macosko
Exciting developments in next generation sequencing, microfluidics, and microscopy have spurned an era of new technologies to measure gene expression in individual cells and in tissues. I will discuss our technological contributions—in the space of single-cell gene expression analysis, as well as a new technology we developed, in collaboration with Fei Chen’s lab, called Slide-seq, which quantifies genome-wide expression at 10 micron spatial resolution. I’ll also highlight some areas of biology in which we are particularly focused on deploying these new tools.