Congratulations to Shila Ghazanfar for leading our team to victory at the inaugral “Oz Single Cells 2018” data analysis challenge.

Utilising ambient RNA and damaged cell profiles for appropriate cell selection in droplet-based single cell transcriptomics

Shila Ghazanfar, Ellis Patrick, Jean Yang

High throughput droplet technology has facilitated simultaneous profiling of entire transcriptomes of thousands of single cells. An important analysis step is to distinguish droplets containing cells of interest, as opposed to containing ambient RNA and heavily damaged cells. Inappropriate selection of cells could result in inconsistencies across multiple datasets and introduce cell-type specific biases, e.g. towards smaller cells with less total RNA. Recent efforts include the emptyDroplets method that distinguishes ‘empty’ droplets from cell-containing droplets [1], but retains damaged cells characterised by high proportion of mitochondrial genes. As a result this necessitates post-hoc removal of the damaged cells, leading to computational costs and significant changes in interpretation of results.

To this end, we propose an extension to the emptyDroplets approach that uses damaged cell profiles to select for cell barcodes that are distinct to both ambient RNA and damaged cells. We applied our method to the Nguyen et al (2018) dataset and found that these approaches result in differences in terms of the number of cells selected, their characteristics, and downstream analysis results. Visual summary suggests that a non-linear approach to selecting cells is more appropriate, and techniques utilising information from all cell barcodes holds promise for quality assessment.

[1] Distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data. Aaron Lun, Samantha Riesenfeld, Tallulah Andrews, The Phuong Dao, Tomas Gomes, participants in the 1st Human Cell Atlas Jamboree, John Marioni. bioRxiv 234872; doi: