Mellon, a single-cell analysis tool, identifies transitory cells and differentiation mechanisms

From the Setty Lab, Basic Sciences Division

As stem cells decide who they want to become, they embark on a journey of gradual differentiation into their final cell fate. This transformation is traditionally depicted by the Waddington landscape conceived by Conrad Waddington in 1957 to illustrate cell-fate commitment.  Here, stem cells are metaphorically depicted as a ball, rolling down a mountainous landscape marked by hills and valleys, symbolizing the progressive restriction of cellular differentiation. While the mountain landscape represents the cell’s entire differentiation process, the valleys are possible differentiation stages, either intermediate or final, while the ridges keep cells from switching between cell fates. With the advances in single-cell technologies, we’re now able to view cells at various states along their differentiation trajectory which can help us understand the molecular mechanisms underlying these cell-fate decisions. But as we’re looking at our UMAPs of leiden clusters, we’re mostly capturing cells that are in their valleys— those intermediate cell fates which are the easiest places for them to hang out. But what about the transitory cells that are climbing that hill into the next valley? Dr. Manu Setty, an Assistant Professor in Fred Hutch’s Basic Sciences Division, explains that classical analyses often miss these transient cells that are in the process of differentiating, reprogramming or even metastasizing. Dr. Dominik Otto, a postdoctoral researcher in the Setty lab sought to develop a new way of analyzing single-cell data to uncover the mechanisms driving these critical transitions.

The problem with single-cell datasets, like single-cell RNA-seq which examine gene expression levels of single cells, is that they are inherently noisy, sparse and are highly dimensional. Current methods have failed to represent these complex datasets in a biologically meaningful way, limiting the identification and interpretation of these transitory cells. To overcome these challenges, the Setty lab developed Mellon, a novel computational algorithm that estimates cell density from single-cell data. This study led by Dr. Otto, is currently available on bioRxiv accepted for publication in Nature Methods.

Illustrative depiction of Mellon. A. Abstract depiction of cellular differentiation landscape where biological processes impact cell-state density. B. Density function visualized as a 3D landscape. C. Single-cell density depiction of high- and low-density regions along differentiation paths.
Illustrative depiction of Mellon. A. Abstract depiction of cellular differentiation landscape where biological processes impact cell-state density. B. Density function visualized as a 3D landscape. C. Single-cell density depiction of high- and low-density regions along differentiation paths. Image taken from original article.

“When we look at a single-cell dataset, we’re looking at a distribution of possible cell states, where each state has a probability of being observed in a biological system,” says Dr. Otto. From this perspective, “one cell is just a sample from the distribution of possible cell states in a tissue,” he adds. Because this is not a complete extraction of a tissue, there is inherent uncertainty with these datasets which is hard to tie into more complex analyses. For example, uncertainty might look like two cells in a tight cluster expressing a specific gene at different levels. It is likely that a cell with an intermediate expression level of the same gene can also exist in the tissue, even if it wasn’t captured in the dataset. Mellon easily and quickly observes the tissue using an approach based on the intrinsic relationship between the distances between cells, known as their nearest neighbor, and the density of cells in that region. The distribution of these nearest-neighbor distances is linked with cell-state density using a Poisson distribution. While other methods interpret single-cell datasets as a collection of discrete cell states, Mellon infers a continuous distribution of cell states to represent the underlying biological system.

The authors first applied Mellon to human blood stem-cell differentiation, known as hematopoiesis. Here they discovered a strong correlation between low-density regions, cell-fate specification, and rapid transcriptional changes driving lineage specification during hematopoiesis. The researchers then investigated the dynamic process of mouse embryo gastrulation and early organ development using Mellon, where they observed that new cell types or lineages typically emerged as transitory low-density cell populations. The authors then applied this method to various single-cell modalities, including scATAC-seq and single cell chromatin modification datasets. Their work uncovered important biological processes, such as enhancer priming, in driving the rapid transcriptional changes in low-density states, while demonstrating the versatility of Mellon with diverse single-cell datasets.

One distinguishing feature of Mellon is its robustness and scalability: Mellon is  capable of handling millions of cells without compromising accuracy. Drs. Otto and Setty agree that one of the biggest challenges of this study was ensuring that Mellon was not only accurate but also computationally efficient. Dr. Otto describes one of the early iterations of this algorithm as being extremely time intensive, with one particular step taking more than 24 hours. Dr. Setty explains that “Dominik took to deriving equations with pencil and paper to make this work.” He adds that these “density equations have to naturally be able to scale to larger cell counts and dimensions,” something that Dr. Otto refers to as “the curse of dimensionality.” In the end, the solution to this time-intensive process actually didn’t cater to dimensionality. “We identified one parameter that was crucial, but was computationally intensive,” Dr. Setty says. Instead of having the algorithm find the precise value for a specific parameter, where the different values do not impact the result all that much, the authors set this parameter to a fixed value that is automatically adapted to any scenario. By making this adjustment, the authors reduced the required computing power so drastically, that it brought this step which had initially taken more than 24 hours, down to just five minutes!

Mellon is the “first computational tool to create a reliable and robust way to compute cell-state density in high dimensional spaces,” states Dr. Setty. With efficiency and robustness in mind while developing, he adds that even though this tool was designed to analyze low-density cell states using single-cell data, this program “makes very few assumptions about the data and there is nothing about it that restricts it to single-cell analysis or even biology in general.” However, in terms of biology, Mellon was able to uncover that cell states are not uniformly distributed across a differentiation trajectory and there is very clear significance to those ridges and valleys which are tightly linked to the Waddington landscape. This computational tool provides a robust way to focus on the mechanistic processes underpinning those cell-fate decisions and provides a new way to understand the biology of these rare transitory cells. While this research has spurred a variety of other projects in the Setty lab, Dr. Otto emphasizes that “many diseases are a disruption to the distributions of cell states and Mellon will be an excellent tool to identify the significance of those differences. This could have important translational applications.”

Mellon is a user-friendly open-source software package available at github.com/settylab/Mellon with complete documentation and tutorials.


This work was supported by the National Institutes of Health and the Brotman Baty Institute.

Fred Hutch/UW/Seattle Children’s Cancer Consortium member Dr. Manu Setty contributed to this work.

Otto D, Jordan C, Dury B, Dien C, Setty M. Quantifying Cell-State Densities in Single-Cell Phenotypic Landscapes using Mellon. bioRxiv [Preprint]. 2023.