Weekly reads 1/12/25
Context is king: from polyclonal beginnings to spatial rules and single-cell phenomics
This week’s papers collectively challenge long-held assumptions about where disease begins, how cells communicate, and what we can learn from spatial and single-cell data. Van Egeren et al. overturn the classic monoclonal-origin model of early colorectal neoplasia, showing that premalignant lesions can emerge from cooperative, genetically diverse clones. Bues et al. introduce IRIS, a breakthrough deterministic platform that unifies high-resolution morphology with full single-cell transcriptomes, revealing hidden phenotypic states and pushing “single-cell phenomics” into practice. Lisek et al. map the spatiotemporal choreography of tumor–stromal interactions in triple-negative breast cancer, highlighting myCAFs as early architects of invasive niches. Cerezo-Wallis et al. build NeuMap, a global atlas that reorganizes neutrophil heterogeneity into conserved functional hubs programmed by cytokines and tissue cues. Finally, Panwar et al. present ClustSIGNAL, an adaptive spatial clustering method that harnesses local neighborhood entropy to better classify cell types in high-resolution spatial transcriptomics.
Preprints/articles that I managed to read this week
Polyclonal origins of human premalignant colorectal lesions
Van Egeren et al. Nature (2025). https://doi.org/10.1038/s41586-025-09930-y
The paper in one sentence
Many pre-cancerous colon polyps in individuals with a hereditary condition arise not from a single mutant cell, but from the collective expansion of multiple, genetically distinct cell populations—challenging the long-standing “monoclonal origin” theory of cancer.
Summary
The prevailing model of cancer initiation holds that a single cell acquires a “driver” mutation (like in the APC gene for colon cancer) and clonally expands to form a tumor. This study provides compelling evidence that this is not always the case. By analyzing 123 tissue samples, including normal colon, benign polyps, dysplastic (pre-cancerous) polyps, and full cancers, from six patients with Familial Adenomatous Polyposis (FAP), the authors discovered that a significant fraction of polyps (40% of benign, 28% of dysplastic) are polyclonal. Using whole-genome sequencing of bulk tissue and single colonic crypts, they showed these lesions are composed of multiple lineages that diverged genetically very early in life, long before the polyp formed. Critically, in these polyclonal polyps, classic driver mutations (like the second APC “hit”) were often found at low, subclonal frequencies, suggesting they were not the initial spark for growth. Instead, the data point toward a model where groups of genetically diverse cells are somehow cooperatively recruited or expanded by tissue-level or microenvironmental signals to initiate a pre-cancerous lesion.
Personal highlights
Why should we care?
This work fundamentally challenges a central dogma in cancer biology: that tumors are born from a single renegade cell. By demonstrating that pre-cancerous lesions often have a polyclonal origin, it forces a paradigm shift in how we think about the earliest stages of cancer. It suggests that the “soil” (the tissue microenvironment and cell-cell interactions) may be as important as the “seed” (a cell with a driver mutation) in initiating growth. For cancer biologists, this opens new avenues of research into cooperative mechanisms between premalignant clones and the role of non-cell-autonomous signaling. For clinicians, it complicates the picture of cancer prevention and early detection, as a genetically diverse starting lesion may be more adaptable and resilient. Ultimately, understanding polyclonal initiation could reveal new, tissue-level vulnerabilities to target for interception before a true, aggressive monoclonal cancer emerges.
IRIS: The Microscope That Can Read a Cell’s Genes
Bues et al. bioRxiv (2025). https://doi.org/10.1101/2025.11.28.690954
The paper in one sentence
IRIS is a new deterministic platform that for the first time seamlessly couples high-resolution, multi-channel microscopy with droplet-based single-cell RNA sequencing, enabling a true one-to-one link between a cell’s detailed morphology and its full transcriptome.
Summary
This preprint introduces IRIS (Interconnected Robotic Imaging and Sequencing), a transformative platform designed to close the long-standing gap between high-resolution cellular imaging and deep molecular profiling. IRIS uses deterministic microfluidics: it optically detects single cells, uses machine vision to stop and precisely position them for high-resolution imaging (brightfield and up to four fluorescence channels), and then encapsulates each cell in a nanoliter droplet containing a unique molecular barcode. Critically, by depositing these barcoded droplets into designated wells of a plate and using a dual-indexing (Cell Code + Well Code) strategy, IRIS guarantees a perfect, traceable link between every captured image and its corresponding transcriptome. The platform operates at a throughput of ~1,150 cells per hour with transcriptome sensitivity rivaling leading commercial methods. The authors demonstrate its power by resolving continuous cell cycle dynamics in FUCCI reporter cells, linking subtle fluorescent reporter intensities to DREAM complex activity and cell cycle speed. In a key discovery application, they molecularly profile two distinct nuclear-ER architectural classes within naïve CD8+ T cells (”stripy” TØ vs. “conventional” TØ), uncovering distinct transcriptional programs linked to activation propensity and validating associated protein expression and surface markers. IRIS establishes a foundational framework for “single-cell phenomics,” where form and molecular function are directly and quantitatively connected.
Personal highlights
Deterministic droplet consortia for scalable, traceable integration: IRIS overcomes the stochastic barcoding limitation of modern high-throughput scRNA-seq by using a deterministic microfluidic workflow. It iteratively encapsulates cells with distinct “Cell Codes” into droplets that are deposited into pre-assigned wells, creating “droplet consortia.” The subsequent addition of a “Well Code” during PCR creates a unique dual-index for each cell, ensuring flawless, end-to-end traceability between a cell’s image and its transcriptome at scale.
Machine vision-guided single-cell manipulation and high-resolution z-stack imaging: the platform uses real-time image subtraction and object detection to identify, decelerate, and precisely position individual cells in a microfluidic channel for imaging. A piezoelectric stage acquires z-stacks across multiple fluorescence channels, providing submicron-resolution 3D morphological data (e.g., nucleus, ER, actin) that is automatically cropped and segmented using YOLO and DeepLabV3 models.
Continuous phenomic dissection of cell cycle dynamics beyond phase annotation: by applying IRIS to FUCCI reporter cells, the authors move beyond discrete cell cycle phases. They compute a continuous “cell cycle angle” from fluorescence intensities, revealing how FUCCI signal strength itself correlates with DREAM complex repression activity and predicts cell cycle progression speed in live-cell tracking, linking a dynamic phenotypic reporter directly to underlying transcriptional regulation.
Morphology-based stratification reveals hidden molecular programs within a single cell type: IRIS’s unique pairing capability allows stratification of cells by morphology before molecular analysis. In naïve CD8+ T cells, this identified the “stripy” TØ architecture (with ER enriched in nuclear invaginations) and linked it to a distinct transcriptional program enriched for TCR signaling modulators, chromatin remodelers, and activation markers—a heterogeneity invisible to transcriptomics alone and validated at the protein level.
Bridging modalities with interpretable machine learning: using simple, interpretable models (ResNet18 CNNs), the authors demonstrate that cell morphology (nuclear staining or whole-cell shape) can predict a subset of gene expression. This analysis not only recapitulated cell cycle genes but also uniquely identified a myofibroblast signature, showing that morphology encodes biological information beyond what is captured by transcriptional proxies alone
Why should we care?
IRIS It addresses a core limitation of modern biology: we can list all the genes a cell expresses or take a high-resolution picture of it, but we rarely know how its specific physical structure directly relates to its molecular machinery. It provides a direct method to move from observing a curious cellular shape to immediately reading out the molecular program that accompanies it, generating testable hypotheses about function. Ultimately, by tightly coupling form and function, IRIS pushes us toward a more complete understanding of the “cellular dogma,” where phenotype and molecular state are inseparable. This has profound implications for discovering new biomarkers, understanding drug effects, and diagnosing disease based on a cell’s physical appearance and its molecular story, told simultaneously.
Spatiotemporal dynamics of tumor microenvironment remodeling
Lisek et al. bioRxiv (2025). doi:10.1101/2025.07.15.662972
The paper in one sentence
By combining high-resolution spatial transcriptomics with a novel triple-negative breast cancer mouse model, this study tracks, in space and time, how early interactions between tumor and stromal cells remodel the tumor microenvironment to drive invasion.
Summary
The study introduces a new triple-negative breast cancer (TNBC) mouse model that develops multifocal, asynchronous tumors along a consistent luminal-to-basal transdifferentiation path. Using spatial transcriptomics and single-nucleus RNA-seq across over 100 mammary ducts, the authors reconstruct the spatiotemporal dynamics of tumor microenvironment (TME) remodeling from pre-invasive to invasive stages. They identify cancer-associated myofibroblasts (myCAFs) as central organizers of pro-invasive extracellular matrix (ECM) remodeling at the tumor–stromal interface and show that myCAFs can steer tumor progression toward aggressive, invasive phenotypes in transplantation experiments. Key signaling axes, including early TGF-β activity and later tenascin-C deposition, orchestrate sequential stromal recruitment and ECM reorganization. The findings are conserved in patient-derived xenografts, suggesting broad relevance for TNBC TME-targeted therapies.
Personal highlights
Spatiotemporal ordering of disease progression: by aligning over 100 ductal lesions along a shared luminal-to-basal trajectory, the study reconstructs TME remodeling as a continuous, stage-resolved process, moving beyond static snapshots to capture dynamic cellular and molecular transitions.
MyCAFs as architects of invasive niches: myCAFs are shown to wrap tightly around advanced tumors, depositing a collagen- and tenascin-C-rich ECM that physically encapsulates invasive EMT-like tumor cells and promotes local invasion.
Early TGF-β signaling initiates stromal reprogramming: TGF-β emerges as a key early signal from tumor cells, driving the conversion of resident fibroblasts into inflammatory CAFs (iCAFs) and recruiting macrophages, a transient signaling axis that is rapidly downregulated as tumors progress.
Stromal zonation at the tumor–stromal interface: spatial mapping reveals a reproducible gradient of stromal phenotypes: iCAFs and macrophages populate distal periductal regions, while myCAFs and TAMs accumulate at the immediate tumor border, creating a spatially organized “hot fibrosis” niche.
Functional validation through co-transplantation: injecting tumor cells together with myCAFs into immunocompetent mice shifts tumor phenotype from keratinizing to EMT-like, enhances ECM remodeling, and accelerates tumor growth, directly demonstrating the tumor-promoting capacity of myCAFs.
Why should we care?
This work provides a dynamic, spatially resolved blueprint of how tumors co-opt their microenvironment from the earliest stages, a view usually missing in clinical samples. By identifying myCAFs as key enablers of invasion and ECM remodeling, it highlights stromal targeting as a promising strategy to slow or prevent tumor progression, especially in aggressive cancers like TNBC
NeuMap: Mapping the Global Neutrophil Compartment Across Tissues and Diseases
Cerezo-Wallis et al. Nature (2025). DOI: 10.1038/s41586-025-09807-0
The paper in one sentence
NeuMap is a single-cell transcriptional atlas that organizes neutrophil diversity into a conserved set of functional hubs across tissues, diseases, and species, revealing how neutrophils are programmed by local cues to adopt specific roles in health and disease.
Summary
This study integrates single-cell RNA-seq data from neutrophils across 47 anatomical, developmental, and pathological conditions in mice, and validates findings in humans, to construct a unified transcriptional map called NeuMap. NeuMap reveals that neutrophils exist in a limited number of functional states (hubs) that are conserved across tissues and species. These hubs are shaped by distinct cytokine signals (e.g., TGFβ, IFNβ, GM-CSF) and transcription factors (e.g., JUNB), which drive neutrophils toward roles in angiogenesis, immunosuppression, antiviral response, or tissue maturation. The study further shows that NeuMap can be used to infer host physiological states from blood neutrophil profiles, highlighting its potential for diagnostic and therapeutic applications.
Personal highlights
Unified atlas of neutrophil transcriptional states: NeuMap integrates 47 biological scenarios to define seven conserved neutrophil hubs, including immunosuppressive, angiogenic, interferon-responsive, and silent states, across tissues, diseases, and species.
Spatial and functional validation of transcriptional hubs: using spatial transcriptomics and cyclic immunofluorescence, the study links specific neutrophil hubs to microanatomical niches: e.g., IS-II neutrophils in tumor borders and IFN-response neutrophils around infected bronchioles.
JUNB as a central regulator of neutrophil programming: neutrophil-specific deletion of Junb disrupts angiogenic and immunosuppressive functions, impairing tissue revascularization and tumor progression, revealing AP-1–driven control of neutrophil polarization.
Deterministic cytokine-driven trajectories: in silico screening and in vitro modeling show that TGFβ, IFNβ, and GM-CSF push neutrophils along distinct transcriptional paths, mirroring in vivo maturation dynamics in health, cancer, and inflammation.
Diagnostic potential via blood neutrophil mapping: By projecting blood neutrophil transcriptomes onto NeuMap, the authors generate disease-specific “barcodes” that can distinguish between infections, cancers, and physiological states like aging or pregnancy.
Why should we care?
Neutrophils are no longer just frontline immune soldiers: they are plastic, multifunctional cells that play decisive roles in cancer, tissue repair, infection, and autoimmunity. NeuMap provides the first global framework to understand how neutrophils adopt these roles. By decoding the “neutrophil language” of cytokines, transcription factors, and tissue cues, this work shifts neutrophils from passive responders to programmable therapeutic agents.
ClustSIGNAL: Adaptive Neighborhood Smoothing for Scalable Spatial Cell-Type Clustering
Panwar et al. bioRxiv (2025). https://doi.org/10.64898/2025.11.30.691081
The paper in one sentence
ClustSIGNAL is a novel spatial clustering method that adaptively smooths gene expression based on local neighborhood heterogeneity to accurately identify cell types and subtypes in high-resolution spatial transcriptomics data.
Summary
ClustSIGNAL addresses the challenges of data sparsity and noise in high-resolution spatial transcriptomics by introducing an adaptive smoothing approach that uses neighborhood entropy to guide gene expression imputation. Unlike existing methods, it tailors the degree of smoothing for each cell based on the heterogeneity of its local environment, smoothing more in homogeneous regions and preserving distinct expression in heterogeneous areas. The method was validated on simulated and real-world datasets, demonstrating superior accuracy, scalability, and robustness to segmentation errors compared to tools like BANKSY, BASS, and SpatialPCA.
Personal highlights
Adaptive neighborhood smoothing via entropy: ClustSIGNAL uses Shannon’s entropy calculated from initial subclusters to measure local heterogeneity and assign cell-specific smoothing weights—enabling flexible, context-aware imputation without over-smoothing.
Balances spatial coherence and transcriptional distinction: in homogeneous regions, smoothing is broad to stabilize expression; in mixed neighborhoods, smoothing is limited to preserve cell identity, effectively embedding spatial context without blurring biological signals.
Robust to sparsity and segmentation errors: maintains high clustering accuracy even under severe data dropouts (up to 90% sparsity) and simulated segmentation errors, outperforming both non-spatial and uniform smoothing approaches.
Scalable multi-sample clustering with batch correction: handles atlas-scale datasets (up to ~900k cells) across multiple samples with integrated Harmony batch correction, producing consistent labels without post-hoc matching.
Identifies biologically meaningful subtypes: successfully recapitulates and refines known cell types in complex tissues (e.g., forebrain subregions in mouse embryo, inhibitory/excitatory neuron subsets in hypothalamus), revealing subtle but distinct subpopulations.
Why should we care?
ClustSIGNAL provides a principled and scalable solution to one of the main challenges in spatial omics: how to leverage spatial context without sacrificing transcriptional resolution. By adaptively integrating neighborhood information, it enables more accurate and interpretable cell typing in tissues with complex architectures.
Other papers that peeked my interest and were added to the purgatory of my “to read” pile
PatchDNA: A Flexible and Biologically-Informed Alternative to Tokenization for DNA
High-confidence structural predictions of extrachromosomal DNA with ecDNAInspector
Genetic barcoding of individual cells links cancer evolutionary trajectories and prognostic outcomes
SpatialProp: tissue perturbation modeling with spatially resolved single-cell transcriptomics
Mapping single-cell diploid chromatin fiber architectures using DAF-seq
Joint imputation and deconvolution of gene expression across spatial transcriptomics platforms
Unified integration of spatial transcriptomics across platforms with LLOKI
Label-free selection of marker genes in single-cell and spatial transcriptomics with geneCover
Accurate predictions on small data with a tabular foundation model
Decay of driver mutations shapes the landscape of intestinal transformation
Thanks for reading.
Cheers,
Seb.


