Weekly reads 9/3/26
How tumors adapt and how new tools are helping us see it
This week’s reads span a wide range from metabolic tricks that boost immunotherapy to new computational tools reshaping single-cell analysis and foundation models that simulate cellular futures. A clinically viable 16-hour fasting regimen surprisingly increases CD8⁺ T-cell-mediated tumor cell killing by rewiring tumor metabolism through the amino acid isoleucine. On the computational front, CellSweep provides a fast and interpretable method for cleaning single-cell data, while PerturbGen, trained on over 100 million cells, predicts the effects of genetic perturbations on developmental trajectories. Other papers provide new insights into the tumor cell plasticity and microenvironmental support, such as how lung cancer cells offload damaged mitochondria to fibroblasts to survive targeted therapy, and how head and neck cancer cells decouple differentiation from loss of self-renewal, which has implications for differentiation therapy. Finally, a systematic benchmark reveals the potential and pitfalls of measuring the activities of transposable elements in single-cell RNA-seq.
Preprints/articles that I managed to read this week
16-h fasting optimizes cancer immunotherapy in mice and humans
Chen, S. et al. Cell Metabolism (2026). https://doi.org/10.1016/j.cmet.2026.01.015
The paper in one sentence
A clinically feasible 16-hour fasting regimen reshapes the tumor microenvironment by causing cancer cells to release isoleucine, which fuels CD8+ T cell cytotoxicity via acetyl-CoA-driven epigenetic and metabolic remodeling, enhancing immunotherapy efficacy in both mice and patients with colorectal cancer.
Summary
Dietary interventions can influence cancer therapy, but prolonged fasting is poorly tolerated in patients who may already be malnourished. Chen and colleagues test a brief, overnight 16-hour fasting regimen, already standard preoperative practice, in mouse tumor models and a prospective cohort of colorectal cancer (CRC) patients. In B16 and MC38 tumour-bearing mice, 16-hour fasting remodelled the tumour immune microenvironment. Single-cell RNA-seq revealed enhanced CD8+ T cell cytotoxicity (increased IFNγ, GZMB) and reduced exhaustion markers (PD1, TIM3, TIGIT), without altering circadian rhythms. In a pilot human study (12 CRC patients), those undergoing preoperative fasting showed expansion of cytotoxic Temra (terminally differentiated effector memory) cells and reduced exhaustion trajectories compared to fed controls. The mechanism centres on the branched-chain amino acid isoleucine. Untargeted metabolomics of tumour interstitial fluid (TIF) identified isoleucine as the most significantly upregulated metabolite after 16-hour fasting—specifically at the 16-hour time point, not earlier. This accumulation correlated with fasting duration in a second patient cohort. Isoleucine proved essential for CD8+ T cell function. Depletion impaired proliferation and effector function; supplementation in nutrient-deprived tumour-conditioned medium (TCM) restored both. In vivo, isoleucine administration slowed tumour growth in a CD8+ T cell-dependent manner and synergized with anti-PD1 therapy. Mechanistically, isoleucine enters CD8+ T cells via the LAT1 transporter, is catabolized by BCAT2, and fuels the acetyl-CoA pool. Isotope tracing confirmed conversion of ¹³C-isoleucine into acetyl-CoA and TCA intermediates. This acetyl-CoA drives two critical processes: (1) histone acetylation at effector gene loci (Ifng, Gzmb, Tbx21), increasing chromatin accessibility, and (2) phospholipid synthesis, supporting membrane integrity and cytotoxic morphology. BCAT2-deficient T cells lacked these responses and failed to mediate anti-tumour effects. Crucially, the isoleucine originates from tumour cells, not serum. Under fasting-induced glutamine deprivation, tumour cells upregulate antiporter activity (SLC3A2/SLC7A5) that exchanges intracellular isoleucine for extracellular glutamine—a metabolic trade-off that releases isoleucine into the TME. Knockout of Slc3a2 in tumour cells abolished isoleucine accumulation and abrogated fasting-induced T cell enhancement. In a prospective neoadjuvant immunotherapy trial (NCT05731726), CRC patients who fasted for 16 hours before anti-PD1 infusion showed expanded Temra populations, enhanced cytotoxic signatures, and reduced tumour size compared to non-fasted controls.
Personal highlights
16-hour fasting opens a metabolic window for immunotherapy: unlike prolonged fasting regimens that are poorly tolerated, a single overnight 16-hour fast, already clinically routine before surgery, is sufficient to remodel the tumour immune microenvironment. In mice, this brief fast enhanced CD8+ T cell cytotoxicity and reduced exhaustion; in a pilot human study, it expanded cytotoxic Temra cells and improved anti-PD1 responses.
Isoleucine emerges as the critical fasting-induced metabolite: untargeted metabolomics of tumour interstitial fluid identified isoleucine as the most significantly upregulated metabolite after 16-hour fasting, specifically at the 16-hour time point. This accumulation correlated with fasting duration in CRC patients and was not observed for other branched-chain amino acids (leucine, valine) or immune-relevant amino acids (arginine, serine).
Isoleucine fuels CD8+ T cell effector function via acetyl-CoA: depletion of isoleucine impaired CD8+ T cell proliferation and IFNγ/GZMB production; supplementation in nutrient-deprived medium restored them. Isotope tracing showed conversion of ¹³C-isoleucine into acetyl-CoA and TCA intermediates. This acetyl-CoA drives histone acetylation at effector gene loci (Ifng, Gzmb, Tbx21) and supports phospholipid synthesis for membrane integrity, dual mechanisms linking a single amino acid to both epigenetic and metabolic control of cytotoxicity.
Tumour cells release isoleucine via a glutamine-exchange trade-off: under fasting-induced glutamine deprivation, tumour cells upregulate the antiporter SLC3A2/SLC7A5, exchanging intracellular isoleucine for extracellular glutamine. CRISPR screening with an isoleucine FRET sensor (OLIVE) identified SLC3A2 as the key efflux transporter. Slc3a2 knockout abolished isoleucine accumulation in TIF and abrogated fasting-induced T cell enhancement, confirming that tumour cells are the source.
Clinical proof-of-concept in neoadjuvant immunotherapy: in a prospective trial, pMMR/MSS rectal cancer patients who fasted for 16 hours before anti-PD1 infusion showed expanded Temra populations, enhanced cytotoxic signatures, and reduced tumour size compared to non-fasted controls. This demonstrates that a simple, well-tolerated dietary intervention can meaningfully improve immunotherapy outcomes.
Why should we care?
Immunotherapy has transformed cancer treatment, but most patients still don’t respond, and resistance remains a major challenge. Meanwhile, dietary interventions have shown promise but require prolonged, poorly tolerated regimens, a non-starter for patients already at risk of malnutrition and cachexia. This study flips that narrative. A single overnight 16-hour fast, already standard practice before surgery, is sufficient to reshape the tumour microenvironment in a way that enhances immunotherapy. The mechanism is elegant: fasting creates a metabolic tug-of-war where tumour cells, starved of glutamine, release isoleucine as a trade-off. CD8+ T cells capture this isoleucine and use it to fuel the very programs, epigenetic remodeling, lipid synthesis, mitochondrial respiration, that underlie effective killing. The clinical proof-of-concept, though small, is striking. Patients who fasted showed expanded cytotoxic T cell populations and smaller tumours after neoadjuvant anti-PD1 therapy.
Single-cell genomics decontamination with CellSweep
Caskey, M. et al. bioRxiv (2026). https://doi.org/10.64898/2026.03.04.709349
The paper in one sentence
CellSweep is a fast, interpretable probabilistic model that removes ambient and bulk contamination from single-cell genomics data using an expectation-maximization algorithm, outperforming existing methods across multiple benchmarks while running in under a minute.
Summary
Caskey and colleagues introduce CellSweep, a generative model that decomposes observed counts for each barcode into three interpretable components: cell-type expression, ambient contamination, and global bulk contamination. The model assumes a multinomial distribution conditional on total UMI counts, with the expected expression profile for each cell as a convex combination of these sources.A key innovation is the use of non-cellular barcodes (empty droplets) to obtain an empirical estimate of the ambient RNA profile, stable and unbiased due to the large number of empty droplets typical in droplet-based assays. Bulk contamination is initialized from the global mean expression across all droplets. Cell-type labels are provided upfront (e.g., from CellTypist), and parameters are inferred via a closed-form expectation-maximization (EM) algorithm that parallelizes perfectly across cells. When non-cellular barcodes are unavailable (e.g., in well-based protocols like Smart-seq2), CellSweep offers an alternative model where ambient RNA is modeled as a mixture of cell-type profiles, with mixture weights updated via a nested EM procedure.
The authors benchmark CellSweep against SoupX, CellBender, DecontX, and scAR across multiple datasets and modalities. In a human-mouse mixture 10x dataset, CellSweep removes >98% of cross-species contamination while retaining >97% of true-species counts—substantially better than competitors. It performs similarly well on Smart-seq2 and ATAC-seq data. In a Visium HD spatial dataset with human cancer cells grafted in mouse, CellSweep not only removes cross-species contamination but also reveals spatial patterns of ambient noise: cells with high predicted ambient fractions (αᵢ) localize to tissue edges, consistent with edge artifacts. On a PBMC 8k dataset, CellSweep cleans up marker gene expression (e.g., removing neutrophil markers from non-neutrophil clusters) while preserving pan-leukocyte markers like PTPRC. It achieves this with mean removal of 668 counts per cell, less aggressive than scAR (2,647) but more effective than CellBender (121), which left substantial contamination. CellSweep is idempotent: reapplying it to already-cleaned data produces minimal additional changes, unlike CellBender and scAR, which continue to remove counts. It is also fast: on a PBMC 8k dataset, CellSweep runs in 25 seconds on 16 CPU threads, 10× faster than DecontX and SoupX, and orders of magnitude faster than neural-network-based methods requiring GPUs. In simulations with ground truth, CellSweep achieves near-perfect positive predictive value (0.981), matching DecontX and SoupX, while scAR performs poorly (0.686).
Personal highlights
Interpretable three-component mixture model: CellSweep explicitly models observed counts as a convex combination of cell-type expression, ambient contamination (from lysed cells), and global bulk contamination (from library prep). This decomposition, unlike black-box neural approaches, provides biologically meaningful parameters: αᵢ (per-cell ambient fraction) and β (global bulk fraction), enabling interpretability and quality control.
Empirical ambient estimation from empty droplets: ny leveraging the large number of non-cellular barcodes typical in droplet-based assays, CellSweep obtains a stable, unbiased estimate of the ambient RNA profile via simple averaging. This avoids the need to infer ambient noise from cellular data alone, a key advantage over methods that must estimate everything simultaneously.
Closed-form EM with perfect parallelization: unlike variational inference or deep generative models, CellSweep’s EM algorithm has closed-form E- and M-steps that decompose independently across cells. This enables near-perfect parallelization and yields runtimes of seconds to minutes on a CPU, orders of magnitude faster than CellBender or scAR, which require GPUs and hours of compute.
Spatial mapping of ambient noise reveals edge artifacts: applying CellSweep to a Visium HD xenograft dataset, the authors show that cells with high predicted ambient fractions (αᵢ) localize to tissue edges, a striking spatial pattern that validates the model and provides a diagnostic for edge artifacts in spatial transcriptomics.
Idempotency ensures stable output: CellSweep, SoupX, and DecontX are nearly idempotent repeated application produces minimal changes. In contrast, CellBender and scAR continue to remove counts across iterations, indicating instability and risking over-cleaning.
Versatility across technologies and modalities: CellSweep works on droplet-based (10x), combinatorial barcoding (Parse), well-based (Smart-seq2), and spatial (Visium HD) data, as well as ATAC-seq. The alternative model handles cases without empty droplets, broadening applicability.
title
Chi Hao, L. et al. bioRxiv (2026). https://doi.org/10.64898/2026.03.04.709254
The paper in one sentence
PerturbGen is a generative foundation model trained on 107 million single-cell transcriptomes that predicts how genetic perturbations introduced at one point along a cellular trajectory, differentiation, development, or immune activation—reshape downstream cell states and fate decisions.
Summary
Existing approaches for predicting perturbation responses operate within fixed cellular states, they cannot model how an intervention applied early (e.g., in a stem cell) propagates to alter later differentiated states. This limits their utility for understanding development, disease progression, and therapeutic timing. The authors of this manuscript develop PerturbGen, an encoder-decoder transformer that explicitly models state-to-state transitions. Cells are represented as ranked tokenized gene expression sequences (following Geneformer). During training, the model learns to predict gene expression at a target state (e.g., day 10 of differentiation) conditioned on source and intermediate states (e.g., days 0, 3, 7). This trajectory-aware architecture enables in silico perturbation: modify the source state representation (e.g., knock out a gene) and predict how that change propagates downstream. PerturbGen is pre-trained on ~107 million single-cell transcriptomes spanning embryonic, fetal, and postnatal stages, capturing diverse developmental transitions. It is then fine-tuned on task-specific time-resolved datasets.
Personal highlights
Trajectory-aware perturbation prediction: unlike prior models that predict effects within a single state, PerturbGen explicitly models state-to-state transitions. By conditioning target-state generation on source and intermediate states, it enables prediction of how early perturbations propagate to reshape downstream transcriptional programs, a critical capability for development, differentiation, and disease progression.
Massive pre-training on developmental transitions: pre-training on 107 million cells including underrepresented embryonic and fetal datasets—exposes the model to diverse, densely sampled state changes. This improves generalization across tissues and contexts, as demonstrated by accurate prediction of unseen time points across three independent time-resolved datasets.
In silico perturbation atlases reveal perturbation-induced programs (PIPs): scaling in silico perturbations to 3,108 genes in hematopoiesis and 5,050 in skin organoids yields perturbation maps where genes with similar downstream effects cluster. These PIPs capture age- and lineage-specific regulatory programs (e.g., “postnatal lymphoid differentiation,” “fetal hematopoietic progenitor proliferation”) and are enriched for blood-trait-associated genes and monogenic disorder genes, demonstrating biological coherence and translational relevance.
Recapitulation of monogenic disease phenotypes: in silico ETV6 knockout in megakaryocyte progenitors predicted transcriptional changes that closely matched those observed in ETV6-related thrombocytopenia patients (81% pathway concordance). This included upregulation of MHC class II genes and downregulation of platelet programs, effects validated across patients and not driven by compositional shifts. This establishes a framework for modeling rare diseases where patient samples are limited.
Functional validation in skin organoids: PerturbGen prioritized GSK3B/Wnt activation as a candidate to promote fibroblast maturation. Experimental Wnt activation with CHIR99021 at day 6 phenocopied the predicted stromal shift, increasing transcriptional similarity to fetal skin fibroblasts. This demonstrates that trajectory-aware predictions can guide experimental optimization of complex multicellular systems.
Why should we care?
Biology is not static, it unfolds along trajectories. A stem cell today becomes a differentiated cell tomorrow; an immune cell at 90 minutes post-stimulus is not the same as at 6 hours. Yet most perturbation models treat cells as snapshots, asking: what happens if I perturb this cell in this state? They cannot ask: what happens if I perturb this cell now and look at its descendants later? PerturbGen bridges this gap. By learning how states transition across time—from 107 million cells spanning development, homeostasis, and disease, it can simulate the downstream consequences of early interventions. This is not just a technical advance; it reframes the question we can ask.
Transfer of Damaged Mitochondria from Cancer Cells to Cancer-Associated Fibroblasts Promotes Tyrosine Kinase Inhibitor Tolerance in EGFR-Mutant Lung Cancer
Liu, T. et al. Cancer Research (2025). https://doi.org/10.1158/0008-5472.CAN-25-0433
The paper in one sentence
EGFR-mutant lung cancer cells under tyrosine kinase inhibitor stress transfer damaged mitochondria via tunneling nanotubes to a specific fibroblast subset (RGS5+MYL9+ CAFs), which act as "metabolic sinks" to reduce oxidative stress and promote drug-tolerant persister cell survival, a process that can be blocked by the FDA-approved Rho kinase inhibitor fasudil.
Summary
Liu and colleagues use single-cell RNA sequencing of treatment-naive EGFR-mutant lung adenocarcinomas to map the fibroblast landscape, identifying five distinct CAF subsets. Among these, a previously unrecognized myofibroblast population marked by RGS5 and MYL9 stood out. In patient-derived organoid co-cultures, RGS5+MYL9+ CAFs, but not other CAF subsets, significantly attenuated osimertinib-induced cell death and promoted tumor regrowth after drug withdrawal. Higher infiltration of these CAFs correlated with advanced stage and poor prognosis in TCGA data. Mechanistically, osimertinib treatment generates mitochondrial reactive oxygen species (mtROS) in cancer cells. This triggers two parallel responses: (1) upregulation of CCL11, which recruits RGS5+MYL9+ CAFs to the DTP niche, and (2) activation of Miro1 (mitochondrial Rho GTPase 1) and RhoA, which drive F-actin polymerization and the formation of tunneling nanotubes (TNTs)—long membrane protrusions that connect cancer cells to adjacent CAFs. Through these nanotubes, damaged, ROS-producing mitochondria are transferred from stressed cancer cells to RGS5+MYL9+ CAFs. The CAFs accept this “toxic cargo,” thereby reducing mitochondrial burden and oxidative stress in the cancer cells and promoting DTP survival. The transferred mitochondria in CAFs show elevated mtROS and dysfunction, confirming they are indeed damaged. In vivo, xenografts containing RGS5+MYL9+ CAFs showed reduced tumor regression on osimertinib and accelerated regrowth after withdrawal, with evidence of mitochondrial transfer detectable by flow cytometry and confocal imaging. Blocking CCL11 with a neutralizing antibody reduced CAF recruitment, delayed relapse, and improved survival. Critically, the Rho kinase inhibitor fasudil—already FDA-approved for cerebral vasospasm, blocked TNT formation by inhibiting RhoA activity. In xenograft models, combining osimertinib with fasudil significantly delayed tumor relapse and extended survival, even when treatment was initiated after MRD establishment. Human specimens from neoadjuvant osimertinib-treated patients showed increased RGS5+MYL9+ CAF infiltration and closer proximity to residual tumor cells, confirming clinical relevance.
Personal highlights
scRNA-seq identifies RGS5+MYL9+ CAFs as a clinically relevant subset: unbiased profiling of treatment-naive EGFR-mutant lung adenocarcinomas revealed five CAF subsets, including a novel myofibroblast population co-expressing RGS5 and MYL9. In patient-derived organoid co-cultures, only this subset conferred osimertinib resistance and promoted tumor regrowth after drug withdrawal. Higher RGS5+MYL9+ CAF infiltration correlated with advanced stage and poor prognosis in TCGA, and was enriched in post-treatment residual tumors from patients.
Damaged mitochondria are transferred from cancer cells to CAFs via tunneling nanotubes: under osimertinib stress, cancer cells form F-actin-rich membrane protrusions (tunneling nanotubes) that connect to adjacent RGS5+MYL9+ CAFs. Through these nanotubes, damaged, ROS-producing mitochondria are transferred from cancer cells to CAFs, visualized by mitoDsRed labeling, confocal imaging, and flow cytometry. This is not a one-way transfer of healthy mitochondria to cancer cells (as previously described), but rather a disposal mechanism where cancer cells offload damaged organelles to stromal “sinks.”
Miro1 and RhoA mediate nanotube formation and mitochondrial trafficking: Osimertinib-induced mtROS upregulates Miro1 (mitochondrial Rho GTPase 1), which moves damaged mitochondria toward the cell periphery, and activates RhoA, which drives F-actin polymerization to form nanotubes. Miro1 knockdown or RhoA inhibition (with fasudil) abrogates mitochondrial transfer and restores drug sensitivity. This establishes a molecular pathway linking oxidative stress to intercellular organelle transfer.
CCL11 recruits RGS5+MYL9+ CAFs to the DTP niche: DTP cells secrete CCL11, which acts as a chemoattractant specifically for RGS5+MYL9+ CAFs (not other subsets). Neutralizing CCL11 reduces CAF accumulation around stressed cancer cells, decreases mitochondrial transfer, and delays tumor relapse in vivo. This reveals a two-step mechanism: recruitment followed by nanotube-mediated transfer.
Fasudil, an FDA-approved drug, blocks TNT formation and prevents relapse: the Rho kinase inhibitor fasudil, already used clinically for cerebral vasospasm, effectively blocks TNT formation by inhibiting RhoA. In xenograft models, combining osimertinib with fasudil—even when started after MRD establishment—significantly delayed tumor relapse and extended survival. This offers an immediately translatable strategy to overcome TKI tolerance.
Why should we care?
Drug-tolerant persister cells are the hidden seeds of relapse in EGFR-mutant lung cancer, they survive initial therapy through non-genetic adaptations, then eventually regrow as fully resistant tumors. For years, we’ve known they exist, but we haven’t known how the microenvironment supports them. This work reveals a remarkable mechanism: stressed cancer cells don’t just suffer in silence. They actively recruit specific fibroblasts, hand off their damaged mitochondria like toxic waste, and thereby reduce their own oxidative burden to survive. The fibroblast acts as a “metabolic sink,” accepting damage to protect the cancer cell. This flips the conventional narrative of mitochondria transfer (healthy mitochondria moving into cancer cells) on its head. The molecular pathway is unusually complete: from the initial ROS signal, to Miro1-mediated mitochondrial positioning, to RhoA-driven nanotube formation, to CCL11-mediated recruitment. And crucially, each node is targetable.
Plasticity of squamous differentiation drives drug resistance in HNSCC
Sipilä, K. et al. bioRxiv (2026). https://doi.org/10.64898/2026.03.09.710514
The paper in one sentence
A subset of head and neck squamous cell carcinoma cells resists differentiation-inducing signals, including the clinically used ErbB inhibitor afatinib, retaining clonogenic and tumorigenic potential despite expressing differentiation markers, revealing that differentiation and loss of self-renewal are uncoupled in these cells.
Summary
Differentiation therapy, forcing cancer cells to terminally differentiate and lose self-renewal, has transformed outcomes in acute promyelocytic leukaemia, but has shown limited success in solid tumours. Why? Sipilä and colleagues address this question using patient-derived head and neck squamous cell carcinoma (HNSCC) lines (SJGs) cultured on feeder layers, a system that preserves the mutational heterogeneity of primary tumours. When transplanted orthotopically into immunocompromised mice, these lines recapitulate the histological diversity of human HNSCC, including variable differentiation status, stromal desmoplasia, and perineural invasion. In normal keratinocytes, detachment from the basement membrane (methylcellulose suspension) triggers terminal differentiation and irreversible loss of clonogenic potential. HNSCC cells also upregulate differentiation markers (IVL, TGM1) in suspension, but they do not lose clonogenic capacity to the same extent. Immunostaining revealed a heterogeneous response: some cells became Ki67⁻ and expressed differentiation markers, but a subset remained Ki67⁺ or failed to upregulate IVL/TGM1 entirely. To track the fate of clonogenic cells in vivo, the authors used lentiviral fluorescent barcoding (mRuby2, mTagBFP2, acGFP). Cells pre-treated with methylcellulose suspension for 20h showed only a minor delay in tumour growth and no significant change in clonal density or clone size, indicating that the cells responsible for tumour formation are largely resistant to transient differentiation signals. A small-molecule screen targeting pathways known to regulate keratinocyte differentiation identified ErbB-MEK1/2-ERK1/2 inhibition (afatinib, PD0325901, VX-11e) as the most effective at inducing IVL expression. Afatinib, already clinically used in HNSCC, increased differentiation marker expression but, like methylcellulose, left a substantial fraction of cells undifferentiated. Fluorescent barcoding after afatinib pre-treatment showed no reduction in tumour growth or clonal architecture, the tumorigenic cells were unaffected by drug-induced differentiation. Using an IVL promoter-driven mCherry reporter, the authors sorted cells by differentiation status after afatinib treatment. IVL-high cells formed markedly smaller tumours than IVL-low cells, but some IVL-high cells still generated progeny, and tumours derived from IVL-low cells remained capable of producing differentiated cells upon re-challenge. Even at supra-clinical concentrations, an afatinib-resistant subpopulation persisted. The key finding: differentiation and loss of self-renewal are partially uncoupled in HNSCC. Cells can express differentiation markers while retaining clonogenic potential, and the most tumorigenic cells are those that resist differentiation cues—not because they cannot differentiate, but because they can escape the irreversible cell-cycle exit that normally accompanies it.
Personal highlights
Patient-derived models preserve heterogeneity: SJG lines cultured on feeder layers retain the mutational landscape of primary tumours (TP53, PIK3CA, FAT1, NOTCH1, CDKN2A). Orthotopic xenografts recapitulate key histopathological features: differentiation status, desmoplasia, perineural invasion, providing a clinically relevant platform to study differentiation dynamics.
Differentiation and self-renewal are uncoupled in HNSCC: in normal keratinocytes, detachment induces terminal differentiation and irreversible loss of clonogenicity. HNSCC cells upregulate differentiation markers (IVL, TGM1) in suspension but retain colony-forming ability. A subset of cells remains Ki67⁺ or fails to express differentiation markers entirely, revealing intrinsic heterogeneity in the response.
Clonogenic tumour-initiating cells resist differentiation signals: fluorescent barcoding enabled lineage tracing of individual clones in vivo. Pre-treatment with methylcellulose or afatinib did not reduce tumour growth, clonal density, or clone size. The cells that drive tumour formation are largely unaffected by differentiation-inducing stimuli.
ErbB-MAPK inhibition promotes differentiation but spares tumorigenic cells: a focused screen identified afatinib, MEKi (PD0325901), and ERKi (VX-11e) as the most potent inducers of IVL expression. Yet, even at supra-clinical concentrations, a fraction of cells remained undifferentiated, and these corresponded to the most clonogenic population in vivo.
IVL reporter reveals graded differentiation states: cells sorted by IVL-mCherry intensity after afatinib treatment showed an inverse relationship between IVL expression and tumorigenic potential. However, some IVL-high cells still formed tumours, and IVL-low cells remained capable of producing differentiated progeny upon re-challenge. Differentiation status is not a binary switch but a spectrum, and cells can move along it without losing self-renewal.
Why should we care?
Differentiation therapy transformed acute promyelocytic leukaemia from a deadly disease to one with >90% cure rates. The idea is elegant: instead of killing cancer cells, force them to mature into harmless, post-mitotic cells. But for solid tumours, this strategy has repeatedly failed. This work explains why. In HNSCC, differentiation and loss of self-renewal are not tightly coupled. Cancer cells can express differentiation markers, they look like they’re maturing, while retaining the ability to divide and form tumours. The cells that actually sustain tumour growth are precisely those that resist differentiation cues, not because they can’t differentiate, but because they can escape the irreversible cell-cycle exit that normally accompanies it. The clinical implications are sobering. Afatinib, already used in HNSCC, does induce differentiation, but it doesn’t eliminate the tumorigenic cells. Even at concentrations exceeding those achieved in patients, a resistant subpopulation persists. This suggests that simply measuring differentiation markers in response to therapy may overestimate efficacy; what matters is whether the clonogenic cells are eliminated.
Benchmarking computational tools for locus-specific analysis of transposable elements in single-cell RNA-seq datasets
Finazzi, V. et al. bioRxiv (2026). https://doi.org/10.64898/2026.02.26.708244
The paper in one sentence
This systematic benchmark evaluates computational tools for locus-specific transposable element quantification in short-read scRNA-seq, revealing that while older elements are reliably quantified, young repetitive TEs remain intrinsically difficult to resolve, and gene-TE misassignment is a pervasive, underappreciated challenge.
Summary
Transposable elements (TEs) are increasingly recognized as regulators of gene expression and cellular identity, but their repetitive nature makes them difficult to quantify, especially at single-locus resolution in sparse, 3’-biased single-cell RNA-seq data. Several tools have been developed, but their relative performance has not been systematically evaluated against ground truth. The authors present a comprehensive benchmarking framework combining real datasets (mouse ESCs, olfactory mucosa, human PBMCs) with controlled simulations that provide read-level ground truth. They evaluate three tools capable of locus-specific quantification: SoloTE, Stellarscope, and STARsolo (with and without EM-based multimapper handling). First, they show that TE-derived reads constitute a substantial fraction of scRNA-seq data (>24% across datasets) and that TE expression profiles alone can resolve cell types, often revealing substructure not apparent in gene-based clustering. However, the proportion of multimapping TE reads varies dramatically by cell state (e.g., highest in 2-cell-like cells expressing young TEs). The simulations, stratified by TE age (old vs. young), mixing, and inclusion of genes, reveal sharp performance contrasts. For old TEs, all tools achieve near-perfect detection and quantification. For young TEs, detection is plagued by false positives across all methods, with limited tool agreement. Including multimappers (via EM or threshold lowering) increases false positives without consistently improving accuracy. Stellarscope’s EM algorithm partially mitigates noise but at the cost of sensitivity; its posterior probability thresholds can be tuned, but the optimal trade-off depends on the analysis goal. Family-level analysis shows striking heterogeneity: L1 and ERVL elements are hardest to resolve accurately, while SINEs (Alu, B2) perform better. Aggregating to the subfamily level dramatically improves precision, confirming that the core challenge is locus-specific assignment, not family-level detection. Critically, gene-TE misassignment is a major, bidirectional problem. Reads from expressed genes are frequently misassigned to overlapping TE loci, and vice versa. Stellarscope, which does not filter gene-overlapping reads, is most affected, but all tools show some degree of cross-assignment. This confound can strongly bias biological interpretation. The authors distill their findings into practical recommendations: (i) use locus-level quantification confidently for older elements, but interpret young-locus calls with caution; (ii) prefer unique-mapper strategies (SoloTE default) when precision is paramount; (iii) for discovery-scale surveys, aggregate to subfamily level for robustness; (iv) explicitly check and report gene-TE overlaps.
Personal highlights
TE-derived reads are abundant and biologically informative in scRNA-seq: across three diverse datasets, >24% of reads mapped to TE loci, reads typically discarded in standard pipelines. TE expression profiles alone resolved major cell types and, in some cases, revealed substructure not visible in gene-based clustering, demonstrating that TEs encode meaningful biological signal.
Age matters: old TEs are reliable, young TEs are problematic: evolutionary age is the dominant predictor of quantification accuracy. Old elements ( >2 million years) were detected and quantified with near-perfect precision across tools. Young elements, by contrast, generated pervasive false positives regardless of method, with limited tool agreement. This reflects fundamental sequence-level constraints: young TEs are too similar to resolve with short reads.
Multimapper handling offers limited gains, at a cost: including multimapped reads, via EM algorithms (Stellarscope, STARsolo) or threshold lowering (SoloTE), increased false positives without consistently improving accuracy. EM improved precision modestly but reduced sensitivity. For most applications, unique-mapper strategies (SoloTE default) performed comparably while producing fewer false positives, suggesting that aggressive multimapper inclusion may do more harm than good.
Gene-TE misassignment is pervasive and bidirectional: reads from expressed genes were frequently misassigned to overlapping TE loci, and TE-derived reads were misassigned to genes. Stellarscope was most affected (it does not filter gene-overlapping reads), but all tools showed cross-assignment. This confound can severely bias interpretation—for example, inflating apparent TE activity in gene-rich regions or masking genuine TE signals.
Family-specific performance guides tool choice: performance varied dramatically by TE family. L1 and ERVL elements (long, homogeneous) were hardest to resolve accurately; SINEs (Alu, B2) performed better. This suggests that optimal tool selection may depend on which families are expected to be active in a given biological system, and that family-aware quality control is essential.
Other papers that peeked my interest and were added to the purgatory of my “to read” pile
GPU-accelerated single-cell analysis at scale with rapids-singlecell
Reciprocal regulation of fibroblast–macrophage equilibrium governs skin integrity
CCL3 is produced by aged neutrophils across cancers and promotes tumor growth
Bringing the genetically minimal cell to life on a computer in 4D
Ageing promotes metastasis via activation of the integrated stress response
Continuous Diffusion Transformers for Designing Synthetic Regulatory Elements
Intestinal interoceptive dysfunction drives age-associated cognitive decline
Thanks for reading.
Cheers,
Seb.


