Weekly reads 11/8/25

Reprogramming biology: CAFs, motifs, metabolism, and more...

Aug 17, 2025

This week’s round of preprints and papers spans everything from stromal immunology to deep learning interpretability, single-cell generative modeling, cancer metabolism, synthetic biology, and spatial epigenomics. The highlights include: targeting NNMT in fibroblasts to restore antitumor immunity, dissecting how neural networks learn DNA motifs, a generative model for derailed cell states, metabolic constraints on metastasis, a phage-based hypermutation engine for rapid evolution, and the surprising adaptability of spatial transcriptomics methods to chromatin accessibility data. Together, these works showcase how experimental biology, computation, and engineering are converging to decode and reprogram complex biological systems.

Preprints/articles that I managed to read this week

NNMT Inhibition in Cancer-Associated Fibroblasts Restores Antitumor Immunity

Heide et al. (2025). Nature. DOI: 10.1038/s41586-025-09303-5

The paper in one sentence

Targeting the enzyme NNMT in cancer-associated fibroblasts (CAFs) disrupts immunosuppressive signaling, revitalizes CD8+ T cell activity, and enhances immunotherapy efficacy across multiple cancer types.

Summary

This study identifies nicotinamide N-methyltransferase (NNMT) as a master epigenetic regulator in CAFs that drives immune evasion by recruiting myeloid-derived suppressor cells (MDSCs) via complement secretion. Using spatial transcriptomics, single-cell RNA-seq, and a novel NNMT inhibitor (NNMTi), the authors demonstrate that NNMT blockade reprograms the tumor microenvironment, reduces metastasis, and synergizes with immune checkpoint therapy in ovarian, breast, and colon cancer models.

Personal highlights

CAF-immune crosstalk decoded: NNMT-expressing CAFs recruit immunosuppressive MDSCs by hypomethylating H3K27me3 and secreting complement factors (C3/C5a), creating a T cell-excluded niche—spatially mapped in human ovarian tumors.
Epigenetic-metabolic switch: NNMT depletion in fibroblasts reverses histone hypomethylation, shrinking CAFs and collapsing their protumorigenic subtypes (myCAFs, iCAFs, apCAFs) without direct cancer cell targeting.
Precision inhibitor development: a stereospecific NNMTi (IC50 <10 nM) was engineered through high-throughput screening, with the inactive distomer (NNMTi-D) serving as a rigorous negative control to validate on-target effects.
CD8+ T cell resurrection: NNMT inhibition reduces PD-L1+ monocytes, triples IFNγ+ CD8+ T cells, and restores immune checkpoint efficacy—even in resistant models like E0771-LMB breast cancer.
Pan-cancer stromal vulnerability: NNMT is ubiquitously expressed across CAF subtypes in 10+ malignancies, positioning it as a universal stromal target to break immunotherapy resistance.

Why should we care?

This work shifts the paradigm of stromal targeting from "depleting CAFs" to "reprogramming CAFs" by exposing NNMT as a linchpin of immunosuppression. For clinicians, NNMTi offers a strategy to rescue checkpoint blockade non-responders. For drug developers, it validates stromal epigenetics as a druggable axis. And for patients, it opens avenues to combine NNMTi with existing immunotherapies—potentially extending survival in aggressive cancers like ovarian or triple-negative breast cancer where stromal barriers limit treatment success. The NNMTi’s ability to work via local (intratumoral) delivery could also minimize systemic toxicity, a critical advantage for metastatic disease.

Decoding Genomic Neural Networks: When and How They Learn Motifs and Their Interactions

Thompson, M., & Lehner, B. bioRxiv (2025). https://doi.org/10.1101/2025.07.25.666754

The paper in one sentence

This study systematically evaluates how different neural network architectures discover DNA motifs and their interactions under varying genomic contexts, revealing that convolutions regularize gradients for motif learning while attention excels in modeling interactions—but LSTMs and dilated CNNs outperform in complex, non-motif scenarios.

Summary

The paper investigates the interpretability of neural networks in genomics by simulating over 1,000 motif-based genetic architectures. It demonstrates that convolutional layers inherently enable motif discovery by sharing gradients locally, while attention-based models excel when phenotypes are driven by motif interactions. However, LSTMs and dilated convolutions outperform attention in the presence of non-motif sequence effects. The study also highlights gaps in motif extraction methods, showing modest correlations between predictive performance and motif discovery power, especially in larger datasets.

Personal highlights

Convolutions as gradient regularizers for motif discovery: a single convolutional layer with exponential activation suffices to learn DNA motifs by uniformly distributing gradients across motif nucleotides, unlike dense layers that overfit to single nucleotides.
Attention dominates interaction modeling, but falters in non-motif contexts: attention-based models outperform others when phenotypes are purely motif-driven, but LSTMs and dilated CNNs take the lead when sequence-level (non-motif) effects are present.
Mis-specified motifs still get discovered: longer motifs (>7bp) are recoverable even with suboptimal filter widths, while short, low-information motifs prove challenging—highlighting the role of motif characteristics in interpretability.
TF-MoDISco beats first-layer filters for precision: deep attribution methods (e.g., TF-MoDISco) reduce false positives and redundancy compared to visualizing first-layer filters, though the latter captures broader sequence features.
Predictive performance ≠ motif discovery: correlations between accuracy and motif recovery weaken with larger datasets and simpler architectures, suggesting benign overfitting or limitations in current xAI methods.

Why should we care?

This work bridges the gap between "black-box" genomic neural networks and actionable biological insights. By clarifying when and how models learn motifs, it empowers researchers to:

Choose architectures wisely: use attention for clean motif-interaction tasks, but switch to LSTMs/dilated CNNs for noisy, non-motif contexts.
Improve interpretability pipelines: combine shallow (first-layer) and deep (attribution-based) methods to capture complementary aspects of motif biology.
Design better synthetic assays: simulated random sequences, as used here, help benchmark models for real-world applications like synthetic biology or disease variant prediction.

Decipher: A Deep Generative Model for Joint Representation and Visualization of Derailed Cell States

Nazaret et al. Genome Biology (2025) 26:219. https://doi.org/10.1186/s13059-025-03682-8

The paper in one sentence

Decipher is a hierarchical deep generative model that jointly models and visualizes single-cell RNA-seq data across normal and perturbed conditions, enabling accurate reconstruction of cell-state trajectories and identification of disrupted biological mechanisms.

Summary

Decipher addresses the challenge of comparing single-cell genomics data across conditions (e.g., healthy vs. disease) by learning a joint latent representation that preserves both shared and disrupted cell-state dynamics. Unlike existing methods, Decipher employs a two-level hierarchical architecture: a 2D "Decipher space" for visualization and a higher-dimensional latent space for refined cell-state characterization. This design captures dependencies between latent factors, avoids artificial mixing of biological differences, and enables direct interpretation of gene expression patterns along trajectories. The method is validated in pancreatitis, AML, and gastric cancer, revealing derailed developmental pathways and dysregulated transcriptional programs.

Personal highlights

Hierarchical latent spaces for interpretable visualization and analysis: Decipher uniquely combines a 2D visualization space (Decipher components) with a higher-dimensional latent space, enabling simultaneous global visualization and fine-grained cell-state characterization without distortion.
Dependency-aware latent factors: unlike traditional VAEs, Decipher’s latent factors are correlated, capturing overlapping biological processes and shared mechanisms across trajectories—critical for comparing normal and perturbed conditions.
Direct trajectory alignment and gene pattern reconstruction: Decipher’s generative framework allows seamless interpolation of gene expression along inferred trajectories (Decipher time), eliminating the need for post-hoc alignment and enabling comparison of conserved vs. disrupted genes.
Basis decomposition for quantifying disruption: a novel probabilistic basis model decomposes gene expression dynamics into interpretable patterns, quantifying how perturbations alter transcriptional programs in shape, scale, or both.
Benchmarked superiority in sparse trajectories: Decipher outperforms existing methods (e.g., scVI, UMAP) in preserving global cell-state order, especially in low-density transitional regions, as demonstrated in synthetic and real-world datasets.

Why should we care?

Decipher bridges a gap in single-cell analysis by not just identifying where cells deviate from normal states but how—revealing the mechanistic drivers of disease progression. For biologists, it offers a tool to dissect derailed differentiation (e.g., in cancer or developmental disorders) with unprecedented clarity, linking mutations to transcriptional cascades. For computational researchers, its hierarchical, dependency-aware architecture sets a new standard for trajectory inference, combining interpretability with scalability. By transforming sparse, noisy single-cell data into coherent narratives of cellular dysfunction, Decipher empowers both discovery and translational applications—from pinpointing therapeutic targets to understanding resistance mechanisms.

Fig 1. from Nazaret et al. - Overview of the Decipher method

Cancer Tissue of Origin Constrains Metastatic Growth and Metabolism

Sivanand et al., Nature Metabolism (2024). DOI: 10.1038/s42255-024-01105-9

The paper in one sentence

Metastatic cancer cells retain metabolic traits of their tissue of origin, which limits their ability to thrive in distant organs, explaining why certain cancers preferentially metastasize to specific sites.

Summary

This study reveals that the metabolic programming of cancer cells is strongly influenced by their tissue of origin, even after they metastasize. Using pancreatic, lung, and liver cancer models, the authors show that metastatic cells grow better in their primary tissue than in secondary sites, despite adapting to new environments. Isotope tracing and metabolomics demonstrate that primary and metastatic tumors share metabolic similarities, suggesting that the original tissue’s nutrient milieu shapes cancer cell fitness. The findings challenge the assumption that metastatic cells fully reprogram their metabolism to colonize new organs, proposing instead that their growth is constrained by retained metabolic dependencies.

Personal highlights

Metabolic retention in metastasis: pancreatic cancer metastases in the liver or lung retain metabolic features of the primary tumor, exhibiting similar glucose and glutamine utilization patterns despite differing tissue environments.
Primary site growth advantage: both primary and metastatic-derived cancer cells form larger tumors when re-implanted in their original tissue (e.g., pancreatic cells grow better in the pancreas than in the liver or lung), suggesting a persistent "home-field advantage."
Limited metabolic plasticity: repeated passaging of metastatic cells in secondary sites (e.g., lung or liver) does not erase their preference for the primary site, indicating that metabolic adaptation is constrained by tissue-of-origin imprinting.
Nutrient environment as a barrier: media mimicking primary tissue interstitial fluid better supports cancer cell proliferation than media matching metastatic sites, implicating nutrient availability in metastatic tropism.
Conserved regulatory programs: single-cell RNA-seq reveals that metabolic gene expression in metastases clusters with the primary tumor, not the metastatic tissue, underscoring the dominance of origin-derived regulation.

Why should we care?

This work redefines how we think about metastasis: it’s not just about cancer cells adapting to new environments but also about their inability to fully escape their metabolic roots. For clinicians, this means that therapies targeting tissue-of-origin metabolic pathways (e.g., pancreatic cancer’s reliance on specific nutrients) could remain effective against metastases. For researchers, it highlights the need to study metastasis through the lens of retained dependencies rather than just acquired plasticity. By uncovering why certain cancers metastasize to predictable sites, the study opens doors to blocking metastasis by exploiting these metabolic constraints—potentially making tumors less "mobile" and more treatable.

An Orthogonal T7 Replisome for Continuous Hypermutation and Accelerated Evolution in E. coli

Diercks et al., Science (2025). DOI: 10.1126/science.adp9583

The paper in one sentence

Researchers engineered an orthogonal T7 phage replisome in E. coli to enable ultra-fast, targeted mutagenesis of plasmids, achieving mutation rates 100,000× higher than the host genome and used it to evolve antibiotic resistance in just one week.

Summary

This study introduces T7-ORACLE, a synthetic biology tool that hijacks the bacteriophage T7 replication machinery to create a hypermutagenic "evolution engine" in E. coli. By optimizing T7 DNA polymerase variants (e.g., exonuclease-deficient mutants like Δ28 and active-site destabilizers like N520M), the team achieved staggering mutation rates (1.7 × 10⁻⁵ substitutions/base) on circular plasmids while leaving the host genome intact. The system’s utility was demonstrated by evolving TEM-1 β-lactamase to resist monobactam and cephalosporin antibiotics 5,000-fold better in under a week, recapitulating decades of clinical resistance mutations. Key innovations include:

Orthogonal replication: T7 replisome proteins (gp1, gp2.5, gp4, gp5) are expressed in E. coli to replicate plasmids with T7 origins, avoiding genomic interference.
Hypermutagenesis: Engineered polymerase variants target plasmids at rates 100,000× higher than natural E. coli mutation rates.
Scalability: Circular plasmids enable high transformation efficiency (2.4 × 10¹⁰ CFU/µg) and compatibility with standard lab workflows.

Personal highlights

T7 replisome orthogonality: by fusing hydrolase-deficient T7 lysozyme to T7 RNA polymerase, the team solved the initiation problem for plasmid replication, achieving stable, high-copy-number mutagenesis without host toxicity.
Directed evolution of mutagenesis: rational engineering of T7 DNA polymerase (e.g., Δ28 + N520M + P560V + V443K) pushed mutation rates to 1.7 × 10⁻⁵ substitutions/base—far surpassing prior systems like OrthoRep (10⁻⁶) or EcoRep (10⁻⁷).
Antibiotic resistance in a week: continuous passaging under antibiotic pressure evolved TEM-1 β-lactamase to clinically relevant resistance (e.g., G238S + E104K for cefotaxime), mirroring natural evolution but 1,000× faster.

Why should we care?

T7-ORACLE transforms protein evolution from a slow, labor-intensive process into a rapid, automated one. For biotech, it accelerates enzyme engineering (e.g., for greener chemistry or drug synthesis). For medicine, it models antibiotic resistance pathways in days, aiding drug design. For synthetic biology, it’s a modular tool to evolve genetic parts (promoters, ribosomes) or even entire pathways.

Unlike phage-assisted continuous evolution (PACE), T7-ORACLE requires no specialized equipment, democratizing hypermutation for any lab. By decoupling mutagenesis from host viability, it also sidesteps the trade-offs of traditional mutagenesis (e.g., chemical mutagens’ toxicity). For a post-antibiotic era, tools like this could help us stay ahead of evolution by predicting and countering resistance before it emerges in clinics.

Spatial Transcriptomics Deconvolution Methods Adapt Well to Chromatin Accessibility Data

Ouologuem et al. Bioinformatics (2025). https://doi.org/10.1093/bioinformatics/btaf288

The paper in one sentence

This study demonstrates that existing spatial transcriptomics deconvolution methods, particularly Cell2location and RCTD, can effectively be applied to spatial chromatin accessibility (ATAC-seq) data, providing a framework for analyzing epigenetic regulation in tissues while highlighting areas for future improvement.

Summary

The paper evaluates five leading spatial transcriptomics deconvolution methods (Cell2location, RCTD, Tangram, SpatialDWLS, and DestVI) on their ability to deconvolve spatial chromatin accessibility data, which measures open chromatin regions rather than gene expression. By developing a simulation framework that generates both RNA and ATAC spot-based data from single-cell multiome references, the authors benchmark method performance across modalities. Key findings include:

Cell2location and RCTD perform robustly on ATAC-seq data, rivaling their RNA-seq accuracy.
Feature selection matters: highly variable peaks outperform highly accessible peaks for deconvolution.
Tangram’s performance is reference-dependent, excelling only when reference and spatial data compositions closely match.
RNA-based deconvolution still edges out ATAC in resolving rare cell types, suggesting room for modality-specific optimizations.

Personal highlights

Cross-modal adaptability of deconvolution tools: Cell2location and RCTD, designed for RNA, generalize well to ATAC-seq data, thanks to shared count-distribution assumptions (e.g., negative binomial/Poisson models). This suggests a unified framework for spatial multiomics.
Feature selection impacts performance: highly variable peaks (informative for cell identity) beat highly accessible peaks (common but noisy), underscoring the need for careful feature curation in epigenomic deconvolution.
Tangram’s caveat—reference sensitivity: Tangram’s cell-mode excels only when reference and spatial data align perfectly, warning users to validate reference compatibility before trusting results.
ATAC’s sparsity challenge: while methods work, RNA still outperforms ATAC in rare-cell detection, hinting at opportunities for epigenetic-specific algorithm refinements (e.g., modeling zero-inflation).

Other papers that peeked my interest and were added to the purgatory of my “to read” pile

Thanks for reading.

Cheers,

Seb.

Sebcentrism

Discussion about this post