Platform Demo
Integrated Modeling of ATRA-Driven Core Network Reprogramming to Reverse Malignant States in HCC
Abstract
- Background: Hepatocellular carcinoma (HCC) remains a leading cause of cancer mortality. All-trans retinoic acid (ATRA) shows promise for forcing malignant cells toward differentiated, less aggressive states.
- Methods: Multi-omics integration (bulk & single-cell), computational modeling, and targeted experimental validation.
- Findings: Identification of key ATRA targets and regulatory circuits underpinning state reversal and selective vulnerability.
- Innovation: A mechanism-to-therapy framework spanning molecular logic, delivery design, and combination strategies.
1.1 Current Status & Challenges in Liver Cancer Therapy
Globally, HCC is among the most prevalent and lethal malignancies. Late diagnosis, high heterogeneity, and limited durability of current treatments (resection/ablation, TACE, TKIs, immunotherapy) leave substantial unmet needs. Therapeutic reprogramming aims to shift cell states toward less aggressive, more differentiated phenotypes.
HCC burden clusters in regions with chronic hepatitis prevalence and metabolic risk. The integrated atlas allows us to compare tumor vs. normal hepatocytes across donors and cell lines, improving generalization.
Surgery and locoregional therapies suffer recurrence; TKIs and immunotherapy face resistance, toxicity, and patient stratification challenges. The outcome is underpinned by intratumoral heterogeneity.
| Modality | Strength | Limitation | Implication |
|---|---|---|---|
| Resection/Ablation | Local control | High recurrence | Microresidual disease |
| TACE | Bridging/palliative | Hypoxia-driven escape | Adaptive stress responses |
| TKIs | Pathway blockade | On-target/off-target AEs | Limited durability |
| Immunotherapy | Long tails | Variable response | Biomarker need |
The reprogramming paradigm aims to stabilize malignant ecosystems by nudging cells toward differentiated fates, thereby reducing proliferation, invasion, and immune evasion pressures—ATRA is a prime candidate to achieve this.
Why reprogramming may generalize better than cytotoxicity
Cytotoxic approaches select for resistant clones. Reprogramming reduces selective pressure by redirecting the state rather than eliminating cells—potentially improving durability and synergy with standard of care.
Interactive “What-if” · Recurrence Reduction with ATRA-guided Reprogramming
Use the slider to hypothesize an average 20% recurrence reduction among high-risk subclones. The narrative updates to illustrate expected benefits and where combination therapy could help.
A 20% reduction could shift early recurrence curves downward, easing pressure on salvage therapy and enabling longer windows for immune engagement. Clusters with high cell-cycle scores would benefit most.
1.2 Biological Basis of ATRA
Retinoic acid (RA) signaling through RAR/RXR modulates chromatin accessibility and transcriptional programs governing differentiation, apoptosis, and cell-cycle control. In acute promyelocytic leukemia (APL), ATRA induces lineage maturation and durable remission—proof that transcriptional rewiring can be therapeutic. In solid tumors, context-dependent efficacy implies that core regulatory networks (GRNs) determine responsiveness, motivating our network-first analysis.
1.3 Objectives & Scientific Questions
Core hypothesis: ATRA can reverse malignant phenotypes in HCC by reprogramming core regulatory networks that govern differentiation and survival, yielding tractable design rules for combinations and delivery.
| Question | Operationalization | Readout |
|---|---|---|
| Q1: Who are the actionable targets? | Infer TF→target circuits, score ATRA receptors & downstream modules | Ranked targets / modules; testable hypotheses |
| Q2: Can we predict responsiveness? | Integrate single-cell states with graph-aware predictors | Response scores; state-transition likelihoods |
| Q3: How to design therapy? | Combine with SoC; optimize delivery (e.g., Lipo-ATRA) | In-silico prioritization; validation plan |
Model · Technical Roadmap (PDF)
Chapter 2 · Materials & Methods
2.1.1 Data Sources & Characteristics
We assembled a comprehensive liver single-cell compendium by integrating authoritative resources: LaminDB ARC virtual atlas (multi-omics reference), hepatoma cell lines (HepG2, Huh7, PLC/PRF/5, Hep3B), normal liver references (THLE-2 and HHSteC) and the Tabula Sapiens liver atlas. This mixture provides tumor vs. normal contrasts and population diversity necessary for robust modeling.
| Dataset | Cells | Genes | Batches | Source |
|---|---|---|---|---|
| LaminDB ARC | ~800,000 | ~20,000 | Multiple | Integrated database |
| Hepatoma cell lines | ~400,000 | ~18,000 | 4 lines | Lab-curated |
| Tabula Sapiens (liver) | ~300,000 | ~19,500 | Multi-donor | Public |
2.1.2 Quality Control Models
To ensure reliable downstream analysis, we applied distribution-aware QC. Genes were kept if expressed in at least:
\[ \max\left\{100,\;0.001\times N_{\text{total}}\right\} \tag{2.1} \]The mitochondrial fraction identifies stressed/low-quality cells:
\[ \mathrm{MT}_{\text{ratio}} \;=\; \frac{\sum_{i=1}^{n_{\text{mt}}} C_{\text{mt},i}}{\sum_{j=1}^{n_{\text{total}}} C_{\text{total},j}} \times 100\% \tag{2.2} \]QC (a) Gene count histogram
QC (c) Library size distribution
We normalize library depth and stabilize variance with the following:
\[ \hat{C}_{ij}=\frac{C_{ij}}{\sum_{k=1}^{m} C_{ik}}\times 10^{4} \tag{2.3} \] \[ X_{ij}=\ln\!\left(\hat{C}_{ij}+1\right) \tag{2.4} \]
2.1.4 Highly Variable Gene (HVG) Selection
Per-gene mean and dispersion are estimated and filtered as:
\[ \sigma_g^2 = \frac{1}{n}\sum_{i=1}^{n} \big(x_{g i}-\mu_g\big)^2 \quad\text{with}\quad 0.0125\le \mu_g\le 3\;\; \text{and}\;\; \sigma_g^2>0.5 \tag{2.5} \]HVG (a) Mean–dispersion scatter
HVG (b) HVG count across datasets — filmstrip (28 panels)
2.1.5 Batch-Effect Correction
We constructed batch-balanced KNN graphs in PCA space (BBKNN) and also used Harmony for MAP-based embedding realignment. Together they remove technical offsets while preserving biological structure.
Before / After UMAP (colored by batch)


Batch mixing score
2.1.6 Cell-Cycle Scoring
Using curated S and G2/M gene sets, each cell receives standardized stage scores:
\[ S_{\text{phase}} \;=\; \frac{1}{|G_s|}\sum_{g\in G_s}\frac{E_g-\mu_g}{\sigma_g} \tag{2.6} \]Cell-cycle stages on embeddings


S and G2/M score distributions


Cell-type specific cycle features


2.1.8 Output Validation
Each preprocessing step produces QC reports and figures for auditability; processed objects (e.g., analyzed_data.h5ad)
are persisted for reproducibility.
Preprocessing overview
2.2.2 ATRA Receptor Expression Landscape
Composite receptor score per cell:
\[ E_{r,c}=\frac{1}{n_r}\sum_{g\in\{RARA,RARB,RARG,RXRA,RXRB,RXRG\}} \frac{X_{g,c}-\mu_g}{\sigma_g} \tag{2.7} \]ATRA receptor expression profiles
2.2.3 Differential Expression & Pathway Enrichment
Ranking and significance:
\[ \log_2\!\left(\frac{\mu_{g,\text{hepatoma}}+\epsilon}{\mu_{g,\text{normal}}+\epsilon}\right), \qquad p_{\text{adj}}<0.05 \tag{2.8} \]Volcano gallery — filmstrip (30 panels)
Top KEGG pathways
2.2.4 Random-Forest Targeted Delivery Model
Classifier and aggregation:
\[ X\in\mathbb{R}^{n\times p},\quad y\in\{\text{hepatoma},\text{normal}\},\qquad \hat y = \mathrm{mode}\left\{T_b(X)\right\}_{b=1}^{B},\; B=50 \tag{2.9} \]Gini-based importance ranks candidate targets for Lipo-ATRA.
Feature importance ranking
Model performance


ROC curve
2.2.5 Lipo-ATRA Treatment Simulation
Target scoring and tumor-selective downregulation:
\[ S_g = w_1\,\mathrm{Importance}_g + w_2\,E_{g,\text{hepatoma}} - w_3\,E_{g,\text{normal}} \tag{2.10} \] \[ X^{\text{post}}_{g,c} = \begin{cases} 0.5\,X^{\text{pre}}_{g,c}, & c \in \text{hepatoma}\ \land\ g=g_{\text{target}} \\ X^{\text{pre}}_{g,c}, & \text{otherwise} \end{cases} \tag{2.11} \]Per-gene expression change — filmstrip (28 panels)
Box/Violin summaries — filmstrip (28 panels)
Response statistics
2.2.6 Mechanism-of-Action Network
Post-treatment DEG thresholds and effect size:
\[ p_{\text{adj}}<0.01,\qquad \left|\log_2\mathrm{FC}\right|>1.5 \tag{2.12} \]Mapped to liver/cancer-relevant pathways to generate a hypothesis network for ATRA-mediated reprogramming.
(a) Heatmap of responsive genes
(b) Interaction networks


(c) Mechanistic schematic
2.4.6 Integrated Validation Metrics
Integrated validation dashboard
2.4.7 Reproducibility & Robustness
Robustness analyses
2.5.1 Quantifying Cell Behavior
Trajectory definition (centroid linkage):
\[ \mathbf{p}_i(t)=(x_i(t),y_i(t)) \tag{2.13} \]We compare constant-velocity, quadratic, and AR(1) extrapolations for smoothing and short-horizon prediction.
Trajectory & extrapolation
MSD curves & anomalous diffusion




Behavior clusters


Cluster trajectory exemplars


Parameter correlations & network


Population velocity fields & density maps




2.6 Engineering Design
ATRA biosynthetic pathway
Pareto front for plasmid optimization
Dose–response system curves




Restriction site density map
Protein physicochemical properties


System parameter sensitivity
Chapter 3 · Results
All figures are zoomable — click any image to view it in a lightbox.
3.1 Heterogeneity Atlas of Liver Cancer Cells
Faceted visualizations summarize cellular diversity, malignant program activation, and cycling states.
3.1.1 Cell-State Diversity
We identify major subpopulations using graph-based clustering on integrated scRNA-seq embeddings. Subcluster-specific marker genes (e.g., AKR1B10, EPCAM, ALB) delineate malignant, progenitor-like, and hepatocyte-like states. Heterogeneity is quantified via intra-/inter-cluster dispersion and silhouette indices.
3.1.2 Malignant-State Features
Transcriptome contrasts between HCC and normal hepatocytes highlight up-regulated cell-cycle/DNA-repair programs and reduced hepatocyte metabolism. Cell-cycle staging reveals proliferative bias in specific subclusters.
3.2 ATRA Target Expression
Retinoic acid receptor isoforms (RAR/RXR) show enriched co-expression in malignant subclusters with high cycling activity, motivating receptor-guided delivery and rational combinations.
3.3 Core Regulatory Network Identification
Using feature selection and graph metrics over the integrated atlas, we prioritize hepatoma-specific regulators and modules.
3.4 Mechanistic Validation
3.4.1 Molecular Docking (narrative)
Prior literature supports plausible ATRA binding within RAR/RXR pockets (conserved residues, pocket geometry). Without a provided docking figure in the current assets, conclusions here remain text-only and consistent with receptor expression maps in Section 3.2.
3.4.2 Virtual Intervention (Lipo-ATRA)
We simulate receptor-informed down-regulation and pathway inactivation; higher receptor scores predict stronger reprogramming toward less aggressive states.
3.5 Experimental Validation
Behavior assays reveal reduced migration persistence and altered trajectory patterns after ATRA exposure.
3.6 Therapeutic System Design
Plasmid and delivery engineering focus on maximizing on-target reprogramming in receptor-high malignant niches while sparing receptor-low hepatocytes.
3.6 Plasmid ORF / Protein Analysis — Visual Summary
Interactive exploration of predicted ORFs and protein properties derived from the plasmid file. Hover points to inspect details; click table headers to sort. Filters update all charts.
MW vs pI (dot size = length aa)
Length Distribution (aa)
Top ORFs by Codon Adaptation Index (CAI)
ORF Table
Download CSV
| ORF | Start | End | Strand | Length (aa) | MW (Da) | pI | Instability | Stability | CAI |
|---|
Chapter 4 · Discussion
4.1 Integrated View of Major Findings
We synthesize evidence from expression mapping, network modeling, virtual intervention, and cell-behavior validation to provide a unified interpretation of how ATRA drives malignant-to-differentiated transitions in hepatocellular carcinoma.
\[ \mathcal{W}_{\text{ATRA}} \;=\; \sum_{\ell \in \{\text{scRNA, net, ML, sim, beh}\}} w_\ell \, z_\ell \quad\text{with}\quad \sum_\ell w_\ell = 1 \tag{4.1} \]| Conclusion | scRNA-seq | Network | ML | Simulation | Behavior |
|---|---|---|---|---|---|
| ATRA-responsive subclusters align with high RAR/RXR expression. | ✓ | ✓ | ✓ | ✓ | — |
| Core circuits controlling cycle-to-differentiation shifts can be perturbed. | — | ✓ | ✓ | ✓ | — |
| Virtual knockdowns predict reduced proliferation, matching behavior assays. | — | — | — | ✓ | ✓ |
| Hepatocyte-like programs re-emerge under receptor-guided delivery. | ✓ | ✓ | — | ✓ | — |
Compared to traditional views of ATRA as a differentiation agent primarily in hematologic malignancies, our integrated evidence suggests that solid-tumor contexts like HCC can also exhibit receptor-guided responsiveness—provided that the delivery and combination design respects the tumor’s heterogeneous microenvironments.
4.2 Biological Significance
The reprogramming effect arises when receptor activation suppresses proliferative circuits and relieves brakes on hepatocyte-like functions. Microenvironmental cues (hypoxia, stromal interactions, cytokines) modulate response heterogeneity.
\[ \text{Shift}_{\text{state}} \;=\; \beta_0 \;-\; \beta_1 \,\Pi_{\text{prolif}}(p) \;+\; \beta_2 \,\Phi_{\text{hep}}(p) \;-\; \beta_3 \,\Sigma_{\text{stress}}(p) \tag{4.2} \]Interactive microenvironment model: drag the slider to vary microenvironmental pressure and preview expected shifts.
With moderate pressure (35%), proliferation remains dominant but starts to decline as receptor signaling competes with cytokine-driven growth.
4.3 Methodological Innovations
Multi-omics integration strategy
We align bulk and single-cell layers, unify gene identifiers, and infer cross-modal modules to expose circuits otherwise obscured by batch and platform differences. This provides a stable substrate for downstream causal modeling.
Closed-loop validation between computation and experiment
Predictions (target sets, directionality of change) are iteratively tested in behavior assays; discrepancies feed back into model refinement (feature selection, priors, constraints), tightening inference over time.
Systematic engineering design
Delivery and combination choices are mapped to receptor landscapes and safety constraints, yielding actionable, modular design rules.
4.4 Limitations and Outlook
Our analyses are bounded by data scale/coverage and by the depth of mechanistic probing achievable in current assays. Translational gaps remain between simulated reprogramming and durable clinical responses. We outline risks and mitigation below.
\[ \mathcal{R}_k \;=\; \text{Impact}_k \times \text{Likelihood}_k \quad\text{and}\quad \text{Priority} \;=\; \operatorname*{arg\,max}_k \mathcal{R}_k \tag{4.3} \]Pick your near-term priority (updates the action list):
- Profile receptor landscapes at single-cell resolution in additional HCC cohorts.
- Benchmark off-target uptake in hepatocyte-like neighborhoods with low receptor scores.
- Stress-test dose scheduling to maximize on-target reprogramming.
Chapter 5 · Conclusions & Outlook
5.1 Main Conclusions
- Reprogramming model established: a structured framework that links receptor landscapes to state transitions in HCC.
- Key regulatory nodes identified: features and modules prioritized by network-aware ML strategies.
- Effects validated from molecules to cells: simulations and behavior assays converge on cycle↓ / differentiation↑.
- Feasible delivery system designed: receptor-guided Lipo-ATRA with rules for dosing and combinations.
Interactive confidence dashboard (adjust the sliders; the overall readiness updates):
5.2 Scientific Contributions
| Contribution | Highlights | Utility |
|---|---|---|
| Theory | Mechanism of state conversion via receptor-guided reprogramming | Clarifies how differentiation signals counteract malignant circuits |
| Method | Integration → network inference → simulation → closed-loop validation | Reusable blueprint for solid-tumor reprogramming studies |
| Application | Lipo-ATRA design rules; receptor-aware dosing/combination | Translational path for HCC with safety-aware targeting |
5.3 Future Directions
We prioritize preclinical validation and personalization. Early animal studies define efficacy/toxicity envelopes, while patient-stratification rules refine delivery and combinations to match receptor landscapes and microenvironmental pressures.
\[ \max_{\;\text{schedule}}\; \mathbb{E}[\Delta\text{Diff}] \;-\; \lambda_1 \mathbb{E}[\text{Toxicity}] \;-\; \lambda_2 \mathbb{E}[\text{Off\text{-}target}] \]Personalization sandbox: toggle receptor landscape and microenvironment pressure; the recommended strategy updates.
- Receptor-guided Lipo-ATRA dosing; monitor off-target uptake.
- Combine with cycle dampeners if proliferative niches persist.
- Schedule pulses to consolidate differentiation gains.
Mixed receptors & moderate pressure: balance efficacy with safety; consider adaptive scheduling.
Appendix · Data & Model Gallery
A. Data Source & Contents
| Item | Content | Notes | Source |
|---|---|---|---|
| Original model images | UMAP / t-SNE / QC / mechanism & engineering schematics | Base visuals used across Chapters 2–3 | DOI: 10.5281/zenodo.17259332 |
| Analysis artifacts | HVG, volcano sets, simulated interventions, behavior panels | Intermediate products (non-personal) | Same as above |
| Plasmid dataset | Fusion ATRA plasmid key elements & constraints | See Chapter 3, Section 3.6 | Same as above |