Explain your model's assumptions, data, parameters, and results in a way that anyone could understand.
Aimed to utilize genomic bioinformatics methods to identify two essential protein receptors that exhibit upregulated expression in gemcitabine-resistant tumors and trastuzumab-resistant cells, and are closely associated with tumor proliferation and drug resistance mechanisms. This approach enabled us to identify HER2 and CD47 as key receptor molecules for our subsequent design.
Focused on applying our active antibody design and screening pipeline to develop and select active antibody fragments targeting HER2 and CD47, which will be used for the bispecific antibody design in Phase 3. This stage allowed us to identify functional antibody sequences for later dual-specific antibody assembly.
Designed to construct an active bispecific antibody by connecting the previously screened antibody fragments via a linker. We employed molecular dynamics simulations to evaluate and determine the most rational linkage strategy between the two binding units.
To compare the expression differences of HER2 and CD47 between normal breast tissue cells and breast cancer tissues, we obtained transcriptomic gene expression data of normal breast tissue from healthy donors from the GTEx (Genotype-Tissue Expression) database—a gold-standard resource for studying gene expression in normal tissues. For breast cancer tissue data, we acquired transcriptomic expression profiles from patients with Breast Invasive Carcinoma (BRCA) from TCGA (The Cancer Genome Atlas), one of the largest and most comprehensive cancer genomic databases.
To mitigate technical variations (i.e., batch effects) arising from different projects (GTEx and TCGA), we did not use the raw data directly from their respective sources. Instead, we utilized an integrated dataset processed through a unified pipeline. The UCSC Xena platform provides TCGA and GTEx data processed with the same bioinformatic workflow, with expression values measured in TPM (Transcripts Per Million). We downloaded the integrated TCGA-BRCA and GTEx breast tissue expression matrix from UCSC Xena.
From this integrated matrix, we filtered normal breast tissue samples (from GTEx) and primary breast tumor samples (from TCGA). We then extracted TPM expression values for the genes ERBB2 (i.e., HER2) and CD47 across all selected samples. A Wilcoxon rank-sum test was applied to assess whether the expression distributions of these two genes differed significantly between the normal and cancer groups. This non-parametric test was chosen due to the typically non-normal distribution of gene expression data, for which the Wilcoxon test offers greater robustness.
The results demonstrated that both CD47 and HER2 were significantly overexpressed in breast tumor tissues (n = 199) compared to normal breast tissues (n = 146), with extremely high statistical significance (P = 5.39e-22 for CD47 and P = 6.44e-43 for HER2; Fig. 1a-1b).
Fig. 1 Expression of CD47 (a) and HER2 (b) in normal and breast cancer tissues. Cancer Cells: breast cancer.
To investigate the role of CD47 and HER2 in gemcitabine resistance, we compared their expression in a non-resistant cancer cell line (BxPC-3) and a gemcitabine-resistant cell line (PANC-1). Transcriptomic profiles of both cell lines were analyzed using mRNA sequencing to compare gene expression between normal tumor cells and gemcitabine-resistant cells. Data were obtained from the GEO database (accession GSE140077), and a t-test was applied to compare the expression levels of CD47 and HER2. The results (Fig. 2) showed that both HER2 and CD47 were significantly upregulated in the resistant cells (PANC-1).
Fig. 2 Expression of CD47 and HER2 in non-resistant versus gemcitabine-resistant cancer cells.
To investigate the role of CD47 and HER2 in trastuzumab resistance, we compared their expression in a non-resistant cancer cell line (SK-BR-3) and a trastuzumab-resistant cell line (HCC1954). Transcriptomic profiles of both cell lines were analyzed using mRNA sequencing data obtained from the GEO database (accession GSE140077). A t-test was used to compare the expression levels of CD47 and HER2. To further validate the expression of these molecules at the protein level, we performed flow cytometry (FCM) to detect the actual surface expression of HER2 and CD47 on the resistant cells (HCC1954). The results (Fig. 3–Fig. 4) consistently showed high expression of both CD47 and HER2 in trastuzumab-resistant cells, as indicated by a rightward shift of the red peaks compared to the white control peaks. This supports their suitability as protein targets for our bispecific antibody design.
Fig. 3 Expression of CD47 and HER2 in non-resistant versus trastuzumab-resistant cancer cells.
Fig. 4 Flow cytometry assay.
In summary, bioinformatic analysis and preliminary experimental data indicate that both CD47 and HER2 are expressed in tumor cells. Our further analysis demonstrates that overexpression of CD47 and HER2 is closely associated with gemcitabine resistance and trastuzumab resistance in tumors. These findings therefore provide a rationale for simultaneously targeting CD47 and HER2 to overcome resistance to either gemcitabine or trastuzumab.
The first phase of our project focused on screening and designing antibody fragments targeting HER2 and CD47. We conducted the following workflow to design and select active antibody fragments for subsequent bispecific antibody development.
Fig. 5 Overall Screening Workflow
We employed RFdiffusion (fine-tuned for antibodies), RoseTTAFold2 (fine-tuned for antibodies), and ProteinMPNN to design active antibody fragments based on defined binding epitopes. A total of 10 sequences were designed. For relevant code, please refer to the GitHub repositories of the David Baker group:
Two strategies were used for selecting hotspot residues to guide active antibody design:
Active antibody design was guided by analyzing the interaction between Lemzoparlimab (TJC4) and CD47. Hotspot residues involving unique antigenic epitopes, such as Glu106 and Gly107, were selected.
Following conformational design, ProteinMPNN was used for sequence design and optimization. Some of the designed and optimized antibody fragments are shown in Fig. 6. By specifying hotspot residues, we successfully directed the antibody fragments to bind and be designed for the intended epitopes.
Fig. 6 Structurally designed antibody fragments generated by AI. (A) Design structural diagram of the active antibody fragment targeting CD47, with the active antibody fragment shown in yellow. (B) Design structural diagram of the active antibody fragment targeting HER2, with the active antibody fragments shown in green and blue.
We retrieved and collected known active antibody fragments targeting HER2 and CD47 to construct a comprehensive database of active antibody fragments.
The AI-designed active antibody fragments and those from the collected database were screened according to the following workflow: Protein-protein docking was performed using the Schrödinger Protein-Protein Docking Panel. The top 10% of compounds based on Glide Score (3 for HER2 and 2 for CD47) were selected for further evaluation. The specific docking parameters are illustrated in Fig. 7.
Fig. 7 Docking parameters and setup
For the antibody fragments ranked in the top 10% by Glide Score, the complexes formed with their respective receptors were first subjected to structure prediction using AlphaFold 3. The predicted structures were then used in molecular dynamics (MD) simulations to evaluate the stability of the antibody–receptor binding. Equilibrium conformations from the MD trajectories were selected for MM-GBSA binding free energy calculations, and the fragment with the highest MM-GBSA score was chosen for bispecific antibody design.
Among the two hotspot residue strategies applied for HER2, both HER2_001 (Strategy 2) and HER2_002 (Strategy 1) scored in the top 10% in docking and exhibited nearly identical MM-GBSA scores. To determine which fragment bound more stably to the HER2 receptor, we performed 500 ns MD simulations for each HER2–fragment complex (HER2_001 and HER2_002) and computed their free energy landscapes. A more stable complex typically exhibits a narrower and deeper energy well in its free energy landscape. As shown in Fig. 9, HER2_002 demonstrated superior binding stability compared to HER2_001 and was therefore selected for subsequent bispecific antibody development. For CD47, based on MM-GBSA scores, CD47_001 was identified as the optimal active fragment and chosen for further design.
As illustrated in Fig. 8, both CD47_001 and HER2_002 stably bound to the surface of their respective target proteins and reached equilibrium rapidly during simulations. These results confirm that, at the computational level, the designed active antibody fragments effectively bind to the intended epitopes on the receptors.
Fig. 8 RMSD analysis from MD simulations
Fig. 9 Free energy landscape
This phase aimed to design a novel bispecific antibody, NanosphinX, by integrating the previously selected active antibody fragments (structures shown in Fig. 10) to enable simultaneous binding to both CD47 and HER2. We initially attempted two construction strategies. The first approach involved connecting the CD47-targeting active fragment to the heavy chain of the HER2-targeting antibody fragment using a flexible linker.
Fig. 10 Alphafold3-predicted structures of the active antibody fragments used for subsequent bispecific antibody design (A) Structure of the CD47-targeting active antibody fragment (B) Structure of the HER2-targeting active antibody fragment (single-chain)
However, subsequent experiments revealed that linking to the heavy chain frequently resulted in mispairing between heavy and light chains, preventing successful production of the intended bispecific antibody, NanosphinX. To address this, we further optimized the construct by instead connecting the CD47-targeting active fragment to the light chain of the HER2-targeting antibody fragment via a flexible linker (structure shown in Fig. 11). This approach resolved the issue, enabling successful expression and purification of the bispecific antibody protein.
Fig. 11 Predicted structure of the bispecific antibody formed by connecting the CD47-targeting active fragment to the light chain of the HER2-targeting antibody fragment via a flexible linker Red: Heavy chain of Her2nb Green: Light chain of Her2nb Purple: CD47nb Blue: Linker