Design–Build–Test–Learn Overview
In developing our broad-spectrum influenza vaccine, we systematically applied the engineering Design-Build-Test-Learn (DBTL) cycle across three key stages—peptide prediction, antigen expression, and ferritin-based protection—while continuously integrating insights from our Human Practices (HP) work.
In the peptide prediction cycle, we designed by analyzing influenza genome data to identify conserved, antigenic HA2 regions, built and modeled candidate peptide sequences, tested their conservation and immunogenicity using bioinformatics tools, and learned from both computational results and HP feedback—which emphasized the need for broad protection and public trust—to select optimal epitopes.
During the antigen expression cycle, we designed optimized gene constructs for bacterial expression, built them in E. coli systems, tested expression and solubility through SDS-PAGE and purification, and learned from technical challenges (such as inclusion body formation) and HP findings, which highlighted the importance of reliable, scalable production and vaccine quality.
In the ferritin protection cycle, we designed nanoparticle constructs to enhance antigen stability and storage, built and purified ferritin-peptide nanoparticles, tested their physical properties and digestion resistance, and learned from both experimental data and HP outcomes—particularly the need for vaccines that are easy to store and distribute without cold-chain requirements.
At every stage, our Human Practices work—through community surveys, expert interviews, and stakeholder engagement—directly influenced our engineering decisions, ensuring our technical progress was continually aligned with real-world needs, public concerns about safety, and the practicalities of vaccine deployment.
Peptide Prediction Cycle
Digging for conserved region in Influenza Virus Genome sequene
Our project began with a fundamental question: how can we design a broadly protective influenza immunogen? To answer this, we targeted the most conserved regions of the influenza virus genome, focusing on the HA2 domain of hemagglutinin (HA) proteins from both H1N1 and B-type viruses. These regions are less subject to antigenic drift and are crucial for viral fusion, making them promising universal vaccine targets.
We systematically collected all available HA2 protein sequences from the NCBI Influenza Virus Sequence Database, covering isolates from 1934 to 2020 for H1N1 and from 1940 to 2020 for B-type influenza.
Fig1. Sequence Collection from NCBI (link)
Using MEGA10, we performed multiple sequence alignments, which allowed us to visualize amino acid conservation across decades and global strains. Conservation scoring, performed with Excel and BioEdit, enabled us to generate conservation matrices and sequence logos.
Fig2. Sequence Logo Depicting Conservation of H1 Hemagglutinin (HA) Proteins from 1934 to 2020. Amino acid sequence conservation of influenza A H1 hemagglutinin (HA) proteins was analyzed using sequences collected from 1934 to 2020. Multiple sequence alignment was performed and visualized as a sequence logo using MEGA10 and the WebLogo tool. In this plot, the overall height of each stack indicates the sequence conservation at that position, while the height of individual letters represents the relative frequency of each amino acid. Amino acids are color-coded by chemical properties: polar residues (green), basic residues (blue), acidic residues (red), and hydrophobic residues (black). The red box highlights a region of interest with particularly high conservation, selected for further analysis in this study.
Fig3. Sequence Logo Depicting Conservation of Influenza B Virus Hemagglutinin (HA) Proteins from 1940 to 2020. Amino acid sequence conservation of influenza B virus HA proteins was assessed using sequences collected between 1940 and 2020. Multiple sequence alignment was performed, and the degree of conservation at each residue was visualized as a sequence logo using MEGA10 and the WebLogo tool. In the sequence logo, the total height of each stack reflects the conservation level at that position, while the height of individual amino acid letters indicates their relative frequency. Amino acids are color-coded according to their chemical properties: polar residues (green), basic residues (blue), acidic residues (red), and hydrophobic residues (black). The red box highlights a highly conserved motif identified for further analysis in this study.
These analyses highlighted several highly conserved regions, from which we selected the top 16 amino acid sites for both H1 and B types. This rigorous approach ensured that our design would target sequences that are stable across numerous influenza lineages.
Antigenic potential of conserved peptide regions
Identifying conserved sequences was only the first step. Next, we needed to ensure that these regions had strong antigenic potential and could be recognized effectively by the immune system. We employed the Kolaskar & Tongaonkar antigenicity prediction method via the IEDB webserver (http://tools.immuneepitope.org/), which analyzes peptide sequences for properties such as accessibility, flexibility, and hydrophilicity—key factors for B cell epitope recognition.
Through this analysis, we identified two highly promising candidate epitopes:
- Influenza B: TISSQIELAVLLSNEC
- H1N1: DIWTYNAELLVLLENE
Fig4. Kolaskar & Tongaonkar Antigenicity Prediction Plot for a Conserved Influenza B HA Peptide. Antigenic propensity of a conserved peptide region from influenza B virus hemagglutinin (HA) was predicted using the Kolaskar & Tongaonkar method. The y-axis represents antigenic propensity, while the x-axis denotes sequence position. The red line indicates the antigenicity threshold (set at 1.0). Regions above the threshold (shaded yellow) are predicted to be antigenic and potential B cell epitopes, while regions below the threshold (shaded green) are considered non-antigenic. This analysis was used to identify peptide segments suitable for immunogen design.
Fig5. Structural Extraction and Modeling of Influenza HA2 Conserved Region. The schematic illustrates the stepwise extraction and modeling of the conserved region from the influenza hemagglutinin (HA) trimer. The left panel shows the full HA trimer structure, with the conserved HA2 domain highlighted by a black box. The middle panel represents the isolated HA2 subunit, and the right panel shows the further refined model of the conserved region, highlighting the secondary structure elements. This process was used to identify and model the highly conserved HA2 regions for downstream epitope and immunogen design.
Molecular Cloning of Antigen Expression Vector
With our candidate epitopes selected, we moved to the design and build phase of our engineering cycle. The gene sequences encoding these conserved peptides were codon-optimized for E. coli expression and synthesized by a commercial provider. We utilized the pET28a-sumo vector for high-level bacterial expression, ensuring robust transcription and translation.
Fig6. Schematic Design of Recombinant Ferritin Nanoparticle Immunogen Displaying Conserved Influenza HA Epitopes.
Fig7. Schematic Diagram of Recombinant Ferritin Nanoparticle Immunogen Constructs and Expression Vector.
The cloning process involved transformation into competent cells BL21(DE3). Positive colonies were screened by colony PCR and confirmed by Sanger sequencing to ensure sequence fidelity. We constructed several plasmid variants, including single-epitope and multi-epitope combinations, to enable flexibility in downstream immunogen design.
Fig8. Transformed colony in BL21(DE3)
Fig9. Sequence for plasmid B-(16)(4)-F、 H3-H1-B-F、H1-(16)(4)-F were vertified using restriction digest
Expression of Antigen
Following successful plasmid construction, we expressed our recombinant immunogen proteins in E. coli BL21(DE3) cells. We systematically optimized induction conditions, testing different IPTG concentrations and induction times. After induction, protein expression was analyzed by SDS-PAGE.
Fig10. SDS-PAGE Analysis of Soluble Expression of Recombinant Ferritin-Based Immunogens. SDS-PAGE analysis of soluble protein expression for three recombinant ferritin nanoparticle immunogens (B-(16)₄-F, H1-(16)₄-F, and H3-H1-B-F) in E. coli under different IPTG induction concentrations. Lanes 1–3: B-(16)₄-F with 0, 0.2, and 0.5 mM IPTG; lanes 4–6: H1-(16)₄-F with 0, 0.2, and 0.5 mM IPTG; lanes 7–10: H3-H1-B-F with 0, 0.2, 0.5, and 1 mM IPTG. The molecular weight markers (Marker) are shown on the left. No significant soluble expression of the target proteins was detected in the supernatant under the tested induction conditions.
It became clear that most of the recombinant antigens accumulated as insoluble inclusion bodies, rather than as soluble proteins in the cytosol—a common challenge for complex or synthetic proteins in prokaryotic hosts. To overcome this, we extracted proteins from the inclusion bodies, followed by purification and in vitro refolding to restore correct structure and function. The whole process was done by AKTA10 automatic protein purification system.
Fig11. Raw AKTA protein purification result
Fig13. western blot detection of in vitro folding product using anti His-tag primary antibody
SDS-PAGE and western blot analysis confirmed that the refolded proteins matched the expected molecular weights.
Improving soluble expression of recombinant proteins is essential for efficient downstream processing, as soluble proteins are more readily purified and maintain their native, functional conformation. Insoluble proteins often aggregate into inclusion bodies, requiring complex and time-consuming refolding procedures that can reduce overall yield and bioactivity. By increasing the proportion of protein expressed in the soluble fraction, we can greatly enhance the scalability and cost-effectiveness of our vaccine production, while also ensuring higher quality and reproducibility of the final immunogen product.
To address the challenge of limited soluble expression, our next steps will focus on optimizing both the production and purification of our recombinant proteins. We plan to explore advanced phase separation purification techniques, which can selectively isolate soluble proteins from cell lysates with higher efficiency. Additionally, we will consider co-expressing molecular chaperones or using solubility-enhancing fusion tags to assist in proper protein folding during expression. Alternative approaches such as lowering induction temperatures, modifying growth media, or testing different E. coli strains may also be evaluated. These strategies aim to increase the yield of functional, soluble antigen, streamlining both purification and downstream applications for our vaccine platform.
Characteristise of Ferritin particle
To enhance the immunogenicity and stability of our selected epitopes, we displayed them on a ferritin nanoparticle scaffold, leveraging ferritin’s ability to self-assemble into uniform nanostructures. This design mimics the repetitive, multivalent display of viral antigens that is known to boost immune responses.
We characterized the resulting nanoparticles using two main techniques: nano flow cytometry (NanoFCM) and transmission electron microscopy (TEM). NanoFCM measurements revealed a consistent particle size distribution, while TEM imaging confirmed that the nanoparticles were uniform and spherical, with diameters ranging from 12 to 16 nm—matching our design expectations.
Fig12. Nano Flow cytometry measurement of ferritin particles
Fig13. Transmission Electron Microscopy (TEM) Analysis of Recombinant Ferritin Nanoparticles. Representative TEM images showing the morphology of recombinant ferritin-based nanoparticles. The particles exhibit uniform, spherical structures with diameters ranging from 12 to 16 nm, consistent with the expected theoretical size for ferritin nanoparticles.
These results demonstrate the successful assembly and structural fidelity of our ferritin-based immunogen platform.
Digestion protection function of ferritin
A major challenge for protein-based vaccines, particularly those intended for oral or ambient delivery, is their susceptibility to degradation by proteases and harsh environmental conditions. To test the stability of our ferritin-based immunogens, we incubated the nanoparticles at room temperature overnight and analyzed them by SDS-PAGE. The results demonstrated that, while some partial degradation occurred, the main ferritin-immunogen bands remained intact, indicating that the ferritin scaffold provides significant protection against proteolytic digestion.
Fig14. Assessment of Degradation Stability of Ferritin-Based Protein by SDS-PAGE. SDS-PAGE analysis of ferritin-based protein samples after overnight incubation at room temperature. M: protein marker; lanes 1–3: ferritin-based protein samples collected from elution fractions. Target protein bands are still detected in the elution fractions after incubation, although partial degradation is observed. The main protein band remains present, indicating that the ferritin-based protein retains stability under these conditions.
Importantly, this enhanced stability translates into improved storage requirements for our vaccine candidates. Because the ferritin scaffold shields the epitopes from rapid degradation, the immunogens can be stored for longer periods at room temperature without significant loss of integrity. This property not only simplifies distribution and reduces reliance on cold-chain logistics but also makes our platform more suitable for deployment in low-resource or remote settings where refrigeration may not be available. Ultimately, the digestion protection and improved storage stability of our ferritin-based vaccines greatly enhance their practicality and accessibility for global public health initiatives.