Parts | TJI-Seoul - iGEM 2025

Background on MHC and HLA Diversity

The Major Histocompatibility Complex (MHC) is one of the most polymorphic systems in the human genome. In humans, MHC proteins are encoded by Human Leukocyte Antigen (HLA) genes.

What is HLA?

HLA proteins are found on the surface of most human cells and play a central role in regulating the immune system. They act as display platforms, presenting small fragments of proteins (antigens) to immune cells. In this way, HLA enables the immune system to distinguish between the body's own proteins and potentially harmful invaders, such as viruses, bacteria, or abnormal cancer cells. HLA is essentially the human version of the MHC, which is present in many animals (1).

HLA Diversity and Its Importance

One defining feature of HLA genes is their extraordinary diversity. HLA genes are highly polymorphic, with thousands of different variants existing among humans (4). This genetic variability is a crucial defense mechanism: if everyone carried the same HLA type, a single pathogen could adapt and devastate the population. Instead, diversity ensures that pathogens encounter different immune filters in different individuals, making it far more difficult for them to spread universally.

However, this same variability also creates significant challenges for medicine. Treatments that depend on precise antigen recognition --- such as vaccines, organ transplants, or cancer immunotherapies --- can behave very differently depending on a patient's HLA type. For example, in cancer immunotherapy, immune cells must recognize tumor-specific antigens presented by HLA molecules. Since not all patients display the same antigens in the same way, therapeutic responses can vary greatly between individuals.

MHC Classes

MHC molecules are grouped into two main classes:

MHC Class I (HLA-A, HLA-B, HLA-C): expressed on nearly all nucleated cells; present peptides to cytotoxic CD8+ T-cells (2).
MHC Class II (HLA-DP, HLA-DQ, HLA-DR): expressed mainly on antigen-presenting cells; present peptides to CD4+ helper T-cells (3).

Figure 1. Illustration of different signalling pathways for MHC classes I & II.(Image Source: Biorender.com)

Our project focuses on MHC Class I, specifically the HLA-A*02:01 allele, one of the most common and well-studied alleles worldwide. This allele has been extensively characterized in structural and immunological studies (5--7) and is widely used in cancer immunotherapy research (8). By selecting HLA-A*02:01, our results can be broadly applicable and comparable with findings from other studies. Focusing on such a common allele also increases the translational potential of our work, ensuring its relevance to real-world medical applications.

Looking forward, researchers are even exploring the idea of a "universal HLA" --- an engineered form that could overcome the challenge of HLA diversity and provide broader therapeutic coverage. While this remains a long-term vision, our project contributes to this larger effort by testing whether engineered HLA molecules can enhance antigen presentation and improve cancer immunotherapy outcomes.

Why We Chose HLA-A*02:01

We selected HLA-A*02:01 (HLA-A2) as our model allele for several reasons:

Prevalence in Global Populations: HLA-A*02:01 is one of the most frequent alleles worldwide, with high frequencies across multiple ethnic groups. This makes it highly relevant for a wide patient base (5).
Well-Characterized in Immunology: It is one of the best-studied alleles in cancer research, with abundant structural and immunological data available, including peptide motifs and tumor-associated epitopes (6).
Structural Availability: Crystal structures of HLA-A*02:01 are widely available in the Protein Data Bank (PDB), enabling detailed computational docking studies (7).
Clinical Relevance: Many immunotherapy trials --- especially peptide-based cancer vaccines --- have been conducted in HLA-A*02:01-positive patients, making this allele a strong translational target (8).

By working with HLA-A*02:01, our results are not only relevant for basic science but also have potential implications for ongoing cancer immunotherapy efforts.

Role of β2-Microglobulin (B2M)

The functional MHC Class I complex is composed of two chains:

the heavy chain (α-chain) encoded by HLA genes, and
the light chain β2-microglobulin (B2M).

While the heavy chain forms the peptide-binding groove, B2M plays a stabilizing role:

It ensures proper folding of the heavy chain, preventing aggregation.
It stabilizes the peptide--MHC complex at the cell surface.
In in vitro systems like ours, co-refolding heavy chain with B2M and peptide is essential for producing functional tetramers (9,10).

Without B2M, the heavy chain cannot achieve its proper conformation, and peptide presentation fails.

The Four Plasmids Used in Our Project

To experimentally evaluate our approach, we constructed and used four plasmids, each encoding a key component of the system:

HLA-A*02:01 Heavy Chain (Wild-Type): BBa_25LBQEGY
- Serves as the baseline control for comparison.
- Codon-optimized for E. coli expression.
- Contains a C-terminal biotinylation tag for tetramer formation.
HLA-A*02:01 Heavy Chain (Mutant 1 -- Trp167→Ala167): BBa_25SNBFGX
- A point mutation replacing tryptophan at position 167 with alanine.
- Suggested by DiffDock modeling to potentially alter the peptide-binding groove, improving stability for tumor-specific peptides.
HLA-A*02:01 Heavy Chain (Mutant 2 -- Tyr→Ala at positions 7, 99, 159, 171): BBa_25SCBFYY
- Four tyrosine residues predicted to influence peptide accommodation were replaced with alanine.
- Suggested by DiffDock to create additional space or reduce steric hindrance in the peptide-binding groove.
- Designed to test whether multiple coordinated substitutions could synergistically enhance antigen presentation.
β2-Microglobulin (B2M): BBa_25GNHXKA
- Common component used with all three heavy chain variants.
- Required for refolding of heavy chain + peptide into a stable, functional MHC Class I tetramer.

By testing two different mutation strategies against the wild-type, and holding B2M constant, we designed a system that allows direct comparison of wild-type vs. engineered variants under otherwise identical conditions.

Our Contribution to the iGEM Registry: The HLA Engineering Toolkit

To advance the field of synthetic immunology, our team designed and validated a complete set of interoperable parts collectively named the HLA Engineering Toolkit. This toolkit was developed to pursue the Best New Parts Collection special prize and demonstrates a unified, end-to-end pipeline for engineering, refolding, and characterizing human immune display systems in E. coli.

The collection includes four standardized, Registry-compatible plasmids: HLA Engineering Toolkit (New Parts Collection): c64ee9f8-f6ec-4ff4-9da1-38ee16033a42

Together, these parts form a complete, modular MHC Class I system that allows direct comparison of engineered variants under standardized experimental conditions. This integrated structure — rather than a single part — is what defines our eligibility for the Best New Parts Collection award.

Goal of the Collection

Our aim was not simply to create new parts, but to build a rational engineering framework for immune display systems.

The HLA Engineering Toolkit was specifically designed to:

Demonstrate that human immune proteins can be functionally expressed in bacteria through optimized inclusion body refolding;
Provide a comparative baseline for computationally guided protein engineering (using tools like DiffDock and AlphaFold-Multimer);
Establish a standardized assay system for future iGEM teams working on cancer, vaccine, or immunotherapy-related projects;
Serve as a transferable template for expanding to other HLA alleles or MHC Class II systems.

By meeting these goals, the toolkit moves synthetic biology closer to rational immune system engineering — bridging molecular design, wet-lab assembly, and translational application.

Design

Each construct was designed to balance experimental practicality and biological relevance:

The wild-type HLA-A*02:01 heavy chain (residues 25–308) serves as the structural backbone.
All constructs share the same pET21a expression vector, AviTag–6xHis purification system, and E. coli codon optimization, ensuring consistent refolding performance.
The mutants were guided by computational docking predictions to test two strategies:

W167A: minimal, single-site modification to subtly adjust groove flexibility.
Y7A/Y99A/Y159A/Y171A: multi-site reprogramming to evaluate cooperative effects.

The non-tagged β2M ensures physiological accuracy of the refolded complexes, avoiding artificial stabilization from affinity tags.

By combining these design choices, we produced an experimentally robust and biologically relevant toolkit that models how mutational design influences antigen presentation.

Experimental Workflow

All four constructs were expressed and validated using a unified, reproducible protocol. The full process includes:

Transformation and Plasmid Confirmation in E. coli DH5α.
Protein Expression in E. coli BL21(DE3) using IPTG induction.
Inclusion Body Isolation and Refolding with β2M and tumor antigen peptides under optimized redox conditions.
Validation through SDS-PAGE, BCA quantification, and fluorescence-based binding assays.

Each construct was verified to produce correctly folded, biochemically active complexes under these conditions — confirming that the parts function as a single, coherent system.

Characterization

We characterized our toolkit collectively and comparatively:

Wild Type (BBa_25LBQEGY): Established baseline folding yield, binding affinity, and refolding efficiency.
Mutant #1 (BBa_25SNBFGX): Demonstrated the viability of hypothesis-driven single-point engineering.
Mutant #2 (BBa_25SCBFYY): Revealed structural trade-offs in multi-site mutagenesis.
β2M (BBa_25GNHXKA): Confirmed essential stabilizing role; absence of β2M led to aggregation.

Each construct was verified to produce correctly folded, biochemically active complexes under these conditions — confirming that the parts function as a single, coherent system.

Applications

The HLA Engineering Toolkit is a versatile platform with broad applicability:

Cancer Immunotherapy Research: Enables rational testing of tumor neoantigen display.
Vaccine Development: Applicable to viral epitope presentation studies.
AI-Guided Protein Design: Provides a wet-lab benchmark for computational predictions.
Synthetic Biology Education: Serves as a modular teaching system for protein refolding and immune complex assembly.
Industrial Applications: Demonstrates transferrable folding workflows for high-value human proteins in E. coli.

This toolkit transforms HLA research from a specialized immunology task into a standardized synthetic biology platform — accelerating innovation in immune engineering.

Discussion

Through iterative design, expression, and refolding, our project validated a complete experimental ecosystem for HLA engineering. Key insights include:

Inclusion body refolding can produce large, functional human proteins in E. coli.
Standardization across all constructs allows controlled comparison of engineered variants.
Computational predictions must be empirically validated, as multi-site mutations can introduce unpredictable effects.
The presence of β2M is essential for stability and native-like conformation.

Beyond technical success, our project demonstrates that HLA molecules can be rationally redesigned — paving the way for next-generation synthetic immunology tools.

Conclusion

The HLA Engineering Toolkit is more than just a collection of plasmids — it represents a complete, practical system for studying and engineering human immune proteins in a synthetic biology setting. Through this project, we showed that complex human molecules such as HLA Class I can be expressed, refolded, and functionally tested in E. coli using standardized methods. This achievement is important because it proves that even difficult, multi-domain human proteins can be handled through a systematic Design–Build–Test–Learn approach. By integrating computational predictions, controlled mutagenesis, and careful experimental validation, our toolkit bridges the gap between theoretical protein design and real, testable biological function.

The significance of this project lies in how it transforms HLA biology into something that can be engineered and studied with the same logic used in synthetic biology. Each component — wild type, mutants, and β2M — plays a specific role, but their real strength appears when used together as an integrated system. This structure allows direct comparison of engineered variants, helping researchers understand how specific mutations influence folding, stability, and peptide binding. Because the entire workflow is open and repeatable, future iGEM teams can reuse these parts, adapt the design for other HLA alleles, or apply the same principles to different protein families.

More broadly, the HLA Engineering Toolkit demonstrates how synthetic biology can contribute to medicine and biotechnology. It provides a foundation for developing improved cancer immunotherapies, vaccine candidates, and other applications where antigen presentation is important. At the same time, it serves as an educational model — a hands-on example of how computational design, molecular biology, and experimental testing come together in one continuous cycle. In this sense, the toolkit embodies the spirit of iGEM: using engineering principles to make complex biology understandable, reusable, and ultimately, useful for solving real-world challenges.

Sequences

In addition to iGEM Parts Registry, full-length sequences have been organized into PDF files as below. All of the sequences were analyzed using SnapGene software.

PDF Attachment

References (Parts Section)

Neefjes, J., et al. (2011). Towards a systems understanding of MHC class I and class II antigen presentation. Nat Rev Immunol, 11(12), 823--836.
Rock, K. L., Reits, E., & Neefjes, J. (2016). Present yourself! By MHC class I and class II molecules. Trends Immunol, 37(11), 724--737.
Germain, R. N. (1994). MHC-dependent antigen processing and peptide presentation: providing ligands for T lymphocyte activation. Cell, 76(2), 287--299.
Robinson, J., et al. (2020). The IPD and IMGT/HLA database: allele variant databases. Nucleic Acids Res, 48(D1), D948--D955.
Cao, K., et al. (2001). HLA-A, -B, -C allele frequencies in world populations. Tissue Antigens, 57(5), 358--362.
Rammensee, H.-G., et al. (1999). SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics, 50, 213--219.
Berman, H. M., et al. (2000). The Protein Data Bank. Nucleic Acids Res, 28(1), 235--242.
Ayyoub, M., et al. (2003). Tumor-reactive CD8+ T cells in HLA-A*0201+ melanoma patients. Cancer Res, 63(11), 3053--3057.
Zinkernagel, R. M., & Doherty, P. C. (1979). MHC-restricted cytotoxic T cells: studies on the role of β2-microglobulin. J Exp Med, 149(6), 1476--1489.
Garboczi, D. N., et al. (1992). Assembly, specific binding, and crystallization of a human TCR--MHC complex. Proc Natl Acad Sci USA, 89(8), 3429--3433.