SCU-China 2025 | Protein Modeling

Introduction

We aim to engineer TasA fusion proteins that maximize Bacillus subtilis adhesion to hydrophilic polystyrene (PS) so cells can be immobilized on PS supports. Our design space combines (i) short PS-binding peptides (PSBPs) that deliver strong affinity to PS, (ii) mussel foot proteins that provide long-reach, catechol-based adhesion, and (iii) optional SpyTag/SpyCatcher for covalent display. We evaluate single-chain fusions (TasA--linker--tail) and split assemblies (TasA--SpyTag with SpyCatcher--tail) using a common scoring pipeline to quantify chemical exposure, geometric reach, and assembly reliability.

For the PS-binding option, we include the literature PS tag PSBP-R (PS19-6; RIIIRRIRR) as a benchmark and two new candidates designed to intensify aromatic and cationic contacts with PS: PS-1 (WWMRHMFAWRI) and PS-2 (FWWRTIVWRHIR). PS19-6 is known to bind hydrophilic PS plates with nanomolar affinity (K_d ≈ 86 nM) and to retain binding in the presence of non-ionic detergents(Kumada et al., 2010). Arg is essential for high affinity and aliphatic residues (Ile/Leu) are beneficial---properties that make PS tags reliable anchors when their key residues are well displayed(Kumada et al., 2010). Guided by those rules, PS-1 and PS-2 were composed as Trp-rich, amphipathic decapeptides that combine aromatics (W/F), hydrophobics (I/V/M), and cationics (R/H) to leverage π--π, hydrophobic, and cation--π interactions with the PS surface while maintaining compact length for facile display.

For the mussel-protein option, we use Mefp-1 (full length) and Mefp-5 (≈10 kDa). Mefp-1 behaves as a hydrated random coil in water (dynamic-light-scattering R_h ≈ 10.57 nm) with little persistent secondary structure; polymer-physics analysis and DLS show coil dimensions that are largely insensitive to ionic strength, supporting its role as a long, flexible tether that samples the interface effectively(Haemers et al., 2005). Mefp-5, by contrast, is a small, DOPA-rich interfacial protein (∼30 mol % DOPA) that can reach adhesion energies ≈ −14 mJ m⁻² on mica in the surface forces apparatus, exceeding other Mfps; critically, its adhesion is strongest when DOPA is kept reduced and at acidic pH, reflecting the chemistry of catechols at wet solid interfaces(Danner et al., 2012). Together, Mefp-1 (reach) and Mefp-5 (potent interfacial adhesion) let us test whether geometry or chemistry is rate-limiting in PS immobilization for TasA fusions.

To make split assemblies reliable, we employ SpyTag/SpyCatcher, which forms a spontaneous Lys--Asp isopeptide bond within minutes and yields a product stable to boiling/SDS. Single-molecule measurements show rupture forces on the order of ~1 nN, underscoring the mechanical and chemical robustness of the linkage once formed(Zakeri et al., 2012). In this framework our central question is: which combination of tail chemistry (PS-1, PS-2, PSBP-R, Mefp-1, Mefp-5) and assembly mode (single vs. Spy) best displays the adhesive residues and reaches the PS surface in a geometry that survives handling?

We set out to design TasA fusion proteins that maximize B. subtilis adhesion to hydrophilic polystyrene (PS) so cells can be immobilized on PS supports. We evaluated four architectures that vary in how PS-affinity chemistry and assembly are delivered: S1 TasA--(GGGGS)₃--tail (single chain), S2 TasA--(GGGGS)₃--Mefp (single chain), S3 TasA--SpyTag + SpyCatcher--tail (split, covalent ligation), and S4 TasA--SpyTag + SpyCatcher--Mefp (split). "Tail" variants included PSBP-R (PS19-6; RIIIRRIRR), two new PS tags (PSBP-1: WWMRHMFAWRI; PSBP-2: FWWRTIVWRHIR), Mefp-1 (full-length, IDR), and Mefp-5 (10 kDa, DOPA-rich).

Methods

Modeling & Disorder Check

Each fusion (or pair, for split designs) was modeled with AlphaFold 3(Abramson et al., 2024); Mefp-1 and Mefp-5 disorder was evaluated by AIUPred(Erdős et al., 2025; Erdős & Dosztányi, 2024).

Exposure of PS-relevant Chemistry (CAE)

We quantify how well the tail's functionally relevant residues are solvent-presented near the terminus. For PSBP tails we weight aromatics/hydrophobics/cationics; for Mefp we weight aromatics (including DOPA-like hetero-residues when present):

$$CAE = \frac{1}{n_{res}^{\gamma}}\sum_{i \in \mathcal{T}}^{}\frac{1}{1 + n_{i}/20}$$

with γ = 0.9, 𝒯 = tail residues of the relevant class, and n_i = neighbors within 6 Å. Higher CAE ⇒ better presentation of PS-active chemistry.

Tethered Reachability (TRS)

We approximate the probability that the linker ± flexible head can span a cell→PS gap d ∈ [8,12] nm using a worm-like-chain proxy:

$$L_{c} = 0.36,L + HEAD,R^{2} = 2L_{c},PERSIST + {RIGID}^{2},TRS = \min_{d \in \lbrack 8,12\rbrack}{\exp!}( - \frac{3d^{2}}{2R^{2}}),$$

with PERSIST = 0.8 nm, HEAD = 1.5 nm, RIGID = 3.5 nm, (L) the flexible length (S2/S4 include additional flexible residues to reflect Mefp tails). Larger TRS ⇒ better geometric access when the surface is not tightly apposed.

Spy Ligation Efficiency (SLI_eff; split only)

We combine a Lys--Asp geometry term centered on 2.8--3.8 Å, inter-chain contacts (≤ 6 Å), and H-bond-like pairs (≤ 3.8 Å) and penalize interface occlusion (OCI) by nearby tail atoms:

$${SLI}_{raw} = 0.50geom + 0.35\min(1,\ \frac{contacts}{300}) + 0.15\min(1,\frac{H - bonds}{20})$$

$${SLI}_{eff} = {SLI}_{raw}(1 - OCI)$$

Higher SLI_eff means the covalent complex forms reliably without the tail self-blocking the interface.

Composite Score

Metrics are min--max normalized by column.

Single-chain: $score = 0.5CAE_{n} + 0.5TRS_{n}$

Split: $score = 0.55CAE_{n} + 0.20TRS_{n} + 0.25SLI_{eff,n} + 0.03$

Results

Disorder Context for the Tails

AIUPred profiles show Mefp-1 is highly disordered across the sequence (≳0.8 for most positions), whereas Mefp-5 is only moderately disordered overall (mostly 0.2--0.5 with local increases) (Figure 1). Treating Mefp-1 as a long coil (supporting high TRS) and Mefp-5 as a shorter, still-flexible but more structured segment is therefore consistent with the literature (Mefp-1 R_H ≈ 10.5 nm; Mefp-5 highly adhesive but pH/redox-dependent).

Figure 1. Intrinsic disorder of Mefp-1 and Mefp-5 predicted by AIUPred.

Notes: Per-residue disorder probability (red trace; 0–1) is plotted against amino-acid index; the grey horizontal line marks the 0.5 disorder threshold.

Structure Analysis

Figure 2. AlphaFold 3 structures of TasA fusion designs (single-chain S1/S2 and split S3/S4) with PS-binding peptides or mussel proteins.

Notes: A S1--PSBP-R. B S1--PS-1. C S1--PS-2. D S2--Mefp-1. E S2--Mefp-5. F S3--PSBP-R. G S3--PS-1. H S3--PS-2. I S4--Mefp-1. J S4--Mefp-5. For S1/S2 panels, the large electrostatic body is TasA (red = acidic; blue = basic), the tail (cyan ribbon) is the fused peptide/protein, and the thin tan segment is the (GGGGS)₃linker. For S3/S4 panels, TasA--SpyTag is purple, SpyCatcher--tail is pink; the covalent Spy interface lies between the two domains.

Each fusion (or pair, for split designs) was modeled with AlphaFold 3(Abramson et al., 2024) (Figure 2).

The single-chain S1 constructs (Figure2. A--C) show TasA as a compact, highly charged core (electrostatic surface; red = acidic, blue = basic) with the engineered PS tags emerging from the (GGGGS)₃linker as solvent-exposed tails (cyan). S1-PSBP-R (A), S1-PS-1 (B), and S1-PS-2 (C) all place the PS peptide distal to the TasA surface rather than tucked into grooves, which explains their consistently high CAE values (PSBP-R = 178.7; PS-1 = 200.3; PS-2 = 204.7) (Table 1). Because the peptides are short, the tethered reachability is uniformly low (TRS ≈ 9.4×10⁻⁵), so these designs are expected to perform best when cells contact PS at short range; this matches their mid-pack composite scores (0.324--0.394), where strong chemistry compensates only partly for limited reach.

The S2 single-chain Mefp fusions (Figure 2. D--E) shift the balance toward geometry. S2-Mefp-1 (D) displays a long, disordered coil radiating from TasA; S2--Mefp-5 (E) is shorter and shows a compact helical segment near the terminus. In both cases the flexible tails project away from the TasA surface, yielding the highest TRS in our set (0.875 for both) and large CAE (Mefp-1 = 244.3; Mefp-5 = 105.7) (Table 1). These structures rationalize why S2-Mefp-1 ranks first overall (score = 1.000) and S2-Mefp-5 second (0.629): the tails reach recessed/rough PS and present adhesive aromatics frequently, a mechanism consistent with Mefp-1's large coil dimensions and Mefp-5's strong interfacial chemistry.

The S3 split designs with PS tags (Figure 2. F--H) separate TasA--SpyTag (purple) from SpyCatcher-PS peptide (pink). The AF3 poses show similar domain orientations but different packing of the tag around the ligation interface. S3--PSBP-1 (G) positions the tag clear of the interface, consistent with the highest SLI_eff among split PS designs (0.449) and a low OCI (0.086) (Table 1). S3--PSBP-R (F) is moderately occluded (SLI_eff = 0.275; OCI = 0.134) (Table 1). S3--PS-2 (H) visibly folds back toward the interface, matching its higher occlusion (OCI = 0.365) and intermediate SLI_eff (0.318) (Table 1). These geometric differences explain why S3--PS-1 tops the split group despite similar CAE across tags: accessible tag + reliable ligation wins.

The S4 split Mefp constructs (Figure 2. I--J) illustrate the main failure mode for large IDR tails on the SpyCatcher side. S4--Mefp-1 (I) shows the coil sampling across both proteins, yielding only middling performance overall. S4--Mefp-5 (J) brings the tail into persistent proximity with the interface---consistent with its extreme occlusion (OCI = 0.935), near-zero SLI_eff (0.031), and the lowest composite score (Table 1). These structures make it clear that self-crowding at the Spy interface can defeat the otherwise robust Spy reaction unless spacers or partner assignments are re-designed.

Table 1: Strategy-wise raw and normalized metrics (CAE, TRS, SLI_eff, OCI) and the final score

Strategy	Tail Label	n_{tail res}	CAE	TRS	SLI_eff	OCI	CAE_n	TRS_n	SLI_eff,n	Score
Strategy	Tail Label	n_{tail res}	CAE	TRS	SLI_eff	OCI	CAE_n	TRS_n	SLI_eff,n	Score	S2	MEFP1	36	244.3	0.875	NA	NA	0.999	0.999	NA	0.9999
S2	MEFP5	28	105.7	0.875	NA	NA	0.258	0.999	NA	0.6289
S3	PSBP_1	7	168.4	9.37e-05	0.449	0.086	0.594	0	0.897	0.5809
S3	PSBP_R	9	139.9	9.37e-05	0.274	0.134	0.442	0	0.550	0.4102
S1	PSBP_2	10	204.7	9.37e-05	NA	NA	0.788	0	NA	0.3941
S3	PSBP_2	10	123.4	9.37e-05	0.318	0.365	0.353	0	0.635	0.3827
S1	PSBP_1	7	200.3	9.37e-05	NA	NA	0.765	0	NA	0.3823
S4	MEFP1	36	79.10	9.37e-05	0.500	0	0.115	0	0.999	0.3433
S1	PSBP_R	9	178.7	9.37e-05	NA	NA	0.649	0	NA	0.3244
S4	MEFP5	14	57.54	9.37e-05	0.031	0.935	0	0	0.062	0.0455

Conclusions

For hydrophilic PS plates or PS beads in standard buffers: start with S2--Mefp-1 (highest final score via reach + exposure) and S3--PSBP-1 (best split balance of exposure + ligation). S2--Mefp-5 is a solid second choice when you can maintain acidic/reducing conditions; it should give strong signals on polar PS. Avoid S4--Mefp-5 unless you redesign to reduce OCI (e.g., longer spacer between SpyCatcher and Mefp-5 or swapping which partner carries the tail).

References

Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A. J., Bambrick, J., Bodenstein, S. W., Evans, D. A., Hung, C.-C., O'Neill, M., Reiman, D., Tunyasuvunakool, K., Wu, Z., Žemgulytė, A., Arvaniti, E., … Jumper, J. M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature, 630(8016), 493–500.
Danner, E. W., Kan, Y., Hammer, M. U., Israelachvili, J. N., & Waite, J. H. (2012). Adhesion of Mussel Foot Protein Mefp-5 to Mica: An Underwater Superglue. Biochemistry, 51(33), 6511–6518.
Erdős, G., Deutsch, N., & Dosztányi, Z. (2025). AIUPred – Binding: Energy Embedding to Identify Disordered Binding Regions. Journal of Molecular Biology, 437(15), 169071.
Erdős, G., & Dosztányi, Z. (2024). AIUPred: Combining energy estimation with deep learning for the enhanced prediction of protein disorder. Nucleic Acids Research, 52(W1), W176–W181.
Haemers, S., Van Der Leeden, M. C., & Frens, G. (2005). Coil dimensions of the mussel adhesive protein Mefp-1. Biomaterials, 26(11), 1231–1236.
Kumada, Y., Kuroki, D., Yasui, H., Ohse, T., & Kishimoto, M. (2010). Characterization of polystyrene-binding peptides (PS-tags) for site-specific immobilization of proteins. Journal of Bioscience and Bioengineering, 109(6), 583–587.
Zakeri, B., Fierer, J. O., Celik, E., Chittock, E. C., Schwarz-Linek, U., Moy, V. T., & Howarth, M. (2012). Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin. Proceedings of the National Academy of Sciences, 109(12).

Contents