Model

Overview

The protein folding problem has long represented one of the greatest challenges in structural biology. The advent of Artificial Intelligence-based platforms, such as AlphaFold, has revolutionized structure prediction and opened new frontiers in de novo protein design. Within this field, generative models such as RFdiffusion and ProteinMPNN – developed at the Institute for Protein Design (IPD), directed by David Baker – have become benchmarks for Reverse Protein Folding. In this work, we design a synthetic peptide for the selective binding of the intrinsically disordered C-terminal region of progerin, a pathogenic variant of lamin A (LMNA) responsible for Hutchinson-Gilford Progeria Syndrome (HGPS). Our computational approach integrates the AI-based models AlphaFold3, RFdiffusion, ProteinMPNN, as well as the novel NeuroBind – developed by NeuroSnap – in combination with the molecular docking platforms HADDOCK and ClusPro.

**Figure 1.** Schematic representation of the developed computational pipeline

Bioinformatics is a central discipline in modern biology that applies computational, mathematical, and statistical methods to analyze and visualize biological data, from genomic sequences to protein structures, supporting biological discovery and guiding experimental design.

The development of deep-learning structure predictors, such as AlphaFold and RoseTTAFold, marked a turning point in structural bioinformatics. These tools revolutionized the field by achieving near-experimental accuracy in protein structure prediction, enabling reliable modeling even for proteins that are difficult to study experimentally. AlphaFold3 [3] further extends this paradigm by modeling structures of complexes—including proteins with nucleic acids, small molecules, ions and modified residues—broadening what can be explored in silico.

Some targets remain challenging for crystallography because of intrinsically disordered regions (IDRs) and membrane association. Progerin, a lamin A splicing variant implicated in Hutchinson–Gilford Progeria Syndrome, has a largely disordered tail region, making a high-resolution crystallographic structure generally not possible. While predictions for disordered segments must be interpreted with caution, tools like AlphaFold3 can still help with the hypothesis generation by suggesting plausible conformations for IDRs and adjacent structured regions to guide experiments.

Beyond prediction, AI-driven de novo protein design now enables the creation of binders and scaffolds tailored to specific targets, even difficult ones such as progerin. Methods like RFdiffusion (generative backbone design and binder scaffolding) and ProteinMPNN (sequence design on fixed backbones) provide sequences expected to fold into user-specified structural motifs, accelerating the path from concept to testable designs.

For interaction assessment, modern docking platforms such as ClusPro and HADDOCK2.4 have made large-scale, physics-based and data-driven docking practical, integrating smoothly with predicted or designed structures for in silico screening. Downstream, tools like PRODIGY estimate binding affinity from complex structures and report a Kd to help prioritize candidates before wet-lab validation. Together, this integrative pipeline of prediction, design, docking and affinity estimation, provided the foundation of our bioinformatics pipeline.

Within our project we designed a library of synthetic peptides for the selective binding of the intrinsically disordered C-terminal region of progerin, the pathogenic variant of lamin A (LMNA) responsible for Hutchinson-Gilford Progeria Syndrome (HGPS). Our computational approach integrates the AI-based models AlphaFold3, RFdiffusion, and ProteinMPNN, in combination with the molecular docking platforms HADDOCK and ClusPro and the affinity estimation with PRODIGY. The pipeline begins with structural modeling and refinement of the target, proceeds through binder backbone and sequence design, and concludes with docking simulations and estimation of binding affinity.

Schematic of lamin A, lamin C, and progerin Splicing and Maturation: Impact of the Aberrant Splice Site on Protein Processing — **Figure 2.** A: Schematic representation of the alternative splicing events of lamin A, lamin C, and the aberrant splice site responsible for progerin. B: Protein maturation of lamin A and progerin. Lamin A retains 50 amino acids that are deleted in progerin. These amino acids play a key role in the post-translational processing of lamin A, and their absence in progerin leads to multiple consequences.

Overview

Hutchinson–Gilford Progeria Syndrome (HGPS) is caused by a mutation in the LMNA gene that produces progerin, a truncated and permanently farnesylated isoform of lamin A. The C-terminal region of progerin is intrinsically disordered and highly flexible, representing the sole structural difference from wild-type lamin A. This makes therapeutic targeting particularly challenging: instead of recognizing a pre-formed structure, designed binders must impose conformational order on an otherwise unstructured sequence.

The Hutchinson–Gilford Progeria Syndrome (HGPS) is caused by a point mutation (c.1824C>T; p.G608G) in the LMNA gene on chromosome 1, which encodes lamin A, a major component of the nuclear lamina. This silent mutation activates a cryptic splice donor site in exon 11 at codon 608, resulting in the production of a truncated protein, progerin, that lacks 50 amino acids near the C-terminus. Under normal conditions, pre-lamin A contains a CaaX motif at its carboxyl terminus, which is recognized by protein farnesyltransferase (FTase): the –aaX tripeptide is cleaved, the terminal cysteine is farnesylated, and subsequently carboxymethylated. Finally, the zinc metalloprotease ZMPSTE24 removes the last 15 amino acids from the C-terminus, including the farnesylated and carboxymethylated cysteine residue. In progerin, the deletion eliminates the ZMPSTE24 endoproteolytic cleavage site, leading to the accumulation of a permanently farnesylated protein anchored to the inner nuclear membrane. [1]

Lamin A, like all other lamins, is a type V intermediate filament protein usually associated with lamin-associated proteins (LAPs). All lamin isoforms share a conserved architecture: an N-terminal head domain; a central α-helical rod domain composed of four coiled-coil segments separated by flexible linkers that enable dimerization; a nuclear localization signal (NLS); and an Ig-like domain followed by an intrinsically disordered C-terminal tail. [2]

Intrinsically disordered regions (IDRs), such as progerin’s C-terminus tail, lack a stable native structure under physiological conditions due to their enrichment in polar and charged amino acids and depletion of hydrophobic residues, which reduces the hydrophobic driving force required for stable folding. Instead, IDRs can be described as ensembles of functionally relevant, transient conformations at the secondary or tertiary structural level. [3]
As in all IDRs, the structure of progerin’s C-terminal region is poorly defined because of its flexibility. This characteristic is particularly problematic, as it represents the sole divergence from properly spliced lamin A, making progerin difficult to target therapeutically without also affecting lamin A. This feature forced us to rethink the binder design process: whereas conventional design aims at developing a binding pocket that recognizes a given structure through specific amino acids, in our case the interacting peptide must instead provide the properties needed to constrain the target sequence into a defined fold.

Progerin Structure Prediction

Overview

The progerin C-terminal tail was modeled using AlphaFold3, generating a full-length model and 10 fragment-based models to capture its intrinsically disordered nature. Confidence scores (pTM and pLDDT) were modest but improved for shorter fragments, reflecting reduced heterogeneity and the region’s inherent flexibility.

In the absence of experimentally determined structures of the progerin C-terminal tail, its conformation was predicted using two complementary approaches: de novo modeling with AlphaFold3 [4] and template-based homology modeling with SWISS-MODEL [5]. To overcome the challenges of modeling intrinsically disordered regions (IDRs), in addition to the full-length protein, we generated a set of overlapping constructs of the C-terminal tail, with lengths of 62, 57, 52, 47, 42, 41, 40, 39, 38, and 37 amino acids. This strategy yielded 22 putative structural models, thereby increasing the likelihood of capturing structurally plausible conformations of the progerin IDR.
However, SWISS-MODEL failed to produce reliable models for the C-terminal region, likely due to the lack of structural templates for that disordered region. Consequently, only the 11 AlphaFold3 models were retained for further analysis. (See our wiki cycle 2,)

Each AlphaFold3 model is associated with two confidence scores, the predicted Local Distance Difference Test (pLDDT) and the predicted TM-score (pTM), which indicate the reliability of local and global features of the prediction. In our study, the pTM values ranged from 0.27 for the full-length progerin sequence to 0.56 for the shortest construct. A general increase was observed with decreasing sequence length, consistent with the notion that restricting the input to the intrinsically disordered C-terminal tail reduces structural heterogeneity, allowing AlphaFold3 to generate more coherent local conformations.

This observation aligns with the concept of conditionally folded states, where short peptides tend to adopt more stable structures, especially when they undergo folding upon binding to other molecules or post-translational modifications. Research has shown that AlphaFold frequently predicts conformations that resemble such folded states, although the method still provides only one structural snapshot and cannot represent the full conformational ensembles that characterize IDRs. This is why focusing on multiple predictions of smaller fragments allowed us to have a more accurate and more complete rendering of the potential folding of progerin’s IDR. [6] [7]

Yet, the absolute pTM values remained modest, reflecting the intrinsic flexibility of the C-terminal region and confirming the expected limitations of structural prediction in intrinsically disordered domains.

**Table 1.** pTM scores for each prediction. We divided the C-terminal tail of progerin into 10 fragments and predicted each fragment individually to increase the confidence of the structural predictions.
Length (aa)	pTM (AF3)
62	0.37
57	0.45
52	0.54
47	0.48
42	0.53
41	0.47
40	0.51
39	0.53
38	0.56
37	0.59
611	0.27

Structural Analysis and Refinement

Overview

The predicted structures of the progerin C-terminal fragments were refined through a multi-step process using AMBER and Rosetta Relax to reduce steric clashes and optimize conformations. PDBFixer ensured output integrity, and MolProbity scores assessed stereochemical quality. Final models were selected based on a balance of low energy and high structural quality.
All the parameters used in our computational pipeline are presented in the supplemental materials.

The raw models predicted by AlphaFold3 required structure relaxation to correct stereochemical imperfections and improve energetic stability. This refinement cycle was performed using two complementary approaches: the molecular mechanics of AMBER and the knowledge-based refinement protocol of Rosetta Relax. To exploit the complementarity between AMBER and Rosetta Relax, an initial AMBER minimization was applied to reduce the most evident steric clashes. The models were then refined with Rosetta Relax, which explores alternative conformations of backbone and side-chains to improve packing and stereochemistry. A second AMBER minimization was finally performed to regularize the Rosetta-refined structures within a physically consistent energy landscape. The refinement process was coupled at every stage with PDBFixer to ensure the integrity of the input files and with structural evaluation through MolProbity to select conformations with both improved energetic stability and stereochemical quality.

Relaxation Process Applied to Predicted 3D Structures of progerin C-Terminal Fragments — **Figure 3.**Scheme depicting the relaxation process applied to the predicted structures of the progerin C-terminal fragments.

Rosetta Relax is a refinement protocol within the Rosetta suite that improves stereochemical quality and energetic stability by optimizing backbone and sidechain conformations [8]. The algorithm uses a Monte Carlo minimization strategy, in which small random perturbations of torsional angles are accepted or rejected according to the Rosetta energy function. This function combines physical terms (van der Waals, electrostatics, and hydrogen bonding) with statistical potentials derived from high-resolution protein structures. Through iterative cycles of side-chain repacking and backbone minimization, the software gradually reduces steric clashes and generates energetically favorable conformations. The final models represent a local minimum of the Rosetta energy landscape, consistent with stereochemical restraints. Rosetta Relax was accessed through ROSIE, the Rosetta Online Server that Includes Everyone. The parameters used were: “tether backbone coordinates of the pdbs being relaxed to the coordinates in the crystal native” and “use cartesian minimizer”. These parameters were selected to maintain the overall backbone geometry close to that of the original AlphaFold3 models, while allowing for local optimization and enabling finer adjustments of bond geometries through Cartesian minimization.

AMBER (Assisted Model Building with Energy Refinement) is a molecular mechanics package developed at the University of California, San Francisco [9]. It performs energy minimization and molecular dynamics using all-atom force fields. These force fields describe both bonded interactions (bond lengths, angles, and torsional rotations) and non-bonded interactions (electrostatics and van der Waals), with parameters calibrated from experimental and quantum mechanical data. Energy minimization is achieved through gradient-based algorithms, which iteratively adjust atomic coordinates to reduce steric clashes and strain in bond geometries. This produced structures that are physically consistent and energetically stable. AMBER was accessed through the NeuroSnap online suite.

**Table 2.** Depiction of the relaxation energies for each predicted structure of the progerin C-terminal fragments.
Length (aa)	Pre-Relax	Post-AMBER	Post-RosettaRelax	Post-AMBER 2.0
62	10405.41	-422.048	1475.999	-789.613
57	6217.509	-3423.500	-1983.066	-3428.483
52	4127.430	-3840.914	-2694.035	-4316.357
47	4958.840	-3977.398	-2747.251	-4393.889
42	4457.610	-3820.677	-2858.242	-4218.129
41	3233.990	-4243.939	-3391.465	-4491.515
40	3175.860	-4420.744	-3262.657	-4497.118
39	59603.60	-4024.410	-3154.428	-4183.405
38	2813.380	-4343.088	-3656.575	-4464.136
37	5082.929	-4037.173	-3334.752	-4168.164
611	31791.042	-82992.712	-62649.329	-84876.584

MolProbity is a structural validation software developed at Duke University, North Carolina, widely used for the assessment of macromolecular structures [10]. It provides an overall measure of stereochemical quality by evaluating parameters such as bond lengths and angles, steric clashes, Ramachandran plot outliers, and side-chain rotamer conformations. These criteria are combined into a single value, the MolProbity score, which allows direct comparison between different structures. Lower scores correspond to better stereochemical quality, and the metric is therefore commonly used to monitor improvements during refinement steps.

At the end of this multi-step cycle, the models were evaluated both in terms of total energy (from AMBER; Table 2) and stereochemical quality (from MolProbity; Table 3). Final candidates were selected according to the best compromise between low energy values and improved MolProbity scores, ensuring that only conformations with favorable geometry and stability were retained for downstream design.

**Table 3.** Depiction of the MolProbity Scores for each predicted structure of the progerin C-terminal fragments.
Length (aa)	Pre-Relax	Post-AMBER	Post-RosettaRelax	Post-AMBER 2.0
62	2.59	2.13	1.49	2.09
57	2.86	1.24	1.70	1.78
52	2.08	2.73	1.28	1.48
47	2.59	2.76	1.17	2.62
42	3.50	2.84	2.15	2.49
41	3.55	1.73	1.65	2.00
40	2.09	1.46	0.98	3.28
39	2.64	1.86	1.59	2.59
38	2.78	1.68	2.01	2.18
37	2.94	2.08	1.70	1.95
611	2.62	2.01	1.06	1.86

Binders Design

Overview

Binders targeting progerin’s intrinsically disordered C-terminal tail were designed using RFdiffusion for backbone generation and ProteinMPNN for sequence design. The resulting sequences were structurally modeled with AlphaFold3 and subsequently evaluated through docking simulations to characterize their interaction with progerin and estimate binding affinities (Kd).

The design of peptide binders for the C-terminal of progerin was carried out through a two-step process, consisting of backbone generation with RFdiffusion and sequence design with ProteinMPNN, both accessed via the NeuroSnap platform. In binder design mode, RFdiffusion accepts as input a target structure along with user-defined binding pockets and hotspot residues that orient the diffusion process. The choice of these regions is crucial and depends on the specific nature of the target. In the case of progerin, the C-terminal tail differs from that of lamin A by only 6 amino acids, which in lamin A are usually processed by ZMPSTE24. This minimal divergence highlights the challenge of achieving selective binding to progerin while avoiding cross-reactivity with lamin A.

Comparison of Lamin A and progerin Sequences Highlighting C-Terminal Differences — **Figure 4.**Lamin A and progerin sequences, with the differences in the C-terminal tail highlighted.

RFdiffusion is a generative deep learning model for de novo protein design, developed at the Institute for Protein Design (IPD), University of Washington, under the direction of David Baker [11]. The first version, published in Nature in 2023, has represented a significant advance in the field of reverse protein folding. More recently, an updated framework named RFdiffusion2 was released in 2025 and is currently available as a preprint. The method extends RoseTTAFold by reformulating the protein design problem as a denoising diffusion process. RFdiffusion starts from random structural noise and, guided by geometric and biochemical constraints, progressively refines it through a series of diffusion steps until it converges toward a valid three-dimensional protein backbone. This iterative denoising process enables the model to generate highly diverse and realistic folds, which can be applied to various design tasks, including binder design. All the parameters we used to compute our pipeline are presented in supplemental materials.

ProteinMPNN (Protein Message Passing Neural Network) is a deep learning generative algorithm for designing protein sequences starting from three-dimensional backbone structures [12]. It was developed at the Institute for Protein Design (IPD) at the University of Washington, in the laboratory of David Baker, and was first published in 2022. The model is based on a graph neural network architecture, where each backbone residue is represented as a node connected to others through distances, orientations, and dihedral angles. The network relies on message passing, a mechanism by which each node exchanges information with its neighbours and iteratively updates its state. Through this process, both local and global structural relationships are integrated, allowing the model to predict, for each position, the probability distribution over possible amino acids. ProteinMPNN was accessed through the NeuroSnap platform. The ProteinMPNN– Medium Gaussian Noise (0.20 Å) model type was selected as a compromise between accuracy and sequence diversity. For each backbone, 100 sequences were generated in multiple runs while varying the sampling temperature between 0.1, 0.2, and 0.3. Lower values favoured high-probability sequences, whereas higher values promoted greater sequence diversity. From these runs, the five sequences with the highest Ligand Confidence scores were retained from each temperature setting as primary candidates for subsequent docking simulations, thus ensuring a balance between structural reliability and exploratory diversity.

Starting from the sequences generated by ProteinMPNN, we predicted their structures using AlphaFold3 and subjected them to the relaxation cycle described above. These models, together with the C-terminal fragments and the full-length progerin, were used as inputs for docking simulations. This step allowed us to model peptide–protein interactions and to estimate binding affinities through dissociation constant (Kd) predictions.

Docking analysis

Overview

Peptide docking with progerin was carried out using HADDOCK and ClusPro. HADDOCK employed hotspot restraints to guide binding toward the C-terminal tail, whereas ClusPro performed blind docking on the full-length protein to independently confirm localization. Docking outcomes were assessed based on HADDOCK scores and cluster convergence, allowing us to identify the most reliable C-terminal interactions.

The docking simulations were performed with HADDOCK and ClusPro, two complementary software approaches that combined hotspot-driven restraints with unbiased sampling of the protein surface. This strategy enabled us to model peptide–progerin interactions with higher reliability and to investigate whether binding occurred preferentially at the C-terminal tail. This analysis was particularly important for the full-length construct, where blind docking could serve as an independent control to assess whether the predicted interactions were restricted to the C-terminal region or, instead, distributed randomly along the protein surface, which would indicate nonsignificant binding. Each designed peptide was docked against the specific progerin fragment on which RFdiffusion had modeled it, since the generated binder backbone was conditioned by that particular conformational snapshot of the intrinsically disordered region (IDR). In parallel, all designed peptides were docked with ClusPro against full-length progerin in blind docking mode to assess whether their interactions were localized at the C-terminal region, also in the context of the entire protein. It is essential to note that these AlphaFold3-derived conformations represent only one of the numerous possible structural states of the disordered tail. Consequently, the binder is not meant to recognize that single structure as a rigid target, but rather to induce a compatible conformation upon binding.

HADDOCK (High Ambiguity Driven protein–protein Docking) is a data-driven docking platform that incorporates biochemical and structural information, such as hotspot residues, to guide the docking process [13]. It is available as a web server where the user can define active and passive residues, choose the type of restraints to be applied, and adjust parameters such as the flexibility assigned to backbone and side chains. The workflow consists of an initial rigid-body docking, followed by semiflexible refinement of the interface, and a final optimization in explicit solvent.

Docking results are reported in the form of clusters, ranked according to two primary metrics: the HADDOCK score (Equation 1), a weighted combination of van der Waals, electrostatics, desolvation, and restraint energies, and the Cluster Size, which represents the number of complexes sharing the same binding interface. A low HADDOCK score indicates a more favorable interaction, while a larger cluster size increases confidence by suggesting convergence toward a similar binding mode. The combination of these two parameters provides a robust criterion for selecting the most reliable docking solutions.

HADDOCKscore = 1.0 × Evdw + 0.2 × Eelec + 1.0 × Edesolv + 0.1 × Eair (1)

Active residues on progerin were defined as the last 15 amino acids of the C-terminal tail, excluding the terminal cysteine, which is normally farnesylated and therefore not suitable for direct inclusion. For the designed peptide binders, the entire sequence was defined as active, since no prior information about their binding interface was available. To approximate the biological effect of farnesylation in membrane anchorage, the terminal cysteine of progerin was set as semi-flexible. This configuration allows limited conformational adjustments during refinement, reducing steric artifacts while avoiding the unrealistic freedom that would result from defining it as fully flexible. Fully flexible regions were instead assigned to the intrinsically disordered segments (IDRs) of progerin, while no semi-flexible or fully flexible residues were defined for the binders. Finally, surface contact restraints were enabled to enforce intermolecular contacts and minimize the occurrence of unbound conformations.

ClusPro is an automated protein–protein docking server that performs large-scale rigid-body docking followed by clustering of the resulting poses [14]. The docking engine relies on generating thousands of putative orientations of the two partners, in which the two proteins are systematically rotated and translated relative to one another to explore the entire interaction surface. These solutions are then grouped into clusters of structurally similar complexes, under the assumption that if many independent solutions converge toward the same orientation, this binding mode is more likely to be biologically relevant. Representative structures from the largest clusters are subsequently refined by energy minimization.

Unlike HADDOCK, ClusPro does not require the user to define active residues, flexible regions, or other docking restraints. This makes the method particularly suited for blind docking, in which the entire surface of the target protein is sampled without prior assumptions about the binding site. For the full-length construct of progerin, this approach was essential to evaluate whether the designed binders explicitly interacted with the C-terminal region or distributed across the protein surface in a non-significant manner.

PRODIGY (PROtein binDIng enerGY prediction) is a web server developed at Utrecht University for the estimation of binding affinities of protein–protein complexes [15]. Given the three-dimensional structure of an interaction in PDB format, PRODIGY predicts the binding free energy (ΔG) and the corresponding dissociation constant (Kd) by analyzing interfacial contacts and surface properties. The underlying model relies on the number and type of residue–residue contacts within a cutoff distance, combined with the fraction of charged and apolar residues exposed at the non-interacting surface. These parameters are integrated into a statistical predictor that has been trained on a large dataset of experimentally determined complexes, allowing fast yet accurate estimation of binding affinities. We employed this tool for the evaluation of binding affinities between our interactors and the C-terminal region of progerin and lamin A.

Overview

Our computational pipeline enabled the exploration of a broad library of designed peptides to assess their potential binding to the flexible C-terminal tail of progerin. Combining structural prediction, docking simulations, and stability optimization, we identified candidates displaying consistent and promising binding behavior. This workflow highlights how rational design, multi-sequence alignment, and energy-based evaluation can effectively guide the selection of peptide binders with high confidence for subsequent experimental validation.

Computational Pipeline for Designing Selective progerin Binders: From Structure Prediction to Binding Affinity Evaluation — **Figure 5.**Overview of our computational strategy for designing selective binders against progerin. Starting from the amino acid sequence, structural models were generated with AlphaFold3, binder candidates were designed with RFdiffusion and ProteinMPNN, and their binding affinity was evaluated through docking analysis (HADDOCK, ClusPro) and energy estimation (PRODIGY).

Our docking analysis revealed that our binders exhibited heterogeneous binding patterns. Some peptides designed with RFdiffusion on the full-length structure of progerin only weakly interacted with the C-terminal region or did so in clusters of limited size, suggesting that these interactions were not significant. In contrast, other binders consistently engaged the C-terminal tail and produced several clusters of considerable size, indicating reproducible and reliable interactions. This agreement with hotspot-driven docking adds confidence to the overall analysis.

For the binders modelled on individual C-terminal fragments, the picture is more complex. Since IDRs can adopt multiple conformations, the AlphaFold3 model of full-length progerin represents only one possible conformation. This means that the lack of interaction in docking simulations does not necessarily indicate a binder would fail, but rather that the interaction may require or induce alternative conformations of the protein to engage properly.

To illustrate these patterns more concretely, we focus on a representative binder. The interactor 62aa11_1 was designed using the 62 amino acids fragment and showed significant binding with the structure it was modeled on (Figure 6). Its binding affinity corresponds to a Kd of 4.90 × 10^-9M with a cluster size of 116 structures, indicating a robust interaction. When analyzed against the full-length progerin structure, the Kd decreases to 2.17 × 10^-6M due to conformational differences in the same region depending on the predicted structure. In a cellular environment, both conformations are possible, so the interactor’s potential remains unchanged. By examining these cases in detail, we can better understand the structural features that contribute to strong and specific binding, and why certain sequences were selected for further analysis.

Structural Model of 62aa11_1 Interaction with progerin Fragment: Differential Binding Visualized in ChimeraX — **Figure 6.** structural model of the interaction between model 62aa11_1 (salmon) and 62-residue truncated fragment or progerin (purple). 62aa11_1 shows a stronger binding affinity with 62aa fragment and a weaker affinity for progerin. This picture has been visualized in ChimeraX.

Based on these observations, we selected for affinity prediction only the binders that displayed interactions in proximity to the last 15 residues of their fragment, corresponding to the hotspot region defined during design with RFdiffusion. In addition, binders that also docked at the C-terminal tail of full-length progerin were considered as particularly promising candidates, as they suggested a stronger and more consistent affinity for that region. In total, we analyzed approximately 350 interactors; the sequences are provided in the supplemental materials.

Other techniques for binding design

Furthermore, for the binders designed against full-length progerin, the fifteen sequences retained from ProteinMPNN were subjected to multiple sequence alignment (MSA), revealing which positions were conserved and which varied across different designs.
Stability analyses were then performed using the Stabilize Protein protocol in ROSIE, which suggested specific substitutions to increase the thermodynamic stability of the candidates. From this process, a consensus sequence was derived and designated as LOGO, in which certain residues were substituted to enhance stability, while others, important for the folding, were manteined according to the consensus. LOGO represents our rational design approach, using a consensus sequence optimized based on stability (Figure 7).

Figure 7.Insights into LOGO, the interactor obtained through rational design. A: Schematic illustrating how the LOGO sequence was generated for rational design, by analyzing the most frequent amino acids and their properties. B and C: Analysis of the interaction between progerin and the LOGO interactor.

In addition to the main computational pipeline, we employed NeuroBind, a fully automated AI-based binder design platform developed by NeuroSnap, as an auxiliary tool for the de novo generation of binders against progerin. NeuroBind was not part of our main workflow, but was used to explore additional candidates outside the RFdiffusion–ProteinMPNN design process. The platform requires only the target sequence and a few basic parameters, while the entire process of backbone generation, sequence design, and preliminary validation is carried out automatically by the algorithm. The only input required was the sequence of progerin, the design mode, set to peptide with a binder length of 45 amino acids, and the hotspot region, defined consistently with RFdiffusion as the last 14 amino acids except for the terminal cysteine (residues 597-610). NeuroBind generated 25 candidate binders in a single run, which were subsequently inspected in ChimeraX. Candidates that produced unrealistic conformations of progerin were discarded. For example some models produced quadruple α-helices, or binders located in biologically implausible positions, for instance penetrating through the coiled-coil domain The remaining models were retained for downstream analysis.

After refinement of the docking complex, the binding affinities of the complexes were estimated with PRODIGY at 36 °C. For the HADDOCK results, the first model from the top three clusters ranked by HADDOCK score was analyzed for each binder. For the ClusPro docking against full-length progerin, all clusters in which the interaction involved the C-terminal region were evaluated, while for the fragment-based docking, only the first two clusters by size were considered. For each case, the predicted Kd values were averaged by calculating a weighted mean, in which the weight was given by the cluster size. This procedure yielded one mean Kd per binder from HADDOCK and one from ClusPro. The best results were obtained for binders from the 62-residue and 52-residue fragments of progerin, as well as for the full-length construct. Compared with the original candidates generated directly by ProteinMPNN, LOGO consistently achieved improved results in terms of binding affinity, with a predicted Kd of 6.10×10⁻⁸M, highlighting the effectiveness of this consensus-based optimization strategy. This analysis allowed us to extract, from a library of 71 putative binders, six peptides with predicted Kd values ranging from 1.7×10⁻⁷M to 4.9×10⁻⁹M.

Moreover, the complexes from NeuroBind were evaluated through Prodigy to verify the Kd returned from the software. Since all the Kd were extremely low, ranging from 6.06 × 10^-15M to 2.21 × 10⁻²¹M, we selected the best three peptides with the most convincing binding modes, reaching a final set of nine interactors. Five of these interactors were selected for experimental validation.

Insights from the selected models

Overview

Five interactors were tested for binding to progerin, with 62aa11_1 showing the strongest binding affinity and highest prediction confidence. While Rank 15 and Rank 21 exhibited predicted high binding affinities, their low structural confidence suggested overestimated results, and other interactors are discussed in the supplemental materials.

Table 4 summarizes key properties of the five interactors selected for experimental validation. The conformations of these interactors vary to explore which structural features correlate with optimal binding to the progerin C-terminus. 62aa11_1 adopts an α-helical structure and stands out for its strong binding affinity to the 62-residue C-terminal fragment, with a Kd of 4.90 × 10^⁻⁹ M and a cluster size of 116 structures. HADDOCK docking with full-length progerin shows a slightly higher Kd, but the interaction remains satisfactory. 62aa11_1 also exhibits the highest pTM score, indicating high confidence in its predicted structure. LOGO and n80_02 are both globular interactors. LOGO was designed rationally and demonstrates a satisfactory binding affinity to the progerin C-terminus, with a Kd of 7.50 × 10^⁻⁸ M, while n80_02 was generated through our computational pipeline. n80_02 also stands out in the ClusPro analysis, with a predicted Kd of 8.73 × 10^⁻⁸ M.

Rank 15 and Rank 21 were produced using NeuroBind, which predicts very high binding affinities. We included these to assess the potential of this bioinformatic tool. However, while the algorithm reported extremely low Kd values, such affinities are likely overestimated and physically unrealistic, reflecting computational bias rather than true molecular behavior. Their structures are also less well defined and more disordered than the other interactors, reflected in relatively low pTM scores of 0.20 for Rank 15 and 0.32 for Rank 21.

**Table 4.** Key information about the interactors tested in the laboratory, including their pTM (prediction confidence), the predicted Kd from HADDOCK for the interaction with the full-length progerin C-terminus, as well as the peptide sequence and structural features. Other interactors were interesting for further analysis, but we only had the possibility to test 6 of them. The sequences of all the other interactors are present in the supplemental materials section.
Interactor	pTM	HADDOCK Kd progerin	Sequence
62aa11_1	0.76	9.32E-07	GLEEAQRRAEEARRQIALANRAGRDQEEAARLQRELEALEAEIEEAKTG
LOGO	0.44	7.50E-08	APGRGRCRGNPPVCCCPNCPRCDADCTQGGGSGCPACPCP
n80_02	0.33	6.92E-08	AAGCGRCIGNPPVCCCCNCPECGQDCTQCGGSGCPNCPCP
Rank 15	0.2	1.08E-17	GLPLPELDLPEAMFRGKCAQANGAGASGTTHTAPPPEPREPLSGE
Rank 21	0.32	5.15E-15	ACAGSPRNCPAPCTGTDCPPCPGPAFEGDTEKPGPGEPPRGGAGG

Other interactors were interesting for further analysis, but we only had the possibility to test 6 of them. The sequences of all the other interactors are present in the supplemental materials section.

Conclusion

Overview

We developed a computational pipeline for the de novo design of peptide binders selectively targeting the C-terminal tail of progerin, combining structural modeling, sequence design, docking, and PRODIGY-based affinity prediction. From an initial library of 96 candidates, nine binders were selected with consistent engagement of the C-terminal region and estimated dissociation constants in the nanomolar range or lower.

While limited by single AlphaFold3 conformations and rigid docking, the approach demonstrates the feasibility of designing selective binders for intrinsically disordered regions. Future experimental validation in yeast and human cells will test these predictions and guide iterative optimization, supporting the design of effective interactors for the ProgERASE therapeutic strategy.

These results must be interpreted in light of recent advances in the field. In particular, two studies recently published on Nature and Science by the Baker Lab have introduced pipelines for designing binders against intrinsically disordered regions, which have long been regarded as inaccessible to rational design [16] [17]. While our strategy shares some core elements with these pipelines, such as the integration of deep generative models and docking simulations, their approaches explicitly account for the conformational ensemble of the disordered region, either by co-folding the target and binder during diffusion or by guiding design with multiple conformations and secondary structure propensities. In contrast, our method relies on AlphaFold3 snapshots of individual conformations combined with rigid docking, which inevitably restricts the exploration of the full structural heterogeneity of the IDR. These differences underline both the strengths and the limitations of our method. While it demonstrates the feasibility of designing selective binders for disordered regions, it also highlights the need for broader conformational sampling. In particular, AlphaFold3 predictions could have been expanded by using multiple random seeds, thereby capturing a larger ensemble of conformations of the disordered C-terminal tail, and secondary structure predictions could have been integrated to better approximate its conformational landscape. Such refinements would bring our pipeline closer to the state of the art in IDR-targeted binder design and represent clear directions for methodological improvement in future work.

Laboratory Experimentation

These computational predictions require rigorous experimental validation. First, the interaction between the designed peptides and progerin has been assessed using a yeast two-hybrid assay to detect direct interactions in a cellular context (visit our yeast wiki page). In parallel, the NanoLuc® Binary Technology (NanoBiT) complementation assay will be applied in human MRC-5 fibroblasts to confirm binding under physiologically relevant conditions (visit our Mammalian cell wiki page). Finally, quantitative affinity measurements will be carried out by Microscale Thermophoresis (MST) and Spectral Shift on a Monolith X instrument (NanoTemper Technologies), enabling the precise determination of dissociation constants for the peptide–progerin complexes. Together, these experiments will provide critical validation of our computational predictions and allow us to refine the design pipeline through an iterative cycle of in silico modeling and in vitro testing.

By assessing the potential of our binders to selectively target progerin while discriminating against lamin A, we aim to identify the structural and sequence features that are most important for designing precise and effective interactors suitable for the ProgERASE therapeutic approach. Building on these insights, future studies will expand the search for additional candidate binders, exploring larger libraries and a wider range of structural variants.
Slight modifications to the strategies implemented in the pipeline can be considered, such as incorporating more overlap between different software-generated data. For instance, literature suggests that integrating docking analyses through web servers like HADDOCK with predictive models such as AlphaFold could enhance the modeling of intrinsically disordered regions (IDRs). It would also be valuable to explore results using ensembles of potential conformations for the target protein, effectively simulating multiple “snapshots” of folding rather than relying on a single conformation. [18] This approach will allow us to refine our design strategies, optimize binding affinity and specificity, and ultimately move closer to identifying the most promising interactors for experimental validation and potential therapeutic applications.

The entire phase 2 was carried out in parallel with the experimental validation of the interactors, due to the time constraints of the iGEM competition. The results presented from this point onward did not have a direct impact on the project, but they will become relevant if the study is continued.

Step 1 - Without RING domain

After predicting the interactors using AI-driven tools and trying to target specifically progerin, further analyses involve the measurement of the binding affinity with lamin A to inspect if our peptides can discriminate between the two proteins. To do that, we continued our analyses focusing on the first 9 interactors that showed the strongest binding affinity with progerin. First, we predicted the interactors using Alphafold3 a second time, in order to be sure to continue the analyses with the best structures possible as we wanted to refine and test effectively the binding capability of both lamin A and progerin. After obtaining the AlphaFold structures, we ran the relax cycle and we checked the structure integrity using MolProbity as we previously described. For each interactor, we maintained the structure that showed the best features and compromise between a low energy level and a good MolProbity score. The same was performed for lamin A. Table 5 shows the results of the relaxation cycles along with the structure scores. We highlighted the structures that were chosen for the downstream analysis. Then, we analyzed the selected structures running docking analyses with lamin A and progerin using both ClusPro and HADDOCK.

**Table 5.** Relaxation results for the predicted most promising interactors. The table shows the energy levels and MolProbity scores of the structures after each relaxation step.
Interactor	Energy				MolProbity score
Interactor	Pre-Relax	Post-AMBER	Post-Rosie	Post-AMBER 2.0	Post-AMBER	Post-Rosie	Post-AMBER 2.0
lamin A	33.167,06	-86.321,45	-65.282,53	-88.179,65	1,90	0,90	1,84
LOGO	-716,32	-3.606,94	-3.001,67	-3.914,43	1,93	0,86	2,70
n80_02	12.057,20	-2.528,40	-1.272,80	-2.425,82	2,02	1,68	1,56
Rank 15	316,67	-2.755,90	-1.375,60	-2.868,68	1,46	1,88	1,99
Rank 21	7.667,93	-1.766,30	-329,59	-1.912,69	2,02	1,88	1,19
62aa11_1	-6.279,97	-9.645,16	-7.290,12	-9.855,37	1,94	0,53	2,81
52aa2_1	-2.907,52	-6.645,47	-5.453,73	-6.911,33	3,01	1,01	2,35
42-2L03	2.855,60	-1.668,46	-844,73	-1.712,35	1,73	0,50	2,22
n51_02	-618,09	-2.918,23	-1.947,73	-3.036,73	2,43	1,65	1,54
Rank 7	1.030,02	101,65	1.247,85	-43.38	1,46	1,19	1,83

ClusPro results for lamin A docking showed no binding between the interactors and the region of interest in the C-terminus. Figure 8 depicts the interaction that got the closest to the region of interest, but it is evident that it is still far from the desired position.

Figure 8. Closest interaction between lamin A and LOGO with respect to the region of interest.

HADDOCK results were useful for assessing the binding capability between the interactor and lamin A. After defining the interacting region, we focused on clusters with low HADDOCK scores and large cluster sizes, as these indicate significant and reliable interactions. We focused on clusters with a HADDOCK score lower than -35, and we measured the weighted average Kd, with the cluster size as weighting factor. For peptides n80_02 and Rank15, we further analyzed two clusters each, as relying on a single cluster would have had a strong impact on the final outcome. In fact, both initially showed binding affinities in the order of 10^-9M (9.40E-09M and 9.70E-09M), which would have been remarkably strong. However, the subsequent clusters adjusted these values, leading to a more balanced and reliable estimation of the affinity. Table 6 shows HADDOCK results:

**Table 6.** HADDOCK docking results between the interactors and lamin A. The final Kd was calculated as a weighted average of the Kd values from individual clusters, using cluster size as the weighting factor. For models with insufficient significant clusters, the first reliable cluster was retained.
lamin A	Cluster information			Prodigy
lamin A	Cluster	Haddock score	Cluster size	Kd [M]	Weighted average
LOGO	2	-41.8 ± 3.1	22	8.60E-08	1.28E-07
	3	-41.5 ± 4.6	16	3.80E-08
	1	-36.9 ± 3.3	29	2.10E-07
n80_02	2	-38.2 ± 2.7	21	9.40E-09	3.49E-08
	3	-31.2 ± 2.1	17	4.70E-08
	1	-29.2 ± 3.7	80	3.90E-08
Rank 15	3	-22.3 ± 6.9	15	1.90E-06	9.10E-07
Rank 15	10	-20.3 ± 6.3	5	1.00E-06	9.10E-07
Rank 21	2	-25.8 ± 2.2	17	9.70E-09	2.18E-07
	5	-42.4 ± 2.1	9	3.50E-07
	4	-35.1 ± 7.4	11	1.10E-07
62aa11_1	4	-40.6 ± 3.4	15	2.40E-08	1.73E-07
62aa11_1	1	-40.6 ± 5.2	60	2.10E-07	1.73E-07
52aa2_1	3	-34.0 ± 3.9	14	1.80E-06	1.80E-06
52aa2_1	5	-40.9 ± 1.6	13	1.60E-07	1.80E-06
42-2L03	2	-39.9 ± 2.6	39	6.40E-07	4.42E-07
42-2L03	4	-38.7 ± 0.5	20	2.40E-07	4.42E-07
n51_02	6	-35.8 ± 1.3	6	1.30E-07	1.30E-07
Rank 7	8	-23.7 ± 3.6	6	1.30E-06	1.30E-06

We repeated the same procedure with progerin to improve our Kd predictions using these refined progerin structures. Table 7 shows the HADDOCK results for progerin. Clusters with a HADDOCK score below -35 were considered significant, and the Kd was calculated as a weighted average based on cluster size. For cases where no cluster achieved a satisfactory HADDOCK score, we retained the most reliable cluster for analysis. We notice that the best Kd value is now 5.70 × 10^-9M, corresponding to Rank 21, which likely retains part of the optimistic Kd predicted by NeuroBind. Among the binders not designed with NeuroBind, LOGO shows the lowest binding affinity at 1.30 × 10^-8M, which is encouraging for the outcome of our rational design.

Table 8 presents the differences between the first and second predictions (the first predictions are in Table 4); overall, the data do not differ significantly, they largely remain within the same order of magnitude. The largest differences can be observed for Rank 15, Rank 21, and Rank 7 because NeuroBind had predicted very optimistic results, which is quite common in automatic prediction software.

**Table 7.** HADDOCK results for the interaction between progerin and the binding peptides. The final Kd was calculated as a weighted average of the Kd values from individual clusters, using cluster size as the weighting factor. For models with insufficient significant clusters, the first reliable cluster was retained.
progerin	Cluster information			Prodigy
progerin	Cluster	Haddock score	Cluster size	Kd	Media pesata
LOGO	3	-35.7 ± 4.0	10	1.30E-08	1.30E-08
LOGO	7	-50.3 ± 5.7	8	3.60E-08	1.30E-08
n80_02	3	-42.4 ± 3.7	11	1.60E-07	7.81E-08
	5	-41.9 ± 3.6	10	1.90E-09
	4	-40.8 ± 1.7	10	2.30E-07
	2	-38.4 ± 4.2	11	2.00E-09
	1	-38.0 ± 2.7	48	6.80E-08
Rank 15	7	-39.9 ± 9.6	8	1.30E-07	1.30E-07
Rank 21	1	-43.4 ± 1.8	28	5.70E-09	5.70E-09
62aa11_1	1	-32.2 ± 2.0	29	1.40E-06	1.40E-06
52aa2_1	1	-56.0 ± 0.8	32	9.50E-08	4.17E-07
	3	-53.7 ± 7.0	53	5.00E-07
	2	-51.2 ± 2.1	29	6.20E-07
n51_02	7	-46.2 ± 2.8	8	3.00E-08	6.30E-08
n51_02	15	-43.8 ± 7.2	5	4.20E-08	6.30E-08
Rank 7	3	-37.5 ± 2.8	10	1.00E-07	1.60E-07
42-2L03	2	-29.9 ± 2.3	17	1.60E-07	4.53E-07
	9	-39.8 ± 2.4	8	1.00E-06
	10	-36.5 ± 3.4	7	4.00E-08

**Table 8.** HADDOCK results for the interaction between progerin and the binding peptides. The final Kd was calculated as a weighted average of the Kd values from individual clusters, using cluster size as the weighting factor. For models with insufficient significant clusters, the first reliable cluster was retained.
Interactor	Haddock Kd progerin	Haddock Kd progerin - second analysis
LOGO	7,50E-08	1,30E-08
n80_02	6,92E-08	7,81E-08
Rank 15	1,06E-17	1,30E-07
Rank 21	5,15E-15	5,70E-09
62aa11_1	9,32E-07	1,40E-06
52aa2_1	7,90E-07	4,17E-07
42-2L03	4,53E-07	4,53E-07
n51_02	8,51E-09	6,30E-08
Rank 7	2,44E-18	1,60E-07

To assess the selectivity of each designed interactor, we compared the dissociation constants (Kd) obtained from docking simulations with progerin and lamin A. To enable direct comparison, we calculated the ratio Kd(progerin)/Kd(lamin A), where values below 1 indicate stronger affinity for progerin, while ratios above 1 suggest preferential binding to lamin A. The results are summarized in Table 9.

This analysis revealed that several candidates displayed marked selectivity toward progerin, with ratios as low as 0.03–0.23, corresponding to up to an order of magnitude stronger binding compared to lamin A. These include the models LOGO, Rank 15, Rank 21, Rank 7, and 52aa_1. Rank 21 represents the most promising interactors for specific recognition of the pathogenic isoform. Conversely, a few candidates (n80-02, 62aa11_1, and 42-2L03) showed ratios exceeding 1, suggesting potential cross-reactivity with lamin A due to the high structural similarity between the two C-terminal regions. In particular, 62aa11_1 exhibited a significantly higher affinity for lamin A. One possible reason is that since the two binding sites of lamin A and progerin differ by only a few amino acids and also correspond to an intrinsically disordered region, the peptide designed to bind progerin efficiently may also have the potential to bind lamin A equally well or even better.

Overall, these results confirm that our computational pipeline can generate interactors with differential binding profiles and demonstrate that both de novo (NeuroBind) and rational design approaches are valid and complementary strategies.

**Table 9.** Comparison of the Kd values for the interaction between the designed interactors and progerin versus lamin A. The third column reports the Kd ratio to facilitate the comparison of binding affinities. Values below 1 indicate stronger binding to progerin than to lamin A, while values above 1 correspond to higher affinity for lamin A. These results refer to the interactors without the RING domain.
Interactor	Haddock Kd [M] progerin - second analysis	Haddock Kd [M] lamin A	Kd (progerin) / Kd (lamin A)
LOGO	1.30E-08	1.28E-07	0.10
n80_02	7.81E-08	3.49E-08	2.24
Rank 15	1.30E-07	9.10E-07	0.14
Rank 21	5.70E-09	2.18E-07	0.03
62aa11_1	1.40E-06	1.73E-07	8.10
52aa2_1	4.17E-07	1.80E-06	0.23
42-2L03	4.53E-07	4.42E-07	1.02
n51_02	6.30E-08	1.30E-07	0.48
Rank 7	1.60E-07	1.30E-06	0.12

Step 2 - With the RING domain

To analyse the behaviour of the interactors when attached to the TRIM21-RING domain, it is first necessary to predict and relax the structures of the interactors-RING chimeric protein. The interactors were combined with the RING domain in the N-terminus tail using a flexible linker composed of glycine and serine (GGGGSGGGGSGGGGS). Table 10 presents the total energy of the structure after each relaxation step, run using Amber, RosettaRelax and Amber; along with the MolProbity score for structure validation. Additionally, the pTM score obtained by Alphafold is provided, which estimates the global confidence of the protein folding, serving as an indicator of the reliability of the global topology of the structure. In the table, we highlighted the structures that were retained based on their balance between MolProbity score and energy.

**Table 10.** Relaxation results for the predicted interactors attached to the RING domain. The table shows the energy levels and MolProbity scores of the structures after each relaxation step.
Interactor	pTM	Energy				MolProbity score
Interactor	pTM	Pre-Relax	Post-AMBER	Post-RosettaRelax	Post-AMBER 2.0	Post-AMBER	Post-RosettaRelax	Post-AMBER 2.0
LOGO	0.41	-1.570,60	-5.551,27	-3.702,61	-6.333,65	2.55	1.46	2.39
n80_02	0.4	-588,60	-4.514,32	-2.275,82	-4.553,58	2.37	1.51	2.12
Rank 15	0.39	365,13	-3.555,26	-850,96	-3.647,22	2.30	2.02	2.38
Rank 21	0.39	302,12	-2.899,76	-549,98	-3.690,36	2.27	1.03	2.42
62aa11_1	0.47	-7.308,03	-11.329,11	-8.128,80	-11.533,71	2.12	1.40	2.17
52aa2_1	0.41	-5.565,58	-8.299,75	-5.352,63	-8.400,86	2.18	1.41	2.02
42-2L03	0.43	-447,58	-2.869,33	-853,65	-3.031,81	2.18	1.55	2.46
n51_02	0.41	-241,29	-4.606,89	-1.848,39	-4.534,06	2.32	1.51	2.18
Rank 7	0.41	1.741,06	-2.435,16	124,32	-2.813,69	2.15	1.45	2.01

This phase has brought to light important insights about our interactors. We discovered that, when attached to the RING domain, not all structures are capable of maintaining the correct folding necessary for protein binding. This implies that, for some of the interactors and especially for the globular ones, more damaged by the presence of RING, the linker length needs to be optimized to achieve a functional and properly folded protein. Increasing the linker length can improve the folding of the interactor, but it should not compromise the activity of the RING domain by placing it too far from the target progerin. (We also discuss it in our engineering page, DBTL cycle 5). The most affected structure was LOGO, but we continued the downstream analysis to investigate whether it could retain some binding activity. Figures 9, 10, 11 show comparisons between the properly folded structure and the one whose folding was disrupted by the presence of the RING domain.

**Figure 9.** folding distortion of interactor 52aa2_1.

Figure 10. folding disruption of interactor LOGO. Globular conformations are the most affected by the presence of the RING domain.

Folding Disruption of Interactor n80_02: RING Domain Effects on Globular Conformations — **Figure 11.** folding disruption of interactor n80_02. Globular conformations are the most affected by the presence of the RING domain. Nevertheless, a global maintenance of the interactor’s folding is observed.

We then calculated HADDOCK binding Kd values for lamin A and progerin to investigate whether our interactors could discriminate between the progerin target and lamin A. Clusters with a HADDOCK score below -35 were considered significant, and the Kd was calculated as a weighted average based on cluster size. For cases where no cluster achieved a satisfactory HADDOCK score, we retained the most reliable cluster for analysis. The most promising interactor appears to be Rank 15, with a Kd of 1.20 × 10^-8 M (Table 11). For other interactors, such as n80_02, the presence of the RING domain enhanced their ability to bind progerin, decreasing the Kd from 7.81 × 10^-8 M (Table 8) to 5.55 × 10^-8M (Table 11), likely due to conformational rearrangements (Figure 11).

**Table 11.** HADDOCK docking results for the analysis between the interactors attached to the RING domain and progerin.
progerin	Cluster information			Prodigy
progerin	Cluster	Haddock score	Cluster size	Kd [M]	Weighted average
LOGO	6	-40.4 ± 1.3	14	6,20E-07	1,38E-07
	1	-38.3 ± 2.8	47	3,90E-09
	8	-38.0 ± 3.2	6	3,80E-07
	5	-36.1 ± 3.6	15	1,40E-08
n80_02	1	-48.5 ± 1.8	98	7,10E-09	5,55E-08
	2	-48.3 ± 2.3	19	3,30E-07
	3	-40.8 ± 8.8	17	4,40E-08
	4	-35.5 ± 1.0	6	1,00E-08
Rank 15	1	-41.0 ± 3.1	49	1,20E-08	1,20E-08
Rank 21	2	-31.7 ± 2.5	26	7,80E-08	7,80E-08
62aa11_1	1	-23.4 ± 8.4	39	1,70E-07	1,70E-07
52aa2_1	1	-45.9 ± 1.3	49	3,80E-07	3,80E-07
42-2L03	2	-52.9 ± 1.7	24	1,60E-07	1,19E-07
42-2L03	4	-45.3 ± 3.3	9	8,50E-09	1,19E-07
n51_02	1	-48.3 ± 3.6	74	2,00E-08	1,97E-07
	10	-41.7 ± 4.9	5	7,00E-07
	2	-41.7 ± 2.3	36	4,80E-07
	3	-41.0 ± 1.4	10	1,90E-07
Rank 7	1	-31.1 ± 2.2	116	2,20E-08	2,20E-08

**Table 12.** shows HADDOCK docking results for the interaction between the interactors with the RING domain and lamin A.
Lamin A	Cluster information			Prodigy
Lamin A	Cluster	Haddock score	Cluster size	Kd [M]	Weighted average
LOGO	1	-31.7 ± 1.8	19	1,30E-07	1,30E-07
n80_02	2	-32.5 ± 2.8	50	4,90E-08	4,90E-08
Rank 15	2	-28.6 ± 13.4	19	4,50E-07	4,50E-07
Rank 21	11	-38.9 ± 9.0	4	1,80E-07	1,80E-07
62aa11_1	1	-43.1 ± 1.6	71	3,00E-09	3,00E-09
52aa2_1	1	-36.3 ± 2.6	73	7,30E-06	7,30E-06
42-2L03	5	-52.7 ± 4.1	10	1,70E-07	1,19E-07
	1	-50.2 ± 2.1	42	1,60E-07
	6	-48.0 ± 0.7	9	3,50E-08
	11	-42.7 ± 2.9	4	7,70E-08
	4	-41.6 ± 3.8	13	2,40E-07
n51_02	1	-48.3 ± 3.6	74	2,00E-08	1,97E-07
	10	-41.7 ± 4.9	5	7,00E-07
	2	-41.7 ± 2.3	36	4,80E-07
	3	-41.0 ± 1.4	10	1,90E-07
Rank 7	1	-31.1 ± 2.2	116	2,20E-08	2,20E-08

Table 13 compares the Kd values for the interaction between the RING-fused interactors and progerin or lamin A. The analysis revealed that, overall, the presence of the RING domain did not compromise the binding ability of most interactors. Several candidates maintained or even improved their selectivity toward progerin, as indicated by ratios below 1.

In particular, Rank 15 and 52aa2_1 showed the strongest preference for progerin, with Kd ratios of 0.03 and 0.05, respectively, confirming that RING fusion can preserve and even enhance target discrimination. Rank 21 also retained moderate selectivity (ratio = 0.43). In contrast, the LOGO model exhibited a reduction in affinity, with a ratio approaching 1, likely due to conformational disruption observed in Figure 10. It would be interesting to test whether modifying the linker length or the position of the RING domain could restore its optimal configuration.

Notably, 42-2L03 maintained the same Kd for both progerin and lamin A (ratio = 1.00), confirming its neutral binding behavior. n80_02 showed a slight improvement compared to the isolated form, but still displayed a ratio above 1, indicating a stronger affinity for lamin A. 62aa11_1 exhibited a pronounced shift in specificity (ratio = 56.67), suggesting that RING fusion induced unfavorable conformational rearrangements that impaired binding to progerin and favored lamin A instead. Both 62aa11_1 and n80_02 demonstrated low affinity for progerin and higher affinity for lamin A, and would likely be excluded from future optimization rounds.

Taken together, these data indicate that the inclusion of the RING domain generally preserves the designed binders’ selectivity for progerin, while also highlighting how structural compatibility between the effector domain and the interactor scaffold is crucial for maintaining target specificity.

**Table 13.** Comparison of the Kd values for the interaction between the interactors and progerin versus lamin A. The third column reports the Kd ratio to facilitate the comparison of binding affinities. Values below 1 indicate stronger binding to progerin than to lamin A, while values above 1 correspond to higher affinity for lamin A. These results refer to the interactors without the RING domain.
Interactors	HADDOCK Kd progerin [M]	HADDOCK Kd Lamin [M]	Kd(progerin) / Kd(lamin a)
LOGO	1,38E-07	1,30E-07	1.07
n80_02	5,55E-08	4,90E-08	1.13
Rank 15	1,20E-08	4,50E-07	0.03
Rank 21	7,80E-08	1,80E-07	0.43
62aa11_1	1,70E-07	3,00E-09	56.67
52aa2_1	3,80E-07	7,30E-06	0.05
42-2L03	1,19E-07	1,19E-07	1.00
n51_02	1,89E-07	1,97E-07	0.96
Rank 7	2,20E-08	2,20E-07	0.10

Conclusion and future perspectives

Future perspectives

In the future, if we have the opportunity to continue our project, we would definitely aim to analyze the remaining interactors that we initially discarded, to see if any of them might outperform the current best options. Additionally, we would like to conduct studies focusing on the positioning (C-terminal vs. N-terminal) and linker length, to observe how these factors influence the folding of interactors when paired with the RING domain. Furthermore, we are interested in exploring the most effective approach for achieving optimal interaction. Should we rely solely on our existing pipeline, or would rational design, NeuroBind, or a hybrid method yield better results? To answer these questions, we will need to await experimental validation. This validation will also help us determine the most suitable features for the interactor, whether they should be globular, helical, or unstructured.

Another promising future direction would be to analyze interactors of varying lengths, potentially extending beyond peptides, to investigate whether changes in length impact binding affinity and RING domain efficiency. Lastly, if we had the resources, we would be very keen on conducting a comprehensive molecular dynamics study of progerin. We aim to generate a highly detailed model, including the farnesyl group, to allow for even more precise in vitro validation.

Looking further ahead, if the designed interactors were to be considered for therapeutic applications, a complete validation strategy would be required.
This would include:

immunogenicity assessment;
solubility and aggregation profiling;
stability and degradation studies;
affinity and target engagement tests;
off-target and safety evaluations;
absorption, distribution, metabolism and excretion (ADME)/pharmacokinetic characterization

Each of these steps serves a distinct purpose and relies on complementary in silico (computational) and experimental approaches. Computational analyses would first allow us to predict critical properties such as potential immune epitopes, solubility, aggregation hotspots, and proteolytic cleavage sites. For instance, tools like IEDB and NetMHCpan can be used to assess immunogenicity risks by identifying HLA-binding epitopes; CamSol and AGGRESCAN3D help predict solubility and aggregation-prone regions; while PROSPER and PROSPERous allow the identification of protease-sensitive sites. These predictions guide early optimization and help identify variants less prone to immune reactions or degradation. Similarly, in silico screening against the human proteome, using sequence alignment and structural comparison tools (e.g. BLAST or AlphaFold Database), would help detect possible off-target similarities, reducing unwanted interactions before any laboratory testing.

At the experimental level, validation would proceed through a series of biophysical and cellular assays. Solubility and aggregation behaviour would be analyzed through size-exclusion chromatography (SEC) and dynamic light scattering (DLS), while stability would be assessed under thermal and oxidative stress to verify resistance to degradation. Binding affinity and kinetics could then be measured using surface plasmon resonance (SPR) or bio-layer interferometry (BLI), complemented by cell-based assays such as the cellular thermal shift assay (CETSA) to confirm interaction with the native target. Broader safety checks would involve cytotoxicity assays and hemocompatibility tests, together with in vitro immunogenicity screening using peripheral blood mononuclear cells (PBMCs) to detect potential cytokine release. Finally, pharmacokinetic (PK) and biodistribution studies in suitable models would clarify the peptide’s half-life, clearance and tissue localization.

Altogether, this integrated workflow — combining in silico prediction, targeted redesign, and experimental validation — would ensure that any candidate emerging from our project not only retains high affinity for progerin but also exhibits the safety, stability, and specificity required for therapeutic development. All these validation steps would be conducted following international guidelines from the European Medicines Agency (EMA), the U.S. Food and Drug Administration (FDA), and the International Council for Harmonisation (ICH S6(R1)), ensuring full compliance with established safety and efficacy standards for therapeutic proteins.

Conclusion

In conclusion, as part of our project, we have developed a sophisticated bioinformatics pipeline for the design of peptide interactors specifically targeting an intrinsically disordered region, specifically, the C-terminal tail of progerin. This is a crucial aspect of our approach to tackling Hutchinson-Gilford Progeria Syndrome (HGPS). Our pipeline has undergone multiple iterations of trial and error, incorporating pages of engineering adjustments to refine the results, and we have validated the structures obtained using a variety of bioinformatics tools.

After generating these interactors, we proceeded to analyze their binding affinities. From the hundreds of initial interactors, we selected the top 9 candidates that demonstrated the most promising binding characteristics. These were then subjected to more in-depth analysis, both in laboratory experiments and further computational studies using software tools. As part of our investigation, we included both versions of the interactors: with and without the RING domain from the TRIM21 ubiquitin ligase, which is central to our project’s goal of targeted progerin degradation.
Our findings revealed that, from the many initial candidates, the 9 interactors we analyzed showed varying degrees of specificity. Some interactors performed better against progerin, while others were more effective against lamin A, which underscores the complexity of targeting intrinsically disordered regions with precision. All the interactors developed during this process are now available in the registry(Visit our contribution wiki page), and their respective sequences are provided in the supplemental materials section for further reference. This resource will be crucial for future studies and experimental validation as we move forward with optimizing these interactors for therapeutic use in the ProgERASE approach.

At the end of it all, we remind ourselves that outside the laboratory, beyond the computer, the software, the proteins, and the de novo design of interactors... beyond all of this, there are people who are born and grow up with a rare, ultra-rare disease. Every day they live and continue to make their voices heard for who they are, both the same and different. Our iGEM community must remember that every year we participate in a competition among teams, but the ultimate goal is always to help build a simpler world for everyone.

Interaction analysis template Template to compute our data analysis, write to mutans.biologia@unipd.it to have the excel file. We will be happy to share it with you!

Tools' parameters Depiction of all the settings we used to obtain our results

MULTI-FASTA Sequences of all the interactors and fragments we analyzed

[1] L.B. Gordon, W.T. Brown, and F.S. Collins. “Hutchinson-Gilford Progeria Syndrome”. In: GeneReviews® [Internet] (1993-2025; Updated 2025 Mar 13). https://www.ncbi.nlm.nih.gov/books/NBK1121/
[2] X. Wong, A.J. Melendez-Perez, and K.L. Reddy. “The Nuclear Lamina”. In: Cold Spring Harb Perspect Biol 14 (2022). https://doi.org/10.1101/cshperspect.a040113
[3] R. Trivedi and H.A. Nagarajaram. “Intrinsically Disordered Proteins: An Overview”. In: Int J Mol Sci 23 (2022). https://doi.org/10.3390/ijms232214050
[4] J. Abramson et al. “Accurate structure prediction of biomolecular interactions with AlphaFold 3”. In: Nature 630 (2024). https://doi.org/10.1038/s41586-024-07487-w
[5] A. Waterhouse et al. “SWISS-MODEL: homology modelling of protein structures and complexes”. In: Nucleic Acids Res 46 (2018). https://doi.org/10.1093/nar/gky427
[6] T. R. Alderson, I. Pritišanac, and J. D. Forman-Kay, “Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2,” In: Proc. Natl. Acad. Sci. USA 120 (2023). https://doi.org/10.1073/pnas.2304302120
[7] Z. F. Brotzakis, S. Zhang, M. H. Murtada, P. Sormanni, and M. Vendruscolo, “AlphaFold prediction of structural ensembles of disordered proteins,” In: Nat. Commun., 16 (2025). testodoi: 10.1038/s41467-025-56572-9. https://doi.org/10.1038/s41467-025-56572-9
[8] S. Lyskov et al. “Serverification of Molecular Modeling Applications: The Rosetta Online Server That Includes Everyone (ROSIE)”. In: PLoS One 8 (2013). https://doi.org/10.1371/journal.pone.0063906
[9] D.A. Case et al. “The Amber biomolecular simulation programs”. In: J.Comput. Chem. 26 (2005). https://doi.org/10.1002/jcc.20290
[10] Williams et al. “MolProbity: More and better reference data for improved all-atom structure validation”. In: Protein Science 27 (2018). https://doi.org/10.1002/pro.3330
[11] J.L. Watson et al. “De novo design of protein structure and function with RFdiffusion”. In: Nature 2023 (620). https://doi.org/10.1038/s41586-023-06415-8
[12] Jue Wang et al. “Scaffolding protein functional sites using deep learning”. In: Science 377 (2022). doi: 10.1126/science.abn2100. https://doi.org/10.1126/science.abn2100
[13] R.V. Honorato et al. “The HADDOCK2.4 web server for integrative modeling of biomolecular complexes”. In: Nat Protoc 19 (2024). https://doi.org/10.1038/s41596-024-01011-0
[14] D. Kozakov et al. “The ClusPro web server for protein–protein docking”. In: Nat Protoc 2017 (12). https://doi.org/10.1038/nprot.2016.169
[15] Xue Li C. et al. “PRODIGY: a web server for predicting the binding affinity of protein–protein complexes”. In: Bioinformatics 32 (2016). https://doi.org/10.1093/bioinformatics/btw514
[16] C. Liu et al. “Diffusing protein binders to intrinsically disordered proteins”. In: Nature 644 (2025). https://doi.org/10.1038/s41586-025-09248-9
[17] Wu Kejia et al. “Design of intrinsically disordered region binding proteins”. In: Science 389 (2025). https://doi.org/10.1126/science.adr8063
[18] C. Geng, S. Narasimhan, J. P. Rodrigues, and A. M. Bonvin, “Information-driven, ensemble flexible peptide docking using haddock,” In: Methods in Molecular Biology (2017). https://doi.org/10.1007/978-1-4939-6798-8_8

Navigation Map

Introduction

Overview

Hutchinson-Gilford Progeria Syndrome: Molecular Basis

Overview

Phase 1: Computational Predictions and Binders Design

Progerin Structure Prediction

Overview

Structural Analysis and Refinement

Overview

Binders Design

Overview

Docking analysis

Overview

First results

Overview

Other techniques for binding design

Insights from the selected models

Overview

Conclusion

Overview

Laboratory Experimentation

Phase 2: Binding Affinity Comparison between progerin and lamin A

Step 1 - Without RING domain

Step 2 - With the RING domain

Conclusion and future perspectives

Future perspectives

Conclusion

Supplemental material

References