Back
To Top

Accessibility Options

Text size

Line height

TOC

Engineering

A comprehensive design strategy encompassing wetlab, model, software, and hardware, rigorously following the DBTL cycle to ensure iterative optimization and full integration.

Expression of Activated GZMK

Cycle 1: Acquisition of GZMK DNA

Design

To express GZMK, we need to obtain its DNA sequence. Considering the long synthesis cycle and high cost of synthesizing DNA sequences through commercial services, we propose to enrich and clone the GZMK DNA from a human genomic library.

Build

We designed two primers that bind to sequences flanking the GZMK gene on human chromosomes to enrich the GZMK gene from the genomic library. To enhance success rates, we developed primers of varying lengths (annealing temperatures) to balance nonspecific binding and yield.

Test

Using the human genome as a template, PCR was performed with the designed primers.

For shorter primers with lower specificity, numerous nonspecific bands were observed due to nonspecific binding. For longer primers with higher specificity, no specific binding was detected at various temperatures, and the target GZMK band was not obtained.

PCR Results for GZMK

Learn

The PCR results showed that low-specificity primers generated multiple nonspecific bands, while high-specificity primers failed to amplify the target band under various temperature conditions. This suggests that the GZMK gene is present at very low abundance or even absent in the genomic library used, rather than the issue being solely due to primer design. Therefore, we concluded that enrichment from the genomic library is not feasible and shifted to commercial DNA synthesis to ensure a correct and usable GZMK template.

Cycle 2: Prokaryotic Expression

Design

To screen for inhibitors and binding proteins of GZMK, obtaining active GZMK is critical. Based on literature review, GZMK can be expressed using either prokaryotic or eukaryotic vectors. We prioritized prokaryotic expression due to its lower requirements for equipment and reduced cost.

Build

We removed the signal peptide from the human GZMK gene, retained the proenzyme fragment, and performed codon optimization for Escherichia coli before cloning it into the pET-28a vector for GZMK protein expression in E. coli. As GZMK is a human-derived protein, expression in E. coli leads to incorrect folding and the formation of inclusion bodies, necessitating denaturation and refolding procedures for the inclusion bodies.

Test

We transformed the plasmid into Escherichia coli and cultured until turbidity was observed. Induction was performed at 37°C for 4–5 hours, followed by centrifugation to collect the bacterial cells. After cell lysis and centrifugation, a white precipitate was obtained, which was confirmed to contain the target protein GZMK via SDS-PAGE and Western Blot analysis.

Results of GZMK Prokaryotic Expression

The inclusion bodies were collected and dissolved in a denaturation buffer. The solution was then slowly pumped into a refolding buffer, followed by dialysis to exchange the buffer, yielding a GZMK solution. The presence of the target protein GZMK was confirmed via SDS-PAGE and Western Blot analysis.

Results of GZMK Denaturation and Refolding

GZMK was activated using Cathepsin C (CTSC), and its enzymatic activity was subsequently measured, revealing no expected activity.

Learn

SDS-PAGE and Western Blot confirmed that GZMK protein with the correct molecular weight and tag was obtained. However, the absence of enzymatic activity indicated that the problem was not at the genetic construction or translation level, but rather during the refolding process. As a human-derived protein, GZMK tends to form inclusion bodies in E. coli, and improper refolding conditions may result in misfolded or disordered protein that lacks activity. Thus, we concluded that optimizing refolding conditions is essential to obtain functional GZMK.

Cycle 3: Optimization of Prokaryotic Expression Conditions

Design

Based on Cycle 2, we need to modify the refolding conditions to obtain correctly folded GZMK.

Build

Through literature and textbook review, we found that the temperature during the refolding process affects the formation of disulfide bonds, which are critical for stabilizing the GZMK structure and generating its activity. Therefore, based on the literature, we increased the refolding temperature to promote the formation of correct disulfide bonds.

Test

Inclusion bodies were collected, dissolved in denaturation buffer, and slowly pumped into refolding buffer, with the refolding temperature raised from 4°C to room temperature (approximately 25°C) and a refolding duration of 30 hours.

After 30 hours of refolding at room temperature, a large amount of white flocculent precipitate formed in the solution, which was identified as denatured GZMK protein.

Learn

The refolding temperature experiments revealed that at low temperatures, GZMK failed to form correct disulfide bonds and proper structure, while at higher temperatures, it denatured and precipitated. Considering these results along with time and cost constraints, we recognized that the prokaryotic system is unlikely to meet the requirements for correct folding and enzymatic activity. Hence, we decided to abandon further optimization in the prokaryotic system and shift to eukaryotic cell expression, which naturally supports proper folding and modifications.

Cycle 4: Signal Peptide Selection

Design

GZMK is a human protein, and its expression in mammalian cells naturally facilitates the formation of correct disulfide bonds and proper folding, eliminating the need for denaturation and refolding processes. However, due to its dual intracellular and extracellular toxicity, we need to express it in the form of a secreted zymogen.

Build

We utilized HEK 293F cells to express GZMK via secretory expression. To this end, we constructed the following plasmids. The choice of signal peptide affects secretion efficiency, and we selected three signal peptides reported in the literature and constructed corresponding plasmids.

Test

Primers were first designed for PCR, and the results showed that the GZMK-1 vector could not be cloned, likely due to template contamination or degradation. Consequently, GZMK-2 and GZMK-3 were transfected into HEK 293F cells for protein expression.

PCR Results for Eukaryotic Expression of GZMK

Cells were periodically sampled, and Western Blot was used to assess protein expression levels. The secretion level of GZMK-2 was extremely low, whereas GZMK-3 exhibited high secretion levels, allowing GZMK to be obtained from the culture medium.

Expression Status of Eukaryotic GZMK

The culture medium of GZMK-3 was collected and subjected to affinity chromatography followed by size-exclusion chromatography. High-purity GZMK solution was obtained, as verified by SDS-PAGE.

Purification Results of GZMK

Learn

The comparison of three signal peptides showed that only GZMK-3 achieved efficient secretion and strong expression in HEK293F cells, while other constructs either failed to clone or exhibited very low secretion efficiency. This demonstrated that signal peptide selection plays a critical role in the success of secretory protein expression. We therefore identified GZMK-3 as the optimal construct and successfully obtained high-purity GZMK, laying a solid foundation for subsequent functional assays.

Cycle 5: Enzymatic Cleavage to Obtain Active GZMK

Design

Considering the cytotoxicity and stability of active GZMK, we expressed GZMK with its propeptide. After purification, the propeptide was cleaved in vitro to activate GZMK, yielding enzymatically active GZMK.

Build

Due to the difficulty in obtaining Cathepsin C, the enzyme responsible for propeptide cleavage in humans, we modified our approach by using the commonly available tool enzyme Enterokinase (EK). During vector construction, the signal peptide was followed by the EK cleavage sequence (DDDDK), and then the mature GZMK sequence, ensuring that EK processing yields enzymatically active GZMK.

Test

The GZMK zymogen solution was dialyzed to exchange into the working buffer for enterokinase. Enterokinase was added according to the recommended dosage, and dialysis was performed at room temperature for 16 hours to obtain the cleaved solution. Enzyme activity was measured using an instrument, revealing significant activity compared to the control sample.

GZMK Activity Curve

Absorbance Data Table (click to expand / collapse)

Tip: Scroll horizontally if the table is wide.

If the table did not load, please check whether CSV_URL is correct and the file is accessible.

Learn

By introducing an EK cleavage site into the construct, we successfully enabled in vitro activation of GZMK zymogen and observed the expected enzymatic activity after cleavage. This demonstrated the feasibility of our design, with all genetic elements (signal peptide, cleavage site, etc.) functioning as intended. The resulting active GZMK not only validated our experimental strategy but also provided a reliable protein source for downstream affinity measurements and inhibitor screening.

Cycle 6: Optimized GZMK Expression

Design

The subsequent inhibitor screening and affinity measurements require a substantial amount of GZMK, necessitating large-scale expression. Based on previous experiments, we continuously optimized the experimental protocol and ultimately established a stable protocol for the expression and purification of GZMK.

Build

To distinguish GZMK from its binding proteins, we replaced the His tag with a Flag tag and removed the linker between GZMK and the tag to prevent artificial glycosylation. The final construct utilized the pcDNA3.1 plasmid, into which the IgK signal peptide, EK cleavage site, GZMK sequence, and Flag tag were inserted.

Schematic Diagram of the Final GZMK Construct

Test

The aforementioned vector was transfected into HEK 293F cells, and the supernatant was collected after 72 hours. Affinity chromatography using Flag resin was performed, followed by concentration and EK cleavage. Post-cleavage, size-exclusion chromatography was conducted, with the elution peak corresponding to the size of GZMK. The protein at the peak fraction was aliquoted and cryopreserved. Subsequent activity testing revealed significant activity compared to the control.

SEC Image and Enzyme Activity Curve of GZMK

Learn

Through iterative exploration and optimization in the previous cycles, we ultimately established a stable and efficient workflow for producing active GZMK. This system combined appropriate signal peptide selection, tag design, and cleavage strategy, ensuring both yield and functionality. With aliquoted cryopreservation, we secured sufficient and stable supplies of GZMK, providing a solid foundation for large-scale affinity measurements and inhibitor screening in subsequent experiments.

Quantitative GZMK Activity Testing

Cycle 1: Activity Assay of Prokaryotically Expressed GZMK Using Z-Lys-SBZL

Design

To evaluate the biological function of prokaryotically expressed GZMK, we designed this experiment to measure its protease activity following activation by the key enzyme, Cathepsin C (CTSC).

Build

We selected Z-Lys-SBZL as a specific chromogenic substrate for GZMK. Active GZMK can cleave this substrate to release a free thiol group (-SH), which in turn reacts with DTNB to produce a yellow product with maximum absorbance at 412 nm. Enzyme activity is quantified by monitoring the change in absorbance. Due to instrument limitations, we ultimately chose to perform the measurement at a wavelength of 405 nm.

Structure of Z-Lys-SBZL

Test

We mixed CTSC-activated GZMK, the substrate Z-Lys-SBZL, and the chromogenic reagent DTNB in reaction buffer, and set up different control groups. The absorbance at 405 nm was measured using a microplate reader, and the results are shown below. Compared with the blank control, the absorbance of the GZMK+CTSC group increased significantly. However, the CTSC-only group also showed a similarly high signal, suggesting that fluorescence interference might originate from CTSC itself. Moreover, since the fluorescence intensity curve of the GZMK+CTSC group was a flat line rather than the expected sloped line, we concluded that GZMK exhibited no enzymatic activity.

Activity Assay of GZMK Based on Z-Lys-SBZL

Learn

The GZMK+CTSC group showed an elevated baseline but a near-flat time course, while the CTSC-only control also produced strong signals. Given DTNB’s thiol sensitivity, we infer the interference arises from reducing agent DTT in the CTSC buffer, which react directly with DTNB, yielding a high background without kinetic slope. Therefore, these data do not establish whether the prokaryotic GZMK is active; the core issue is assay interference. Next, we switch to a DTT-free activator (EK in the next cycle) to restore kinetic interpretability.

Cycle 2: Activity Assay of Eukaryotically Expressed GZMK Using Z-Lys-SBZL

Design

The objective of this experimental cycle was to verify whether the GZMK obtained from eukaryotic expression exhibits the expected protease activity after being cleaved and activated by Enterokinase.

Build

The experimental principle was the same as in the previous cycle: the assay quantifies enzyme activity by using GZMK to cleave the substrate Z-Lys-SBZL, which releases a thiol group. This thiol group then reacts with DTNB to produce a color signal detectable at 405 nm.

Test

We mixed the Enterokinase-activated GZMK solution with the substrate Z-Lys-SBZL and DTNB. To eliminate potential interference, we included both a blank control and a control group containing only Enterokinase. The results showed that while the absorbance of the GZMK experimental group increased significantly, the Enterokinase control group also exhibited a notable rise in absorbance.

Validation of GZMK Activity Before and After Enterokinase Cleavage

Learn

After EK activation, the GZMK group increased markedly, suggesting catalytic activity; however, the EK-only control also rose, indicating persistent background. The hierarchy “GZMK+EK > EK-only > blank” implies a positive net GZMK effect (after subtracting EK-only), yet EK or its buffer contributes to background. To clarify the mechanism and reduce error, the next cycle modulates EK activity (Ca²⁺/EDTA) to test whether EK directly cleaves the substrate or indirectly affects the readout.

Cycle 3: Investigation into the Influence of Enterokinase on Enzyme Activity Assay

Design

We hypothesized that Enterokinase (EK) itself might cleave the substrate, thus interfering with the accurate measurement of GZMK activity. This experiment was designed to test this hypothesis and to explore whether this interference could be eliminated by modulating EK activity (via Ca²⁺ concentration) to obtain more precise GZMK activity data.

Build

Our strategy involved using Ca² ⁺ as an activator and EDTA as an inhibitor of Enterokinase. By adding supplemental Ca² ⁺ (to enhance its activity) and EDTA (to inhibit its activity by chelating Ca²⁺) to separate reaction systems, we expected to observe corresponding changes in the background signal, thereby validating our hypothesis.

Test

The specific experimental groups were set up as follows: the final concentration of calcium ions was 2mM, and the final concentration of EDTA was 4mM.

Investigation of the Effect of Enterokinase on GZMK Activity

Learn

If our hypothesis were correct, Ca²⁺ should elevate and EDTA should reduce EK-related background; the observed changes did not match these predictions, failing to support EK directly cleaving Z-Lys-SBZL as the sole cause. The background is likely multifactorial (e.g., sensitivity of the substrate/chromophore to protein impurities, ions, or trace reductants). Since we cannot reliably isolate the true signal, the Z-Lys-SBZL + DTNB system is unsuitable for quantitative analysis here. We therefore switch to a fluorogenic peptide substrate to achieve higher S/N and cleaner kinetics.

Cycle4: Activity Assay of GZMK Based on Fluorescent Peptide Substrate

Design

Given that the Z-Lys-SBZL colorimetric assay could not provide precise quantitative data, a critical requirement for our subsequent inhibitor screening experiments, we decided to develop and establish a new activity assay system based on a specific fluorogenic peptide substrate.

Build

Based on literature research, we designed and synthesized a specific fluorogenic peptide substrate (DABCYL-GDGRSIMTE-EDANS). The peptide sequence is a specific cleavage site for GZMK and is flanked by a fluorophore (EDANS) and a quencher (DABCYL). Upon cleavage by GZMK, the two are separated, abolishing the quenching effect. The fluorophore then emits light at 490 nm upon excitation at 340 nm, with the fluorescence intensity being directly proportional to enzyme activity.

Principle of Fluorescent Substrate Color Development

Test

We performed the activity assay in a standard buffer system. The final concentration of GZMK was 0.42 μM, and the fluorogenic peptide substrate was tested at three different concentrations (250, 125, and 62.5 μM), with two replicates per group. A blank control group without GZMK was also included. We monitored the change in fluorescence signal continuously for 30 minutes. It should be noted that due to filter limitations, the actual emission wavelength we measured was 460 nm.

Activity Assay of GZMK Based on Fluorescent Peptide Substrate

Learn

With a low, stable blank, all assay groups showed time-dependent increases that scaled with substrate concentration, yielding the expected kinetic slopes. Despite reading at 340/460 nm instead of the canonical 340/490 nm, the net signal remained strong, indicating robustness. We therefore confirm this peptide is a specific, efficiently cleaved substrate for GZMK, and the assay platform provides the reproducibility and sensitivity required for kinetic analysis and inhibitor screening.

Cycle5: GZMK Activity Assay

Design

After successfully establishing and validating the fluorogenic peptide-based activity assay, we used this system to perform a systematic enzyme kinetics analysis of the GZMK obtained from our eukaryotic expression and purification pipeline.

Build

We treated GZMK as an enzyme that follows Michaelis-Menten kinetics. The experimental method involved measuring the initial reaction velocity (the rate of fluorescence increase at the start of the reaction) at a fixed GZMK concentration across a series of varying fluorogenic peptide substrate concentrations. Finally, the data was fitted to the Michaelis-Menten equation using GraphPad Prism software to calculate the Michaelis constant (Km) and maximum reaction velocity (Vmax).

Test

We set up a series of substrate concentrations ranging from 0.9375 μM to 500 μM, with three parallel replicates and one blank control for each concentration. To correct for spontaneous substrate hydrolysis, the data used for the final fitting was calculated by subtracting the control group values from the average of the replicate groups. During data processing, we excluded the data points for the 500 μM and 125 μM concentrations due to anomalies or contamination.

Enzyme Kinetics Assay of GZMK

Learn

Through Michaelis-Menten fitting, we successfully obtained key enzyme kinetic parameters for GZMK: a Km value of 50.20 μM, indicating a relatively high affinity between GZMK and the fluorogenic substrate, and a Vmax of 437.5 RFU/min. These data provide a critical baseline for subsequent inhibitor screening experiments.

High-Precision Inhibitor Screening

Cycle1: Determination of Screening Conditions

Design

Before initiating large-scale screening, we need to establish various conditions for the screening process, including substrate concentration, protein concentration, and inhibitor concentration. Therefore, we plan to conduct preliminary experiments to investigate these factors.

Build

For substrate concentration, we set two final concentrations: 50 μM and 100 μM. For protein concentration, we set four final concentrations: 25 nM, 50 nM, 100 nM, and 200 nM. The total volume of each reaction system was set to 50 μL.

Test

Based on the conditions described above, we established 10 groups for preliminary experiments. The basic information, fluorescence intensity curves, and curve slopes are as follows:

Preliminary Screening Experiment Results

Learn

The pilot runs showed that 100 μM substrate produced a larger initial-rate separation within the same time window while maintaining a clean linear range. At 100 nM enzyme, the slope difference versus control was already robust, and higher enzyme levels mainly increased consumption with limited gain in window size. Balancing signal-to-noise, substrate use, enzyme economy, and linear initial-rate stability, we selected 100 μM substrate with 100 nM enzyme for large-scale screening, which secures adequate kinetic slope while preserving throughput and cost efficiency.

Cycle2: Large-Scale Screening

Design

We selected the L1000 compound library, which contains 1,813 approved drugs, all of which have been approved by regulatory agencies such as the FDA, EMA, and CFDA. We will screen the entire library.

Build

Each reaction system contains HEPES buffer, 84.23 nM GZMK protein, 100 μM fluorescent substrate, 1 mM inhibitor, 0.1 mg/mL BSA, and 0.01% Triton X-100. Fluorescence intensity was measured once per minute for 15 cycles.

Test

Due to the large volume of data, we cannot present all results here. Among the 1,813 compounds tested, three exhibited results consistent with expectations based on fluorescence intensity and excitation light curves.

Results of Large-Scale Preliminary Screening

Learn

Under the chosen parameters, most compounds showed only background-level variation, while three compounds produced a stable and reproducible drop in initial rate across replicates with kinetics consistent with true inhibition. Together with BSA and low Triton X-100 suppressing nonspecific aggregation, this supports these as true positives rather than optical or aggregation artifacts. The next step is retesting and dose–response confirmation to rule out autofluorescence, inner-filter effects, and precipitation.

Cycle 3: Determination of Precise Inhibitory Effects

Design

Based on the results from the previous large-scale screening, we selected Nafamostat mesylate, which exhibited the highest inhibition rate, for more precise characterization. The IC50 value, which indicates the concentration of a protease inhibitor required to inhibit 50% of the target enzyme’s activity, is a critical parameter for evaluating drug potency and selecting candidate compounds. The IC50 curve, by demonstrating the dose-response relationship, aids researchers in optimizing drug design, predicting effective concentrations, and assessing potential therapeutic efficacy. We aim to measure the IC50 value of Nafamostat mesylate.

Build

To determine the IC50 value of Nafamostat mesylate, we established a control curve of GZMK activity at varying concentrations of Nafamostat mesylate. Starting from 200 μM, we performed serial twofold dilutions to create 18 concentration gradients, with the lowest concentration being approximately 1.53 nM, for the GZMK activity assay.

Test

During the experiment, GZMK was first mixed with Nafamostat mesylate and incubated for 90 seconds. The substrate was then added and mixed, followed by immediate measurement of fluorescence intensity in a microplate reader, recorded once per minute for a total of 15 cycles. After data processing, we obtained the Nafamostat mesylate concentration–GZMK activity inhibition rate curve. Curve fitting yielded an IC50 value of 0.1951 μM.

Nafamostat Mesylate Concentration–GZMK Activity Inhibition Rate Curve

Learn

Dose–response fitting yielded an IC50 of 0.1951 μM for Nafamostat mesylate, indicating submicromolar potency and significant inhibition at low concentrations. This provides a quantitative baseline for selectivity profiling and mechanism assignment, and a reference for defining a standard control and refining hit thresholds. Next we will verify reproducibility, compare apparent IC50 stability across substrate levels, and extend to mechanism and reversibility assays.

High-Throughput Expression of Binders

Cycle 1: Positive Control Expression

Design

Subsequent experiments involving binders require a positive control, i.e., a pair of proteins confirmed to bind to each other. We also need such a pair to validate the entire experimental workflow. Therefore, before formally initiating binder expression, we will express a pair of positive control proteins.

Build

Our requirements for the positive control include simple expression (preferably via direct expression in Escherichia coli), moderate affinity (Kd in the μM range), and stable properties (to facilitate testing under extreme conditions). Ultimately, we identified a suitable protein pair: Ubc9 and SUMO1.

Test

We cloned Ubc9 and SUMO1 into plasmids and expressed them in Escherichia coli. Following affinity chromatography and size-exclusion chromatography, high-purity Ubc9 and SUMO1 solutions were obtained. Subsequently, surface plasmon resonance (SPR) was used to measure binding affinity, yielding a Kd of 8.267 μM.

Affinity Assay of SUMO1 and Ubc9

In the figure above, the Raw Data section shows the real-time SPR response curves. Each colored line corresponds to a different concentration of binder injected over immobilized GZMK on the chip. As the concentration increases, the response signal rises accordingly, and decreases during the dissociation phase. This concentration-dependent response is a hallmark of specific binding. Minor negative values or sharp fluctuations observed in some curves are background noise or nonspecific signals, which do not affect the overall interpretation.

In the Fitted Data section, the x-axis represents sample concentration and the y-axis represents equilibrium response. Colored points indicate experimental data, while the black or gray line represents the fitted model. Good agreement between data points and the fitted curve indicates reliable results, from which the dissociation constant (Kd) can be calculated. The Kd value reflects the strength of molecular interaction: a lower Kd indicates tighter binding, whereas a higher Kd suggests weaker binding.

Learn

The workflow was successfully completed, with Ubc9 and SUMO1 expressed stably in E. coli and purified to high quality. SPR analysis yielded a binding affinity in the micromolar range, meeting our design criteria for a positive control. This confirms that our experimental pipeline is functional and provides a reliable protein pair. The positive control can be used not only for binder validation but also for testing hardware devices and colloidal gold test strips.

Cycle 2: High-Throughput Protein Expression

Design

Due to the currently low success rate of computationally designed binders, we need to express a large number of proteins to ensure the identification of the target protein, necessitating high-throughput protein expression.

Build

Simply increasing the number of samples processed simultaneously in each experiment would reach a bottleneck, as many instruments have limited throughput and cannot handle multiple samples concurrently. Instead, we opted to overlap multiple rounds of experiments temporally, conducting different stages of multiple experiments simultaneously to maximize efficiency.

Test

The current protein purification cycle takes 4 days: Day 1 involves transformation and plating, Day 2 includes picking single colonies and small-scale culture, Day 3 entails scaling up to large-scale culture and induction, and Day 4 consists of cell harvesting, affinity chromatography, and size-exclusion chromatography. We conduct two rounds of expression simultaneously, with Days 3 and 4 of the first batch overlapping with Days 1 and 2 of the next batch.

Learn

By adopting a temporal overlapping strategy, we overcame the limitations of single-batch instrument throughput, increasing overall output while maintaining the same experimental cycle. Results confirmed that overlapping steps from different batches did not interfere with each other. This approach ensured consistent expression quality while significantly boosting efficiency, making it highly suitable for processing the large number of samples required in binder screening.

Cycle 3: Rapid Protein Expression

Design

Under specific circumstances (e.g., completing expression within a single weekend), we require a shorter cycle than the original 4-day protocol to achieve rapid protein expression.

Build

The key factor limiting the experimental duration is several overnight steps exceeding 10 hours, including the overnight plate culture on Day 1, which is necessary for picking single colonies. Since the goal of our experiment is to obtain protein, whether single colonies are used does not impact the outcome, allowing this step to be omitted.

Test

The new experimental cycle spans two days plus one night: transformation and small-scale culture on the evening of Day 1, scale-up to large-scale culture and induction on Day 2, and cell harvesting, affinity chromatography, and size-exclusion chromatography on Day 3. Experimental validation confirmed that protein expression results are equivalent to those of the standard 4-day cycle.

Learn

By omitting non-essential overnight steps, we successfully shortened the cycle to two days plus one night, with protein yields equivalent to the standard four-day protocol. This demonstrates that certain steps can be streamlined according to experimental goals, allowing flexible adaptation to time-constrained situations. For instance, during weekends or urgent schedules, this method enables efficient completion of experiments within limited time.

Cycle 4: Segmental Protein Expression

Design

During the semester, team members typically have classes from Monday to Friday, leaving substantial time for experiments over the two weekend days. Using the original experimental method, only one batch of experiments could be completed per week, resulting in low efficiency.

Build

We found that in the 3-day rapid expression protocol, the first two days involve minimal experimental work, which can be completed between classes. The final day requires over 10 hours of experimental time. Therefore, we pause the experiment on the morning of the third day and resume over the weekend, achieving higher efficiency.

Test

Segmental Protein Expression: Transformation and small-scale culture are performed on the evening of Day 1, followed by scale-up to large-scale culture and induction on Day 2. On Day 3, cells are harvested and the bacterial pellet is cryopreserved, pausing the experiment. Upon resumption, cell lysis, affinity chromatography, and size-exclusion chromatography are conducted, taking approximately 10 hours. Experimental validation confirmed that protein expression results are equivalent to those of the standard 4-day cycle.

Learn

By cryopreserving the bacterial pellet on Day 3 and resuming the process over the weekend, we effectively overcame the limitation of restricted weekday hours. Results showed that freezing did not impair protein expression or purification quality, and the outcomes were consistent with the full cycle. This method makes full use of spare time during the week and extended weekend hours, enabling more experimental batches per week and significantly enhancing overall efficiency.

Optimized Binder Affinity Assay

Cycle 1: Preliminary Experiment for Feasibility Validation

Design

Through preliminary in-depth research, we found that SPR technology can achieve high-throughput screening, while also offering advantages such as convenient operation, rapid detection, and low cost, which meet our needs. To verify its feasibility, we used the Biacore T200 to measure the affinity of positive control binding and compared it with its theoretical value.

Build

We coupled Ubc9 and YPet-Ubc9 to the chip as stationary phases, using SUMO1 and CyPet-SUMO1 as their corresponding mobile phase proteins, respectively. Since the isoelectric points of Ubc9 and YPet-Ubc9 are 7.73 and 6.77, respectively, the pH value for protein coupling should be less than the pI. Therefore, we ultimately determined the protein coupling pH to be 5.0.

Test

During chip coupling, concentration gradients of SUMO1 and CyPet-SUMO1 were prepared. The program was run to obtain the raw data of SUMO1_Ubc9 affinity, and the fitted Kd value was 8.267 μM

Affinity Assay of SUMO1 and Ubc9

Affinity Assay of CyPet-SUMO1 and YPet-Ubc9

The raw data of CyPet-SUMO1_YPet-Ubc9 affinity was obtained, and the fitted Kd value was 1.329μM

Learn

The measured Kd deviated significantly from the theoretical value, and curve fitting lacked sufficient reliability, indicating errors in the assay system. Based on the experimental details, the issues were likely caused by inappropriate buffer selection and inaccuracies in preparing the concentration gradient. Therefore, we concluded that reliable affinity data requires further optimization of buffer conditions and improvement of gradient preparation.

Cycle 2: Preliminary Experiment Optimization

Design

There were certain errors in the previous round of experimental process, which resulted in lower credibility of the obtained affinity data. We repeated the experiment after optimizing the experimental operation to obtain a more accurate Kd value for this group of positive controls.

Build

We employed a gradient dilution method to establish a concentration gradient to minimize errors, and adjusted the highest concentration to more effectively cover the concentration range corresponding to Kd.

Test

Prepare a protein concentration gradient for the mobile phase, and keep the remaining operations consistent with the previous cycle. Run the program, and obtain the accurate affinity value from the chart data

Improved Affinity Assay of SUMO1 and Ubc9

Improved Affinity Assay of CyPet-SUMO1 and YPet-Ubc9

Learn

With improved gradient dilution and concentration range coverage, the quality control score increased significantly, and the affinity fitting results became more stable and reliable. This demonstrated that proper gradient design and careful operation are critical for obtaining accurate Kd values in SPR assays. Through this optimization, we determined the concentration gradients and operational procedures required for subsequent affinity assays, providing a solid reference for large-scale screening.

Cycle 3: High-Throughput Screening for Affinity Binders

Design

We have verified the feasibility of determining affinity using SPR technology through the interaction between SUMO1 and Ubc9. Next, we need to preliminarily screen samples from the large number of binding proteins we have expressed, aiming to identify those with high affinity and specificity for GZMK.

Build

Based on the fundamental properties of GZMK and binding proteins, we chose GZMK as the stationary phase and various binding proteins as the mobile phase for affinity detection through the chip in sequence. Since the isoelectric point of GZMK protein is 9.47, we selected pH=5.0 as the pH for protein coupling.

Test

Couple GZMK protein on the new chip, prepare the purified 27 binding protein samples into mobile phases of the same concentration, and run the program. Among the 27 samples, the RU values of 1-6 and 1-24 are positive, and the kinetic data are highly reliable.

Preliminary Screening Results for Binder 1–6

Preliminary Screening Results for Binder 1–24

Learn

Among the 27 candidate proteins screened, only 1-6 and 1-24 exhibited clear response signals with reliable kinetic data, while the others showed no effective binding. These results demonstrated that SPR technology can efficiently identify potential binders from a large sample set, guiding subsequent precise affinity measurements.

Cycle 4: Obtaining Accurate Affinity Values

Design

Binder 1–6 and 1–24 have been confirmed to possess notable affinity, necessitating the design of experiments to obtain more accurate Kd values to determine their suitability for colloidal gold test strip construction and to aid in optimizing subsequent protein design.

Build

The pH value of the coupling protein was set to 5.0, and GZMK was selected as the stationary phase protein. The concentration gradients of the two mobile phase proteins were prepared using a gradient dilution method, and the program was run.

Test

Configure the concentration gradient of the two protein samples according to the procedure and run the instrument.

The fitted Kd value of 1-6 was 1.329μM

Accurate Affinity Measurement of Binder 1–6

The graph showed negative response units (RU) for samples 1-24, indicating invalid data.

Accurate Affinity Measurement of Binder 1–24

Learn

For binder 1-6, the affinity measurement yielded a stable Kd of 6.863 μM with high reliability, indicating promising potential. In contrast, the measurement of binder 1-24 produced negative response signals, rendering the data invalid. Based on analysis, this anomaly was likely caused by nonspecific binding interference. Therefore, we decided to adjust buffer conditions in subsequent experiments to confirm and eliminate nonspecific effects.

Cycle5: Elimination of Nonspecific Binding

Design

To optimize experimental results and obtain accurate affinity data, it is necessary to eliminate the influence of non-specific binding on instrumental analysis. We suspect that the occurrence of non-specific binding is related to the salt concentration of the buffer solution, hence the focus of our experimental design lies in varying the salt concentration of the buffer solution.

Build

The original SPR Running Buffer formulation consisted of 1X PBS, 3 mM EDTA, 0.05% Tween-20, and 1 M NaCl. We attempted to reduce the salt concentration of the buffer, designing a new buffer formulation: 1X PBS, 3 mM EDTA, 0.05% Tween-20, and 363 mM NaCl.

Test

Accurate numerical values for the binding affinity of 1-24 were obtained, with a high degree of reliability. Kd=0.4033μM

Improved Affinity Assay of Binder 1–24

Learn

By reducing the salt concentration in the buffer, we eliminated the previously observed nonspecific binding signals and successfully obtained an accurate affinity value for binder 1-24, with Kd=0.4033 μM. This experiment demonstrated that buffer ionic strength is a key factor influencing specific binding versus background noise. We ultimately established a buffer system suitable for measuring GZMK binders, which will serve as the standard condition for all subsequent affinity assays.

Innovative Colloidal Gold Test Strip Development

Cycle 1: Determination of Optimal Coupling pH

Design

As mentioned above, we obtained a pair of interacting proteins as a positive control. We aim to use this protein pair to test the preparation process of colloidal gold test strips and to serve as a quality control line to assess the validity of the test strips.

Build

The first step in preparing colloidal gold test strips is to conjugate the target protein with colloidal gold, which requires determining the optimal coupling pH. A pH that is too high or too low can alter the charge, leading to unsuccessful conjugation.

Test

We took a small amount of colloidal gold and adjusted the pH to 6, 7, 8, and 9, respectively. Excess SUMO1 protein was added, followed by a specific concentration of NaCl to induce aggregation of unconjugated colloidal gold.

Results for Determining the Optimal Coupling pH

It was observed that at pH 6 or 9, the colloidal gold turned blue and formed precipitates. At pH 7 or 8, the color remained nearly identical to that of the initial colloidal gold.

Learn

At pH 6 or 9 the solution turned blue with precipitates, indicating colloidal gold aggregation and insufficient protein adsorption. At pH 7 or 8 the color matched the starting colloid, indicating a dispersed state and successful coating. We therefore defined the optimal window as 7 to 8 and set pH 7.5 for subsequent work.

Cycle2: Determination of Optimal Coupling Amount

Design

After determining the optimal coupling pH, the optimal coupling amount needs to be established to minimize the waste of protein and colloidal gold.

Build

We will take a small amount of colloidal gold, adjust it to the optimal pH, and observe whether the colloidal gold is fully conjugated under different coupling amounts.

Test

We took a small amount of colloidal gold, adjusted the pH to 7.5, and added 10, 20, 30, and 40 μg of SUMO1 protein, respectively. A specific concentration of NaCl was then added to induce aggregation of unconjugated colloidal gold.

Results for Determining the Optimal Coupling Amount

It was observed that for all tested coupling amounts, the color of the colloidal gold remained nearly identical to the initial color.

Learn

In the salt challenge assay the color remained unchanged from the starting colloid across 10 to 40 micrograms, showing that all doses reached the protective threshold and that the lowest dose was still excessive. To balance cost and stability we selected the minimal effective amount with a small safety margin and set 11 micrograms SUMO1 per milliliter of colloidal gold.

Cycle3: Conjugation of SUMO1 with Colloidal Gold

Design

After determining the optimal coupling pH and amount, we need to scale up the process for formal conjugation.

Build

We will scale up the volume of colloidal gold used for conjugation, following the optimal coupling pH and amount, to prepare test strips.

Test

We took 20 mL of colloidal gold, adjusted the pH to 7.5, and added 220 μg of SUMO1, stirring thoroughly. Excess BSA was then added to block unbound sites. After thorough centrifugation, the pellet was collected and resuspended in a mild buffer to obtain the conjugated colloidal gold.

Prepared Colloidal Gold

Learn

After scale-up the solution retained a stable red color without visible precipitation, and resuspension after centrifugation and blocking was smooth, indicating uniform coating and stable colloidal gold. This confirms the small-to-large process is transferable and ready for strip assembly and functional testing.

Cycle4: Testing the Quality Control Line

Design

Conventional colloidal gold test strips typically use antibodies to achieve protein-protein binding. We replaced antibodies with smaller binders and modified the binding logic from target protein–primary antibody–secondary antibody to two binders capable of simultaneously binding the target protein.

Build

The colloidal gold test strip consists of the following components: a substrate, a sample pad, a conjugate pad, an NC membrane (bearing the detection and quality control lines), and an absorbent pad. Having conjugated SUMO1 with colloidal gold, we immobilized Ubc9 on the quality control line of the NC membrane to create the detection line.

Test

Two groups of materials were prepared, with 1 μL and 10 μL of SUMO1-colloidal gold, respectively. A 1 μL Ubc9 solution was applied as a thin line on the NC membrane. After assembling and drying the test strips, 40 μL of PBS was added to the sample pad, and results were observed after 10–15 minutes.

Results of Quality Control Line Testing for Colloidal Gold Test Strips

The leftmost strip, with 1 μL SUMO1-colloidal gold, showed a faint band. The middle strip, with 10 μL SUMO1-colloidal gold, displayed a distinct band. The rightmost strip was a commercially available COVID-19 test strip.

Learn

The low-dose strip showed a faint band and the high-dose strip showed a strong band, giving an input-dependent response. A visible line formed only at the Ubc9 position, indicating specific binding to SUMO1. The band clarity matched a commercial control, confirming that the control line design is reliable for validating strip performance.

Cycle 5: Construction of Test Line and Testing of the Entire Test Strip

Design

We innovated on the existing principle of colloidal gold test strips, modifying the binding mode from target protein–primary antibody–secondary antibody to Binder 1–target protein–Binder 2. For the test line construction, we conjugated the screened Binders 1–6 with colloidal gold and immobilized another binder, Aprotinin, on the test line to enable colloidal gold aggregation and color development.

Build

Following the same steps as before, we first determined the optimal pH and coupling amount for conjugating Binders 1–6 with colloidal gold. Subsequently, we formally conjugated Binders 1–6 with colloidal gold and assembled the entire test strip using the same method as previously described.

Test

We prepared the substrate, sample pad, conjugate pad, NC membrane, and absorbent pad. On the conjugate pad, 4 μL of SUMO1-colloidal gold and 4 μL of Binders 1–6-colloidal gold were applied. On the NC membrane, 0.5 μL of Ubc9 solution was applied at the control line (C line) position, and 0.5 μL of Aprotinin solution was applied at the test line (T line) position, ensuring application as a thin line. The test strip was assembled, taking care to maintain the correct layering. For the blank sample, 30 μL of PBS was applied to the sample pad. For the test sample, 30 μL of GZMK solution was applied to the sample pad. After waiting 10 minutes, bands were observed.

Colloidal Gold Test Strip Results

The left strip, representing the blank sample, showed only a single control line. The right strip, representing the test sample, displayed both a test line and a control line, indicating normal test strip functionality and results consistent with expectations.

Learn

The blank sample produced only the control line, while the test sample produced both test and control lines, indicating that in the presence of GZMK the bridge between Binder 1 and Aprotinin forms via the target and drives nanoparticle aggregation and color development. The contrast with the control shows the bands arise from specific recognition. This validates the antibody-free binder design on a full strip and provides a foundation for further optimization of sensitivity and specificity.

Structure Prediction

Cycle 1

Design

Our primary goal was to obtain the active conformation of GZMK, as an accurate active-site structure is critical for studying inhibitor binding modes and ensuring the success of virtual screening. The only published crystal structure is of a pro-granzyme K S195A variant, designed to prevent autocatalytic cleavage. This conformation exhibits extremely low catalytic efficiency and fails to represent the native active state. Therefore, we decided to use the AlphaFold Server to predict the structure of wild-type GZMK, aiming to obtain a model that more closely resembles its physiological conformation.

Build

We submitted the amino acid sequence of GZMK to the AlphaFold Server and obtained five candidate models. According to the provided confidence metrics, the best model achieved a pTM (predicted TM-score) of 0.92, indicating high confidence in its global topology. We exported this predicted structure for subsequent alignment, visualization, and preparation for docking.

Test

We used Maestro to perform a structural alignment of the AlphaFold model with the reported pro-enzyme crystal structure. The alignment revealed a high degree of overlap in the backbone trace and secondary structure arrangement, demonstrating strong overall topological consistency. This result structurally validates the reliability of our AlphaFold model.

Learn

Through this cycle, we successfully obtained an active GZMK protein structure model with high global confidence. This model provides a reasonable initial definition of the active site's geometry and serves as a solid structural foundation for subsequent pocket definition, grid generation, and ligand screening.

Cycle 2

Design

Considering that post-translational modifications, particularly glycosylation, could alter the protein's surface chemical environment and interfere with the selection of binding hotspots, we planned to first predict potential N-glycosylation sites on GZMK. This step aimed to avoid mistakenly selecting regions that might be modified or conformationally unstable as key interaction sites.

Build

We used the Re-Glyco online tool from the GlycoShape platform to perform an in silico prediction of N-glycosylation sites on the AlphaFold-generated GZMK structure. To ensure reproducibility and standardization, we uploaded the PDB model and ran the prediction workflow using default parameters.

Ives, C.M., Singh, O. et al. Restoring protein glycosylation with GlycoShape. Nat Methods (2024).

Test

After the computation, we reviewed the output, focusing on the prediction scores and solvent accessibility of potential N-X-S/T motifs. The results detected no reliable N-glycosylation sites on the mature GZMK model (with the signal peptide removed), indicating that our selection of surface hotspots would not be significantly constrained by N-glycosylation.

Learn

The prediction ruled out interference from N-glycosylation in our target site selection. This allowed us to focus on hotspot identification strategies driven by structure and energetics, thereby streamlining the subsequent definition of small-molecule pockets and protein binding interfaces while reducing uncertainty from potential modifications.

Cycle 3

Design

While AlphaFold is highly reliable for backbone topology prediction, the side-chain orientations and local flexible regions in its models often lack high-resolution accuracy. To obtain a conformation that more closely represents the true physiological state, we decided to introduce Molecular Dynamics (MD) simulation for further structural relaxation and refinement.

Build

We performed a 100 ns MD simulation on the AlphaFold-predicted GZMK model using the Desmond module from the Schrödinger suite under near-physiological conditions. The simulation parameters were as follows:

Temperature: 310 K (near physiological temperature)
Ionic strength: 0.15 M NaCl (with Na⁺ ions to neutralize the system)
Force field: S-OPLS
Simulation time: 100 ns

After the simulation, we performed conformational clustering on the trajectory and selected the representative structure from the most stable cluster as the final model for all subsequent computational work.

Test

Root Mean Square Deviation (RMSD) analysis showed that the GZMK backbone fluctuation remained within 3.8 Å throughout the 100 ns simulation, indicating a highly stable overall conformation. Furthermore, key active site residues did not undergo significant displacement, preserving the critical geometry of the initial model.

Learn

Through MD simulation, we obtained a GZMK structure that is dynamically more stable and conformationally more reasonable. This model is not only globally stable but also features local side-chain arrangements and pocket geometries that better represent its native state. This fully refined structure provided a solid and reliable foundation for subsequent ligand docking, pocket energetic analysis, and inhibitor screening.

Protein Binder Design

Cycle 1

Design

We employed a hotspot-guided de novo protein design strategy. This approach aims to ensure high geometric complementarity at the binder-GZMK interface and to provide potential for future affinity maturation. The primary goal of this cycle was to generate binders with detectable affinity, laying the groundwork for developing a dual-epitope pair required for a sandwich assay.

Build

Our computational workflow was as follows: First, we identified and selected residue clusters on the GZMK surface as hotspots by evaluating their Spatial Aggregation Propensity (SAP). Next, we used RFdiffusion to generate candidate protein backbones complementary to these hotspots, followed by sequence design using ProteinMPNN. Finally, we performed iterative rounds of structure prediction and energy scoring with AlphaFold2 and Rosetta to filter for the most stable designs. This pipeline yielded 30 binder candidates targeting 8 distinct hotspots.

Test

Wet-lab validation successfully identified two binders with detectable affinity for GZMK. Although both showed promising predicted binding modes, they were found to target the exact same hotspot. This prevented them from forming the non-competing, dual-epitope pair required for a sandwich assay.

Learn

This cycle validated the feasibility of our de novo design pipeline but also highlighted a critical limitation: the lack of epitope diversity. To meet the requirements of a sandwich assay, it became imperative to design a stable binder targeting a second, distinct hotspot on GZMK.

Cycle 2

Design

To explore ways of increasing the design success rate, we attempted a template-based strategy. The plan was to use a known natural inhibitor protein as a structural scaffold and re-engineer its interface to confer specificity for GZMK. This approach aimed to balance the inherent stability of a natural protein with engineered target specificity.

Build

During backbone generation, we used a partial-diffusion approach to preserve key structural fragments of the native inhibitor protein. In the sequence design phase, we used ProteinMPNN to replace native cysteine (Cys) residues, aiming to mitigate the risks of misfolding and sample heterogeneity caused by unintended disulfide bond formation during in vitro expression.

Test

The computational evaluation was discouraging. The predicted interface PAE scores from AlphaFold were generally above 20 Å, indicating significant uncertainty in the binding interface geometry. We inferred that the native disulfide bonds of the template protein were crucial for maintaining its overall scaffold rigidity and stability; simply removing them compromised the designs' foldability. Furthermore, Molecular Dynamics (MD) simulations revealed that these binders contained excessive flexible loops and exhibited large-scale motions over 100 ns, failing to maintain a stable conformation.

(Comparison: MD simulation of a successful, stable binder)

Learn

This strategy failed to produce stable binder structures in the absence of their native disulfide bonds. This led us to revert to the proven de novo pipeline from Cycle 1 and focus our efforts on targeting a second, distinct hotspot to ensure the final delivery of a functional dual-site binding solution for the sandwich assay.

Cycle 3

Design

Drawing lessons from Cycle 1, we adopted a more refined four-residue combination strategy for hotspot definition. We carefully selected four new hotspots that were spatially distinct from the first successful one, aiming to create entirely new interfaces with different geometric fits and solvent accessibility profiles.

Build

We replicated the established workflow from Cycle 1: generating backbones with RFdiffusion, designing sequences with ProteinMPNN, and performing rigorous structural confidence and energy-based filtering using AlphaFold2 and Rosetta. This process yielded a new set of candidate binders targeting the newly defined hotspots.

Test

The computational screening results were highly encouraging. The overall distribution of key metrics (such as pAE, ΔΔG, and SASA) for the new candidate pool was very similar to that of the successful batch from Cycle 1. This filtering process yielded a set of binders with excellent scores across all computational criteria.

Learn

At the computational level, we have successfully generated a pool of promising binder candidates for a second, distinct hotspot. This provides the crucial molecular components needed for a sandwich assay pair. The next critical step will be to validate their binding affinity, epitope independence, and complex stability through biophysical and functional experiments, such as SPR.

Small-Molecule Screening

Cycle 1

Design

As GZMK is a serine protease, our core hypothesis was that its enzymatic activity could be effectively inhibited by small molecules that competitively occupy the active site pocket housing the catalytic triad (H41–D90–S188). Therefore, the objective of this cycle was to identify potential competitive inhibitors through an active-site-based virtual screening campaign.

Build

We established a docking workflow using the Schrödinger Glide module. The prepared GZMK structure (at a simulated pH of 7.6±0.5) was used to screen two compound libraries: Specs-21 (~20,000 compounds) and the L1000 library of approved drugs (1813 compounds). A docking grid was centered on the catalytic triad. Finally, the top 20 hits from each library were re-ranked using MM/GBSA binding free energy calculations (OPLS4 force field, VSGB solvent model).

Test

The docking results were promising, with the top-scoring molecule from the Specs-21 library achieving a docking score of −8.574, and the best from the approved drug library scoring −7.915. The MM/GBSA calculations provided effective re-ranking, revealing that several top candidates formed reasonable hydrogen bonds and hydrophobic interactions within the active site.

Title	Docking Score (kcal/mol)	MMGBSA ΔG Bind (kcal/mol)
T0138	-6.905	-62.44
T0179	-6.613	-62.37
T0138	-7.457	-59.91
T0125	-6.634	-59.31
T1020	-6.634	-59.31
T2815	-6.674	-58.66
T0795	-7.915	-57.72
T0346	-6.651	-55.05
T0795	-6.806	-51.69
T0138	-6.618	-49.89
T0827	-6.637	-48.62
T1218	-6.703	-46.51
T0966	-6.683	-44.97
T0453	-7.007	-43.59
T0279	-7.315	-41.00
T0307	-6.826	-40.11
T1339	-6.914	-39.02
T0469	-7.234	-36.97
T1320	-7.566	-35.40
T1026	-6.935	-29.92

Title	Docking Score (kcal/mol)	MMGBSA ΔG Bind (kcal/mol)
T0795	-7.638	-64.20
T1218	-6.738	-60.95
T0827	-5.639	-59.53
T0138	-8.139	-59.49
T2815	-6.855	-57.92
T0138	-7.769	-52.72
T2391	-4.016	-52.50
T0795	-6.009	-49.69
T0346	-3.831	-47.35
T2392	-3.722	-45.44
T1339	-7.203	-39.69
T0966	-4.910	-39.52
T0453	-5.296	-32.16
T0279	-5.552	-32.13
T3359	-2.199	-28.09
T0307	-4.940	-24.38
T1320	-5.114	-22.64
T0469	-4.590	-21.37
T1026	-4.815	-15.30

Learn

This initial screen validated the feasibility of our active-site-based virtual screening approach and yielded a set of potential lead compounds. However, the fact that the experimentally validated hits did not rank highly in this screen highlighted the potential limitations of rigid docking in accurately capturing the true binding mode and accounting for receptor flexibility.

Cycle 2

Design

Given the low docking scores of the experimental hits, we proposed an "allosteric regulation" hypothesis: these molecules might inhibit enzyme activity indirectly by binding to a pocket outside the active site. To test this, the goal of this cycle was to systematically identify potential allosteric pockets on the GZMK surface and evaluate the binding of our experimental hits within them. (Note: Due to cost and time constraints, all subsequent computational validation focused on the L1000 drug library.)

Build

We used the Schrödinger SiteMap tool to probe the surface of GZMK, which identified three potential binding pockets. We then re-docked the experimental hits into each of these newly identified sites to assess their binding potential and pose stability.

Test

One pocket identified by SiteMap was located adjacent to the catalytic triad. Although the experimental hits achieved their best allosteric poses in this pocket, their docking scores were only around −4.016, significantly weaker than their scores in the active site. While their MM/GBSA energies were comparable to some active site binders, the binding poses were not convincing.

Learn

The results did not support the existence of a functional allosteric pocket. This negative finding guided our attention toward another critical factor: the intrinsic flexibility of the protein itself likely plays a dominant role in ligand binding. We concluded that the dynamic nature of the receptor must be taken into account.

Cycle 3

Design

Building on the insights from the first two cycles, we formulated a new core hypothesis: "induced fit." We posited that the experimental hits achieve their potent binding by inducing a conformational change in the active site, a phenomenon that rigid docking cannot capture. The goal of this cycle was to test this hypothesis using Induced Fit Docking (IFD).

Build

We selected the top 15 virtual hits from Cycle 1 and the 3 experimental hits for IFD analysis. The docking grid was centered on the pose of a high-scoring virtual hit (T0138), and the side chains of residues within the pocket were allowed to move flexibly during the docking process.

Test

The IFD results were striking. The docking scores of two experimental hits dramatically improved to −9.545 and −9.476. By allowing receptor flexibility, the resulting binding poses were far more reasonable, forming a stable network of interactions including hydrogen bonds, a salt bridge, and π-π stacking.

(Comparison shows the binding pose after IFD (left) is more favorable than with rigid docking (right))

Learn

The results strongly supported the induced fit hypothesis. However, IFD provides a static optimal snapshot. We still needed to validate the stability of this induced-fit complex in a dynamic environment using molecular dynamics simulation.

Cycle 4

Design

To definitively confirm the reliability and stability of the binding mode predicted by IFD, this cycle was designed to use Molecular Dynamics (MD) simulation to examine the dynamic behavior of the protein-ligand complex in a simulated aqueous environment.

Build

Using Schrödinger Desmond (S-OPLS force field), we ran 100 ns MD simulations for the top IFD poses of the virtual hit T0138 and the two experimental hits. Crucially, we also simulated the initial rigid-docking pose of Nafamostat (T2392) as a control. To ensure robustness, the simulation of the Nafamostat IFD pose was repeated multiple times.

Test

Trajectory analysis showed that all complexes formed from IFD poses remained stable. The most critical finding was that the initial rigid-docking pose of Nafamostat (T2392) spontaneously transitioned during the relaxation phase into a stable conformation that was nearly identical to the IFD-predicted pose (forming a key hydrogen bond with H41 and a salt bridge with D182). This stable pose was maintained for the remainder of the simulation.

Learn

The MD simulations provided definitive evidence that the experimental hits act via an induced fit mechanism. This explains why traditional rigid docking failed to identify them effectively. We concluded that while the combined IFD+MD approach is powerful, its high computational overhead makes it impractical for large-scale primary screening.

Cycle 5

Design

Given GZMK's induced fit nature and the high cost of the IFD+MD protocol, we designed a novel workflow to balance efficiency and accuracy: a "Similarity-Search-Driven IFD-MD" pipeline. The goal was to create an efficient process for discovering potent inhibitors from large chemical databases.

Build

The new two-step workflow is as follows: First, using known serine protease inhibitors (e.g., Nafamostat, PMSF) as query "seeds," we perform a rapid chemical similarity search using our tool, GeminiMol, to enrich for high-potential candidates from a large library. Second, only this small, enriched subset of top-scoring molecules is subjected to the computationally expensive IFD+MM/GBSA+MD pipeline for rigorous validation and ranking.

1.0 → PMSF: Known to completely inhibit GZMK.
0.8 → AEBSF: Water-soluble, irreversible inhibitor.
0.7 → Nafamostat: Clinical drug, potent serine protease inhibitor.
0.6 → Camostat, Benzamidine: Weaker, but reported to inhibit GZMK.
0.5 → Gabexate, 3,4-DCI: Broad-spectrum protease inhibitors with some relevance.

SMILES	Label	Inhibitor	Notes
O=S(F)(=O)Cc1cccc1	1.0	PMSF	Known to completely inhibit GZMK
FS(=O)(=O)c1ccc(cc1)CCN	0.8	AEBSF	Water-soluble, irreversible inhibitor
NC(=N)c1ccccc1	0.6	Camostat / Benzamidine	Weak activity, but literature suggests inhibition of GZMK
ClC1=C(Cl)c2ccccc2C(=O)O1	0.5	Gabexate / 3,4-DCI	Broad-spectrum protease inhibitor, partial relevance
N=C(N)c1ccc2cc(OC(=O)c3ccc(N=C(N)N)cc3)ccc2c1	0.7	Nafamostat	Clinical injection drug, strong serine protease inhibitor
CN(C)C(=O)COC(=O)CC1=CC=CC=C1OC(=O)C2=CC=CC=C2N=C(N)N	0.6	Camostat / Benzamidine	Weak activity, but literature suggests inhibition of GZMK
O=C(Oc1ccc(cc1)C(=O)OCC)CCCCC/N=C(N)N	0.5	Gabexate / 3,4-DCI	Broad-spectrum protease inhibitor, partial relevance

Test

We applied this new workflow to the L1000 drug library. The result was a resounding success: we correctly identified two of the experimentally validated hits within our top-10 ranked predictions. This demonstrated the high efficiency and accuracy of our new pipeline. We subsequently applied the workflow to the much larger Specs library and performed IFD and MM/GBSA calculations on the top 30 hits to generate a further list of candidates.

SMILES	ID	w_1.0	w_0.8	w_0.6	w_0.5	w_0.7	Score
Cl.NCC1=CC=C(C=S1)S(N)(=O)=O	T1026	0.41429	0.62460	0.38088	0.26535	0.43794	2.1230
NS(=O)(=O)C1=CC=C(C=C1)C(O)=O	T0398	0.48743	0.54954	0.35754	0.28610	0.43834	2.1189
NCC1=CC=C(C=C1)C(O)=O	T1252	0.45990	0.62544	0.40630	0.23729	0.37817	2.1071
CS(O)(=O)=O.CN(C)C(=O)COC(=O)CC1=CC=CC=C1	T2391	0.27359	0.39438	0.44614	0.43391	0.54197	2.0900
O.[Na].CC(C)NS(=O)(=O)C1=CC=C(C(N)C=C1	T0087	0.48172	0.59416	0.32486	0.28470	0.39247	2.0779
OC(=O)C1=CC=C(C=C1)S(=O)(=O)N(C)C	T0037	0.60851	0.60449	0.22529	0.28336	0.31356	2.0352
NC1=CC=C(C=C1)S(N)(=O)=O	T0123	0.36788	0.56466	0.37540	0.27302	0.45233	2.0333
CS(O)(=O)=O.CS(O)(=O)=O.NC(N)=NC1=CC=CC=C1	T2392	0.22591	0.39310	0.37731	0.35836	0.66217	2.0168
CS(=O)(=O)C1=CC=C(C=C1)C(O)C(O)NC(=O)	T1550	0.52616	0.58203	0.28635	0.31276	0.29499	2.0023
COC(CN)=O	T0051	0.47927	0.42985	0.38996	0.32663	0.37379	1.9995
OCCS(O)(=O)=O.OCCS(O)(=O)=O.NC(=O)C1=CC=CC=C1	T1654	0.32289	0.40699	0.40953	0.37532	0.45566	1.9704
CC(CN(C)C)OC.CC(CN(C)C)OC.CC(CN(C)C)OC	T3614	0.56355	0.51427	0.35565	0.24139	0.28466	1.9595
NC(C)NS(=O)C1=CC=CC(=C1)C(N)C	T0241	0.24465	0.50061	0.37088	0.32622	0.51583	1.9538
Cl.CC(N)NCC(O)C1=CC=C(NS(C)(=O)=O)C=C1	T0483	0.53746	0.60055	0.22115	0.26553	0.31785	1.9425
CC(C)NS(=O)(=O)C1=CC=CC(=NC(=O)C2=CC=CC=C2)C1	T0409	0.38723	0.39565	0.38357	0.30095	0.46789	1.9335
OC(=O)C1N+[I-]O=C=C1	T0067	0.67609	0.40260	0.39212	0.23904	0.21825	1.9281
CC(=O)NC1=CC=CC(C(CO)=O)C=C1	T0310	0.56403	0.51920	0.33687	0.18753	0.24850	1.9141
COC(C=O)C1=CC=CC(=N)C=C1	T0924	0.53666	0.53666	0.28377	0.28365	0.32323	1.8777
NC(N)=NS(=O)C1=CC=CC(=C1)C(N)C	T0510	0.46683	0.36503	0.33043	0.33475	0.54416	1.8761
NC(C1=CC=CC=C1)C(O)=O	T0689	0.51580	0.49026	0.38748	0.19998	0.28168	1.8752

Learn

We successfully developed a novel screening workflow that intelligently combines prior chemical knowledge with rigorous physics-based models. This pipeline can efficiently and accurately identify potent inhibitors from large databases, laying a methodological foundation for future large-scale drug discovery. Importantly, this integrated strategy is not limited to GZMK; it represents a highly promising screening paradigm for other protein targets that also feature flexible active sites.

Software

Cycle 1: Traditional Protein Design Methods

Usage

We employed a protein design strategy based on RFdiffusion with the goal of obtaining high-quality binders. Specifically, RFdiffusion was used to generate backbones, ProteinMPNN was applied to design sequences, and subsequent screening was performed using AlphaFold and PyRosetta. The top candidates identified through this computational pipeline were then subjected to experimental validation, with the ultimate aim of discovering binders exhibiting superior affinity and structural stability.

Learn

In applying this workflow, we found that it is not strictly binding-oriented but instead relies on generating large numbers of candidates followed by high-throughput dry/wet screening. During sequence design, the method primarily ensures the structural integrity of the binder while insufficiently addressing interface construction and backbone potential. As a result, even binders with measurable affinity retain considerable room for improvement in both backbone and sequence. To address this, our goal is to develop a tool that can fully exploit and optimize the potential of each candidate protein to further enhance binding performance.

Cycle 2: BetterEvoDiff

Design

We aim to develop a tool capable of fully exploiting the potential of screened proteins. We recognized that existing optimization methods are largely restricted to single-point mutations, which fail to capture the interdependence among amino acid residues and thus limit further performance improvement. To address this, our objective is to leverage coordinated multi-point mutations to unlock greater optimization potential in candidate proteins. Specifically, the goals are: (1) to establish rational relationships among different mutation sites; and (2) to collect and learn from the effects of mutations to guide more effective designs. To achieve this, we combined discrete autoregressive modeling with reinforcement learning to construct a framework for multi-point mutation and evolution.

Build

Based on this design, we developed a protein optimization tool that integrates EvoDiff with the Group Relative Policy Optimization (GRPO) algorithm. The tool takes any input sequence and iteratively performs the following cycle until convergence: random masking of residues, denoising with EvoDiff to generate mutated sequences, property evaluation in an environment to obtain rewards, parameter optimization of EvoDiff using GRPO, and subsequent regeneration of sequences with the updated model. Through repeated iterations, the model gradually learns from mutation outcomes and converges to a state where it can reliably produce near-optimal solutions under random perturbations. Finally, the tool enables batch generation of candidate sequences, followed by dry-lab screening and wet-lab validation to identify binders with improved performance.

Test

We first carried out dry-lab validation, where a reward function was constructed using AF output metrics to optimize proteins with affinity generated by classical methods. The results showed that our tool was able to achieve directed optimization of the protein binding interface under our reward function, converging at the highest reward state and reaching the theoretical optimum. We then randomly selected a subset of optimized sequences from the dry-lab experiments for wet-lab validation and obtained affinity data. Disappointingly, the optimized proteins did not exhibit significant improvements in affinity, and in some cases, proteins with seemingly better binding interfaces after optimization displayed lower affinity than their unoptimized counterparts.

Learn

We found that our tool can optimize protein properties toward specific targets, demonstrating the effectiveness of the proposed framework. However, the core challenge lies in aligning dry-lab rewards with wet-lab rewards. Even though AlphaFold provides highly accurate predictions, it still cannot fully match wet-lab results, which is beyond our ability to resolve. Therefore, in subsequent work, we will continue to use AF3 as the reward function, as it can at least verify whether our framework is capable of converging toward defined objectives. Current results also highlight that the backbone is a crucial starting point, playing an even more decisive role in affinity than the side chains themselves—a poor backbone cannot be optimized into a high-affinity binder. This motivates us to develop a de novo protein design framework capable of both selecting superior backbones and fully exploiting their potential, thereby enabling more accurate protein design.

Cycle 3: BetterMPNN

Design

In current protein design, the upper limit of performance is determined more by prediction models than by generative models, as the reliability of prediction results significantly exceeds that of generative outputs. Against this background, our work focuses on optimizing generative models so that they can distill structural knowledge from prediction models. At the same time, RFdiffusion, as the most effective backbone generation tool available, remains our best choice for backbone design—even though only a small fraction of its outputs prove usable in experimental validation, when combined with our framework it can still quickly provide effective starting points. Based on these backbones, ProteinMPNN has classically been used only to generate sequences that can fold into the target structure, but we aim to further extend its capability so that it not only ensures structural foldability but also learns to generate sequences with superior binding interfaces. To achieve this, we integrate RFdiffusion, ProteinMPNN, and GRPO into a unified high hit-rate (one-shot) protein design approach. Within this optimization loop, ProteinMPNN is continually refined through reinforcement feedback, enabling rapid convergence from backbone generation to high-quality binders.

Build

For implementation, we first use RFdiffusion to generate dozens to hundreds of backbones, each of which is subjected to an independent optimization loop (parallelized when resources permit). Within each loop, ProteinMPNN generates multiple sequences for a given backbone, which are then evaluated in an environment to compute rewards. These reward signals are used to optimize ProteinMPNN parameters via the GRPO algorithm, progressively teaching it to generate better binders. The updated ProteinMPNN is then reused for sequence design, and this cycle continues until convergence. During training, metrics such as loss, KL divergence, and reward values allow us to quickly assess backbone reliability: reliable backbones are optimized to convergence, while unreliable ones are discarded and replaced. Once converged, the optimized ProteinMPNN is used for batch design on high-quality backbones, and the resulting sequences, together with those collected during optimization, are advanced to dry-lab screening. A subset of high-confidence sequences is further selected for wet-lab validation.

Test

Through dry-lab experiments, we demonstrated several key capabilities of our method: the design of protein inhibitors directly targeting the GZMK active pocket within 20 hours (from backbone to binder), the design of binders targeting distinct GZMK hot-spots, the ability to design protease cleavage sites, autonomous selection of optimal hot-spots, and rapid evaluation of RFdiffusion backbones within minutes to hours. Under dry-lab conditions, the method shows strong potential for one-shot design, provided that dry-lab and wet-lab conditions can be correctly aligned. At present, our wet-lab experiments and validations on additional targets are still ongoing, and progress updates will be available on our GitLab.

Learn

Throughout this work, we have gained several important insights. First, reinforcement learning has proven to be highly suitable for biological applications, as it naturally simulates the process of biological evolution and allows knowledge to be accumulated and refined through iterative feedback. Second, the critical importance of the backbone was reaffirmed: it serves as the essential starting point and largely determines whether a binder can achieve high affinity, since a poor backbone cannot be rescued by subsequent sequence optimization. At present, the reward obtained from our evaluations is applied only to optimize the sequence design model, while the backbone remains unchanged. If we could design a mechanism whereby the reward signal is first broadcast to the side chains and then further broadcast to the backbone, it would enable the simultaneous optimization of both backbone and side chains, thereby yielding more rational and higher-quality binders. This will also be the direction of our future work.

Cycle 4: Future Plans

We are continuing to pursue the above idea with the goal of developing a more precise and efficient protein design tool. This work is still in progress—stay tuned, and future updates will be available on our GitLab.

Construction and Optical Performance Optimization for the Micro-Volume Liquid Detection Platform

Cycle 1: Detection Platform Construction

Design

Laboratory-scale cell-free systems typically feature small expression volumes and high unit production costs. The expression volume of a single system is nearly equivalent to the sample amount required for one read on a microplate reader. This severely limits the throughput of protein expression in cell-free systems and the reproducibility of subsequent detection experiments. To address this need, our team constructed a microchannel configuration with reduced detection volume, which significantly decreased the sample amount required per measurement; fully transparent materials were adopted to facilitate subsequent fluorescence detection.

Build

In this version, the chip uses soda-lime glass as the substrate. A flexible PDMS channel is obtained by negative photolithography. Plasma cleaning is applied to remove organic contaminants and activate surface oxygen radicals on both materials, followed by lamination and overnight heating to form a stable covalent bond between them. The main channel section includes an inlet, a waste outlet, and a detection window, and is connected to the external injection system via P20 tubing. Prior to use, Aquapel is applied to hydrophobize the inner walls of the channel, preventing adsorption of proteinaceous substances by PDMS that would introduce detection errors and cause channel blockage, while also mitigating capillary effects that can compromise flow stability. A ballast component is introduced at the detection window, which, together with an external Harvard pump, ensures flow stability at the detection window.

The first-generation chip

Test

The FCS detection results showed weak excitation signals, low photon flux density, and a poor signal-to-noise ratio (SNR).

FCS detection results of the first-generation chip

Learn

Based on the above results, it is hypothesized that the relatively thick soda-lime glass moderately absorbs excitation photons; additionally, the increased optical path length introduced by the thicker glass leads to spherical aberration in high-angle rays that cannot be compensated by the objective design, resulting in a substantial reduction in photon flux density. Meanwhile, surface-adhered organics also reduce transmissivity.

Cycle 2: Substrate Optical Performance Improvement

Design

Targeted optimizations were made to the previous chip version, particularly by improving the glass substrate materials in the optical path, exploring thinner materials with lower absorption at the specific wavelengths used. Surface treatment processes were also refined.

Build

In testing, microfluidic chips were constructed using substrates of ultra-clear glass, borosilicate glass, and fused silica with different thicknesses (2 mm and 0.15 mm). All substrate surfaces were treated by plasma at 30 W for 130 s to remove surface contaminants and were stored in clean Petri dishes after treatment.

Test

The results showed that 0.15 mm fused silica exhibited a distinct advantage as an optical medium for laser-based detection, representing a clear improvement over the previous-generation chip, though still falling short of the control group.

Learn

A new issue emerged: the 0.15 mm fused silica exhibited observable deformation at the objective interface, introducing an additional refractive surface in the optical path and thereby affecting focusing to some extent. Meanwhile, the small contact area at the objective face resulted in low friction, and the chip frequently slipped due to physical perturbations.

Cycle 3: Lens Coupling Improvement

Design

To address the above issues, our team first aimed to increase the hardness of the channel layer; in parallel, a support structure was designed for the chip. The chip layout was also redesigned to ensure exact coincidence of the detection center with the geometric center, preventing a tendency toward lateral tilting.

Build

Specifically, the PDMS layer thickness was increased, the crosslinker ratio was raised, and the curing time was extended, making the chip less prone to elastic deformation. In addition, drawing on common fixation methods in XRD measurements, moldable butyl plasticine was used to provide structural support for the chip. A cutting registration at 60 mm × 24 mm was introduced to fit the quartz thin glass used. The chip’s tubing layout was redesigned, placing the detection window at the geometric center of the cutting region.

Test

Design of the third-generation chip

Now, the detection window (the area within the red circle) represents the geometric center of the chip. The results show that these measures significantly reduce chip deformation, especially preventing elongation along the long axis. The addition of butyl plasticine both supports the chip against deformation and prevents slipping. Moreover, with the geometric center aligned vertically with the working fulcrum, the chip’s operational stability and anti-interference performance (anti-vibration) were improved.

Learn

Construction goals achieved.

Performance Optimization for Mixing in High-Reynolds-Number Differential Systems

Cycle 1: Mixing Ratio

Design

Empirically, relying solely on off-chip mixing followed by reinjection is labor-intensive and time-intensive and prone to operator-dependent variability that compromises mixing uniformity, thereby limiting system throughput and reproducibility. By contrast, integrating mixing functionality directly on-chip can substantially shorten workflows and time costs, while achieving more efficient and controllable mixing under stable flow, thus laying the foundation for subsequent high-throughput heterogeneous-system detection.

In the specific design, different mixing modes were explored: simple two-layer co-flow, droplet-based dispersion, and a jetting mode in which a sheath-confined jet forms a lamella.

To prevent channel clogging by particulates in cell-free systems, this generation of chips introduced a weir structure; the inlet channel for the cell-free stream was also made relatively wider to reduce the risk of blockage.

Build

In the current chip design, while preserving the optimizations achieved in the previous generation, a cross-shaped mixer was introduced to evaluate both droplet-based and jetting-based mixing strategies. It is noteworthy that these two modes are realized by adjusting the channel width to modulate the sheath flow rate, thereby ensuring the stable formation of droplets and jets. Consequently, it can be inferred that the sheath flow velocity remains constant throughout the experiments.

Test

Multiple mixing schemes were evaluated in the comparative design study, and the stability of jetting at different flow rates was compared.

Learn

Jetting

Droplet

The results indicate that, relative to droplet-based dispersion or simple two-layer co-flow, the jetting mode—which forms a laminar liquid layer through a sheath-confined jet—achieves a more uniform and stable compositional distribution on shorter timescales. This conclusion is drawn primarily from microscopic observations of stable jetting and droplet generation rates; in two-layer co-flow, the interface was found to be the least stable due to differing wall-induced viscous drag on the two liquids.

It must be acknowledged, however, that jetting requires a narrow range of flow-rate ratios; stable jets are obtained only within appropriate velocity windows. Solutions to this constraint will be proposed in the work on “high-throughput detection.”

Cycle2: Mixing Modes

Design

In essence, liquid mixing must be both “stable” and “unstable”: stability is needed prior to mixing to maintain a constant ratio—hence robust self-sustained flow stability—whereas instability is desired at the interface under specific engineered perturbations to promote mixing. The key lies in designing suitable mixer structures. Because our team is mixing a high-Reynolds-number stream with a low-Reynolds-number stream, existing microfluidic designs for such conditions often focus on active mixing, which necessarily involves acoustic, thermal, magnetic, or other fields—undesirable here due to the potential to interfere with stable protein affinity measurements under the microscope. Therefore, passive mixing was chosen as the primary strategy. Yet for systems with large Reynolds-number disparities, studies on passive mixing remain limited; such extreme conditions pose heightened challenges for both mixing efficiency and structural design.

In light of this, our team initially selected a passive micromixer containing triangular obstacles. This design performs well across a broad Reynolds-number range. Its operating principle is the canonical split-and-recombine (SAR): upon encountering triangular baffles, the flow is compelled to split and subsequently recombine downstream, repeatedly stretching and folding the fluid layers and enlarging interfacial area. Under low-Re conditions, such geometric perturbations markedly shorten diffusive path lengths; under higher Re, they also induce transverse velocity components that further promote convective mixing. Under the present experimental conditions, splitting/recombination and diffusion are coupled, allowing the triangular baffles to achieve efficient mixing within a short channel length.

Build

The mixing structure was positioned downstream and adjacent to the cross-mixer, with the same fabrication workflow as above.

Test

When the mixture was examined with an optical microscope, it was found that the mixture was not thorough and the luminous intensity was insufficient.

Learn

Mixing was found to be incomplete—only a weak signal—yet an initial degree of homogenization was achieved, and the Reynolds numbers are relatively closer. In the next-generation chip, we plan to incorporate a mixer tailored for low-Re conditions with small Re contrasts.

Cycle 3: Composite Gradient Mixing

Design

Given the relatively successful mixing in the prior stage and the need to further blend the two liquids, the operating regime now transitions to mixing a low-Re stream with another low-Re stream. Therefore, a serpentine micromixer was introduced downstream to ensure thorough mixing. Its principle leverages Dean vortices generated by continuous bends: as fluid traverses a curved segment, centrifugal effects drive a pair of counter-rotating secondary flows across the cross-section; with alternating bends along the channel, these vortices flip correspondingly, continuously stretching, folding, and redistributing fluid layers. This enhances interfacial perturbations across the cross-section and markedly improves mixing efficiency.

Build

Two sets of serpentine micromixers were designed to ensure complete mixing, and sufficiently sized ballast structures were placed downstream to stabilize the flow and allow diffusion to reach equilibrium.

Test

FCS detection experiments were conducted.

FCS results of SUMO-mGFP

G(t)-t graph of SUMO-mGFP

Learn

FCS measurements of the SUMO-mGFP protein solution yielded well-defined G(t)-t curves, with fluorescence intensity falling within the normal range. Notably, a small number of particles were observed to exhibit internal vortex circulation, which in turn affected the surrounding surface flow and further accelerated the mixing process.

The experimental results demonstrated that the objectives of this generation’s chip design were successfully achieved, thereby paving the way for the development of the next-generation chip.

Construction and Improvement of High-throughput Protein Affinity Detection Platform

Cycle 1: Vertically Variable Concentration System

Design

The project targets multiple proteins; thus, high-throughput detection is crucial not only for this work but also for its broad future applicability. The primary goal of high throughput is to construct complete affinity curves for a given protein across different concentrations, thereby obtaining continuous concentration-dependent affinity data. However, in second-generation chip tests we found that the jetting mode imposes clear constraints on the allowable flow-rate window, which in turn limits the accessible detection range. For this reason, subsequent designs adopted a pre-mixing approach to expand the measurement window.

Build

A Y-mixer was introduced upstream of the cross-mixer, with a ballast to stabilize the flow and a 45° angle to prevent backflow.

Test

The design is as follows:

Design of the third-generation chip

Learn

The pre-dilution region lacked sufficient space, allowing only a Y-mixer, whose performance is relatively limited—this can be addressed in the next generation.

Cycle 2: Multi-concentration Gradient Configuration

Design

Given the need to detect multiple proteins and switch between them, as well as the requirement for concentration-gradient measurements in cross-correlation modes on fluorescence-related spectrometers, integrating on-chip optical pumping to generate gradients is highly challenging and unstable. We therefore adopted an off-chip modular design, employing a primary–secondary chip architecture to produce different gradients.

Build

The N-to-1 miniature chip with a ballast incorporates an improved tubing connection scheme, delivering higher connection stability and reliability and supporting repeated plug-and-play without damage. The secondary chip enables pre-dilution and facilitates subsequent automation integration; it can be expanded with different secondary chip according to experimental needs, enhancing stability and scalability.

Test

Using a 2-to-1 chip as an example, the operational stability of the secondary chip was evaluated.

Learn

The results show that this primary–secondary chip strategy mixes and dilutes different liquids—such as cell-free systems and protein solutions—effectively.

Cell-free protein solution with yellow ink

Protein solution with blue ink

Cell-free protein solution with yellow ink and protein solution with blue ink mixing to green solution

Cycle 3: FCCS Detection

Design

FCCS (Fluorescence Cross-Correlation Spectroscopy) enables the detection of fluorescence intensity fluctuations at the single-molecule level within an extremely small observation volume, allowing the determination of molecular diffusion times and complex formation. Compared with traditional methods that rely on purification and multi-concentration titration, FCCS can simultaneously track protein synthesis and binding processes in situ within cell-free systems, thereby enabling faster construction of affinity curves.

Build

In implementation, we employed Fluorescence Cross-Correlation Spectroscopy (FCCS). Fluorescent molecules in solution undergo Brownian motion, randomly diffusing in and out of the confocal detection volume. This results in statistical fluctuations of the instantaneous number of fluorescent molecules, N(t), around its time-averaged value ⟨N⟩. Consequently, the detected fluorescence intensity, I(t), exhibits random fluctuations proportional to the number of molecules present within the observation volume. The instrument simultaneously monitors two fluorescent fusion proteins with distinct spectral properties, SUMO-mGFP and Ubc-mRuby2, and records fluorescence signals in two independent channels. By maintaining a constant concentration of SUMO-mGFP while gradually increasing the concentration of Ubc-mRuby2, a series of G_cross(0) values under different conditions can be obtained. Since G_cross(0) is proportional to the concentration of the complex [AB], a binding curve can be plotted and nonlinearly fitted to a binding model based on the law of mass action to determine the dissociation constant (Kd), thereby achieving precise and quantitative characterization of protein–protein interactions.

Test

Using FCCS to analyze the two fluorescent protein samples, we observed that although the 561 nm channel and the cross-correlation curve exhibited some noise, the overall profiles remained within the normal range. From the analysis, the binding data between the two proteins was determined.

tFCCS results of 488/561 standard sample

Analyzed FCCS results of 488/561 standard sample

Learn

The experimental results collectively demonstrate that FCCS, owing to its high sensitivity and temporal resolution, enables the rapid acquisition of protein–protein binding data within an extremely short time.

Cycle 4: Integration of Automated Equipment

Design

In traditional protein affinity assays, injection and sample loading operations rely heavily on manual handling. When the scale of microfluidic screening expands to a large number of parallel assays, it becomes difficult to accurately perform multi-gradient dilution manually, and the timing of sample introduction at different concentrations cannot be precisely controlled. This not only incurs substantial labor costs but also introduces human-induced variability, leading to poor reproducibility and reduced data reliability.

To address these challenges, the present design incorporates programmable automated equipment—specifically syringe pumps and peristaltic pumps—to replace manual operations. This setup enables automated generation of concentration gradients and precise temporal control of sample injection, thereby laying the foundation for high-throughput and high-precision operation of the integrated microfluidic system.

Build

The system integrates two types of automated fluid-driving devices. First, the YHPLC0100S micro syringe pump, which employs a stepper motor coupled with a lead-screw transmission system, converts motor rotation into linear injection motion to achieve uniform and quantitative liquid delivery. It is used for the precise dispensing of protein solutions, ensuring the accuracy of concentration gradient generation within the secondary chip.

YHPLC0100S Micro-syringe Pump

Second, the Longer BT100-2J peristaltic pump supports multi-pump-head configurations and a wide adjustable flow-rate range, making it suitable for delivering buffer solutions with lower precision requirements and fulfilling the needs of high-throughput experiments.

Longer BT100-2J Peristaltic Pump

Both devices allow pre-setting of parameters such as flow rate, injection duration, and operation mode, enabling precise control over the concentration gradient generation process in the secondary chip. This ensures that samples of varying concentrations enter the detection inlet in accordance with the timing requirements of cross-correlation measurements, substantially reducing manual workload and improving experimental consistency.

Test

We conducted preliminary tests to evaluate the performance and compatibility of the automated devices. First, we examined the precision of gradient generation to ensure that the coordinated operation of the devices met experimental requirements. Second, we verified the accuracy with which samples of different concentrations entered the detection inlet according to the preset timing, confirming compliance with the temporal requirements of cross-correlation measurements. Third, repeated experiments were performed to assess operational stability and data reliability, while also confirming a significant reduction in manual labor.

The test results demonstrated that both types of automated equipment successfully executed the predefined operations and fully met the integration requirements of the system.

Preliminary Test of the Automated Devices

Learn

To accommodate the differing requirements of protein solutions and buffer delivery, we adopted a differentiated configuration: the micro syringe pump for high-precision control (maintaining a concentration deviation of less than 5%) and the peristaltic pump for cost efficiency and high-throughput adaptability. This approach meets our goal of minimizing cost while ensuring precision, and it reduces manual labor by over 80% compared to traditional manual operation.

However, current programming control still requires manual adjustment of pump parameters. Future iterations should enhance the integration between the pumps and the host software to enable automated parameter updates based on experimental needs. In addition, prolonged use of peristaltic pump tubing leads to a decline in flow-rate accuracy, and millisecond-level synchronization errors have been observed between the syringe and peristaltic pumps.

To address these issues, a scheduled replacement and calibration mechanism for wear-prone components should be established, and software algorithms should be introduced to compensate for timing discrepancies, thereby improving the long-term reliability and precision consistency of the system.

Cycle 5: Intuitive Analysis and Characterization

Design

In traditional protein affinity assays, data analysis and interpretation largely rely on manual processing. FCCS raw data must be manually reformatted, binding curves require third-party software for fitting, and experimental progress as well as device operation status must be recorded and correlated by hand. This workflow is not only inefficient but also prone to data-matching errors, which can compromise the reliability of analytical conclusions.

To address these limitations, our team developed an integrated analysis platform that enables fully automated management across the entire workflow—covering experiment design, process execution, data parsing, and iterative result optimization. This system provides intuitive analytical visualization and rapid optimization capabilities, supporting high-throughput experimentation with improved efficiency and accuracy.

Build

The software was developed based on the Streamlit framework, featuring three core functional modules:

(1) Experiment Design Module — This module supports parameter configuration for three dedicated pumps (protein solution, buffer A, and buffer B), allowing customizable flow rates and operation durations. It also provides a five-step standardized workflow (including pump operation, mixing reaction, and data acquisition) with an intuitive visual interface, thereby lowering the operational threshold through structured user interaction.

(2) Process Execution Module — Built upon session_state, this module enables real-time management of pump operation status (start/stop), step progression (completion flags and progress percentage), and emergency stop signals. It automatically records detailed system logs containing timestamps and event descriptions, ensuring full traceability throughout the experimental process.

(3) Data Parsing and Visualization Module — This component integrates a custom parse_fcs_data function for automatic validation and extraction of FCS data in CSV format, including columns such as protein, concentration, and affinity. The binding curve is fitted to the model

\[ Y = \frac{B_{\max} \cdot C}{K_d + C} \]

using the curve_fit function from SciPy, and the resulting data are visualized via Plotly, enabling direct transformation from raw measurements to interpretable conclusions.

Test

Based on the software’s design logic, the completeness and coordination of the core functional modules were verified through simulation. First, the experiment design module was tested by simulating various pump parameters—such as flow rates ranging from 50 to 100 μL/min and operation times from 10 to 30 seconds—to ensure that parameter modifications were instantly synchronized with the process execution module, thereby maintaining consistency between design inputs and execution commands.

Second, the process execution module was evaluated by simulating the triggering of all five experimental steps, verifying the logical coherence of pump state transitions, step progress updates, and system log recording. For instance, after Pump 1 completed its operation, the system was expected to automatically flag Step 1 as finished.

Finally, the data parsing and fitting module was validated using a constructed standard CSV dataset to confirm the completeness of data extraction, ensuring correct identification of required columns and accurate format conversion, as well as the logical reliability of the curve-fitting process.

Learn

Through an integrated design, the software achieves end-to-end connectivity from experimental setup to data visualization. Its structured interface and automated logic minimize manual intervention, theoretically improving both analytical efficiency and the accuracy of data correlation.

From a design perspective, however, there remains room for further optimization. First, the current data format support is limited to CSV files, expanding compatibility to include direct parsing of native FCS instrument formats would enhance applicability. Second, the curve-fitting function currently relies on a fixed binding model, incorporating additional models such as competitive or cooperative binding would better accommodate complex experimental scenarios. Third, the logging functionality is relatively basic, integrating visual statistical modules—such as pump runtime distributions and step duration analyses—would strengthen data traceability and provide greater support for workflow optimization.

Future iterations can build upon these directions to enhance the software’s flexibility and analytical depth.

Microfluidic Hardware Integration: 3D-Printed Chip Enclosure Design

Cycle1: Preliminary Design and Build

Design

Functional requirements: Design a unified enclosure that houses one primary chip and one auxiliary chip. The primary chip measures 75×25 mm. The auxiliary chip has a PDMS layer measuring 25×30 mm with a thickness of 3～5 mm; its microchannels are 50×100 μm. The substrate glass slide is 75×25×1 mm, and all access ports are 1.4 mm in diameter.

Design concept: The enclosure should integrate sample injection on the auxiliary chip with protein mixing on the primary chip. Specifically, capillary tubing is connected to the top port of the auxiliary chip, and its outlet is routed via tubing to the inlet of the primary chip; proteins are mixed on the primary chip and discharged from its outlet, thereby enabling high-throughput sample loading. Two different auxiliary chips are used to introduce multiple proteins, so the system must allow effortless hot-swapping of the auxiliary chip during input. Mechanically, the device should provide a reliable clamping fixture that secures the chips without compressing the PDMS, and incorporate basic leak-prevention and a visual inspection window for operational transparency.

Build

During assembly, 3D-printing materials should be selected by function: the outer housing is fabricated from PLA, which offers low warpage, dimensional stability, and low cost; the transparent cover uses acrylic (PMMA) to provide optical visibility while maintaining adequate mechanical strength; fasteners are stainless-steel M4 screws; and the connecting lines are medical-grade silicone tubing.

Design Blueprint of the First Version of 3D Printed Chip Cartridge

Exploded View of the First Version of 3D Printed Chip Cartridge

Physical Image of the First Version of 3D Printed Chip Cartridge

Test

The testing phase comprised three key evaluations—leak testing, optical testing, and mechanical stability testing. For the static-pressure leak test, deionized water was injected using a peristaltic pump to check for seepage at all interfaces and housing ports. The optical test assessed field-of-view occlusion and lid glare; if glare was pronounced, the cover needed to be replaced with an anti-reflective alternative. Mechanical stability was evaluated by moving the enclosure and applying light vibrations to verify that tubing connections did not loosen. All tests were passed.

However, during experiments using the primary and auxiliary chips, we observed that when the auxiliary chip was laid flat, the input liquid did not enter the channels smoothly and frequently exited from the other loading port; additionally, the interconnecting capillary between the two chips tended to exhibit liquid hold-up.

Learn

Material upgrades: If higher resistance to organic solvents or stronger cleaning capability is required, the housing should be switched to SLS material.

Tubing specification: If insertion or removal causes excessive resistance, replace with tubing of 1.3 mm or 1.5 mm outer diameter to match different needle bore sizes, or apply a small amount of silicone-free lubricant to the PDMS inlet before insertion.

Dimensions and tolerances: Based on the initial 3D printing and assembly experience, adjust the geometry slightly to ensure the PDMS chip remains flat and is not compressed.

Cycle 2: Optimization for Liquid Flow

Design

To address the liquid flow and retention issues observed in version 1.0 during experiments involving the primary and auxiliary chips, as well as to accommodate the updated dimensions of the primary chip, the 3D-printed housing was optimized. The redesign aims to fit the new chip size while ensuring smooth fluid flow within the auxiliary chip and minimizing liquid retention inside the connecting tubing.

Build

Slot adaptation: As the primary chip’s dimensions were updated in the new iteration, the slot structure was accordingly modified to ensure a precise fit and secure placement of the chip.

Trapezoidal platform design: A trapezoidal platform was introduced to position the auxiliary chip on an elevated, slanted plane above the primary chip. This adjustment optimizes the spatial relationship between the two chips and allows the auxiliary chip to be tilted, so that gravity assists the smooth flow of liquid from the auxiliary chip into the connecting tubing.

Design Blueprint of the Second Version of 3D Printed Chip Cartridge

Exploded View of the second Version of 3D Printed Chip Cartridge

Test

We subsequently conducted the same series of tests on version 2.0, focusing on verifying the fitting accuracy of the primary chip slot, assessing the smoothness of liquid flow within the auxiliary chip, and checking whether any fluid retention persisted within the connecting tubing.

Learn

Although version 2.0 resolved several issues, software-based simulation tests revealed that the auxiliary chip, due to its inclined placement and insufficient slot fit, was not securely fixed in position. In addition, the solid trapezoidal platform consumed a considerable amount of 3D printing material, indicating the need for further structural optimization.

Cycle3: Optimization for mechanical stability and reduction of material loss

Design

Building upon the issues identified in version 2.0—namely insufficient fixation of the auxiliary chip and excessive material consumption in the trapezoidal platform—we further optimized the 3D-printed housing. The goal was to enhance the fixation of the auxiliary chip, reduce material usage, and maintain overall mechanical stability.

Build

Clamp addition: Because the auxiliary chip is placed at an angle and the slot size still does not perfectly match, we added clamping components to secure the chip more effectively and prevent loosening.

Platform optimization: The lower portion of the solid trapezoidal platform was hollowed out and reinforced with support structures, minimizing 3D printing material consumption while maintaining mechanical stability.

Application scenarios of the Third Version of 3D Printed Chip Cartridge

Test

We conducted another round of hardware and software testing on version 3.0, examining the firmness of auxiliary chip fixation, the structural stability of the trapezoidal platform, and the effectiveness of material reduction in 3D printing.

Learn

Version 3.0 showed significant improvements in both auxiliary chip fixation and material efficiency. After verification, we obtained a well-refined 3D-printed chip enclosure suitable for integration into the hardware system, which has now been deployed in the project. Further refinements can be made based on continued user feedback and real-world performance.