SynBio is not just about genetic engineering or changing DNA. It is about designing and building new biological parts, devices, and systems or redesigning natural systems to make them useful. It combines ideas from genetic engineering, biochemistry, systems biology, microbiology, and computer science, making it both an engineering and applied science. Engineering in SynBio follows a few key principles. Abstraction allows a part to be separated from its natural function and used somewhere else. Modularity lets us combine different parts to build bigger systems. Standardization makes sure parts fit together so that we can use the same tools and methods, like BioBrick assembly. Finally, characterization documents on how parts behave, and this information is collected in the Parts Registry, which is useful for future projects. Using these principles, synthetic biologists work through a four-step engineering rollup: design, build, test, and learn. In the next sections, we explain how our team applied this rollup to our project.

Our Goal

The Goal was clear: we wanted to build an RNA detection system, in which a protein scaffold only condenses in the presence of a specific RNA in order to achieve rapid in vivo sensing abilities. To achieve this, we started by conceptually designing the system, initiating the first phase of the DBTL (Design-Build Test-Learn) engineering rollup.

Engineering Cycle

Designing

To achieve our goal, two things became clear early on. First, we needed a valency-based synthetic condensate system as a foundation. This system could then be adapted to respond to specific RNAs by exchanging the linkers with RNA. Second, we needed reliable RNA-binding proteins.

During the first expert meetings of this project with Dr. Franzman, Prof. Dr. Alberti and Prof. Dr. Honigmann, a synthetic condensation system engineered by Michael K. Rosen was proposed (Li et al., 2012a). In our search for adaptable RNA-binding proteins, we found an engineered variant of the Pumilio factor family called ‘Pumby’ (Pumilio-based assembly). Pumbys exhibit a strong affinity toward RNA and are engineerable to bind RNA sequences up to 18 nucleotides (Adamala et al., 2016). With these two components, the first iteration of TRAPS was designed.

Design Cycle 1 – SH3-PRM-TRAPS

Illustration of the valency dependence of the SH3/PRM condensate system. On the upper half SH3 and PRM chains with different length are displayed. Depending on the length of the chains, meaning the amount of domain repeats a condensate can form or not. Five and four repeats allow condensate formation and three repeats do not.
Figure 1: The valency dependent SH3-PRM condensate.

The synthetic condensate system designed by Prof. Dr. Rosen consists of two engineered multivalent protein chains, in which each SH3 domain can bind to a PRM. This interaction results in the formation of a network visible as a condensate. The network formation is highly dependent on the valency, meaning, the number of repeats of each binding domain in each chain. For example, if each chain consists of five repeats of the respective binding domain or motive, network formation is highly likely, resulting in a stable condensate. In contrast, if there are only three repeats, the likelihood of network formation is much lower (Figure 1) (Li et al., 2012b).

Animation of the first TRAPS-Pumby design using the SH3/PRM system. The PRM chain consists of only two repeats and the SH3 chain of five. PRM binds to SH3 but does not form a condensate. RNA is introduced to the freely diffusing proteins and connect them to a detectable condensate
Figure 2: Design of the first TRAPS-Pumby system based on the SH3-PRM condensate.

We planned to make use of this valency dependence by using the SH3-chain with a high valency consisting of five repeats and reducing the PRM chain to only two repeats. This valency combination results in a system that does not exhibit condensate formation. However, by fusing RNA-binding proteins to the C- and N-terminal of the PRM chain, multiple PRM chains can connect to one specific target RNA, effectively increasing the PRM chain valency. Subsequently, this then results in target RNA-dependent condensate formation. To achieve this, it is important that we can precisely engineer the RNA-binding proteins, the Pumbys, to our sequence of choice.

Table 1: Overview of engineering Pumby domains for nucleotide specificity by changing the amino acid at the 16th and 20th position
Ribonucleotide Base Amino Acid 16 Amino Acid 20
Adenine (A) Cysteine (C) Glutamine (Q)
Guanine (G) Serine (S) Glutamic Acid (E)
Cytosine (C) Serine (S) Arginine (R)
Uracil (U) Asparagine (N) Glutamine (Q)

Pumbys consist of concatenating 36 amino acid RNA-binding domains, each binding to one nucleotide. The nucleotide specificity is determined by the amino acid at position 16 and 20 (Table 1). To increase specificity while limiting the size of the Pumbys, we decided to design Pumbys consisting of 12 nucleotide-binding regions.
With the Pumbys designed, we built a gene construct encoding for the SH3-chain labelled with a GFP. For the lower-valency PRM chain, we built two species to increase the chance of network formation. Each species had two Pumbys targeting different sequences on the target RNA. In total four Pumbys were designed to target one RNA at different sites.

Before we ordered the DNA to introduce these proteins into our yeast, we double-checked our design and realized that the initial publication on the SH3-PRM system was in vitro work. The likelihood of this system working in vivo was clearly very low, since SH3 domains generally bind all proline-rich motifs present in the cell. With no proof that even the simple synthetic system published works in yeast, the uncertainty was too big to continue with this approach. Due to this realization we started looking for a new synthetic condensate system for our idea.

While designing the Pumbys for the first system, we realized that the selection of a correct binding site in the target RNA is vital for the success of TRAPS. There was a multitude of factors to consider when looking for the correct target site. Therefore, we started to design our own software.

Design Cycles – RNA targeting Software

Software Cycle 2.1

In preparation of our first constructs using Pumilio based RBPs we came to realize that common tools for binding sequence design i.e. for sgRNAs - that minimize off target binding in a transcriptome dataset - could not accommodate our requirements of much shorter binding sequences of only 8-12 bp. To counteract the lower specificity of a shorter binding sequence we planned to design multiple RBPs that bind to different sites on the target RNA.

As the core problem of evaluating sequence similarity seemed simple enough, and we predicted more requirements would reveal themself as the project progresses and out of curiosity for the challenge, we decided to build our own software tool. To evaluate a potential binding sequence (query) we count the number of matched base pairs for each possible binding situation in each transcript in the transcriptome and calculated a pseudo binding probability based on the Boltzmann factor using the number of matching base pairs to estimate the energy of the bonds formed.

During the development of a first prototype, we tested different implementations of this moving window sequence comparison, allowing us to arrive at a numba (JIT Python compiler) driven implementation almost 1000x faster than the very first implementation of the algorithm. Furthermore, the batch evaluation of all query sequences, i.e. all substrings of the target RNA, can be sped up by parallel computation distributed over many cores.

In its first iteration the software was designed to compare all subsequence’s of the target RNA (mCherry) of a length n (these are the possible binding site getting evaluated for pumilio RBPs) against all sequences of a yeast transcriptome. While calculating the underlying binding metrics for each potential binding site on the target worked as expected, we were still missing a way to evaluate groups of binding sites simultaneously as we set out to do.

Software Cycle 2.2

We came up with more ideas on data we could use to model the binding more accurately: first we decided to switch to a quantitative transcriptome dataset including the expression levels of each transcript allowing us to weight the binding affinities to the transcripts by their concentration. Simultaneously we planned to look into the binding affinity towards the target, where the secondary structure of the target RNA could determine binding site availability. Additional considerations went into the idea of making the tool sequence function aware e.g. to avoid ribosome binding sequences and prefer UTRs. Starting to think about publishing this internal tool we wanted to investigate the common ways tools like these are published, and how the usability of the codebase could be improved.

In this next iteration we changed from a purely qualitative transcriptome dataset to a quantitative one allowing us to weight the similarity to each transcript by the expression level of that transcript, improving the search results by avoiding binding sites on high frequency transcripts, while also reworking most of the visualisations. A major addition was the option to evaluate not just single binding sites, but groups of binding sites to avoid overlapping off target effects in situations where multiple probes are used simultaneously, like in our case with multiple pumilio RBPs and sgRNAs for Cas13. The codebase up to this point being a mix of .py files and jupyternotebooks we decided to put all essential functions into a compact .py file, and additionally offer an example implementation as well as a detailed explanatory jupyter notebook showing each of the core functions. Additionally to the making the tool available for local installation via conda, pip, uv(suggested by iGEM team Potsdam) we started to experiment with a web interface for our tool to be integrated into the wiki.

Talking to more experts and testing example sequences revealed the secondary structure prediction of RNA to be a field very much still in progress, leaving us to decide against a direct integration. Similarly we (for now) did not implement any sequence function based evaluation as this evaluation might depent on the scientific question adressed; focusing purly on off target effects in the transcriptome. However as the list of binding sites to be evaluated by the tool is open for the user to be changed it is easy to apply a pre-selection of potential binding sites, based on prior scoring from other tools, or to restrict the binding site search to certain regions. Looking into the hardware of the iGEM server hosting the wiki we came to realize our project would not be able to run nativly on these servers, for the same performance reasons an implementation running on the clients browser was discarded. Due to iGEMs wiki restrictions hosting our own server to run the backend with the frontend in the wiki is not allowed, leaving us to discard the idea of an online demo tool as part of the wiki.

This second design cycle confronted us with questions and problems addressing the which features are viable, which make sense to include for the target audience, balancing 'smart' i.e. predetermined behaviour against versatility at the cost of complexity and against the time and work it would take to implement. Similar problems came up documentation, having to find a balance between simple and easy to understand documentation for new users while not dropping details and optional parameters inclined users might be interested in.

Here comes the – Heidenreich System

It became apparent that the initial synthetic condensate wouldn’t work in yeast, so we continued looking for more options. When researching further, we found a fitting synthetic system published by Heidenreich, which had even been tested in the Alberti Lab already (Heidenreich et al., 2020). Additionally, we considered not only using Pumbys as our RNA-binding proteins but also dead Cas13. Using dCas13 was further evaluated and discussed with Yu (Elsa) Wei from the Hyman lab, who has previously worked extensively with Cas13.

Animation of the synthetic condensate designed by Heidenreich. Tetramerization units form a tetramer and are connected via the E9-Im2 interaction to the dimer. This results in many tetramers connecting to each other resulting in a large network visible as a condensate.
Figure 3: Synthetic condensate designed by Heidenreich et. al.

The system designed by Dr. Heidenreich consists of two proteins: a tetramerization unit labelled with GFP and fused to a toxin (E9), and a dimerization unit connected to the respective antitoxin (IM2). Due to the multivalence , the system forms a condensate with liquid-like properties based on the E9-IM2 link (Figure 1). We then had the idea to replace the dimerization unit with RNA to achieve the linking of multiple GFP-labelled tetramers by the specific RNA.

From this, we designed two new TRAPS systems, both based on the synthetic condensate system, one combined with Pumbys and one with Cas13 as RNA binding proteins.

Design Cycle 3 – TRAPS-Pumby keeping E9 and IM2

To ensure a high likelihood of the condensate formation, we needed not just two but multiple binding sites of our RNA-binding proteins to our target RNA. With more binding sites, one RNA can connect more than just two tetramers. For the TRAPS-Pumby system, we decided that four binding sites should be sufficient to begin with.

Animation of the first TRAPS-Pumby system using the synthetic condensate designed by Heidenreich. The dimerization domain is replaced by Pumby domains targeting different sites on the target RNA determined by our software. Tetramerization units form a tetramer and are connected via the E9-Im2 interaction to the Pumbys. Once the target RNA is introduced the Pumbys will bind and connect multiple tetramers, forming a large network visible as a condensate.
Figure 4: TRAPS-Pumby system with E9 and IM2 connection.

We initially planned on keeping the E9-IM2 link to maintain the liquid-like nature of the condensate. Therefore, we kept the tetramer as it was and created three versions of an IM2-Pumby fusion protein, each targeting different sites on our target mRNA. These three proteins should bind to the available E9-binding sites at the tetramer via IM2. If the target RNA is present in the cell tetramers will be connected via the Pumbys (Figure 4).

It was time to get serious and think of a cloning strategy. For this we met with two cloning experts from the MPI-CBG in Dresden, Dr. Aliona Bogdanova and Andrey Pozniakovsky. In this meeting we decided to use conventional cloning and ordered all the genes with SacI and MluI restriction sites at the ends for insertion into the plasmids available in the Alberti lab.

But most importantly one more thing became apparent; the E9/IM2 ratio should be close to 1:1 in order to avoid many freely diffusing Pumbys that would not contribute to the network when bound to the target RNA. With three designed proteins containing IM2 (the three different IM2-Pumbys) and only one containing E9 (the tetramerization unit), this ratio is almost impossible to achieve. With this realization we went back to the drawing board designing a new TRAPS-Pumby system.

Design Cycle 4.1 – TRAPS-Pumby

We decided to leave out the E9-IM2 interaction altogether, since there was no viable option to keep the interaction while still having multiple different Pumbys targeting different sites on the target RNA. We still wanted to have three Pumbys and decided to replace the E9 toxin entirely with Pumbys. This resulted in the tetramerization domain being covalently connected to its respective Pumby. Three versions of these tetramerization–Pumby fusion proteins were created.

Animation of the second and final TRAPS-Pumby design. Pumby domains are directly connected to the tetramerization units replacing the E9-toxin. Tetramers will assemble. Additionally, a Pumby is fused to the dimer and dimers will assemble. All proteins are freely diffusing in the cytosol. Once the target RNA is introduced it will be bound by the Pumbys connecting multiple tetramers, forming a large network visible as a condensate.
Figure 5: Final Pumby design.

If we had stopped here, there would have been a chance that one tetramer contained all three versions and could occupy one target RNA completely, meaning all possible binding sites of the RNA would be bound by one tetramer, which in turn would not contribute to network formation. Even if this outcome is statistically unlikely, we still wanted to ensure that at least two tetramers would be connected by one RNA. To achieve this, we planned to introduce a fourth pumby connected to the original dimerization unit of the system. With this, the dimer will have two identical binding sites to the RNA, so in theory each dimer is connected to two target RNAs, while the tetramers can bind to the residual binding sites on the RNA, thereby forming a network (Figure 5).

These four proteins, the three tetramerization units connected to different Pumbys and the one dimerization unit connected to a pumby, composed our TRAPS-Pumby System.

Design Cycle 4.2 - TRAPS-Cas13 Design

Animation of the TRAPS-Cas13 design. Cas13 domains are fused to the Im2-antitoxin. Tetramerization units form a tetramer and are connected via the E9-Im2 interaction to the Cas13 domain. The binding site of the Cas13 unit are defined by the introduction of four gRNAs targeting different sites on the target RNA. Once the target RNA is introduced it will be bound by the Cas13 connecting multiple tetramers, forming a large network visible as a condensate.
Figure 6: Figure 6: TRAPS-Cas13.

The TRAPS-Cas13 design is simpler, since the binding site of Cas13 on the RNA is not defined by its amino acid sequence, as in the case of Pumbys, but by the gRNA bound to Cas13. The tetramerization unit can remain as it is, resulting in tetramer with four E9 binding sites, and the dimerization unit is replaced by Cas13. We initially used a miniaturized Cas13 version, Cas13X.1(Xu et al., 2021). Cas13 fused to IM2 will bind to the four available E9 binding sites, maintaining the liquid like E9-IM2 connection (Figure 6).

If now a set of gRNAs is introduced, the gRNAs will be distributed to the Cas13 proteins, thereby defining their binding site on the target RNA. We again decided to initially have four binding sites on the target RNA, so four gRNAs were designed, with each one binding to a Cas13 domain. To introduce the gRNAs we used an approach with a self-cleaving gRNA cassette that can be genomically integrated, producing a self cleaving RNA that results in a functional gRNA (Utomo et al., 2021).

The introduction of two proteins, the tetramerization domain as it is and the IM2-Cas13 fusion protein, results in freely diffusing tetramers with multiple Cas13 units noncovalently bound. The introduction of gRNAs now defines the target RNA for the Cas13 binding, resulting in network formation and condensation.

We have two systems that both can form a large network once a targeted RNA is present, resulting in visible local up-concentration of GFP signal. With TRAPS-Pumby, the RNA target is defined by the amino acid sequence of the Pumbys, whereas the TRAPS-Cas13 system defines the target by the introduction of gRNAs. In the Pumby system, four proteins need to be designed and integrated, whereas in the TRAPS-Cas13 system only two proteins are needed. The E9-IM2 bond that ensured the liquid-like behaviour of our synthetic condensate is still present in the TRAPS-Cas13 system but was lost in the TRAPS-Pumby system.

At this point, it became apparent that Cas13 is most likely the better RNA-binding protein for our application, being substantially more adaptable to new targets and significantly less complicated. Additionally, the resulting network once an RNA is bound, might also behave more like a liquid due to the E9-IM2 link. The only upside of the TRAPS-Pumby system is the very high affinity for the Pumbys towards their target RNA sequences.

Building

As a proof of concept we used the mRNA of the fluorescent protein mCherry as our first target RNA in S. cerevisiae. This has multiple advantages. Since it is an artificially introduced gene that is not naturally expressed in yeast we can control the expression with a conditional promoter. For this we used a galactose-dependent promoter, which only activates when we switch from glucose medium to galactose medium for the growing cells (Hovland et al., 1989). This allows us to precisely observe whether condensates form when we activate the promoter. Additionally, we can directly assess the impact of our system on translation of the target RNA by looking at the mCherry fluorescence.

Build Cycle 1 – Cloning Strategy: Gateway Cloning

We planned on using Gateway cloning, since it is a well-established method in our lab. For this, we planned to add attB sites to the genes that we order and introduce them into donor vectors (Hartley, 2000).

When we became more concrete with the design of the gene cassettes, we realized that we would probably need to replace promoters and exchange genes quite often to improve our system step by step. We realized that for such a fast, iterative process, Gateway cloning is not optimal and decided instead to use conventional cloning to introduce our TRAPS genes. However, to integrate the conditional mCherry, we used the Gateway cloning approach, since a yeast-optimized mCherry entry clone was already available.

Build Cycle 2 – Cloning Strategy: Conventional Cloning

For our TRAPS proteins, we decided on using conventional cloning and selection with auxotrophic markers (Burke et al., 2000). To avoid using too many auxotrophic markers, we integrated two proteins per plasmid, resulting in only three auxotrophic markers in total per system.

In the TRAPS-Cas13 system, a tryptophan-selectable plasmid (Figure 7A) was used for mCherry transformation, an uracil-selectable plasmid (Figure 7B) was used for the integration of both our proteins and a leucine-selectable plasmid (Figure 7C) was used to integrate the gRNA coding cassette. In the TRAPS-Pumby system again tryptophan (Figure 7A) was used for mCherry, uracil (Figure 7B) for two of the tetramerization-Pumby units and leucin (Figure 7C) for the last tetramerization-Pumby unit and the dimer-Pumby unit.

To ligate two protein cassettes, a SphI restriction site was placed at one end of one cassette and to the opposite end of the associated cassette. For the introduction into the vectors, we used SacI and a MluI restriction sites to remove the toxic ccdB cassette and insert our cassette. Additionally, the enzymes cut directly in front of or behind M13 primer sites, allowing us to sequence the insert after ligation.

Illustration of the plasmids used for the transformation of yeast with our proteins. A: pAG304GAL-ymCherry is an expression vector containing a yeast optimised mCherry. It is a genomically integrating vector with a Tryptophan coding cassette. B: pAG416GPD-ccdb destination vector is the vector used for ligation with the designed gene coding cassettes of the TRAPS-Cas13 system and two of the four proteins of the TRAPS-Pumby system. Restriction and ligation will happen at the SacI and MluI restriction sites. It is a centromeric vector with an Uracil coding cassette. C: pAG305GPD-ccdb destination vector is the vector used for ligation with the designed gRNA coding cassettes TRAPS-Cas13 system and two of the four proteins of the TRAPS-Pumby system. Restriction and ligation will happen at the SacI and MluI restriction sites. It is a genomically integrating vector with an Leucin coding cassette.
Figure 7: Used plasmids for the transformation of TRAPS.

Build Cycle 3 - Construct Building

For each protein, the respective construct was built. All constructs are controlled by a GAP (BBa_K4292002) promoter and a ADH1 (BBa_K1486025) terminator derived from the iGEM registry. As previously mentioned, the ratios of the proteins relative to each other are important. In the TRAPS-Pumby system the expression rate of all three different tetramerization units should be 1:1:1 and in the TRAPS-Cas13 system the ratio of both proteins should also be 1:1 for efficient network formation.

Since protein expression has many variables, expression rates are hard to predict. Therefore, we added an antibody tag to all proteins to check expression levels with a western blot after integration. Proteins with similar size, specifically the three tetramerization-unit-Pumby variants, were tagged with different antibodies. Once we know the expression rates, we can increase or reduce it by exchanging the promoter of individual cassettes, depending on the needs. For this purpose, each promoter was flanked by two unique restriction sites within the plasmid, enabling simple exchange.

Build Cycle 4 - Considering potential failure

Before we finalized and ordered our fragments, we considered all likely failure points of our system and tried to implement strategies that would allow us to directly demonstrate failure.

The first possible failure would be if our large fusion proteins were not expressed, or misexpressed. This was already testable by the western blot. Since both Cas13 (444 aa) and Pumbys (432 aa), which are either directly or indirectly connected to the tetramer, are substantially larger than the tetramerization domain (31 aa), we hypothesized that the domain might not form a tetramer anymore due to steric clashes of the large bound proteins. We planned a strategy of testing this in the TRAPS-Pumby system. For one of the three different tetramerization-pumby units, no GFP was fused. This unit is the only one of the three units that has a myc tag. With this setup, we could perform a pulldown assay against the myc-tag, resulting in this specific tetramerization unit remaining bound in the gel. If the other two units labelled with GFP were unable to bind and form a tetramer with the mentioned unit, there would be no GFP fluorescence remaining in the gel. On the other hand, if they form a tetramer, there should be strong fluorescence in the gel even though only the unlabelled tetramerization unit was directly bound.

A further possible failure point is the successful binding of our RNA-binding proteins to the respective RNA binding site. Here again, a pulldown assay could demonstrate binding. If we perform a pulldown against one of the antibody tags connected to our RNA-binding proteins, we should be able to extract the RNA that remains in the gel. One common problem in protein purification is contamination with RNA, so we already know that anyway RNA often remains in the gel anyway. However if our proteins bind efficiently the target RNA, it should be enriched compared to the usual RNA impurities.

Cas13-Constructs

Illustration of the TRAPS-Cas13 units and coding cassettes. On the top the illustration of the tetramerization unit, with individual domains parts and restriction sites labelled on the coding cassette depiction. On the bottom the illustration of the Im2-Cas13 unit, with individual domains, parts and restriction sites labelled on the coding cassette depiction.
Figure 8: TRAPS-Cas13 protein coding cassettes, with the Tetramerization Unit and the Im2-Cas13 Unit, designed to be cloned into the pAG416-ccdb plasmid.
Illustration of the TRAPS-Cas13 gRNA coding cassettes. Four different sequences for the gRNA are placed into the general self-cleaving gRNA coding cassette with individual parts and restriction sites labelled.
Figure 9: TRAPS-Cas13 gRNA coding cassette designed to be cloned into the pAG305GAP-ccsdb plasmid.

Pumby-Constructs

Illustration of two of the TRAPS-Pumby units and coding cassettes. On the top the illustration of the one tetramerization-Pumby unit, with individual domains parts and restriction sites labelled on the coding cassette depiction. On the bottom the illustration of a second tetramerization-Pumby unit, with individual domains, parts and restriction sites labelled on the coding cassette depiction.
Figure 10: TRAPS-Pumby protein coding cassettes, with two Tetramerization-Pumby Units, designed to be cloned into the pAG416GAL-ccdb plasmid.
Illustration of the other two TRAPS-Pumby units and coding cassettes. On the top the illustration of the third tetramerization-Pumby unit, with individual domains parts and restriction sites labelled on the coding cassette depiction. On the bottom the illustration of a dimerization-Pumby unit, with individual domains, parts and restriction sites labelled on the coding cassette depiction.
Figure 11: TRAPS-Pumby protein coding cassettes, with one Tetramerization-Pumby Unit and the Dimerization-Pumby Unit, designed to be cloned into the pAG305GAL-ccdb plasmid.

Testing

Once the protein-coding cassettes were designed and the cloning strategy was clear, we started the testing by first introducing the mCherry gene into the yeast cells.

Testing Cycle 1 – mCherry integration, expression and microscopy

The mCherry DNA was genomically integrated with a galactose-dependent promoter in the tryptophan locus using Gateway cloning.

Upon changing the medium from glucose to galactose, thereby activating the galactose promoter, a strong increase in fluorescence is observed. Almost no fluorescence is visible in the cytosol of the uninduced cells, whereas strong fluorescence can be seen in the induced cells (Figure 12 A, B, C, D), which is also reflected in the analysis of the cumulative fluorescence intensity of the cells (Figure 12 E). Interestingly, we observed fluorescence in the vacuoles of the uninduced cells (Figure 12 B).

Figure 12: mCherry expression test. A, B: Images of uninduced cells continuously grown in glucose medium. C, D: Gal-induced cells grown in galactose medium. E: Analysis of the fluorescence intensity of both groups.

The conditional expression of mCherry under the galactose-dependent promoter was successful, and the resulting yeast strain was used for further tests of the TRAPS System targeting the mCherry RNA. The fluorescence of the vacuoles, even in the uninduced sample, was initially confusing. However, upon further research, we learned that adenine-deficient yeast strains, such as ours, accumulate an intermediate of the adenine biosynthesis pathway that has auto fluorescent properties and is typically stored in the vacuole (Banta et al., 1988). Thus, the vacuole fluorescence we observed is independent of mCherry expression and not relevant for our further testing.

After we created our mCherry-expressing strain, we introduced the TRAPS-Cas13 proteins and the gRNA targeting the mCherry mRNA

Testing Cycle 2 – TRAPS-Cas13 integration, expression and microscopy

Once the genes for our TRAPS-Cas13 proteins arrived, the two cassettes were ligated and introduced to our centromeric plasmid with the uracil auxotrophic marker. This plasmid was then transformed into the conditionally mCherry-expressing strain using our optimized transformation protocol. The same was performed with the gRNA coding cassette.

After successfully transforming the yeast, we started testing the functionality of TRAPS. We imaged one uninduced group, in which no mCherry RNA should be present, and one galactose-induced group expressing mCherry (Figure 13 B, E). When looking at our GFP-labelled TRAPS proteins, we saw local up-concentrations (Figure 13 C, F, arrows) indicating that our proteins have formed “droplets” in the cell. This is the desired outcome for our TRAPS system, but only when the RNA is present. Unfortunately, we found the droplet formation was independent of mCherry expression, therefore independent of our target RNA.

Figure 13: Testing first iteration of TRAPS-Cas13. A-C: Images of uninduced cells continuously grown in glucose medium. D-E: Gal-induced cells grown in galactose medium.

At this point, we already speculated that we were not observing condensate formation but rather aggregation of our proteins independent of RNA, but this had to be proven. First, we needed to confirm that the formation of these droplets is RNA-independent since the conditional promoter can be leaky (Peng et al., 2015), resulting in RNA expression and condensate formation even without galactose induction. Then, we aimed to determine which of the two proteins caused the aggregation by splitting apart the two cassettes and transforming them individually.

Having observed the first microscopy results, we wanted to test whether our proteins are expressed at the correct size and, if so, in what ratio. As mentioned previously, an optimal ratio is 1:1. To test this, we performed a western blot.

Testing Cycle 3 – Western Blot

We were kindly taught the method of western blotting by PhD Student Moo Koo Kang of the Alberti Lab. Western blotting was conducted to confirm that our protein construct is expressed and to assess its molecular weight, thereby also confirming the success of the transformation process. Furthermore, western blotting allows the detection of potential protein degradation and enables the comparative analysis of construct expression levels in cell lines with gRNA and without gRNA expression, providing insight into expression efficiencies. To further investigate our suspicion that we are observing aggregates rather than the desired condensates, western blotting was performed on the Total cell lysate (T), which includes both soluble and insoluble fractions, as well as on the Supernatant (S), which contains primarily soluble proteins like the proteins forming condensates.

The protein constructs were designed to be detected through immunoblotting. Both constructs were tagged with a myc tag. As the GFP-E9-Tetra construct is approximately 48 kDa and the lM2-Dim construct approximately 65 kDa a differentiation between the two constructs in the immunoblot should be possible. Furthermore, the GFP-E9-Tetra construct is GFP tagged so the samples were split into heated / denatured and unheated / native samples before performing a SDS-Page so that the GFP will still be functional in its fluorescence in the native sample.

After two failed western blot attempts and extensive trouble shooting a successful western blot was performed. Both protein constructs are expressed at the expected molecular weight and no degradation was spotted. The results from the Supernatant and Total fraction of the cell lysate further confirmed our suspicion that we have in fact aggregates as there is more protein material visible in the T fraction in direct comparison to the S fraction. Condensates are expected to be in the soluble S fraction, whereas aggregates are expected to be present in the insoluble fraction which would be the T fraction in this experiment (Alberti & Hyman, 2021). The experiments also revealed a more effective expression in the cell lines without gRNA expression in direct comparison to the cell lines with gRNA expression. These results tell us that both proteins are expressed, but we would need to adjust the promoter controlling the expression of the tetramerization unit to a lower expression level to match the expression of or Im2-Cas13 unit, resulting in a 1 to 1 E9/Im2 ratio. Since we had already planned to introduce a new Cas13 version, that will most likely change the expression level of the unit, we have not adjusted the promoter yet.

Testing Cycle 4 – Testing the Aggregation

To check whether our proteins aggregate independent of RNA, we transformed our plasmid carrying TRAPS into WT yeast, where no mCherry RNA is present. Additionally, we separated the two cassettes and integrated and transformed them individually to test whether our labelled tetramer also aggregates without the IM2-Cas13 unit bound to it.

As we expected, the droplets formed completely independent of RNA, since we also observed them during the microscopy of the WT cells that were transformed with our TRAPS proteins (Figure 15 A-C).

Figure 15: Testing for aggregation. A-B: Images of TRAPS-Cas13 system introduced into WT W303 yeast cells, not having the mCherry gene. D-E: Images of yeast cells only having the tetramerization unit of the TRAPS-Cas13 system.

The individual expression of the GFP-labelled tetramerization unit resulted in distributed GFP signal within the cytosol (Figure 15 D-F), further confirming our suspicion that the aggregation is caused by misfolding of the Cas13 domain.

FRAP

Another possible explanation for our observation is that Cas13 lacks specificity in its binding to the gRNAs. This would mean that Cas13 not only binds the designed gRNAs and subsequently the target RNA, but that binds other RNAs in the cell instead. These unspecifically bound RNAs could connect the TRAPS tetramers, resulting in condensate formation independent of our target RNA. In that case, the system should still behave liquid-like, since the bond between the tetramers is RNA-mediated, even though it is not mediated by the target RNA. This contrasts with the expectation of aggregates, which behave more solid like (Taylor et al., 2019).

To test this, we performed a FRAP (Fluorescence Recovery After Photobleaching) experiment to examine at the fluidity properties of our droplets. If there is substantial intramolecular movement within the droplet, as expected for liquid-like behaviour, the fluorescent recovery of a bleached part of the droplet should be faster, since unbleached molecules migrate into the bleached area. On the other hand, if there is little intramolecular movements in the droplet, as expected in solid-like aggregates, the recovery should be slow, since unbleached molecules struggle to migrate to the bleached area.

Intracellular FRAP experiments of condensates or aggregates is quite difficult, since they are usually small in size. Typically, FRAP is performed in vitro on large size aggregates of purified Proteins. Nevertheless, we performed the experiment bleaching a part of the droplet and observed almost no recovery (Figure 16). A similar experiment was performed on the original condensate system published by Heidenreich (Heidenreich et al., 2020, Figure 3b). An appropriate adaptation of the result was depicted here as the expected recovery (Figure 16).

Figure 16: FRAP of TRAPS-Cas13: A-I: Images of fluorescently bleached droplets in our cells before bleaching, and 2 seconds or 10 minutes after bleaching. J: Graph plotting the fluorescence recovery over time in seconds.

As expected for aggregates, there is almost no recovery after photobleaching, and the recovery is not comparable to the observed recovery in the initial system. This provides final conformation that we are observing aggregates and not condensates.

We can therefore conclude with good confidence that the Cas13 domains do not fold properly, leading to direct protein-protein interactions subsequent aggregation of the entire system.

After confirming that the TRAPS-system does not correctly respond to our target RNA because aggregation of the Cas13 unit, resulting in a signal without the involvement of our target RNA, we explored further Cas13 options.

Testing Cycle 5 – TRAPS-Cas13 with a RfxCas13d unit

We continued searching for a suitable Cas13 version to fuse to our IM2 unit. We found a catalytically inactive RfxCas13d variant that was reported to effectively bind to RNA in yeast (Zhang et al., 2022) A new gene fragment was constructed by replacing the Cas13 unit, then integrated into our plasmid, transformed and tested.

Figure 17: Testing the second iteration of TRAPS-Cas13. A-C: Images of uninduced cells continuously grown in glucose medium. D-E: Gal-induced cells grown in galactose medium.

Again, we observed the previously described aggregates independent of RNA presence (Figure 17). Extensive research was needed to overcome the aggregation problem.

Testing Cycle 6 – TRAPS-Cas13 with redesigned RfxCas13d unit

The observed aggregation was disappointing, since the RfxCas13d version of Cas13 is the only catalytically inactive variant reported to work in yeast. We therefore considered that Cas13 might not be compatible our system, which would mean the end of our approach, at least in yeast. However, we did not give up and tried to determine whether there was a plausible and resolvable cause for this aggregation.

Upon a close look at the publication about the RfxCas13d, we realized two things. First, the version that they used was not yeast-optimised, as the gene for the protein was directly taken from a provided plasmid of previous work on human cell lines with this Cas13 application (Han et al., 2020). Yeast optimisation could potentially increase the expression levels, although it would likely not resolve the aggregation issue. Second, we found in the supplementary data of the initial publication on human cells that the efficacy of the RfxCas13d depends on the position of protein fusion: N-terminal fusion leads to aggregation and C-terminal fusion results in a functional Cas13 domain (Han et al., 2020, Figure S5).

In our construct the IM2 unit is fused N-terminally to the RfxCas13d domain, which is therefore the likely cause of aggregation. A new yeast-optimised version of our protein with a C-terminal IM2 fusion has been ordered and is expected to arrive on the 07.10. This new version will be tested and, hopefully, will lead to a functional system.

With this we found a very reasonable explanation of the continued aggregation and moved forward by addressing the problem with a new gene. However, we had reached the end of the development of TRAPS-Cas13 up to this point.

Testing Cycle 7 -TRAPS-Pumby integration, expression and microscopy

The TRAPS-Pumby genes were ligated, integrated into the respective vectors, and transformed into the yeast strain carrying the mCherry gene. After successfully transforming the yeast, we again examined one galactose-induced group and one uninduced group.

Microscopy revealed a slightly different result compared to the first TRAPS-Cas13 test. We again observed droplets independent of galactose induction, and therefore independent of our target RNA (Figure 18 C, F), but there were some interesting additional observations. Compared to the fluorescence in the TRAPS-Cas13 microscopy, we observed a substantial amount of seemingly unaggregated cytosolic GFP signal. While droplets were present, distributed GFP signal was also visible in the cytosol.

Another interesting observation was that mCherry fluorescence appeared reduced in cells that showed high GFP intensity and numerous droplets (Figure 18 E, F, arrows).

Figure 18: Testing the second iteration of TRAPS-Cas13. A-C: Images of uninduced cells continuously grown in glucose medium. D-E: Gal-induced cells grown in galactose medium.

We also observed aggregation independent of RNA, but since unaggregated GFP was still present, we can conclude that at least some of our proteins are not aggregating. Of the four proteins that we introduced, only two are labeled with GFP (two of the tetramerization units) and two are unlabeled (the third tetramerization unit and the dimer). It is possible that only one of the two labeled proteins aggregates while the other does not.

A more likely explanation for this partial aggregation is that, since we expressed four very similar proteins all under the highly expressive GAP promoter, we may have reached a concentration threshold where aggregation begins. Because we expressed four only slightly different Pumbys targeting different sites on the target RNA, this high Pumby concentration may have caused some of them to aggregate.

The interesting observation that mCherry fluorescence appeared reduced in cells with high GFP fluorescence may indicate that free, unaggregated Pumbys were binding our target mCherry mRNA and reducing its translation. All these assumptions still need to be tested. To determine whether only one of the tetramers aggregates, we would need to separate the protein cassettes and transform them individually. To test whether the aggregation is caused by high protein concentration, we could replace the GAP promoter with a less expressive promoter.

These results (Figure 18) were obtained at almost the same time that we confirmed aggregation in the TRAPS-Cas13 system, and we realized that it would be impossible within the available timeframe to continue testing and validating both TRAPS systems. We therefore actively decided, after the first iteration of the TRAPS-Pumby system, to pause its development and fully focus our efforts on the TRAPS-Cas13 system, since it appeared more promising due to its rapid adaptability to new targets, reduced complexity, and retention of the E9–IM2 interaction.

Learning

TRAPS, as a new RNA detection system, holds strong potential not only for sensing RNA but also for sequestering it into a protein-scaffold condensate. This feature opens the door for a wide range of additional applications, since the RNA becomes localized within the scaffold. This became clear to us during numerous discussions with experts and scientists throughout the development of TRAPS. We realized that we had not just designed an RNA detection method but also created an RNA sequestration platform that could be applied in several different ways. If you want to know more, read the sections Future Prospects and Integrated Human Practices.

The aggregation observed in many of our experiments was caused by design errors in the Cas13 or Pumby coding cassettes, or more generally, by the choice of the particular domain versions we used, which only became apparent during testing. We are confident that this aggregation issue is solvable, and we expect it to be resolved in the coming months, allowing us to continue testing the system in multiple directions.

As you have read, we researched, adapted, and learned throughout every step of this project. On the scientific level, each of us has acquired new knowledge and skills. But by the end of this project, and this large rollup, we realized it was not just about the science. For all of us, it was the first time carrying out such a large project with full self-responsibility. We experienced everything that comes with it: from fundraising and bureaucracy to PR and outreach events. This experience has made us more prepared than ever to advance in our scientific careers.

Most importantly, we grew not only as a team, moving from initial brainstorming to a concrete idea and testing a new tool in the toolbox of biology, but also as individuals, evolving from loose connections into real friendships. We didn’t just meet in the lab, but also outside of it, celebrating together and sharing good times.

The biggest lesson we take from this project is that science is not just about science, but also about enjoying the process and the fellow scientists around you!

References

  • Adamala, K. P., Martin-Alarcon, D. A., & Boyden, E. S. (2016). Programmable RNA-binding protein composed of repeats of a single modular unit. Proceedings of the National Academy of Sciences, 113(19). https://doi.org/10.1073/pnas.1519368113
  • Alberti, S., & Hyman, A. A. (2021). Biomolecular condensates at the nexus of cellular stress, protein aggregation disease and ageing. Nature Reviews Molecular Cell Biology, 22(3), 196–213. https://doi.org/10.1038/s41580-020-00326-6
  • Banta, L. M., Robinson, J. S., Klionsky, D. J., & Emr, S. D. (1988). Organelle assembly in yeast: Characterization of yeast mutants defective in vacuolar biogenesis and protein sorting. The Journal of Cell Biology, 107(4), 1369–1383. https://doi.org/10.1083/jcb.107.4.1369
  • Burke, D. J., Dawson, D. S., & Stearns, T. (with Cold Spring Harbor laboratory). (2000). Methods in yeast genetics: A Cold Spring Harbor laboratory course manual (2000 ed). Cold Spring Harbor: Cold Spring Harbor Laboratory Press.
  • Han, S., Zhao, B. S., Myers, S. A., Carr, S. A., He, C., & Ting, A. Y. (2020). RNA–protein interaction mapping via MS2- or Cas13-based APEX targeting. Proceedings of the National Academy of Sciences, 117(36), 22068–22079. https://doi.org/10.1073/pnas.2006617117
  • Hartley, J. L. (2000). DNA Cloning Using In Vitro Site-Specific Recombination. Genome Research, 10(11), 1788–1795. https://doi.org/10.1101/gr.143000
  • Heidenreich, M., Georgeson, J. M., Locatelli, E., Rovigatti, L., Nandi, S. K., Steinberg, A., Nadav, Y., Shimoni, E., Safran, S. A., Doye, J. P. K., & Levy, E. D. (2020). Designer protein assemblies with tunable phase diagrams in living cells. Nature Chemical Biology, 16(9), 939–945. https://doi.org/10.1038/s41589-020-0576-z
  • Hovland, P., Flick, J., Johnston, M., & Sclafani, R. A. (1989). Galactose as a gratuitous inducer of GAL gene expression in yeasts growing on glucose. Gene, 83(1), 57–64. https://doi.org/10.1016/0378-1119(89)90403-4
  • Li, P., Banjade, S., Cheng, H.-C., Kim, S., Chen, B., Guo, L., Llaguno, M., Hollingsworth, J. V., King, D. S., Banani, S. F., Russo, P. S., Jiang, Q.-X., Nixon, B. T., & Rosen, M. K. (2012a). Phase transitions in the assembly of multivalent signalling proteins. Nature, 483(7389), 336–340. https://doi.org/10.1038/nature10879
  • Li, P., Banjade, S., Cheng, H.-C., Kim, S., Chen, B., Guo, L., Llaguno, M., Hollingsworth, J. V., King, D. S., Banani, S. F., Russo, P. S., Jiang, Q.-X., Nixon, B. T., & Rosen, M. K. (2012b). Phase transitions in the assembly of multivalent signalling proteins. Nature, 483(7389), 336–340. https://doi.org/10.1038/nature10879
  • Peng, B., Williams, T. C., Henry, M., Nielsen, L. K., & Vickers, C. E. (2015). Controlling heterologous gene expression in yeast cell factories on different carbon substrates and across the diauxic shift: A comparison of yeast promoter activities. Microbial Cell Factories, 14(1), 91. https://doi.org/10.1186/s12934-015-0278-5
  • Taylor, N. O., Wei, M.-T., Stone, H. A., & Brangwynne, C. P. (2019). Quantifying Dynamics in Phase-Separated Condensates Using Fluorescence Recovery after Photobleaching. Biophysical Journal, 117(7), 1285–1300. https://doi.org/10.1016/j.bpj.2019.08.030
  • Utomo, J. C., Hodgins, C. L., & Ro, D.-K. (2021). Multiplex Genome Editing in Yeast by CRISPR/Cas9 – A Potent and Agile Tool to Reconstruct Complex Metabolic Pathways.Frontiers in Plant Science, 12, 719148.https://doi.org/10.3389/fpls.2021.719148
  • Xu, C., Zhou, Y., Xiao, Q., He, B., Geng, G., Wang, Z., Cao, B., Dong, X., Bai, W., Wang, Y., Wang, X., Zhou, D., Yuan, T., Huo, X., Lai, J., & Yang, H. (2021). Programmable RNA editing with compact CRISPR–Cas13 systems from uncultivated microbes. Nature Methods, 18(5), 499–506. https://doi.org/10.1038/s41592-021-01124-4
  • Zhang, Y., Ge, H., & Marchisio, M. A. (2022). A Mutated Nme1Cas9 Is a Functional Alternative RNase to Both LwaCas13a and RfxCas13d in the Yeast S. cerevisiae. Frontiers in Bioengineering and Biotechnology, 10, 922949. https://doi.org/10.3389/fbioe.2022.922949