| UCSC - iGEM 2025

Cycle 1 & 2 full PDF

Cycle 1 & 2 — Full Document (PDF)

Cycle 3 5 iterations + citations

Iteration 1

Design

GOAL: Design a double-stranded gene block and establish a streamlined digestion protocol that will allow us to measure the digestion capabilities and processivity of Lambda exonuclease.

Design:

Lambda exonuclease strictly targets phosphorylated 5′ ends. Its intended function is to digest one strand of double-stranded DNA (dsDNA), leaving the other strand intact. We will employ Lambda exonuclease (exo) to digest our gene blocks (gBlocks).

In order to evaluate the digestion capabilities and processivity of Lambda exo, we designed a dsDNA construct, TriCycleV1 (T), containing putative pause sequences. The initial design of TriCycleV1 was 830 nucleotides in length and holds four 5′-GGGGATTC T-3′ pause sequences | separated by 9-nt nonsense spacers [1]. This “rumble zone” flanks a 210-bp region intended to remain undigested to “hold” the undigested aptamer arms (Fig. 1). We also designed a control sequence with no pause sequences, NonStop (N), to better characterize the influence of the pause sequences within the context of digestion. The greatest challenges we faced were determining how to pause the exonuclease, and how to measure where in the sequence it stopped digesting.

There are two sets of primer binding sequences encoded within both of the experimental gBlocks. The T7 forward and T7 reverse primer binding sites are located at the 3′ ends to enable PCR amplification of the entire gene block and Sanger sequencing (Fig. 1). The M13 forward and M13 reverse primer binding sites are positioned downstream of their T7 counterparts. The M13 primers are to be used as a second measure of digestion efficiency and measuring where the exonuclease halted digestion when used to perform Sanger sequencing. The primer sequences were sourced from the University of California, Berkeley DNA Sequencing Facility stock primer sequences [2].

To amplify the gBlock, we use Touchdown PCR (TD-PCR) to avoid potential hairpin formation on one of our primer sites (Fig. 2). We use OneTaq DNA polymerase because it does not exhibit exonuclease activity. The aforementioned primers and primer binding sites are employed during Sanger sequencing as performed by the University of California, Berkeley DNA Sequencing Facilities.

Fig. 1 TriCycleV1 Gene Block — **Fig. 1.** TriCycleV1 Gene Block.

Fig. 2 Hairpin structure of T7 reverse primer — **Fig. 2.** The hairpin structure of the reverse primer T7 from the gene block TriCycleV1 at 51 °C. Visualized using UNAfold **[3]**.

Build

For the purpose of experimenting with the “rumble zone”, we ordered our TriCycleV1 and NonStop sequences as gBlocks from Integrated DNA Technologies (IDT) [4]. We similarly ordered T7 forward and T7 reverse primers from IDT. We amplified the gBlocks using Touchdown PCR with 2X OneTaq Master Mix. We started the annealing temperature at 64 °C and went down −1 °C per cycle for 14 cycles to reach a final annealing temperature of 50 °C.

After sequencing, raw .ab1 files were aligned using Geneious Prime’s De Novo assembler for each sample, combining the four reads from each primer and to generate assembled contigs [5]. These assembled contigs were mapped back to their respective template sequence to assess digestion. The mapped contigs were subsequently used to visualize the results of our experiments.

Test

We amplified our gBlocks using 2X OneTaq Master Mix in a 50 µL reaction format with 1 µL of our 10 ng/µL template DNA, 10 µM forward, and 10 µM reverse primer. We verified the amplicons on a 1% agarose gel with SYBR Safe to ensure our primers were viable. Then, we digested approximately 200 ng of each experimental gBlock at 37 °C with Lambda exonuclease following the manufacturer specifications for reaction conditions [6]. We then cleaned up the DNA using the “Monarch Spin PCR & DNA Cleanup Kit (5 µg)” following manufacturer instructions [7]. We used a NanoDrop spectrophotometer to verify the concentration and purity of our DNA.

We tested different methods of DNA isolation and exonuclease inactivation such as: EDTA, chloroform extraction, ethanol precipitation, guanidinium isothiocyanate (GITC) extraction, ammonium acetate extraction with both glass beads and a silica column. Protocols can be referenced on our protocol page.

The digestion products were divided into four 10 µL aliquots, each sequenced using a different primer (M13 forward, M13 reverse, T7 forward, T7 reverse). We then sent the aliquots to UC Berkeley for Sanger sequencing.

Learn

We confirmed that our gBlocks were the correct size and our primers had successfully bound and amplified our target gene blocks (Fig. 3). The resulting amplicons were at the expected lengths—approximately 830 bp for TriCycleV1 and 786 bp for NonStop, respectively.

For each inactivation method tested, the DNA yield after purification was consistently poor. However, chloroform inactivation produced the highest yield compared to the other methods tested. Therefore, we believed that all subsequent digestions were to be performed with chloroform inactivation.

Following the first round of Sanger sequencing, we observed that the gBlocks were hardly being digested by Lambda exonuclease. We observed digestion by analyzing peak height on the chromatograms after De Novo assembly. A sharp drop in peak height can be associated with lower confidence due to lower quantity of fragments at that position. Based on the consensus results, we estimated that only 20 base pairs were being eaten in 15 minutes, which was significantly slower than expected (Fig. 4). The difference between digestion on the TriCycleV1 and NonStop gBlocks was indistinguishable.

A second round of Sanger sequencing was performed on samples that were subjected to longer digestion times, but unfortunately, the results were similarly inconclusive. The read quality was not interpretable, with some of the reads going up to approximately 2000 base pairs—twice the length of the gene block (Fig. 5). There was no consistency between the amount of digestion seen on the 5′ ends for each sample.

The low read quality was likely due to low concentrations of DNA within the digested samples. In addition to the low DNA quantity, we had means to verify each aliquot contained equivalent concentrations and a proportional number of digested products. This issue is corroborated by multiple reads from one sample lacking consistent digestion from primers on the same strand.

Further analysis revealed that the gene blocks were not ordered with 5′-phosphorylated ends. The lack of phosphorylation significantly slows down and impairs the ability of Lambda exo to bind and digest the DNA [8]. Thus, the absence of these modifications may explain the limited digestion we observed. Therefore, in our next experiments we will test lambda’s digestion of 5′-phosphorylated DNA substrates.

Fig. 3 Gel electrophoresis results — **Fig. 3.** Gel electrophoresis results run on a 1% agarose gel with SYBR Safe. Lane 1 contains 1 kb Plus DNA Ladder from NEB. The second and third lanes were loaded with 6X Purple Loading Dye from NEB and contain the NonStop and TriCycleV1 samples, respectively.

Fig. 4 Sanger sequencing results 15 min digestions — **Fig. 4.** First round of Sanger sequencing results for 15 minute digestions of NonStopV1 (top) and TriCycleV1 (bottom). N1, N2, N3, N4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T1, T2, T3, T4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Fig. 5 Sanger sequencing results 30 min digestions — **Fig. 5.** Second round of Sanger sequencing results for 30 minute digestions of NonStopV1 (top) and TriCycleV1 (bottom). N1, N2, N3, N4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T1, T2, T3 reads were read using M13 forward, M13 reverse, T7 forward respectively. Visualized using Geneious Prime **[5]**.

Iteration 2

Design

GOAL: Optimize our digestion protocol to integrate 5′-phosphorylated ends on our gene blocks.

Design:

Our second iteration of testing began with designing a new gene block, TriCycleV2 (TV2), which contained a modified pause sequence lacking the G-quadruplex (G4), a four basepair repeat of guanines. We hypothesized that in the absence of the G4, the polymerase could more effectively replicate our sequence during Sanger sequencing and PCR. The new gBlock was 838 basepairs and still contained the putative pause sequence 5′-GGCGGATTCT-3′ in a quartet manner. The new control sequence, NonStopV2 (NV2), was shorter at 764 basepairs. Both of the new gBlocks were checked for G4’s within their sequences.

We switched to using Q5 High-fidelity DNA Polymerase rather than OneTaq DNA Polymerase in our TD-PCR reactions to amplify our gene blocks. This change was a response to the previously low levels of amplification observed when using OneTaq. Despite the Q5 polymerase exhibiting exonuclease activity, it has much higher fidelity. We decided to further test ethanol precipitation as a means of DNA purification after heat inactivation due to the low concentration of DNA and high salt content present in our samples after cleanup. Additionally, we prepared a custom membrane binding buffer and membrane wash buffer to improve DNA yield and minimize contamination during silica column purification.

Build

To facilitate Lambda exonuclease binding and digestion, the primers and gene blocks were ordered from IDT with 5′-phosphorylated ends. TriCycleV2 contained the new stop sequences, but otherwise both gene blocks emulated the original design (Fig. 1). We began to exclusively employ heat inactivation, incubating the digestion reaction at 80 °C for ten minutes, as we believed it was sufficient to inactivate the exonuclease. This approach allowed us to proceed immediately to silica column cleanup and therefore maximize DNA yield.

The TD-PCR protocol was changed to adapt to the Q5 polymerase, and allow for more specific annealing. The protocol begins the first annealing cycle at +3 °C above the annealing temperature, which is then brought down by −0.5 °C each cycle, whereas in the previous iteration we were taking the temperature down −1.0 °C each cycle for 10 cycles. After 10 cycles, this would take us to −2 °C below the annealing temperature for our primers. This would then be held for the remaining 20 cycles.

Test

The gBlocks were amplified and tested using the NanoDrop spectrophotometer to determine its purity and concentration prior to being digested by Lambda exo in similar fashion to iteration 1.

For this iteration of testing, we digested for 30 seconds, 45 seconds, 1 minute, 1 minute 15 seconds, and 1 minute 30 seconds. Since the gBlocks were phosphorylated, we needed to know if 15 minutes was too long because the theoretical rate of digestion should be close to 12 bp per second which is significantly faster than when the substrate is unphosphorylated [1]. Each reaction contained approximately 1000 ng of template DNA and was run in a 50 µL reaction format following NEB’s reaction Lambda exonuclease mixture specifications [6].

We tested several silica columns to see which would yield the most concentrated DNA with the least amount of contamination. We also tested ethanol precipitation as an alternative method of DNA purification.

Learn

When we performed digestions on 5′-phosphorylated gene blocks, shorter digestion times did not yield significant results. Specifically, shorter reaction times like 45 seconds produced higher ending concentrations of DNA after digestion and silica column cleanup. Despite yielding higher DNA concentrations and lower salt concentrations, these digestion times were too short for the exonuclease to reach the pause sequences. The undigested gBlock chromatograms looked almost identical to those that were digested; thus, our results were inconclusive (Fig. 6). The lack of evidence for digestion led us to believe that the digestion times were either too short for the exonuclease to bind and digest the DNA, or there was an inconsistent amount of starting DNA in each aliquot.

Fig. 6 Sanger sequencing results for 45 second digestions — **Fig. 6.** Sanger sequencing results for 45 second digestions of NonStopV2 (top) and TriCycleV2 (bottom). N1, N2, N3, N4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T1, T2, T3, T4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Test 2: For the next round of Sanger sequencing, we repeated digestions using longer incubation times. Digestions were performed at 37 °C for 2, 3, 5, and 7 minutes for both TriCycleV2 and NonStopV2. Since our previous experiments with digestion times below 1 minute and 30 seconds showed limited digestion, testing longer times would give us a broader range of results to measure rate of digestion and better understand Lambda’s processivity.

Learn 2: As expected the 2 minute digestion produced higher concentrations of DNA after cleanup compared to 3, 5, and 7 minutes. The Sanger results were still better than the V1 gBlocks. In particular, the 7 minute digestion of TriCycleV2 showed cohesive digestion for ~62 base pairs off the forward primers and ~17 base pairs for reverse primers (Fig. 7). The complementary NonStopV2 showed digestion for ~82 base pairs off the forward primers and ~55 base pairs for the reverse primers. The M13 forward and reverse primers for NonStopV2 showed a noncoding region of ~100 base pairs past the DNA sequence. We learned that there is a correlation between the duration of Lambda exo digestion and DNA concentration, further substantiating the time dependence of digestion.

Fig. 7 Sanger sequencing results for 7 minute digestions — **Fig. 7.** Sanger sequencing results for 7 minute digestions of NonStopV2 (top) and TriCycleV2 (bottom). N13, N14, N15, N16 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T13, T14, T15, T16 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Iteration 3

Design

GOAL: Analyze Lambda exonuclease’s processivity using 2% Agarose E-Gel electrophoresis methods.

Design:

Following our consistently inconclusive Sanger sequencing results, we needed to know how we were analyzing and visualizing Lambda processivity. Running our digested products on an E-Gel would tell us if our products were digested based on the intensity and distance traveled by the bands. We plan to run lanes side by side containing TriCycleV2 and NonStopV2 digested for the same amount of time. The intensity of the bands reflects the concentration of DNA present in that well, therefore we expect longer digestion times would exhibit lower intensity. We would expect to see that the undigested template DNA has the highest intensity band and is ideally the longest as well. There should be a pattern of decreasing size and intensity as digestion time increases. We prepared a 10 minute digestion of both gBlocks to see if we can see digestion happening compared to our longer digestions, where entire gBlock strands are likely fully eaten.

Build

We employed Lambda exonuclease to digest the gBlocks for 3 different times: 1 hour, 2 hours, and 4 hours. We chose significantly longer times—longer than recommended—as it would guarantee Lambda had time to digest the gBlocks and hopefully show stark contrast in intensity. This would confirm that Lambda exonuclease is functional and help us measure overall processivity.

Test

We ran our digested samples on an Invitrogen E-Gel Size Select II, 2% Agarose gel [9] with SYBR Gold II Gel Stain [10]. Each lane contained approximately 100 ng of DNA from digested samples diluted in 22.5 µL nuclease-free water and 2.5 µL of 10X ThermoFisher E-Gel loading dye [11]. The six total digestions were run on the E-Gel, with 100 ng of undigested TriCycleV2 gBlock as a reference. The gel was run for 12 minutes.

Learn

When we ran the E-gel, the 4 hour digested NonStop sample displayed the faintest band, supporting the conclusion that longer digestion times do result in greater digestion and therefore Lambda is functional (Fig. 8). Overall, the NonStop lanes consistently showed fainter bands compared to their counterpart TriCycle digestions at each digestion time. The undigested TriCycleV2 gBlock produced the brightest band, and the 1 hour TriCycleV2 digestion had the second brightest band. All of the digested bands migrated farther down the gel when compared to the undigested template, and they are all less intense in brightness. These results indicate that less DNA is present at higher digestion times. However, it is still unclear whether Lambda exonuclease is completely digesting some strands and leaving others intact rather than partially digesting most strands.

Fig. 8 E-Gel Size Select II, 2% agarose gel results — **Fig. 8.** E-Gel Size Select II, 2% agarose gel results. From left to right (1 to 7) the lanes are as follows: template TriCycleV2, 4 hour N, 4 hour T, 2 hour N, 2 hour T, 1 hour T, and 1 hour N.

Test 2: The 4 hour, 2 hour, and 1 hour digestion samples were aliquoted and sent to be Sanger sequenced. We also sent a 10 minute digestion of both gBlocks.

Learn 2: The Sanger sequencing results were consistent with the results from previous iterations: excessively long, low quality reads, with evidence of partial or no digestion for various primers. This led us to believe that some sequences were partially digested or not digested at all. For the 10 minute digestion, the results showed that there were approximately 50 base pairs digested (Fig. 9). However, the 1 hour digestion looks approximately identical to the 10 minute digestion, despite the significantly longer digestion time (Fig. 9, 10). Because Sanger sequencing resolution is limited at the ends of reads, we cannot conclusively determine whether the gBlocks were actually digested or if this is all a reflection of the method’s low resolution at the 5′ ends.

These observations contributed to our hypothesis that the amount of Lambda exonuclease and reaction buffer may be insufficient for the 2000 ng of DNA added to each reaction. Therefore, we decided to increase both the enzyme and buffer volumes by five times (5×) in subsequent digestions.

Fig. 9 Sanger results for 10 minute digestions — **Fig. 9.** Sanger sequencing results for 10 minute digestions of NonStopV2 (top) and TriCycleV2 (bottom). N1, N2, N3, N4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T1, T2, T3, T4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Fig. 10 Sanger results for 5× reaction mixtures at 1 hour — **Fig. 10.** Sanger sequencing results for 1 hour NonStopV2 (top) and 1 hour TriCycleV2 (bottom). N5, N6, N7, N8 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T5, T6, T7, T8 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Iteration 4

Design

GOAL: Digest gBlocks using 5-times the amount of Lambda exonuclease and of 10X Lambda reaction buffer.

Design:

Following the E-Gel and Sanger sequencing, we changed how we visualized digestion amounts. We thought the amount of Lambda was not proportional with the amount of DNA we were digesting, therefore by increasing (5x) the amount of Lambda exonuclease and Lambda exonuclease buffer, we expect to see more significant digestion.

Build

Both gBlocks were digested using 5x Lambda exonuclease and 5x Lambda reaction buffer to increase proportionality and processivity of Lambda. We digested using Lambda exonuclease for 10 minutes and 1 hour, containing 2000 ng of DNA in each digestion. These times were chosen to once again see if long or short digestion times affect how Lambda will digest, and can be compared back to previously sequenced 1x reactions.

Test

We digested the gBlocks for 10 minutes and 1 hour at standard reaction conditions, aside from the increased exonuclease concentration. The total reaction mixture for the 10 minute digestions of both gBlocks increased to 56 µL total for NonStop and 57 µL total for TriCycle. The total reaction mixture for the 1 hour digestions of both of our gBlocks increased to 74 µL total.

Learn

The 1 hour digestion had significantly lower DNA concentration and purity when compared to the 10 minute digestion, which is consistent with our previous results (Fig. 11.). Although NanoDrop measurements are not always highly precise, these results revealed an interesting overall trend in the relationship between digestion time with sample purity and yield.

As mentioned, DNA concentration measured by the NanoDrop for the 1 hour digestion was significantly lower when compared to that of the 10 minute digestion, suggesting that much of the starting material had been degraded. As visualized with Sanger sequencing, the remaining DNA appeared to consist of intact, full-length sequences with little to no distinguishability between the TriCycle and NonStop samples (Fig. 12, 13.). Therefore, the amount of Lambda exo and buffer did not lead to significant changes in processivity.

These findings supported our overall hypothesis that Lambda may be digesting entire gBlock strands rather than digesting partial strands. Additionally, we hypothesize that the quantity of DNA being digested is insufficient to produce a noticeable shift in chromatogram peak height. It is counterintuitive that we see such contrast in concentration and purity of DNA in the NanoDrop results, yet we see no prominent shifts in chromatograms.

Fig. 11 NanoDrop analysis post digestion and cleanup — **Fig. 11.** Post digestion and column purification analysis on the NanoDrop for 1 hour NonStop and 1 hour TriCycle (left), 10 minute NonStop and 10 minute TriCycle (right). For each, NonStop is sample 1, Tricycle is sample 2.

Fig. 12 Sanger results for 10 minute digestions — **Fig. 12.** Sanger sequencing results for ten minutes NonStopV2 (top) and ten minutes TriCycleV2 (bottom). N1, N2, N3, N4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T1, T2, T3, T4 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Fig. 13 Sanger results for 1 hour digestions — **Fig. 13.** Sanger sequencing results for one hour NonStopV2 (top) and one hour TriCycleV2 (bottom). N5, N6, N7, N8 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. T5, T6, T7, T8 reads were read using M13 forward, M13 reverse, T7 forward, T7 reverse respectively. Visualized using Geneious Prime **[5]**.

Iteration 5

Design

GOAL: Assay our digestion samples using loading dye with a denaturing agent on a 1% agarose gel and a 2% agarose E-Gel.

To investigate how the exonuclease behaves on the individual strands of the gBlocks, we aimed to denature our digested samples. This required running the products on a denaturing gel or using a denaturing loading dye. We chose 2X RNA Loading Dye containing formamide [12]. The formamide denatures the DNA, allowing us to observe any differences in migration distance and intensity of bands, which could indicate variability of Lambda exo binding and processing of each strand.

Build

We ran digestions for 4 hours, 2 hours, and 1 hour, reverting back to standard protocol amounts (1×) for Lambda exonuclease and Lambda exonuclease buffer. We used a 10 minute inactivation step at 80 °C.

We employed both an E-Gel and traditional gel electrophoresis to analyze our digestion samples. The SizeSelect II E-Gel contains SYBR Gold stain, a highly sensitive dye capable of detecting very low concentrations of nucleic acids. The E-Gel also allows for easy extraction; therefore bands of interest can be readily selected and used for subsequent experimentation, such as Sanger sequencing. In contrast, the traditional 1% agarose gel is stained with SYBR Safe to compare band separation and fluorescence. Both gels were tested to determine which method produced more interpretable results.

Test

We used the RNA loading dye on an Invitrogen E-Gel SizeSelect II, 2% agarose gel. We loaded the gel with our digested samples and an undigested TriCycleV2 gBlock to use as a reference. Each well contained approximately 20–30 ng of DNA diluted in 12.5 µL of nuclease-free H₂O with 12.5 µL of 2X RNA Loading Dye for a total 25 µL sample. We ran the gel for 12 minutes in an E-gel Power Snap electrophoresis chamber.

We then ran the same protocol using the formamide loading dye on a 1% agarose gel. Each sample contained approximately 100 ng of DNA diluted in nuclease-free water with 10 µL 2X RNA Loading Dye for a total 20 µL sample. We ran the gel for approximately 45 minutes in a 0.5× TBE buffer.

Learn

The E-Gel exhibited decent resolution with individual bands distinguishable from the general lane smear (Fig. 14). There was no visible band in the template lane, likely due to the template DNA being left out from that sample, which made direct size comparison more challenging. However, each visible band was relatively similar in size, as expected. The four-hour digestions run on the E-Gel produced fewer and fainter bands, suggesting more complete digestion of the DNA when compared to the shorter digestion times. Bands seen for the shorter digestions seemed to migrate less far down the gel as well.

The 1% agarose gel showed several bands for each sample but there was less correlation between the brightness of the bands and digestion time. The presence of multiple bands of varying sizes in each sample led us to believe that digestion certainly occurred. Further, the most intense bands for each sample increase in size as digestion time decreases, further corroborating our findings from the previous E-Gel.

In future experiments, we would like to test the addition of formamide to a higher-percentage agarose gel to improve band resolution and intensity. We would also like to design smaller testing sequences to run on such a gel to more accurately analyze subtle differences in digestion.

Fig. 14 Comparison of 2% E-Gel and 1% agarose gel results with denaturing RNA loading dye — **Fig. 14.** E-Gel using 2X RNA Gel Loading dye (left), from left to right the lanes contain: Template TriCycleV2, 4 hour N, 4 hour T, 2 hour N, 2 hour T, 1 hour N, 1 hour T. The 1% agarose gel using RNA loading dye (right), from left to right these wells contain: Template TriCycleV2, 4 hour T, 4 hour N, 2 hour T, 2 hour N, 1 hour T, 1 hour N, and the remainder of the wells were loaded with 2X RNA Gel Loading dye.

Citations

References

T. T. Perkins, R. V. Dalal, P. G. Mitsis, and S. M. Block, “Sequence-Dependent Pausing of Single Lambda Exonuclease Molecules,” Science, vol. 301, no. 5641, pp. 1914–1918, Sep. 2003, doi: 10.1126/science.1088047.
UC Berkeley DNA Sequencing Facility, “Free stock primers,” https://ucberkeleydnasequencing.com/stock-primers-free-1 (accessed Oct. 7, 2025).
Nicholas R. Markham, Leslie S. Zuker, Michael Zuker, “UNAFold.” Accessed: Jul. 21, 2025. Available: https://www.unafold.org/.
“gBlocks & gBlocks HiFi Gene Fragments | IDT,” Integrated DNA Technologies. Accessed: Jul. 23, 2025. Available: https://www.idtdna.com/pages/products/genes-and-gene-fragments/double-stranded-dna-fragments/gblocks-gene-fragments.
Geneious Prime 2025.0. https://www.geneious.com.
New England Biolabs, “NEB #M0262.” Accessed: Jul. 12, 2025. Available: https://www.neb.com/en-us/protocols/2019/07/24/protocol-for-lambda-exonuclease-m0262.
New England Biolabs, “NEB #T1130.” Accessed: Jul. 23, 2025. Available: https://www.neb.com/en-us/protocols/2024/07/16/standard-cleanup-protocol-using-the-monarch-spin-pcr-and-dna-cleanup-kit-and-centrifugation.
K. Subramanian, W. Rutvisuttinunt, W. Scott, and R. S. Myers, “The enzymatic basis of processivity in λ exonuclease,” Nucleic Acids Research, vol. 31, no. 6, pp. 1585–1596, Mar. 2003, doi: 10.1093/nar/gkg266.
“E-Gel™ SizeSelect™ II Agarose Gels, 2% 10 gels | Buy Online | Invitrogen™.” Accessed: Aug. 25, 2025. Available: https://www.thermofisher.com/order/catalog/product/G661012.
“SYBR™ Gold Nucleic Acid Gel Stain (10,000X Concentrate in DMSO) 500 μL | Buy Online.” Accessed: Oct. 7, 2025. Available: https://www.thermofisher.com/order/catalog/product/S11494.
“E-Gel™ EX Agarose Gels 10 Gels/Pk | Buy Online | Invitrogen™.” Accessed: Oct. 7, 2025. Available: https://www.thermofisher.com/order/catalog/product/G401001.
“RNA Loading Dye (2X).” Accessed: Oct. 7, 2025. Available: https://www.neb.com/en-us/products/b0363-rna-loading-dye-2x.

Cycle 4 template — iterations + citations

Overview

Project Summary

Through the utilization of the pUC19 plasmid as a cloning vector, we aimed to design a Golden Gate compatible insert that would generate a final assembled "safeTEA" plasmid.

This final plasmid design, when in the presence of lactose, would be de-repressed to form a pUC19 backbone, which functions as an interior double-stranded segment between two aptamer arms capable of selectively binding to target molecules in aqueous solutions.

Our insert was modeled to be under the control of the LacZ promoter, so that lactose regulation may occur. Due to native pUC19 not having an active LacI gene, it would need to be added to the system to allow for its binding to the DNA of the lac operon, preventing spontaneous derepression and unintended maturation of the backbone-aptamer arm system.

When lactose is introduced to this designed system, it can then bind to LacI, splitting it and inhibiting its ability to bind to the lac operon promoter. This then allows for our restriction enzyme to cleave the double-stranded DNA located between the two aptamer components of the insert, and subsequently for the lambda exonuclease to cut away the anti-aptamer strands from their exposed 5′ ends. The lambda exonuclease is slowed down at rumble zones included in our insert, in order to protect the double-stranded backbone interior between the two aptamer arms.

This ultimately linearizes the plasmid, freeing the aptamers from the anti-aptamers, making them available for target molecule binding.

While lactose presence allows for transcription, under the presence of glucose, either alone or in addition to lactose, this transcription is suppressed, allowing for glucose to be utilized as another method for suppressing plasmid maturation until its final form is desired.

Therefore, both in the absence of lactose as well as in the presence of glucose, transcription, and ultimately plasmid maturation, is suppressed, allowing for the plasmid to remain intact for reproducibility, transportation, and longevity purposes.

Version 1.0

Design

We created an aptamer–anti-aptamer structure. The vector insert contains the aptamer sequences on the sense strands created upon the cleaving of the double-stranded DNA at our recognition site, and anti-aptamer sequences on the anti-sense strands. Therefore, once the plasmid is cleaved by our restriction enzyme, the exposed 5′ ends would be that of the anti-aptamer sequences rather than the aptamer sequences. This ensures that only the 5′ end anti-aptamer sequences are digested by the lambda exonuclease, while leaving the 3′ end aptamer sequence strands fully intact.

The following are the key components that we used to build our insert:

Spacer sequences: With our program NOODL, we are able to identify spacer sequences that can go in between our aptamers without interfering with hairpin formation. These spacers are spread out across our vector to minimize aptamers from entangling with each other.
Restriction Endonuclease cut site: for pUC19, we use the KpnI restriction enzyme, which creates a double-stranded cut at the recognition sequence (5′-GGTACC-3′) [1]. This recognition site is placed in the middle of our vector insert to, once cut, create the aptamer arms.
Restriction Endonuclease Coding Sequence: The coding sequence is under the LacZ promoter of pUC19 to create the restriction endonuclease that cleaves at our middle restriction site.
Lambda Exonuclease Coding Sequence: Once we have the linearized DNA carrying our aptamers with the restriction endonuclease, we need a way to cut the anti-aptamer away from the backbone, as its presence would impede on the efficacy of the aptamer. We chose Lambda exonuclease to digest at any exposed 5′-phosphorylated end of dsDNA, which was designed to be our anti-aptamer. The Lambda exonuclease coding sequence is under the control of the LacZ promoter, and we have attached a HisTag for production quantification. We include start and stop codons for transcription regulation, also as part of our lactose-activated promoter system.
Rumble Zone: These are four consecutive pause sites intended to slow Lambda exonuclease. These are placed on each side of the insert to protect the backbone from exonuclease digestion.

To create our plasmid, we utilized Golden Gate Assembly for multi-fragment insertion and constructed assembly models through Geneious Software [2]. Golden Gate Assembly uses a restriction enzyme to create overlapping “sticky ends” between each fragment and the associated backbone. We chose the BsaI restriction enzyme due to the recognition site (5′-GGTCTC-3′) not being present anywhere else on the plasmid, which ensures no off-target digestion. Our fragments consisted of our vector, in the form of a gBlock and Ultramer duplex, and our pUC19 plasmid backbone. The gBlock was designed to already include the BsaI restriction sites flanking both ends.

Build

Using PCR with flagged primers, we added BsaI sites to the ultramer, gBlock and backbone to be digested by BsaI to create the “sticky ends” for scarless plasmid assembly. We use multistep Golden Gate assembly at a 1:2:2 molar ratio with the backbone at 75 ng, the geneblock at 77 ng, and the ultramer at a value of 21.6 ng. After Golden Gate Assembly, we transform DH5α E. coli cells through electroporation or chemicompetent Mix&Go cells (from ZymoResearch Mix&Go! Competent). Once transformed, these cells can be grown in 5 mL LB cultures with 5 µL of ampicillin and on LB plates that have been treated with ampicillin. Plates are then grown overnight at 37 °C while the cultures are additionally shaken at 80 rpm.

Test

With our Golden Gate PCR product, we ran a gel consisting of two Golden Gate reactions that underwent DNA cleanup (see DNA Cleanup in Protocols). For this we utilized a 1 kb plus DNA ladder and a 1% agarose gel.

Figure 1: 1% gel electrophoresis of Golden Gate reactions — **Figure 1.** 1% Gel electrophoresis visualized with SYBR Safe dye and NEB 1 kb Plus Ladder. Well 1: DNA ladder, Well 2: Golden Gate reaction 1, Well 3: Golden Gate reaction 2.

Learn

After gel analysis, we concluded that we may not be seeing any visible amplicons because it was untransformed. To mediate this, we chose to utilize and visualize this reaction after transformation, as well as adding a control to determine if it was a thermocycler setting issue.

Version 1.1

Design

We performed colony PCR of 10 grown colonies using gene-specific primers to confirm that our Golden Gate integrated our gene insert. For the initial Golden Gate gel, we did not provide a control of native pUC19. Using gene-specific primers: one flanking the backbone and the geneblock, and the other flanking the ultramer. In hopes of improving our results, we provided a native pUC19 control. We used the NEB OneTaq 2X Master Mix for nine 50 µL reactions (see Colony PCR SUMO in Protocols). 1% gel electrophoresis was then used to visualize amplicons.

Build

With colony PCR products, we ran a gel consisting of three different colonies from an index plate made of LB and ampicillin consisting of nine colonies. For this we utilized native pUC19 and a 1% agarose gel. The first four wells were initially run and re-run on the other end of the gel after fixing a calculation error.

Test

Figure 2: 1% gel electrophoresis for Version 1.1 colony PCR products — **Figure 2.** 1% Gel electrophoresis visualized with SYBR Safe dye and NEB 1 kb Plus Ladder. Well 14: DNA ladder, Well 13: Native pUC19, Well 12: Colony 8, Well 11: Colony 7, Well 10: Colony 6.

Learn

With the visualization of pUC within the gel and having strictly primer bands in wells 13–10, we hypothesized a possible problem with the amount of ampicillin, our cells, or the gene-specific primers.

Version 1.2

Design

We performed colony PCR using plasmid and gene specific primers to confirm that our golden gate integrated our gene insert and troubleshoot the concerns of our previous gel. We ran our golden gate product against both native pUC19 and native untransformed DH5α cells. We used NEB OneTaq 2X Master Mix Protocol for ten 50 µL colony PCR reactions with a 30 second initial denaturation time. 1% Gel electrophoresis was then used to visualize amplicons. For our two controls, we expected to see no bands for DH5α to verify that they were not already transformed with a plasmid, as well as having a band around 600 bp for pUC19, as it was treated with the gene specific primers.

Build

To optimize our gel, we utilized two different sets of primers to test our new plasmid specific primers while validating our gene specific primers. The plasmid specific primers aid us in testing through allowing visualization of “zeros” within the colony PCR by amplifying the area of our gene insert but with the annealing site being backbone-specific. Gene specific primers, on the other hand, allow us to see if we have the correct size insert by annealing to both sides of the insert and amplifying. The gene specific primers cannot bind to a “zero” pUC19 plasmid due to the lack of annealing sites present, therefore no band would show up at all.

Test

Figure 3: 1% gel electrophoresis for Version 1.2 colony PCR products — **Figure 3.** 1% Gel electrophoresis visualized with SYBR Safe dye and NEB 1 kb Plus Ladder. Well 1: Ladder, Well 2: DH5α cells, Well 3: pUC19, Wells 4–13: colonies, Well 14: Ladder. For Wells 2–8, gene specific primers were used. For Wells 9–13, plasmid specific primers were used.

Learn

With the verification of our expected control results and a lack of band for our colonies, we focused on primer bands and remanence of chromosomal DNA. For gene specific primers, we saw a lack of chromosomal DNA in all wells except well 7. For plasmid specific primers, we saw primer bands and a complete lack of chromosomal DNA. After verification of plasmid primer sites in Geneious and the knowledge that in order for the cells to have grown on ampicillin plates, they had to have taken up the plasmid, we decided to look over the plates again to determine any mistakes in ampicillin calculation or the possibility of killing our cells throughout the process. We then learned that our ampicillin calculations had been under an assumed concentration of 1000 mg/mL rather than its correct concentration of 100 mg/mL. A pivot was then made to recreate plates and redo our golden gate reaction and transformation.

Version 1.3

Design

Based on the previous cycle, we chose to utilize the plasmid specific primers due to the fact that it visualizes “zeros”, native plasmids, and golden gate products. We then optimized our thermocycler settings by increasing the time for the initial denaturing step to five minutes.

Build

Through an adjustment in ampicillin calculation, as well as redoing the golden gate reaction and transformation, we created four new cultures, in which only two grew. Those then went through plasmid isolation and were run on a 1% agarose gel.

Test

Our reformed colony PCR protocol was created with a longer initial denaturation time of 5 minutes instead of 30 seconds and only using plasmid-specific primers. The amplicons were visualized with 1% gel electrophoresis.

Figure 4: 1% gel electrophoresis for Version 1.3 colony PCR of golden gate transformed cells — **Figure 4.** 1% gel electrophoresis visualized using SYBR Safe gel stain for colony PCR of golden gate transformed cells. Well 1: 1 kb Plus Ladder, Wells 2–3: golden gate reactions 1, 2.

Learn

With an expected band size of around 2.1 kb, the band shown in the gel led us to hypothesize three main ideas: either that the whole time, we were not using pUC19 as we thought, the insert design was incorrect, or that there was a miscalculation when doing the math for golden gate. During review, we noticed that the geneblock has similar overlapping ends which would allow for annealing to one another. With that in mind, we learned that our product value was too high, telling us that it was a band of annealed geneblock. To fix this, we worked to nanodrop each part and double-check our math.

Version 1.4

Design

In order to provide our golden gate with the best chance of success, we carefully redid each step that led up to the thermocycler reaction.

Build

Under blue–white screening, we found that when grown, some colonies showed to be white while others proved to have a blue color. This tells us that some plasmids could be broken and therefore, unable to fully act in the way we intended. This allowed us to grow blue colonies in liquid culture and perform inverse PCR in order to make a new backbone that we know for sure has all of our required components. Geneblock and the ultramer were both amplified and fresh OneTaq was utilized to avoid any possible issues with freeze–thaw sensitivity.

Test

Our reformed colony PCR protocol was created with eight colonies and gene specific primers. The amplicons were visualized with 1% gel electrophoresis.

Figure 5: 1% gel electrophoresis for Version 1.4 colony PCR — **Figure 5.** 1% Gel electrophoresis visualized with SYBR Safe dye and NEB 1 kb Plus Ladder. Well 1: Ladder, Well 2: pUC19 treated with gene specific primers, Wells 3–10: Colonies 1–8.

Learn

Through multiple testing of golden gate plasmid reactions, isolated plasmids and colony PCR reactions, we were unable to get the expected amplicons from our golden gate reaction. Some possible reasons for this could be due to non-specific binding, incorrect concentration ratios, incorrect transformation, or a lack of a selection factor specific to our assembly product. Through the addition of a selection factor not present within the native plasmid, growth due to this factor would have selected plasmids containing our desired insert.

Version 2.0

Design

When transforming our native pUC19 plasmid into the DH5α strain of E. coli, we had questions on whether the pUC19 plasmid had successfully transformed our cells, and determined that we could use a procedure known as “blue-white screening” as a tool for this success.

This screening relies on a few conditions. The first of which is that the host cell, in our case DH5α has a lacZΔM15 deletion mutation. This mutation results in the cell being incapable of generating the β-galactosidase enzyme, which cleaves lactose upon the presence of the lactose compound [1].

This mutation, while resulting in the cell being unable to produce this enzyme on its own, does cause the cell to have the omega fragment of the β-galactosidase enzyme. On its own, this fragment is nonfunctional for enzyme production. However, the pUC19 plasmid contains the remaining sequence for the production of the β-galactosidase that was removed from DH5α via the lacZΔM15 mutation. This is known as the alpha fragment.

When DH5α containing the omega fragment is successfully transformed with a pUC19 plasmid containing the alpha fragment, the transformant is capable of physically associating the two components, effectively creating functional β-galactosidase. This process is known as alpha-complementation.

When these transformants capable of alpha-complementation producing β-galactosidase are in the presence of a substrate known as X-Gal (5-bromo-4-chloro-3-indolyl-β-D-galactoside), the β-galactosidase hydrolyzes the substrate, ultimately producing a colony with a stark blue color [2].

Build

Upon transforming chemically competent DH5α cells with our pUC19 plasmid purchased from a vendor, we plated these transformed cells on LB + ampicillin (final concentration: 100 µg/mL) agar plates, and incubated overnight at 37 °C. With the appearance of colonies, we picked 16 well-defined individual ones off with the tip of a pipette, mixing them into individual solutions in 10 µL of DI water. From these created samples, we were then able to streak multiple plates.

We proceeded to this process, plating each of the sixteen colony samples on plates made with medium consisting of 500 mL of autoclaved LB + agar, which upon cooling to around 55 °C, had the addition of 500 µL of 1000× ampicillin stock (final concentration: 100 µg/mL), 1.25 mL of 400× X-Gal stock (final concentration: 50 µg/mL), and 500 µL of 1000× IPTG stock (final concentration: 0.5 mM). We then incubated these plates overnight at 37 °C.

Test

Upon performing this screening, we found that growing cells from colonies that were expected to be pUC19 transformed DH5α on the plates detailed above produced both individual white and blue colonies. These white colonies remained, even after prolonged incubation.

Figure 6: Blue–white screening plates for pUC19 in DH5α — **Figure 6.** Two plates, each with four different cell cultures taken from anticipated pUC19-transformed DH5α colonies, grown up on LB + ampicillin + X-Gal + IPTG media, showed mixed results with both blue and white colonies present.

Learn

The ampicillin + X-Gal + IPTG plate serves as a test for whether the DH5α cells have been successfully transformed with pUC19 in a manner that allows for alpha-complementation to generate the β-galactosidase enzyme responsible for the blue pigment.

Blue colonies indicate successful alpha-complementation between the transformed DH5α and native pUC19 plasmid.

White colonies indicate a lack of this complementation, and suggest that the components required are not present in the transformant.

This suggests that the pUC19 plasmid stock we purchased may not be entirely pure pUC19 plasmid. The ability for colonies to grow on these ampicillin plates without turning blue demonstrates that the cells had taken up some component that has ampicillin resistance, but did not contain the functional alpha fragment required for alpha-complementation. Therefore, while our pUC19 stock did include functional and complete plasmids, this is not true for the entirety of the product.

Version 2.1

Design

Concerned that the presence of both blue and white colonies may attributed to an issue in the plates themselves rather than the colonies consisting of both complete pUC19 transformants and incomplete pUC19 transformants, we decided to once again select the successful transformants as indicated by the blue colonies, and perform the same screening process as before.

This was designed with the intention of ensuring that when blue colony selected samples of in-tact pUC19 transformed DH5 alpha cells were replated in the same conditions, the presence of solely blue colonies would indicate that the previous white colonies were not a result of the plates, but instead a result of incomplete alpha complementation.

Build

We repeated the same protocol as before in re-plating blue selected colony samples on the same batch of ampicillin + X-Gal + IPTG plates. These were then incubated again at 37 °C overnight.

Test

When these individual blue colonies were re-plated and incubated, they grew up only blue colonies.

Figure 7: Replated blue colonies remain blue on LB + ampicillin + X-Gal — **Figure 7.** All blue colonies, when selected and replated on the same batch of LB + ampicillin + X-Gal plates, grew up only blue colonies.

Learn

The blue selected colonies, when grown up again on the same batch of blue-white screening plated, resulted in only blue colonies. This demonstrates that the white colonies from the initial plating were not due to ineffective plate media, but rather a result of ampicillin-resistant DH5 alpha without complete alpha-complementation.

We are sure that ampicillin was functioning, as the colonies did not grow a lawn, even after prolonged incubation periods, demonstrating the selection factor of the antibiotic was functional.

These findings, in conjunction with one another, reinforced the idea that the pUC19 plasmid stock we were working with did not consist solely of pure pUC19 plasmid in its entirety. There was clearly some additional component within the stock that was capable of transforming DH5 alpha cells, giving them ampicillin resistance, without providing the omega fragment needed for alpha-complementation.

Additionally, this conclusively confirmed that for colonies that were blue, alpha-complementation was successful, and the LacZ alpha, lac promoter, terminator, and ribosome binding site components are all active and usable in these transformed cells.

These findings also suggested that we may not have been using a complete pUC19 plasmid in our golden gate reactions. This initiated the idea of creating a liquid culture from blue selected colonies, which a plasmid miniprep may be performed on in order to get pure pUC19 plasmid that a new golden gate assembly reaction could be performed on.

Version 3.1

Design

As part of our deliverable, we had questions regarding whether or not the LacI regulatory gene was present in DH5 alpha cells. LacI, the regulatory gene of the lac operon, encodes the lac repressor protein in E. coli, where it regulates transcription in the absence of lactose [3]. When the cell is not in the presence of lactose, this repressor binds to the operator site of the lac operon. This prevents the transcription of genes downstream of the promoter region [4].

However, if lactose is present, the lactose molecules bind to the LacI, forcing it to un-bind from the operator, which allows for the transcription of genes downstream from the promoter of the lac operon, including those critical for lactose metabolization, which is the process that contributed to the blue pigmentation in blue-white screening.

Through researching this question, we found conflicting sources. Not only were we unable to determine whether or not the DH5 alpha E. coli cells contain LacI, but we were also unable to conclusively determine through literature whether, if present, this LacI would be active, and function in a manner that would provide us with the repression capabilities we were looking for.

In order to confirm if the LacI gene was present in our cells, we had to determine whether or not the presence of lactose was required in order for the lac operon to be de-repressed. One method of doing so is to use a lactose mimic, such as Isopropyl-β-d-thiogalactopyranoside (IPTG), in conjunction with the X-gal substrate when plating pUC19 DH5 alpha transformants.

If colonies on the X-Gal only plate are solely white, but blue colonies are present on the plate containing IPTG as well as X-Gal, then LacI is present and effectively represses the operator site of the lac operon. If blue colonies are present, then LacI is not present in the system, and lactose (or a lactose mimic) is not required to derepress the lac operon to allow for transcription of downstream genes.

Build

From the previous DBTL cycle, we had successful blue colonies, which we were then able to create index plates from. The validity of these index plates having only successful transformants was confirmed in the previous cycle, due to the fact that when samples from these index plate colonies were replated on blue-white screening media, only blue colonies grew.

From these index plates, we were then able to plate the same sample among multiple plates of different medium types.

We proceeded to do just this, plating each of the sixteen colony index plate samples on two different plate types.

The first plate medium consisted of 500 mL of autoclaved LB + agar, which upon cooling to around 55 °C, we added 500 µL of 1000× ampicillin stock (final concentration: 100 µg/mL), as well as 1.25 mL of 400× X-Gal stock.

The second plate medium was prepared exactly the same, except for the addition of 500 µL of 1000× IPTG stock.

These plates were then incubated at 37 °C overnight.

Test

Both of the plate media types successfully grew blue colonies, and only blue colonies, for each index colony sample.

Figure 8: Blue colonies on LB+Amp+X-Gal with and without IPTG — **Figure 8.** Blue colonies from the same sample grew blue colonies once again on LB + ampicillin + X-Gal plates, both in the presence of and absence of IPTG.

Learn

The results of blue colonies on both media types demonstrate to us that LacI is not present in the system, and therefore, lactose (or a lactose mimic) is not required to derepress the lac operon to allow for transcription of downstream genes.

This also indicates that there is no natural repression system of the lac operon promoter in this system.

Version 3.2

Design

Design: Realizing that a natural repressor was not present in the system, we sought out a new method to perform this functionality. We realized that the binding of RNA polymerase to the lac operon promoter requires the assistance of a catabolite activator protein (CAP). CAP, with its DNA binding capability, binds to a DNA region located upstream of the lac operon promoter, which assists the binding of RNA polymerase to the lac operon promoter itself, which is what initiates transcription of all downstream genes and functionalities [5].

Without CAP functioning in this way, RNA polymerase's ability to bind and begin transcription is extremely limited. CAP’s binding capability function is dependent upon the presence of cyclic AMP (cAMP). The cAMP transcription factor is not made when glucose levels are high.

Therefore, high concentrations of glucose ultimately inhibit the expression of the lac operon in cells through preventing cAMP production, which inhibits CAP's DNA binding ability, and thus limits the binding ability of the RNA polymerase.

This concept could thus be tested through plating confirmed blue colonies on LB + ampicillin + X-Gal plates, both with and without glucose, and then examining if the amount of time required for the development of blue colonies is prolonged when glucose is present compared to in its absence.

Build

With the concentration guidelines determined in previous DBTL cycles for blue-white screening plate making, we prepared plates consisting of LB + ampicillin + X-Gal. We then repeated this process for the second plate media type, this time including a filter-sterilized glucose solution, bringing the medium concentration to 1% glucose.

Using confirmed blue colony index plate samples, we then plated identical samples on both the plate types. These plates were then incubated at 37 °C.

Test

While the 0% glucose plates grew colonies that were visibly becoming blue within 8 hours of incubation, the 1% glucose plates did not begin to display this blue coloring until 36 hours of incubation.

Figure 9: 0% vs 1% glucose—blue-white screening timeline — **Figure 9.** LB + ampicillin + X-Gal plates, when made to be 1% glucose solution (left) suppresses the development of blue colonies, as demonstrated by this image taken 36 hours into incubation. The left plate with 1% glucose has only just begun to develop small blue colonies (shown above the “X” in the “X-Gal” label.

Learn

The difference in the appearance timeline of blue colonies between the 0% and 1% glucose solution plates indicates a 28 hour repression period of the lac operon in these pUC19 transformed cells when in the presence of 1% glucose media.

This confirms the functionality of glucose as a suppression method for the transcription of the lac operon, and therefore demonstrates that glucose may be an effective tool at ensuring that the maturation of our plasmid does not occur spontaneously.

Cycle 4 Citations

References

A. A. Hamed, M. Khedr, and M. Abdelraof, “Activation of LacZ gene in Escherichia coli DH5α via α-complementation mechanism for β-galactosidase production and its biochemical characterizations,” Journal of Genetic Engineering and Biotechnology, vol. 18, no. 1, Dec. 2020. DOI: 10.1186/s43141-020-00096-w.
M. R. Green and J. Sambrook, “Screening Bacterial Colonies Using X-Gal and IPTG: α-Complementation,” Cold Spring Harbor Protocols, vol. 2019, no. 12, p. pdb.prot101329, Dec. 2019. DOI: 10.1101/pdb.prot101329.
B. Gorke, “Activity of Lac repressor anchored to the Escherichia coli inner membrane,” Nucleic Acids Research, vol. 33, no. 8, pp. 2504–2511, Apr. 2005. DOI: 10.1093/nar/gki549.
Khan Academy, “The Lac Operon,” Khan Academy, 2021. Available at: https://www.khanacademy.org/.../the-lac-operon .
A. Tankeshwar, “Lac Operon: Mechanism and Regulation,” Learn Microbiology Online, Nov. 27, 2019. Available at: https://microbeonline.com/lac-operon-mechanism .

Software 2 iterations + citations

Iteration 1

Design

Our computational team set out to design single-stranded DNA spacer sequences that do not interact with neighboring coding regions of DNA. A common biological linker is a polyA tail which consists of many adenine bases strung together. This is not optimal when interacting with aptamers because any of the bases in the polyA tail could interact with any thymine bases in the aptamer/flanking sequence. Spacer sequences are influenced by user given parameters, such as target A/T content and specific crossing over methods, with the goal of avoiding base pair complements to neighboring sequences and minimizing specific motifs (such as palindromes, repeats, and folding). The key tool that NOODL implements is a genetic algorithm (GA) which is a search based optimization technique relying on principles of cell biology, genetics, and natural selection [1]. Having a GA at the core of our program allows each spacer to be evaluated in kmers (DNA substrings of length k), with each kmer being evaluated by itself to compound to a total score. The lower the score, the better the spacer.

An important requirement for our application, driven by our experimental procedures, was to strictly avoid interaction between a spacer sequence and its flanking regions. Mutation does not occur in these fixed regions and therefore the crossover and mutation functions act only on the variable portion of the spacer, thus preserving the shape of the aptamer. When brainstorming how to write our scoring function, we decided that it would penalize any predicted hybridization. Since this folding can inhibit target molecule/aptamer binding, it was essential to build a tool to prevent it.

Figure 1. “Seagull” formation with P6 (blue) and P4G03 (red) single-stranded aptamer arms attached. In the body of the seagull is the complementary sequence of 25bp to limit the flexibility of the aptamers. Limiting flexibility reduces the chance of either aptamer from impeding the other’s hairpin formation. — **Figure 1.** “Seagull” formation with P6 (blue) and P4G03 (red) single-stranded aptamer arms attached. In the body of the seagull is the complementary sequence of 25bp to limit the flexibility of the aptamers. Limiting flexibility reduces the chance of either aptamer from impeding the other’s hairpin formation.

The “seagull” structure contains a double-stranded plasmid backbone, along with spacers and aptamers with overhangs that come off of each backbone strand. This required us to consider the possibility of spacers on the top strand of our sequence potentially interacting with spacers on the bottom strand. This gave rise to an initial limitation of NOODL’s design, where spacers were incorrectly made under the assumption that aptamers were evaluated moving from the 5’ to 3’ end.

Build

NOODL modules include:

noodl.jl: Contains the main genetic algorithm functions, parsing, and imports modules from the other Julia files.
Crossover: Implements multiple types of crossing over methods (Multipoint, Single, Uniform). Initially chose a random crossover method as default if the user did not input a selection.
RCScore: Computes total penalties; facilitates reverse complement search.
Bias Selection: Implements multiple types of bias for parent selection; this drives convergence rate of the GA. Good parents allow for fitter offspring (Stochastic Universal Sampling, Roulette, and Tournament). Selects random bias as the default if the user did not input a selection.

To combat the earliest limitation of NOODL, we modeled the right flank sequence as its reverse complement to mimic the DNA’s behavior on the opposite strand.

Test

Run Sequence:

Parse user input for parameters such as spacer length, target AT content, populations, generations, crossover type, bias selection type, mutation rate, flanking sequences, etc.
Generate a pool of random initial sequences with inputted length and rough A+T percentage.
Create a kmer dictionary for randomly generated sequence and inputted flanking regions; create a second dictionary of reverse complement kmers of the initial dictionary.
Assign penalties based on number of kmer matches between the two dictionaries. Also penalize based on proximity to user-inputted AT content.
Select parents using a bias method; apply crossover and mutation functions.
Re-score and repeat across the desired amount of generations to determine single best spacer sequence.

We took multiple sequences produced by NOODL and ran them through UNAFold’s DNA Folding Form [2] to observe folding differences and folding under different temperatures. Based on the differences observed, we refined NOODL’s input parameters. The goal was to identify the parameters that consistently produced 1) a low internal score and 2) desirable folding.

Our main changes in testing included:

Bias Selection: experimented with different kinds of bias to see which types of bias gave us the lowest scoring sequences
Crossover Functions: experimented with different crossing over points in the sequence to see which point gave us the lowest scoring sequences
Mutation Rate: experimented with values between 0.05 and 0.2
Population Size: experimented with values between 100-300
Number of Generations: experimented with values between 0-100
Kmer Count: the size of the kmers we are counting, which ranged from 5-8

We tested multiple values of the listed parameters until finding the lowest scoring combination.

Learn

We learned that Tournament Selection bias increased population diversity and reduced repeat parent picks within a selection pass. This led to steadier convergence than Roulette Wheel bias or Stochastic Universal Sampling bias. It also reduced the program’s runtime by more than half. Additionally, the Multipoint Crossover method consistently produced lower scoring sequences than other crossover methods, validating our decision to set it as the default. We also deduced that the reverse complement penalty system worked well as a scoring method because after plotting the structure of final sequences, we observed hairpins of length k-1. When any hairpins had a length greater than k-1, they were proven to be thermodynamically impossible.

Ideal input parameters (learned after testing):

Bias Selection: Tournament
Crossover type: Multipoint
Mutation Size: 0.2
kmer Length: 5

Iteration 2

Design

Our next iteration of DBTL came through a thermodynamic breakthrough surrounding the conversation around what makes a spacer truly optimal. Another limitation we encountered was how NOODL did not initially account for the junctions between spacers and adjacent fixed elements, which could lead to potential unwanted interactions with surrounding DNA. NOODL needed to treat fixed prefix and suffix regions (the beginnings and ends of aptamers concatenated to the spacer) as immutable. The optimal spacer would contain k-1mers, interrupted by base pairs that thermodynamically could not exist. This would prohibit a sequence of length k. Another property of an optimal spacer is a high degree of entropy, which means that there are many possible structures that the spacer can take. This makes it thermodynamically unfavorable for the spacer to settle into any single folded structure.

Build

NOODL modules include:

BleedingFlanks: builds boundary kmer sets to span junctions between fixed and variable nucleotides from flank content; scores complement hits.

To put the Bleeding Flanks module to use, we implemented prefix and suffix parameters. These short fixed regions are immutable, but they are included in the scoring process so that kmers spanning the spacer-flank junction are properly considered. By doing this, NOODL evaluates both internal kmers that are fully contained within the spacer and kmers that “bleeding over” and straddle the junction.

We also extended NOODL’s scoring algorithm to incorporate thermodynamic considerations that reflect the physical folding of the spacer at different temperatures. At higher temperatures, molecules have more vibrational energy, increasing the likelihood of different structures and the spacer being “slippery” (easily able to move from one conformation to the next). At lower temperatures, the sequence becomes more rigid due to lack of available energy to sample different conformations.

Test

Our new scoring considerations facilitated NOODL to generate a spacer that also avoids hybridization with the fixed ends of flanking sequences surrounding it.

While exploring the Geneious [3] software, we found that changing the temperature of the folding simulation and testing at 20°C, 37°C, and 55°C had a significant impact on the spacer’s predicted folding probability. At higher temperatures, the spacer exhibited lower stability in the molecule as a whole. On the other hand, at lower temperatures, the spacer was shown to be rigid.

While testing various temperatures with a fixed kmer size, we also experimented with different kmer sizes, from k=3 up to k=7. Minimizing the kmer size allows for more mobility within the spacer itself, leaving the aptamers preserved and structurally stable.

Learn

Our program generated a spacer that was shown to protect the structural integrity of both aptamers. Using Geneious’ DNA Folding tool, we were able to validate the thermodynamic properties of the spacers and the aptamers since they did not engage in any compromising interactions, as we expected. At lower temperatures, we saw that the spacers were typically in a compact, rigid fold with multiple small loops but never spanning length k or higher. At higher temperatures, those loops expanded and some structural stems would collapse meaning our spacer was most susceptible to hybridization at these conditions. This computationally confirmed our hypothesis.

We learned that a kmer size of three makes spacers that NOODL outputs more “slippery”. In other words, minimization-maxing guarantees that we see mostly kmers of 2 or less in our spacer, while sparsely seeing 3mers. This allows our spacer to have maximum entropy and so many conformations that picking one is thermodynamically unfavorable. This property allows the spacer to easily slip from one conformation to another, never settling on just one.

We found that one of the reasons why the P4G03 aptamer wasn’t preserved was due to it having a higher Kd (dissociation constant) value than P6. The lower the Kd, the more structurally sound the aptamer would be. Towards the final days of our project, we also realized that we were using P4G03’s sequence without its overhang, while P6 was inputted with its overhang. The lack of an overhang led to the visible loss of structural integrity of the P4G03 aptamer when modeled with the spacer sequence on Geneious.

Ideal input parameters (learned after testing):

kmer Length: 3

Software Citations

References

“Genetic Algorithms Tutorial.” Tutorialspoint, https://www.tutorialspoint.com/genetic_algorithms/index.htm.
DNA Folding Form, UNAFold. https://www.unafold.org/mfold/applications/dna-folding-form.php.
Geneious Prime 2025.0. (https://www.geneious.com).

Engineering

Design

Cycle 1 & 2 — Full Document (PDF)

Design

Build

Test

Learn

Design

Build

Test

Learn

Project Summary

Design

Build

Test

Learn