Overview Our project, ABCS is a breast cancer surveillance system specifically designed for women who had breast cancer or undergone breast surgery. The core of our project is to develop an effective AND gate sensor sequence that can detect RNA biomarkers of cancer associated adipocytes and report an output signal that can be easily detected. Here we show 4 engineering cycles used in our project:
Cycle 1 Identification and validation of optimal biomarker combinations
Cycle 2 Design and test of the optimal sensor sequences
Cycle 3 Exploration of the reaction kinetics of output signal
Cycle 4 Ethical assessment of our project with the Bio-Ethical Assessment Model
Four engineering cycles contribute to project progress step by step
Each engineering cycle represents a crucial problem we need to solve through combination of wet lab, dry lab and integrated human practice when developing ABCS. In the stage of Design, we utilize modeling to guide our experiment. Following the design, we build our system in eligible chassis. In the stage of Test, we verify our building by experiments. Finally, we summarize our findings and reflect on design improvements to advance our project in the stage of Learn. Cycle1: Identification of biomarker combinations Design To establish an ideal surveillance system for breast cancer, the initial problem we need to solve is what biomarker we detect for. Through reported research we knew that specific genes can be significantly upregulated in adipocytes as they transformed into cancer-associated adipocytes (CAAs) under the influence of tumor microenvironment of breast cancer. Aiming to find out the optimal single gene or gene combination with high specificity for breast cancer recognition, we integrated data from public databases (TCGA, GTEx, UCI) and developed a multi-layer perceptron classification model incorporating an Attention mechanism (Attention-MLP) to make it perform better when handling high-dimensional data. Parameters in Attention-MLP were continuously optimized through binary cross-entropy loss and backpropagation, while the classification threshold determined by Youden's index. Attention-MLP shows better performance over traditional MLP across multiple UCI public datasets. Subsequently, through literature review, we shortlisted four candidate genes that have been reported to have significantly increased expression in CAAs: PLOD2, LIF, FAM3C, and IL-6. We then evaluated the performances of all combinations of these 4 genes using Attention-MLP Model. The results were assessed based on ROC curves and the AUC metric. As illustrated in Figure.2, the PLOD2 and LIF gene combination achieved the highest AUC value, indicating the best predictive performance and thus was selected as the optimal gene combination to target in our engineered adipocytes.
ROC Curve of Gene Combinations
Build Following the analysis of optimal gene combination by Attention-MLP model, we proceeded to validate the best gene combination with wet-lab experiments. The aim was to confirm that the expression of PLOD2 and LIF in human mature adipocytes is significantly and synergistically upregulated within a specific tumor microenvironment. First, human preadipocytes were successfully differentiated into mature adipocytes using a commercial adipogenic induction kit. Based on literature review we knew that PLOD2 and LIF expressions can be upregulated by PAI-1 and CXCLs that are secreted from breast cancer cells respectively, so the differentiated adipocytes were treated by PAI-1 and CXCL8. Subsequently, qPCR analysis was performed to measure the expression levels of PLOD2 and LIF, with GAPDH used as an internal reference. Test We treated differentiated human adipocytes with PAI-1 and CXCL8 and subsequently measured the expression levels of PLOD2 and LIF via qPCR.
Expression levels of target genes after cytokines treatment
The qPCR result revealed that, compared to untreated cells, PAI-1 treatment significantly upregulated PLOD2 expression, and CXCL8 treatment significantly upregulated LIF expression (Figure 3). Learn From the results of our test, we can confirm that the selected gene combination, PLOD2 and LIF, undergoes upregulation in the treatment of cytokines that are secreted from breast cancer cells, indicating that PLOD2 and LIF are optimal biomarker combination in adipocytes to detect breast cancer. The validated result of PLOD2+LIF in wet experiments also demonstrated that Attention-MLP Model is effective for biomarker identification. However, due to the limitation in time we haven’t finished the verification of other gene combinations, which is required to sufficiently validate the accuracy Attention-MLP model. Cycle 2: Optimal design of sensor sequences Design To design RADAR sensor sequences targeting PLOD2 and LIF with high binding specificity, we used a variety of bioinformatical tools. The function of RNA sensor not only depends on its sequence but also highly depends on its secondary structure after folding. In our project, the secondary structure of sensors will directly influence the biomarker recognition and dsRNA formation between sensor sequences and their targets, which ultimately determine the efficiency of ADAR-mediated editing. Multiple studies demonstrate that the binding efficiency of the sensor region to its target RNA is positively correlated with the degree of exposure of the sensor region, i.e. the more exposed the binding region, the more readily binding occurs. Therefore, we employed the iPknot++ algorithm to simulate the secondary structures of candidate sensor sequences. By calculating the unpaired probability and the exposure length of the editing site, we selected sensor regions with high exposure and favorable editing site accessibility as candidate sequence variants. Specifically, we first identified candidate sensor sequences for each gene, then defined the "unpaired probability" as the ratio of unpaired bases to total bases in the sensor region, and the "exposed length of the editing site" as the longest consecutive unpaired bases near the editing site. These two metrics were used to quantitatively evaluate the candidate sequences. Finally, we simulated the secondary structures of several constructed candidate RNA sequences (as listed in the table below) and screened Seq C as the functionally optimal sensor sequence, based on both unpaired probability and exposed length. In Figure 4 we show simulated conformation and predicted results. For detailed computational prediction procedures, please refer to the Model page.
Results of Seq C functional structure assessment
Build Based on the optimal PLOD2&LIF sensor sequence (Seq C) designed by our model group, we constructed the PLOD2&LIF–Gluc sensor sequence and integrated it into the RADAR system to validate its effectiveness. Mature adipocytes derived from induced differentiation were infected with rAAV9 carrying the PLOD2&LIF–Gluc sensor sequence. Two days post-infection, the cells were subjected to different treatment conditions: no cytokine treatment, individual treatments with PAI-1 or CXCLs, and combined treatment with both cytokines. To better simulate the in vivo interaction between breast cancer cells and adipocytes, we also treated the engineered adipocytes with conditioned medium from MDA-MB-231 cells. The effectiveness of the PLOD2&LIF–Gluc system were assessed by measuring luminescence in the culture supernatant. Test In the cytokine treatment experiment involving engineered adipocytes, we first assessed viral infection efficiency by measuring mCherry expression via fluorescence microscopy imaging post-infection. After collecting the supernatant, luminescence was measured following the addition of the substrate coelenterazine. The results revealed a significant increase in luminescence intensity in the sample from cells treated with both cytokines, compared to the untreated control and samples from cells treated with a single cytokine, demonstrating that Gluc reporter expression is induced only when both PLOD2 and LIF are upregulated simultaneously (Figure 5). Most importantly, this result confirms that the PLOD2&LIF sensor can specifically respond to the simultaneous presence of both cytokines and induced the express of its downstream output gene Gluc.
Luminescence detection of conditioned medium of infected adipocytes, with or without cytokines treatment.
Considering that cytokine treatment alone cannot fully replicate the complex situation within the breast cancer tumor microenvironment, we further treated adipocytes with the conditioned medium from MDA-MB-231 human breast cancer cells. After 24 hours of culture, the conditioned medium was collected and luminescence was measured following the addition of the substrate of Gluc, coelenterazine.
Luminescence situation of conditioned medium from adipocytes treated with the supernatant of the conditioned medium from MDA-MB-231 or cultured alone.
The results showed a significant increase in luminescence intensity in the supernatant of engineered adipocytes treated with the conditioned medium from MDA-MB-231, demonstrating that the adipocytes equipped with the PLOD2&LIF-Gluc sensor can effectively respond to the presence of breast cancer and induce the express of Gluc signal for detection purposes. Learn In cycle 2, the effectiveness of the PLOD2&LIF sensor sequence designed by our model group has been confirmed, which confirms that RADAR detection element with optimal sensor sequences works in intracellular environment of adipocyte. In addition, the results show that the iPknot++ algorithm exhibited strengths in predicting RNA secondary structures, which might be a valuable algorithmic tool for other projects that also use RADAR system. Cycle 3: Exploration of the reaction kinetics Design To enable patients to conveniently assess their breast health status by ABCS, we needed to design a clearly visible output detection method without specialized equipment or complicated procedures. Our choice fell upon Gaussia Luciferase (Gluc). Gluc is a novel luciferase that can be secreted from cells, metabolized by the kidneys, and excreted in urine, thereby has the potential to translate internal breast tissue changes into a detectable signal outside the body. When Gluc reacts with its substrate coelenterazine, it generates luminescence signal that is detectable in dark conditions. Reaction kinetics of Gluc production in the RADAR system are essential for determining detection time and method. Therefore, we conducted dynamics modeling. We first analyzed the reaction mechanism of the RADAR system. The interaction between the sensor RNA and its target trigger RNA induces a response, which recruit ADAR to convert the stop codon UAG upstream of the Gluc reporter gene into UIG, thereby initiating its translation. Building upon the interaction and catalytic processes of the RADAR system, we developed kinetic equations accounting for four key processes: transcription and basal degradation, translation and degradation, complex formation, and catalysis and reporter production. Based on biological and mathematical principles, we converted these kinetic equations into a set of ordinary differential equations (ODEs).
Transition from chemical reactions to ordinary differential equations
Parameter values were determined through literature research, and numerical solutions were obtained using Python's solve_ivp function with the LSODA method. This modeling allowed us to predict the temporal concentration changes of Gluc within the RADAR system, as illustrated in the figure below.
Gluc expression curves in dual input RADAR system
Kinetic modeling results demonstrate that the Gluc concentration reaches its peak of 1.91×10⁵ molecules (equivalent to 316.76 nM) at 35.75 hours, surpassing the visualization threshold (150 nM) at 14.13 hours. The concentration reaches half of its maximum value at 14.63 hours. These data indicate that the dual-input RADAR system requires approximately 14 hours to generate a visible output signal upon detection of target RNAs. The modeling result confirms the possibility of using Gluc as the output signal, as it demonstrates that Gluc can be expressed in the RADAR system and accumulates visually detectable concentrations. We plan to further validate through wet-lab experiments. Build To verify the result of Gluc expression dynamics modeling in dual-input RADAR system, we constructed plasmid carrying an established AND gate sensor sequence and Gluc, transfected together with plasmids carrying trigger RNAs.
Plasmid map of Sensor 1+2-Gluc
Test The constructed plasmids were transfected into HEK-293T cells. Conditioned medium was collected from the transfected HEK-293T cells at 12, 24, 36, 48, 60, and 72-hour post-treatment. Immediately after collection, the substrate coelenterazine was added, and luminescence signals were measured using an ALLSHENG Feyond-A300 microplate reader. The corresponding results are shown in Figure 10. Results indicate that the conditioned medium showed an upward trend and reached the peak at 60 hours post-cytokine treatment in the luminescence intensity. After 60 hours, the luminescence intensity began to descent.
Time-course curve of Gluc reaction fluorescence intensity
Learn By monitoring Gluc luminescence intensity at multiple time points, we characterized its expression and secretion dynamics: a rising phase between 12 and 60 hours, peak secretion at 60 hours, followed by a subsequent decline. Compared to the reaction kinetics predicted by our modeling, the expression of Gluc is slower in wet lab experiments. We consider this phenomenon might be caused by two main reasons. The first reason is that in our modeling the initial plasmid value was set as the amount of plasmid already in cells, while in actual experiment, plasmids require time to enter the cells. The second reason is that in our modeling we assumed maximum plasmid transfection efficiency as the calculation parameter, but the actual transfection efficiency in experiments is usually lower. Nevertheless, the curves exhibit similar trends, demonstrating that the experimental result fits the reaction kinetics of Gluc simulated by our model in general. In short words, the modeling of Gluc reaction kinetics could provide valuable information and guidance for detection of Gluc in the subsequent application of our ABCS project. Cycle 4: Ethical assessment of our project with BEAM Design During the development of our ABCS project, we focus on not only technical optimization but also real-world applicability. Due to the application of autologous adipocyte transplantation in ABCS, we need to ensure that our project is ethically sound and socially acceptable. This requires careful ethical review and attention to broader societal feedback. Traditional ethical validation relies on interviews with patients, stakeholders, and experts. However, insights obtained from interviews are often difficult to interpret systematically or present in an intuitive manner. To address this gap, we sought to transform dispersed ethical considerations into a structured, data-driven evaluation framework, which led to the Bio-Ethical Assessment Model (BEAM). In BEAM, we identified five key ethical assessment dimensions through literature review and expert consultation. We plan to employ BEAM as a supporting tool to achieve a systematic and quantitative approach to ethical evaluation. Build Building on five core evaluation dimensions identified in the Design phase, we further expanded and refined them to form a more comprehensive ethical evaluation dimension path structure. Subsequently, through expert interviews this path structure was revised and validated to ensure its scientific rigor and rationality. After identifying path structure, our next task is to evaluate relationship among dimensions and quantify the path coefficients. Based on data collected from the ethical dimension evaluation questionnaire, we constructed the Structural Equation Modeling (SEM) to estimate the path coefficients between dimensions and thereby determine their relative weights within the overall ethical risk framework. Finally, to analyze risks in complex systems and reveal interactions among multiple factors, we employed a Bayesian network to convert the weights into probabilistic relationships. To map the weights to probabilities we used functions, thereby defined conditional probability tables for each node. Bayesian network can infer and predict risk probabilities based on input dimension scores. Test To test the performance of BEAM, we selected several representative cases in synthetic biology—such as the synthetic reconstruction of the horsepox virus—to evaluate the model’s predictive capability. After testing, we use BEAM to analyze our project using data collected from the public ethics assessment questionnaire conducted by our Human Practices team. The evaluation yielded a score of 32.2, corresponding to the mild risk category.
Evaluation results of ABCS by BEAM
Learn According to BEAM, we noticed that in our project ABCS, risk scores in genetic abnormality (41.3) and technology equity (41.5) are relatively high. Genetic abnormality refers to risk of abnormal genetic mutations and technology equity represents risk in regional technological inequity. In ABCS, autologous adipocyte engineering and transplantation are major causes of public concern in these two dimensions. This score inspires us to consider more when applying autologous adipocyte technological approach. As a universal framework for quantifying ethics, BEAM shows how ethical concerns in synthetic biology can be turned into structured, data-driven models. Its validation proves that ethical risks can be systematically analyzed and improved. It is worth noting that, beyond the ethical issues we had already anticipated e.g. genetic abnormality, BEAM also pointed out potential risk we had not considered, e.g. the risks of ABCS on regional technological inequity. Together, these efforts complete a full methodological cycle, turning ethical evaluation from abstract discussion into quantitative analysis and continuous improvement.