Our project, ABCS is a breast cancer surveillance system specifically designed for women who had breast cancer or undergone breast surgery. The core of our project is to develop an effective AND gate sensor sequence that can detect RNA biomarkers of cancer associated adipocytes and report an output signal that can be easily detected.
Here we show 4 engineering cycles used in our project:
Four engineering cycles contribute to project progress step by step
Design
To establish an ideal surveillance system for breast cancer, the initial problem we need to solve is what biomarker we detect for.
Through reported research we knew that specific genes can be significantly upregulated in adipocytes as they transformed into cancer-associated adipocytes (CAAs) under the influence of tumor microenvironment of breast cancer.
Aiming to find out the optimal single gene or gene combination with high specificity for breast cancer recognition, we integrated data from public databases (TCGA, GTEx, UCI) and developed a multi-layer perceptron classification model incorporating an Attention mechanism (Attention-MLP) to make it perform better when handling high-dimensional data. Parameters in Attention-MLP were continuously optimized through binary cross-entropy loss and backpropagation, while the classification threshold determined by Youden's index. Attention-MLP shows better performance over traditional MLP across multiple UCI public datasets.
Subsequently, through literature review, we shortlisted four candidate genes that have been reported to have significantly increased expression in CAAs: PLOD2, LIF, FAM3C, and IL-6. We then evaluated the performances of all combinations of these 4 genes using Attention-MLP Model. The results were assessed based on ROC curves and the AUC metric.
As illustrated in Figure.2, the PLOD2 and LIF gene combination achieved the highest AUC value, indicating the best predictive performance and thus was selected as the optimal gene combination to target in our engineered adipocytes.
ROC Curve of Gene Combinations
Expression levels of target genes after cytokines treatment
Design
To design RADAR sensor sequences targeting PLOD2 and LIF with high binding specificity, we used a variety of bioinformatical tools. The function of RNA sensor not only depends on its sequence but also highly depends on its secondary structure after folding. In our project, the secondary structure of sensors will directly influence the biomarker recognition and dsRNA formation between sensor sequences and their targets, which ultimately determine the efficiency of ADAR-mediated editing. Multiple studies demonstrate that the binding efficiency of the sensor region to its target RNA is positively correlated with the degree of exposure of the sensor region, i.e. the more exposed the binding region, the more readily binding occurs.
Therefore, we employed the iPknot++ algorithm to simulate the secondary structures of candidate sensor sequences. By calculating the unpaired probability and the exposure length of the editing site, we selected sensor regions with high exposure and favorable editing site accessibility as candidate sequence variants.
Specifically, we first identified candidate sensor sequences for each gene, then defined the "unpaired probability" as the ratio of unpaired bases to total bases in the sensor region, and the "exposed length of the editing site" as the longest consecutive unpaired bases near the editing site. These two metrics were used to quantitatively evaluate the candidate sequences.
Finally, we simulated the secondary structures of several constructed candidate RNA sequences (as listed in the table below) and screened Seq C as the functionally optimal sensor sequence, based on both unpaired probability and exposed length.
In Figure 4 we show simulated conformation and predicted results.
For detailed computational prediction procedures, please refer to the Model page.
Results of Seq C functional structure assessment
Luminescence detection of conditioned medium of infected adipocytes, with or without cytokines treatment.
Luminescence situation of conditioned medium from adipocytes treated with the supernatant of the conditioned medium from MDA-MB-231 or cultured alone.
Design
To enable patients to conveniently assess their breast health status by ABCS, we needed to design a clearly visible output detection method without specialized equipment or complicated procedures. Our choice fell upon Gaussia Luciferase (Gluc).
Gluc is a novel luciferase that can be secreted from cells, metabolized by the kidneys, and excreted in urine, thereby has the potential to translate internal breast tissue changes into a detectable signal outside the body. When Gluc reacts with its substrate coelenterazine, it generates luminescence signal that is detectable in dark conditions.
Reaction kinetics of Gluc production in the RADAR system are essential for determining detection time and method. Therefore, we conducted dynamics modeling. We first analyzed the reaction mechanism of the RADAR system. The interaction between the sensor RNA and its target trigger RNA induces a response, which recruit ADAR to convert the stop codon UAG upstream of the Gluc reporter gene into UIG, thereby initiating its translation. Building upon the interaction and catalytic processes of the RADAR system, we developed kinetic equations accounting for four key processes: transcription and basal degradation, translation and degradation, complex formation, and catalysis and reporter production. Based on biological and mathematical principles, we converted these kinetic equations into a set of ordinary differential equations (ODEs).
Transition from chemical reactions to ordinary differential equations
Gluc expression curves in dual input RADAR system
Plasmid map of Sensor 1+2-Gluc
Time-course curve of Gluc reaction fluorescence intensity
Design
During the development of our ABCS project, we focus on not only technical optimization but also real-world applicability. Due to the application of autologous adipocyte transplantation in ABCS, we need to ensure that our project is ethically sound and socially acceptable. This requires careful ethical review and attention to broader societal feedback.
Traditional ethical validation relies on interviews with patients, stakeholders, and experts. However, insights obtained from interviews are often difficult to interpret systematically or present in an intuitive manner. To address this gap, we sought to transform dispersed ethical considerations into a structured, data-driven evaluation framework, which led to the Bio-Ethical Assessment Model (BEAM).
In BEAM, we identified five key ethical assessment dimensions through literature review and expert consultation. We plan to employ BEAM as a supporting tool to achieve a systematic and quantitative approach to ethical evaluation.
Build
Building on five core evaluation dimensions identified in the Design phase, we further expanded and refined them to form a more comprehensive ethical evaluation dimension path structure. Subsequently, through expert interviews this path structure was revised and validated to ensure its scientific rigor and rationality.
After identifying path structure, our next task is to evaluate relationship among dimensions and quantify the path coefficients. Based on data collected from the ethical dimension evaluation questionnaire, we constructed the Structural Equation Modeling (SEM) to estimate the path coefficients between dimensions and thereby determine their relative weights within the overall ethical risk framework.
Finally, to analyze risks in complex systems and reveal interactions among multiple factors, we employed a Bayesian network to convert the weights into probabilistic relationships. To map the weights to probabilities we used functions, thereby defined conditional probability tables for each node. Bayesian network can infer and predict risk probabilities based on input dimension scores.
Test
To test the performance of BEAM, we selected several representative cases in synthetic biology—such as the synthetic reconstruction of the horsepox virus—to evaluate the model’s predictive capability.
After testing, we use BEAM to analyze our project using data collected from the public ethics assessment questionnaire conducted by our Human Practices team. The evaluation yielded a score of 32.2, corresponding to the mild risk category.
Evaluation results of ABCS by BEAM