Our goal is not only to develop effective algorithms but also to transform them into a tool that can be easily used by iGEM community members and the broader scientific community. To this end, we have packaged the entire PROTEUS workflow into a user-friendly, web-based application, and have built a detailed and clearly structured iGEM Wiki page to showcase all of our work.
5.1 PROTEUS Web Platform: A Bridge from Algorithm to Application
Core Functions and User-friendly Interface Design
We deeply understand the diversity of user backgrounds, so we have thoughtfully designed two distinct modes of operation. "Freshman Mode" is aimed at biology researchers who may not have a deep computational background, providing a minimalist "one-click" interface. Users only need to select a target from a preset list of proteins and check the desired optimization characteristic (e.g., thermostability) to start the analysis. "Expert Mode," on the other hand, is for computational biology users or advanced researchers with specific needs, providing richer custom options. It allows users to input their own protein sequences in FASTA format, manually adjust advanced parameters like the point-by-point scanning confidence threshold, and can automatically save and trace historical analysis records for easy longitudinal comparison.
The user interaction flow is designed to be as simple as possible. After submitting a request, the front-end interface will display the status of the computational task in real time. Once the back-end computation is complete (including calling the ESM-2 model for sequence generation and calling the scoring function for evaluation), the final candidate sequence list, their predicted scores, and specific mutation sites will be fed back to the front-end in clear tables and visualizations. The entire process is smooth and intuitive, making the complex back-end computational logic completely transparent to the user.
5.2 BIT-LLM Web Platform Development Log
To ensure the project Wiki both meets the information display needs of an academic project and reflects the team's unique character, we systematically studied past iGEM project wikis. By analyzing the architectural design logic and visual presentation styles of similar projects, and considering our project's core needs, we identified key modules for the homepage: Home, Project, Software, Model, Lab Work, Human Practices, Team, and Awards. This established a clear framework for the project.
The choice of the Wiki's backend framework directly determines its future scalability. We focused on three core metrics: "lightweight," "adaptable," and "easy to maintain." After comparing and evaluating three mainstream frameworks—Django, Flask, and FastAPI—we ultimately chose Flask due to its lightweight, flexible, and easily extensible nature. We adapted it to our development needs and completed the basic configuration for front-end and back-end data interaction.
Building on the core module framework, we proceeded with requirements decomposition. By mapping out the functional boundaries and logical connections of each module, we further broke down each module into sub-headings and content dimensions, completing task allocation to ensure individual responsibility and efficient, parallel progress.
Based on the foundational style guide and design direction set by the art team, each member completed their respective front-end design work. During the design process, a component-based approach was adopted. Reusable elements such as navigation bars, title bars, and content cards were created as components to ensure a unified visual style across major modules.
Figure 16: Component-Based Front-end Design Interface
We conducted a front-end design acceptance review following a cross-review + multi-role validation process. We evaluated the front-end interface from multiple perspectives, from technical and functional completeness to aesthetic appeal. We ultimately discovered and corrected several detailed issues, such as incomplete directory display and improperly timed pop-up windows.
Figure 17: Front-end Design Acceptance Review Process
Considering the overall project needs, we clarified through discussion that the original Wiki, which focused on information display, could not meet the core interactive usage requirements of the protein prediction model. A dedicated functional webpage needed to be created—to serve as a visual operation platform for the model and also to manage user data. We ultimately defined the core positioning of the webpage as a protein sequence prediction and analysis tool.
Focusing on the core functionality of the dedicated webpage: connecting to the protein prediction model. By inputting a sequence against an original protein, it achieves analysis and output of the predicted sequence. It should also support basic functions like user login, registration, and saving past analysis data. Based on this, we planned and defined the backend interfaces: `/predict` (prediction interface), `/register/login` (authentication interface), and `/history` (history record interface), providing a clear basis for technical development.
Figure 18: Backend Interface Architecture Design
We divided the webpage team into a front-end group and a back-end group. The front-end group actively collaborated with the art team on interface design, while the back-end group focused on server deployment planning, further specifying webpage server performance parameters and configuration requirements.
Figure 19: Front-end Group Collaboration
Figure 20: Server Deployment Planning
The front-end team began building the webpage front-end according to the design mockups. The back-end development team, based on the large model integration standards proposed by the modeling group, completed the interface development and debugging using the Flask framework. During the development period, the front-end and back-end teams maintained constant communication and collaborative debugging to optimize the parameter passing mechanism and refine functional details, ensuring the smooth operation of core functions.
Figure 21: Development and Integration Process
To comprehensively validate the webpage's functionality, the team deployed the webpage to a local test environment and invited modeling team members and other students to participate in testing. This covered functional testing, compatibility testing, and user experience testing, leading to the correction of issues such as webpage idle timeout, poor compatibility with multiple logins/logouts, and problems with input validation.
After the modeling team completed the training of the protein prediction model, the back-end team immediately began the "model and webpage back-end integration" work. First, the modeling team provided the model's Python calling script `model_predict.py`, which included the `predict(sequence, protein_id)` function. The back-end team then wrapped this script into a utility function in `app/services/model_service.py`, replacing the original data module used for demonstration and connecting it to the database system. At the same time, the back-end team completed the design and integration of the database system. Using a MySQL database and the Navicat tool, they created the `users` and `history` data tables. They also wrote data manipulation functions `save_history(user_id, sequence, result)` to save prediction records and `get_history(user_id)` to list historical records.
Figure 22: Model Integration and Database Design
Continuing with local testing, the team focused on troubleshooting detailed issues. First, it was discovered that when the sequence length exceeded 500, the model took over 300 seconds to run. The back-end team optimized the model inference logic, adopting a batch processing and real-time display strategy. A streaming output method was used to display the current prediction results in real time, preventing the webpage from becoming unresponsive. Simultaneously, the team initiated the procurement process for an Alibaba Cloud ECS server. By comparing different instance specifications on the Alibaba Cloud official website, a 2-core 4GB entry-level instance was ultimately selected. The team also learned basic operations such as configuring server security groups, binding public IPs, and remote connections, preparing for the subsequent server setup.
Figure 23: Cloud Server Configuration Process
Learned basic Linux operating system commands, installed necessary plugins for operation such as Python, uwsgi, nginx, and MySQL, completed virtual environment isolation configuration, and built a stable server operating core.
Figure 24: Server Environment Configuration
Transmitted local code to the cloud server via Gitee, and completed adaptive operations on the cloud server that differed from the local setup, such as changing IP addresses and creating and connecting database accounts, successfully achieving stable operation of the webpage in the cloud server environment.
After discussing with members of other groups, we modified and improved some minor flaws on the webpage; purchased a domain name, and focused on developing a sequence comparison function to realize a visual display of the protein sequence comparison.
Figure 25: Sequence Comparison Function Development
In addition to visualizing the sequence comparison, we also designed three visual charts based on the characteristics of the sequence and the scoring functions being trained by the modeling team: a score comparison chart, a feature radar chart, and a scoring function distribution chart.
Figure 26: Score Comparison and Feature Radar Charts
Figure 27: Scoring Function Distribution Chart
As the modeling team needed to continue training the scoring function weights (estimated to take 3 weeks), we used this time to carry out front-end interface beautification work. We formulated a beautification plan for detail optimization + animation enhancement. First, we unified the page element styles, changing all buttons and interfaces to a softer rounded rectangle shape. Second, we added a glowing animation to the input box and added waiting and ready states to some buttons to enhance the sense of feedback and smoothness. Finally, we added the function to switch colors according to personal preference.
Figure 28: Interface Beautification with Animation Effects
After the modeling team completed the training of all scoring function weights, we immediately started the real score integration work. First, the modeling team provided the final scoring function weight `.pth` files for 11 characteristics. The back-end team then modified the score calculation logic of the `/predict` interface, replacing the original demonstration scores with the scores calculated by the scoring model to ensure that the scores could reflect the true functional level of the sequence.
To ensure the webpage meets academic rigor and internationalization requirements, we carried out final adjustments. First, we reviewed all page text to ensure that professional terminology was professional and consistent. Second, to cater to a wider audience, we added an English version on top of the Chinese version. Finally, the entire team participated in the final acceptance testing, testing the webpage's performance from three major aspects: functionality, compatibility, and stability, to ensure the webpage's functions were complete, data was accurate, the interface was aesthetically pleasing, and it was easy to use. This completed the design and development of our custom webpage.