Project Description | XJTLU-CHINA 2025

Introduction

In our project, the modeling component plays a crucial role in connecting different aspects. Through modeling, we were able to organically integrate human practice, wet lab, and hardware development into a unified framework. First, in the human practice aspect, by modeling the distribution and dynamics of Ulva prolifera, we provided quantitative support and decision-making guidance for practical ecological conservation efforts. Second, in the wet lab aspect, the model helped us optimize protein expression and regulatory strategies, ensuring the stability and effectiveness of synthetic biology experiments. Finally, in the hardware development aspect, the model provided theoretical guidance for hardware design.

Overall, the model not only helped us explain ecological processes and provided quantitative insights for experimental design but also documented our exploratory and iterative thought process. It has been instrumental in supporting the smooth progress of the entire project.

Fig1. Model framework mind map.

Distribution and Dynamics Modeling of Ulva prolifera

Through these two models, we addressed the origin and spread of Ulva prolifera spores. First, the MaxEnt model helped us predict where Ulva might appear under different environmental conditions, highlighting potential risk zones. Next, the Growth and Trajectory Tracking model simulated how spores drift, grow, and gather under ocean currents, illustrating the journey from their source to bloom hotspots. Ultimately, the Distribution and Dynamics Modeling of Ulva prolifera section made a key contribution to the project by providing an in-depth understanding of the distribution and dynamics of Ulva, offering crucial decision-making insights, especially in identifying high-risk areas and predicting bloom trends.

Fig2. Mind map of distribution and dynamics modeling of Ulva prolifera.

MaxEnt (Maximum Entropy) Model for the Distribution of Ulva prolifera

Description

To predict the potential distribution of Ulva prolifera under different environmental conditions and identify high-risk outbreak zones, we developed a species distribution model based on MaxEnt. This model allows us to evaluate possible suitable habitats on a global scale and provides scientific support for assessing bloom risks. By linking these predictions with conservation and management strategies, our work can help guide early intervention, reduce the ecological and economic damage caused by green tides, and contribute to the sustainable protection of marine ecosystems.

Materials and Method

In this study, the occurrence data of Ulva prolifera were sourced from the Global Biodiversity Information Facility (GBIF). We first conducted rigorous data cleaning, which included removing missing or erroneous geographical coordinates, excluding records located on land, and deleting duplicate points to ensure the accuracy and independence of the data. To minimize bias caused by spatial autocorrelation, we performed spatial thinning at a scale consistent with the resolution of the environmental variables, retaining only one occurrence record per grid cell. Additionally, to improve the match with current environmental data, we prioritized records from the past two decades. After these considerations, a total of 272 usable data points were selected as source data for Ulva prolifera.

Environmental variables were selected from the Bio-ORACLE V3.0 database. After eliminating highly correlated variables, the final predictive data used included Bathymetry, Sea Surface Temperature, Phosphate concentration, Dissolved oxygen, Sea surface salinity, Nitrate concentration, Silicate concentration, Photosynthetically Active Radiation, Chlorophyll concentration, pH, and Current direction for the period 2020-2030, which are factors closely related to Ulva prolifera growth. All environmental factors were standardized to the WGS84 projection coordinate system and processed using QGIS to ensure they only covered marine areas.

MaxEnt Output Variables	Corresponding Environmental Factors (Bio-ORACLE)
terrain	Bathymetry
temp	Sea surface temperature
phosphate	Phosphate concentration
dissolved_oxygen	Dissolved oxygen
salinity	Sea surface salinity
nitrate	Nitrate concentration
silicate	Silicate concentration
current_velocity	Current velocity
par	Photosynthetically Active Radiation (PAR)
current_direction	Current direction
chlorophyll	Chlorophyll concentration
ph	pH

Table 1: Bioclimatic variables used for modeling Ulva prolifera distribution.

Fig3. Workflow for MaxEnt Model.

Results

The model's prediction results indicate that under future climate change scenarios (2020–2030), Ulva prolifera will still exhibit a broad potential suitable habitat globally. High suitability areas (red-yellow) are mainly distributed in the shallow coastal waters of temperate and subtropical regions, including the Yellow Sea and East China Sea in China, the coast of Japan, the North Sea of Europe, the west coast of the United States, and parts of South American seas. These areas share common characteristics such as suitable temperatures, higher nutrient levels, and abundant sunlight, all of which meet the growth requirements of Ulva prolifera. Compared to the current distribution, the model predicts that new potential suitable habitats may emerge in high-latitude regions (such as parts of the North Atlantic and North Pacific coasts), while some low-latitude regions show a decline in suitability due to rising sea temperatures. This also suggests that our mitigation strategy is not only applicable to the Yellow Sea but also provides valuable reference for managing potential green tide occurrences globally in the future. Therefore, it is evident that our project has long-term applicability and potential.

Fig4. All bioclimatic variables used for modeling Ulva prolifera distribution.

discussion and analysis

(1) Model Validation

The model was trained using 272 occurrence points and 10,272 background points, and the results show excellent discriminative performance. The AUC value of the training set is 0.952, significantly higher than the random prediction value of 0.5, indicating that the model is highly reliable in distinguishing occurrence points from background points. The regularized training gain is 2.564, suggesting that the model is able to capture rich environmental information. Analysis of omission rates at different thresholds shows that the model's predictions align closely with theoretical expectations, with an omission rate of 0.099 at the "10% occurrence point threshold", indicating that the model can still effectively cover the actual occurrence points while limiting the prediction range. ROC curve results further confirm the model's high fit, strong robustness in prediction output, and its reliability for inferring the potential distribution of Ulva prolifera.

Fig5. Receiver operating characteristic (ROC) curve for Ulva prolifera.

(2) Variable Contribution

The analysis of variable contribution and permutation importance shows significant differences in the influence of different environmental factors on model predictions. As indicated in the table, Bathymetry is the most important explanatory factor, suggesting that Ulva prolifera primarily relies on shallow marine environments, and large-scale blooms are unlikely to occur in deep-sea areas. Other environmental factors have varying degrees of impact on the distribution trends of Ulva prolifera.

Variable	Percent contribution	Permutation importance
terrain	46.8	76.3
temp	14.6	8.9
phosphate	13.2	6.3
dissolved_oxygen	7.9	3.8
salinity	5.2	1.8
nitrate	3.0	0.4
silicate	2.6	1.1
current_velocity	2.4	0.0
par	2.0	0.7
current_direction	1.4	0.0
chlorophyll	0.5	0.0
ph	0.3	0.6

Table 2. Variable Contribution to Model.

(3) Quantitative Analysis

We conducted a quantitative analysis of the MaxEnt prediction results using two approaches: binary and suitability-weighted.

Binary Suitable Area

Given a threshold of T = 0.5, the output raster was binarized. Areas with a suitability value greater than 0.5 were considered as definite potential distribution zones, while areas with a value less than 0.5 were considered non-distribution zones. The total area of potential suitable habitats globally was 4.73×10⁶ km² (accounting for 1.31% of the ocean).

Weighted / Suitability-Integrated Area

To reduce threshold sensitivity, we further integrated the pixel area using suitability as a weight, preserving the gradient of suitability strength. The output represents the area of "equivalent 100% suitability." At the same time, we maintained the strength-weighting and introduced the threshold T = 0.5. The equivalent area is 3.55×10⁶ km² (accounting for 0.98% of the ocean).

Fig6. Comparison of suitable area under different suitability scores.

Limitations

Data Sources The available marine environmental data still have limitations in spatial resolution and predictive accuracy. Some regions suffer from missing layers or inconsistent boundaries. These uncertainties may affect the reliability of the model predictions to some extent, particularly when interpreting species distribution at local scales.
Considerations The quantitative analysis in this study mainly focuses on the influence of environmental factors on Ulva prolifera distribution, but does not incorporate the species' own dispersal and bloom mechanisms (such as drift diffusion, reproductive cycles) into the modeling process. Additionally, short-term disturbance factors, such as extreme weather events (e.g., storms, anomalously high temperatures), have not been considered, which could lead to an underestimation of actual bloom risks.
Quantitative Analysis Due to the limited number of valid distribution records for Ulva prolifera worldwide, the quantitative analysis in this study primarily relies on statistical methods. The prediction results reflect the cumulative area of potential suitable habitats throughout the year, rather than the dynamic distribution during specific seasons or bloom events. Therefore, the model is currently unable to capture seasonal characteristics or the specific spatiotemporal processes of green tide outbreaks, and further optimization and validation are required using high temporal resolution monitoring data.

**Growth and Trajectory Tracking Model of Ulva prolifera Spores**