The CS model of Pst
Aims
We’ve built an epidemiological model to conduct rapid and accurate detection of the spread of Pseudomonas syringae pv. tomato(Pst) in the greenhouse. This model consists of three parts: the natural infection process, the infection process with early intervention, and its economic benefits outcome. It highlights the significance of our detection tool. Now let's move on to our CA model and SEIR model!
Research on the Transmission of Pst
The SEIR model is one of the most common models in infectious disease dynamics. Its core idea is to more accurately describe the spread dynamics of infectious diseases by dividing the disease into different stages of development. The following figure shows the basic structure of the SEIR model, where

Compared with the traditional SEIR model, we have added two new compartments:
Based on this, we have added two compartments,
Figure 2 fully demonstrates our improvements to the SEIR model for simulating this infection process.

The following arguments have helped us develop this model:
a. The presence of asymptomatic carriers involves environmental strains that come into contact with plants through rainwater, irrigation water, etc., and can survive latently on plant surfaces or within plant tissues. So we divide compartment E into two parts: surface colonization and internal latency. Pst-related strains in the environment (such as rainwater isolates) can be transmitted through irrigation water during the latent period, creating a need for early intervention.
b. For the infection of Pst, asymptomatic carriers or latent strains can be divided into infectious and non-infectious categories. Some lineages persisted in plant vascular tissues without symptoms and required specific triggers to activate virulence.
As for vascular bundle transmission, its essence is the systemic spread of pathogens within a single plant, rather than direct transmission from one plant to another. Therefore, we divide compartment I into two parts: local infection and systemic infection. When Pst invades through stomata or wounds on leaves, if it successfully breaks through local defenses, it can enter and utilize the plant's vascular system for movement. The pathogen reproduces and moves with the fluid flow in the vascular bundles' xylem or phloem, spreading from the initial infection site to other parts of the plant, including stems, leaves, and roots, resulting in systemic infection. This results in the entire plant becoming infected, rather than just localized lesions.

Model Structure of SEIR Model
Assuming tomatoes are planted in a large square farmland with uniform distribution.
We list the required notations and their explanations as follows.
Table1. The explanation of the added parameters
| parameters | Description |
|---|---|
| The plants colonized by bacteria on their surface | |
| The plants whose interior has been invaded and harbors bacteria | |
| The plants that have not yet been invaded by vascular bundles but have already been infected | |
| The plants whose vascular bundle has been invaded and infected | |
| Pathogen concentration in water bodies | |
| Soil pathogen concentration |
Table2. Meanings and explanations of other variables
| Parameters | Name | Description |
|---|---|---|
| Surface Colonization Rate | The rate at which environmental pathogens (W) at unit concentration successfully attach to the surface of susceptible plants (S) | |
| Direct Invasion Rate | The rate of direct successful invasion of environmental pathogens (W) into plant interiors (bypassing surface colonization) | |
| Surface-to-interior Penetration Rate | The rate at which surface-colonizing bacteria (E_s) successfully penetrate plant physical barriers and enter internal tissues | |
| Incidence Rate | The rate at which internal pathogen carriers ( E_i) develop into local lesion infectors ( I_local). 1/σ represents the incubation period. | |
| Infection Systematization Rate | The rate at which pathogens in local infections (I_local) successfully invade the vascular bundle and develop into systemic infections (I_systemic) | |
| Surface Clearance Rate | The rate at which plants remove surface-colonizing bacteria (E_s) through rainfall, their own secretions, or other mechanisms | |
| Local Infection Removal Rate | The rate at which locally infected individuals (I_local) are removed by farmers or die due to local diseases | |
| System Infection Removal Rate | The systemic infection rate (I_systemic) is determined by the rate of whole-plant wilting, death, or removal. | |
| Surface Microbial Release Rate Into the Environment | The rate at which surface colonizers (E_s) release pathogens into the environmental pathogen pool (W) | |
| Disease Spot Environmental Release Rate | The rate at which local infected individuals (I_local) release pathogens into the environment (W) through lesion exudation per unit quantity. | |
| System Infection Environment Release Rate | The rate at which systemic infected units (I_systemic) release pathogens into the environment (W) through vascular exudation and root secretion | |
| Environmental Pathogen Decay Rate | Natural mortality rate of pathogenic bacteria in the environmental pathogen pool (W) | |
| Soil Reservoir Decay Rate | Natural decay rate of pathogens in soil/diseased residue reservoir (Sₒ) | |
| Rate of Diseased and Residual Plant Material Returning to Soil | The rate at which infected individuals (I) deposit pathogen-carrying remains into the soil reservoir ( | |
| System Infection Rate | The probability of an internal latent agent ( | |
| Reseeding Function | Complete the function of | |
| Soil Function | Rate of pathogen release from soil to water source |
We obtain the equation as follows:
Model Structure of CA Model
Thus, we used this equation to simulate the infection of Pst on tomato plants across an entire farmland. However, unlike typical models, since we are simulating an entire farmland, the larger environment means that
So we incorporated cellular automata(CA model)for simulation. A CA model consists of the following basic components:
a. Grid: A regular lattice of cells. In our model, each tomato plant occupies one grid.
b. States: Each cell can be in one of a finite number of states. In our model, it refers to the above six plant states.
c. Neighborhood: A definition of which cells are considered neighbors of a given cell. In our model, it specifically refers to the tomato plants grown in the surrounding area.
d. Rules: A set of rules that determines the next state of a unit based on its current state and the states of neighboring units. In our model, it refers to the pathogen concentration in the environment and the SEIR model.
Then, in order to better reflect the impact of the environment on the infection, we accessed the weather data of Beijing and set some parameters to be weather-related. And since there are few models for Pst transmission in the literature, we make a qualitative analysis here.
According to the investigation, the optimum temperature range for Pst infection and disease development is about 15 ° C to 22 ° C. When the temperature exceeds 30 ° C, the disease development will be significantly inhibited or even stopped. 2 Taking β as an example, we design it as the product of two key environmental factor functions.
Among them,
For the temperature benefit function, we introduce the beta function:
We set the base point temperature for Pst that,
This function will output a value between 0 and 1 when
For humidity, we use a humidity gating function to qualitatively simulate the propagation ability from humidity.
So we substitute the data for calculation. Part of the data is derived from analog data of other bacterial pathogens ( Tomato Bacterial Canker ) in tomatoes.
Table3. Parameter value, Ref3
| Parameters | Name | Value | Units |
|---|---|---|---|
| Surface Colonization Rate | |||
| Direct Invasion Rate | |||
| Surface-to-interior Penetration Rate | 0.1 | ||
| Incidence Rate | 0.1 | ||
| Infection Systematization Rate | 0.05 | ||
| Surface Clearance Rate | 0.5 | ||
| Local Infection Removal Rate | 0.083 | ||
| System Infection Removal Rate | 0.05 | ||
| Surface Microbial Release Rate Into the Environment | 0.5 | ||
| Disease Spot Environmental Release Rate | 0.6 | ||
| System Infection Environment Release Rate | 0.9 | ||
| Environmental Pathogen Decay Rate | 0.2 | ||
| Soil Reservoir Decay Rate | 0.1 | ||
| Rate of Diseased and Residual Plant Material Returning to Soil | 0.7 | ||
| System Infection Rate | 0.1 |
The image below shows the result.


The above image clearly shows the infection process of Pst in a large area of farmland. The above image clearly shows the infection process of Pst in a large area of farmland in Beijing from May to July 2025, when there is no suitable detection method.
It can be seen that in the common tomato growing season from May to July, if not controlled, Pst will basically completely infect the entire tomato field in just about 90 days, resulting in very large economic losses. Therefore, the hardware group developed a corresponding detection device to detect whether the plant was diseased. What we need to consider is how to use these detection devices reasonably, so as to reduce Pst infection and maximize economic benefits.
View our code in igem gitlab.
Economic Benefit Model
In the previous CS model, we simulated the infection process images and visualization model of pathogens on plants in a farmland. To further enhance the practical applicability and guidance of our project and product, we will utilize the Q-learning model to simulate the economic benefits throughout a complete growth cycle. Now let's move on to our economic benefit model!
Background
The yield potential of facility-grown tomatoes is tremendous, ranging from 5,000 kg per mu in conventional greenhouses. Meanwhile, tomato market prices exhibit a typical "U-shaped" annual fluctuation pattern. In January 2022, affected by supply-side factors, the national wholesale average price soared to 7.29 yuan/kg, a year-on-year increase of 73.5%.
We can see that the profitability of tomato cultivation, whether it makes a profit and how much, is closely related to market conditions and weather factors in that particular year. Moreover, if farmers encounter pests and diseases and do not take timely control measures, the resulting negative economic impact can be significant.
The greenhouse is planted with fresh tomatoes that have better taste but poorer disease resistance, requiring more labor input. What farmers need most are tools and strategies for rapid detection of pests and diseases. Our project is fundamentally based on early detection of Pst before symptom appearance, and the hardware team has developed a corresponding Kit. Our economic benefit model theoretically demonstrates the Kit's application, highlighting the economic benefits our device can bring. Below are the detection tools and relevant parameters we have developed for Pst:
We have conducted research on the parameters related to tomato cultivation and sales, which are listed below.
Table4. Related economic parameters
| Parameters | Description | Source |
|---|---|---|
| Tomato price per unit | Ref 4 | |
| Planting density | 4500 | assumption |
| Individual plant weight | 13.62 | Ref 5 |
| Planting cost | 5000 | Ref 6 and assumption |
| Cost per single plant transplant | 0.42 | Ref 7 |
| The cost of Protato kit | 1.6 $ | From Drylab |
Training Principles and Process
This model employs the Deep Q-Network (DQN) algorithm from Deep Reinforcement Learning (DRL) , applying Q-learning to train an artificial intelligence "agent" capable of making optimal disease detection decisions based on daily crop and environmental conditions.
Q-learning is a reinforcement learning algorithm that trains agents to assign values to possible actions based on their current state. The foundation of this learning model is Markov decision, that is , in reinforcement learning, an agent interacts with an environment by taking actions that affect the environment. After each action, the environment transitions to a new state with certain probabilities. Meanwhile, the environment provides feedback to the agent in the form of rewards based on an underlying reward function. And for any finite Markov decision process, Q-learning will find an optimal policy, which maximizes the expected total reward over any and all consecutive steps starting from the current state. Q-learning can determine the optimal action selection strategy for any given finite Markov decision process.

In this code, we still use the SEIR model mentioned earlier to simulate the infection of Pst, with parameters that reflect the pathogen's life cycle and spread dynamics. Crucially, every action has a cost, and every outcome has an economic consequence. This forces the agent to learn the complex trade-offs between the cost of advanced detection methods and the potential long-term loss from inaction, mirroring the real-world cost-benefit analysis a farmer must perform.
Besides, the final output of the training process is not a fixed set of rules, but a highly adaptive and intelligent policy. In other words, it will continuously adjust its strategies based on the training outcomes of multiple simulated growth cycles, optimizing economic benefits through direct feedback. This is precisely the method and data that farmers genuinely need.
Moreover, to more accurately simulate farmers' decision-making during crop growth, we introduced field weather data from May to July 2025 in Beijing for simulation. After incorporating the weather data, AI can now learn to associate plant health conditions with specific weather conditions. Second, it can make more forward-looking decisions. For example, based on the previous Pst infection analysis, we know that Pst spreads much more efficiently on cold, rainy days than in dry, sunny weather. Therefore, if AI knows there will be heavy rainfall in the next period, it may use Kit sampling for detection. Based on its experience learned from thousands of simulations, rainfall greatly promotes disease spread.
After the training, we designed a decision log to reflect the optimal detection strategy for each day, which allows us to analyze the behavioral patterns learned by deep learning and obtain long-term reward data.


In the code, we have designed 3 detection methods: doing nothing, visual inspection, and using our kit. Moreover, we have labeled the estimated detection success rate and economic cost required for each method.
To improve accuracy, we have set 8 AM, 12 PM, 4 PM, and 8 PM as testing times every day. The AI will observe based on the input weather and environmental conditions to determine whether to conduct testing.
Training results

The revenue generated by each planting season has been gradually increasing. The horizontal axis represents the number of training iterations. The vertical axis represents the total revenue divided by 100. The economic benefit increased from approximately 27,500 RMB to the final convergence value of approximately 36700 RMB We can see that AI, through a continuous trial-and-error process, attempts different detection plans each day based on factors such as weather and environment. After attempting sufficient iterations, it retrieves plans with higher Q-values and greater efficiency from its database. This diagram clearly demonstrates its self-learning capability and the feasibility of simulation.
Finally, the training results will be summarized in a table format, like "Day16,20:00 | Soil(H:25%, T:15.8℃) Air(T:18.0℃, H:57%, R:2.6mm) I Healthy:0.27 I Action: 1: No Action" , to list the simulated plant infection situations and corresponding AI treatment methods for each time period of the day. This theoretically provides direct guidance for growers' detection work.

The primary achievement of this codebase is the creation of a fully functional prototype for an agricultural Decision Support System (DSS). Unlike systems that simply classify disease, this project tackles the far more complex problem of sequential decision-making under uncertainty. Overall, Traditional Pst research and control strategies have typically focused on the biological aspect. For example, copper-based sprays are applied under specific conditions, following "reactive" rules usually based on experience or fixed thresholds. In contrast, our model centers around an economic optimization problem.
The core objective of the code is to maximize profits throughout the entire growing season, evaluating control measures from a novel bio-economic perspective to better implement practices and generate greater benefits for farmers. This is accomplished by training an AI agent within a custom-built, high-fidelity simulation of a tomato farm. This virtual environment serves as a risk-free "digital twin" where the agent can conduct thousands of trial-and-error experiments—representing thousands of growing seasons—without any real-world cost or crop loss.
View our codes in igem gitlab.
Future Work
In the future, our work will mainly focus on improving our parameter database. For example, we will use COMSOL to simulate the diffusion of Pst in plant tissues such as mesophyll cells and vascular bundles, in order to obtain the specific values of the parameters within.
Furthermore, out economic model should be more rigorous. Like the cost of components, reagents, disposables, labour, throughput must be added to improve the accuracy of this model.