Overview

Accurate pesticide detection is crucial for ensuring food safety and environmental protection, but traditional colorimetric test strips rely heavily on human visual interpretation.

Due to differences in vision, lighting conditions, and even color vision (such as color blindness), it often leads to misjudgment of color intensity, thereby causing inaccurate estimation of pesticide concentration. Moreover, for many users, especially farmers in rural areas, this process may also cause subjective bias.

Thus, we developed a machine learning-based smartphone detection software to automate the entire color recognition and quantification process.

Our software part is aimed to create a portable pesticide detection platform. The biosensor is composed of AChE fixed on gold nanoparticles, which will produce a colorimetric reaction when exposed to samples containing pesticides. The smartphone application is built with Flutter to enable cross-platform deployment and functions as an analysis and visualization interface. The application processes the visual data from the biosensor through computer vision and colorimetric algorithms to quantify the pesticide concentration.

*The following picture shows the gradient of pesticide test strips sold on the market. It is easy to see that it cannot well reflect the differences in pesticide concentrations. Therefore, our wet laboratory has made significant improvements to the test strips, enhancing accuracy and smaller datasets, and strengthening robustness

The entire detection process follows four steps:

Front-end Architecture

For the front-end task of our application, we choose Flutter as our building frame that it can build applications with native quality for both Android and iOS systems based on the same code-base. The front-end interface offers a simple and user-friendly operation process, making it particularly suitable for practical applications. Its main features include:

Camera module: Utilizes the device's camera API for stable exposure shooting. This interface includes a rectangular overlay to guide users in aligning the test strips and optional reference color cards.
Image preview and cropping: After the shooting is completed, automatically uses color segmentation and contour detection to identify the test area. It also provides a manual adjustment function to ensure accuracy.
User interface: The user interface follows material design principles, featuring large buttons and clear instructions, ensuring usability even in non-laboratory environments. The result screen uses intuitive graphics - colored bars and icons - to convey pesticide content.

Back-end Architecture

Once the image acquisition is completed, the analysis process begins.
Step 1: Region selection extraction – The application will separate out the area containing the color reaction. This will be achieved through edge detection and color threshold processing, with a focus on pixels dominated by blue.
Step 2: Illumination correction – Due to the issue of illumination variation in mobile photography, we adopted two correction strategies:

Gray world algorithm: Assume that the average color of the scene should be neutral gray; adjust the RGB channels proportionally.

$$R’ = R \cdot \frac{\bar{G}}{\bar{R}}, \quad B’ = B \cdot \frac{\bar{G}}{\bar{B}}, \quad G’ = G$$

Color card calibration: If there is a standard color card in the image, calculate a 3×3 color conversion matrix to map the captured color to the reference value. Moreover, if the device is already calibrated by color card, it will use parameters from last calibration in order to improve the accuracy while there is no usable color card.

Step 3: Feature extraction – Transform the corrected area to adapt to a color space that is insensitive to light intensity, normalize it to CIELab. Then calculate the average value and standard deviation of the Lab b* component, generating a color descriptor.

$$\mathbf{x} = [\bar{a}, \bar{b}]$$

Step 4: Classification and Regression – Using SVM and SVR to do the classification and regression with the input of current color information based on pre-trained models. For more information, see “Main algorithm - SVM-based prediction”
Step 5: Cloud and Data Management – To support large-scale environmental monitoring, this application can choose to upload monitoring results to the cloud database via the SQL database. Each record includes:

Timestamp and geographical location information (if permitted)
Images and predicted pesticide concentrations

This enables data aggregation of pesticide residue trends within the region and helps regulatory agencies or farmers track pollution sources.

Main algorithm

SVM-based prediction

In our application, SVM is used to replace the color with a quantitative pesticide concentration.

After capturing the colorimetric area of the test bar, the application performs illumination correction and feature extraction, and then feeds the normalized color vector into the pre-trained SVM model as an ONNX file embedding.

$$\mathbf{x} = \sum_{i=1}^{N_{sv}} \alpha_i y_i K(\mathbf{x_i}, \mathbf{x}) + b$$

To integrate prediction method based on SVM, training through the dataset is necessary. We employed Radial Basis Function (RBF) Kernel as $K$ to project nonlinear color features into a higher-dimensional space, enabling accurate classification of subtle color gradients.

$$K(\mathbf{x_i}, \mathbf{x_j}) = \exp\left(-\frac{\|\mathbf{x_i} - \mathbf{x_j}\|^2}{2\sigma^2}\right)$$

This kernel allows the classifier to adapt to small nonlinearities in the color response curve, providing robustness to variations in lighting and camera sensors.

Model Training and Optimization

he SVM model was trained using a dataset containing multiple colorimetric samples, each labeled with its corresponding pesticide concentration.

During the training process, grid search and cross-validation are applied to determine the optimal hyperparameters, namely the penalty coefficient C and the kernel width $\sigma$, in order to balance classification accuracy and model generalization.

$$\text{Optimization Objective: } \min_{\alpha} \frac{1}{2}\sum_{i,j}\alpha_i \alpha_j y_i y_j K(\mathbf{x_i}, \mathbf{x_j}) - \sum_i \alpha_i$$

subject to $0 \leq \alpha_i \leq C, \quad \sum_i \alpha_i y_i = 0$

After training in Python (scikit-learn), the generated model will be converted into an Onnx file using the skl2onnx converter, allowing for deployment in cross-platform mobile environments. In the end by the utilization of ONNXRuntime for inference, there will be a real-time feedback on the user's screen.

SVR-based Regression and correction

Additionally, we also introduce Support Vector Regression (SVR) as an additional algorithm to enhance the quantitative accuracy of the detection system, it can not only make correction of the result exported by SVM, but also provides an quantification as an reference to distinguish overall level of pesticide leftover.

SVR shares the same theoretical foundation as SVM, using the same RBF Kernal to map nonlinear color features.However, unlike SVM, the function fitted by SVR approximates the relationship between the feature vectors and the continuous target values while maintaining a flat regression surface.

In our system, SVR operates after the SVM classification stage.Once SVM determines the most probable color level, SVR refines the prediction by mapping the extracted color features to a continuous pesticide concentration value.This ensures that the final output reflects both discrete classification reliability and continuous precision.

The workflow can be summarized as below.

In this process, we let the input feature be $\mathbf{x}$, SVM outputs discrete levels $\hat{y}{cls}$, and SVR outputs continuous concentrations $\hat{C}{reg}$.

$$\begin{aligned} \hat{y}{cls} &= \arg\max_k f_k(\mathbf{x}) \\ \hat{C}{reg} &= \sum_{i=1}^{N_{sv}} (\alpha_i - \alpha_i^*) K(\mathbf{x_i}, \mathbf{x}) + b \end{aligned}$$

The strategy of combination:

$$\hat{C}{final} = \begin{cases} \hat{C}{reg}, & \text{if } \hat{C}{reg} \in [C{\hat{y}{cls}-1}, C{\hat{y}{cls}+1}] \\ \text{clip}(\hat{C}{reg}, \text{range}(\hat{y}_{cls})), & \text{otherwise} \end{cases}$$

This strategy ensures that the final output remains within the range of the predicted class, preventing the risk of over- or under-prediction. Now this dual-stage approach can ensure that users understand the conceptual measurement between numerical values and how much (via SVM levels) while also producing quantitative results (via SVR).

This system of algorithms combines native Flutter UI, OpenCV preprocessing, and ONNXRuntime inference, achieving a relatively high-level precision while maintaining lightweight mobile performance, making it suitable for field applications.

Future development

For future development, it also has a great development space:

Integrating CSNorm-based neural correction to further mitigate light interference.
Enabling real-time detection with augmented reality overlays showing concentration directly on camera preview (integrating MR technology).

References

Yao, M., Huang, J., Jin, X., Xu, R., Zhou, S., Zhou, M., & Xiong, Z. (2023). Generalized Lightness Adaptation with Channel Selective Normalization. arXiv e-prints, arXiv:2308.13783.

Hearst, M., Dumais, S., Osuna, E., Platt, J., & Scholkopf, B. (1998). Support vector machines. IEEE Intelligent Systems and their Applications, 13(4), 18-28.