Aechmi: A Portable Device
We designed a minimally invasive biological system for integration into a portable device enabling real-time monitoring of sepsis biomarkers. Such point-of-care tools are particularly valuable in emergency departments and low-resource settings, where rapid diagnosis is critical and laboratory infrastructure may be limited. Compared to conventional diagnostics, portable devices provide faster, more affordable results. (1) Beyond rapid diagnosis, our goal was to develop a system that provides clinicians with continuous insight into disease progression and into type of early treatment, and therapeutic response. Such functionality is crucial for effective resource management and patient prioritization. Point-of-care devices that incorporate microfluidic technology are particularly well suited for this purpose, as they are portable, require only minimal blood volumes, demonstrate high accuracy, and deliver results within a short detection time. (1)
Diagnosis Pipeline
The proposed device's functionality is based on three integrated systems: a biological mechanism that performs the Collateral Cleavage and CHA reactions; a measurement mechanism composed of an electronic circuit that takes the measurements; and the software, which uses machine learning to produce the diagnostic results. The process initiates with the biological system. Because each one of the two reactions require one hour to produce a measurable result, a single measurement is taken every two hours. This allows the device to complete 12 measurements over a 24-hour period. The measurement system activates next, detecting electrochemical signals produced by the reactions. The electronic circuits do not analyze these signals; they simply record the data and send it via Wi-Fi to a server. On the server, the data is processed by machine learning models to generate a diagnosis. The results are then returned to the device and displayed on the screen. In the following sections, the reader can find a detailed explanation of all the elements that compose each mechanism.
Biological Mechanism
Biomarkers can be detected in many biological fluids, but the bloodstream contains the majority of biomarkers at higher concentrations. Most point-of-care diagnostic devices that use microneedles extract interstitial fluid (ISF). However, the exact concentration of microRNAs in ISF remains unclear, which raises uncertainty about whether they can be detected in this fluid. For this reason, we selected blood as the biological sample, since microRNA concentrations are better characterized. Reported values for the total extracellular microRNA concentration in serum range from ~3.8 to 7.7 pM (2), and there is also evidence for specific microRNAs such as miR-486-5p, which has been measured at approximately 13 pM in healthy individuals. (3)
The device would be designed to extract a small blood sample from the capillaries using an array of solid microneedles. Based on literature, the optimal length of microneedles for increased blood flow is 1000μm, width 350μm and thickness 50μm. This length of microneedles balances the satisfactory blood flow with less pain. A clinical study that used microneedles of this length showed that pain scores were significantly lower than venepuncture. (4)
Once extracted, the blood is transported into microfluidic channels via capillary action. Each channel leads to a reservoir. Within the channels, the CRISPR/Cas13a reaction occurs, while the reservoirs host the Catalytic Hairpin Assembly (CHA) reaction. Unlike previous designs that connect all microchannels to a single reservoir, we propose a system that introduces smaller, separate reservoirs at the end of each channel. This modular design allows for the simultaneous detection of multiple microRNA biomarkers, while also spatially separating CHA from Cas13a to prevent interference from Cas13a's collateral cleavage activity. (5)
The microfluidic channels would be pre-loaded with lyophilized CRISPR/Cas13a reagents and Hairpin 0. Upon contact with blood, these reagents are rehydrated and activated. Each reservoir contains a CHA buffer (PBS) along with Hairpin 2. To control the start of the CHA reaction, a peptide nucleic acid (PNA) sequence, synthetic and complementary to the initiator, will be placed at the junction between each microchannel and its reservoir. This ensures that only the specific sequence can pass into the reservoir, effectively separating each reservoir from its corresponding channel. (6)
For detection, each reservoir integrates a three-electrode electrochemical system, consisting of:
- a gold (Au) working electrode (sensing surface),
- a platinum (Pt) counter electrode, and
- a saturated calomel reference electrode (SCE).
The Au electrode is functionalized with thiol-modified DNA hairpin probes (H1) via Au–S bonds. To minimize nonspecific adsorption, the remaining gold surface is blocked with 6-mercapto-1-hexanol (MCH), forming a stable and specific sensing interface (MCH/H1/Au). This modified electrode, integrated with our CRISPR/Cas13a–CHA amplification strategy, enables highly sensitive electrochemical detection of microRNA biomarkers. (13)
Measurement Mechanism
In the above section we explained the main components of the device, in which the reactions of our biological mechanism take place. In this section we will explain the electronic components that take the measurements.
Measurement Logic
As was mentioned previously, our system uses electrodes for signal detection. The system has H1 DNA strands attached those electrodes. When sepsis biomarkers are present, the CRISPR/Cas and CHA reactions allow H2 DNA strands to bind to these H1 strands. Each H2 strand carries a molecule called Methylene Blue (MB), which can transfer electrons and produce a measurable current when voltage is applied. When H2 binds to H1, the MB is brought close to the electrodes, allowing electron transfer and generating a signal in the form of current. The more biomarker present, the more H2 strands are produced, bringing more MB to the electrode and resulting in a stronger signal. Therefore, a stronger signal indicates greater biomarker expression. This way, we can quantify the biomarkers by looking at how much signal is produced. The technique used to apply voltage to the electrodes and measure currents is called Differential Pulse Voltammetry (DPV).
Differential Pulse Voltammetry
Differential Pulse Voltammetry (DPV) is an electrochemical technique that measures tiny electrical currents to detect biological molecules with high precision. It works by using methylene blue (MB), a dye molecule that can undergo "redox" reactions - meaning it can either gain electrons (reduction) or lose electrons (oxidation) when the right amount of electrical energy is applied to it. Every molecule that can do redox reactions has a specific "redox potential," which is like its unique electrical fingerprint - for methylene blue, this happens at around -0.2 volts, meaning that when we apply this exact voltage, the MB molecules start exchanging electrons with our electrode surface and create a measurable electrical current (called faradaic current) that serves as our detection signal. DPV applies voltage in a clever pulsed pattern: it gradually increases the base voltage while adding small pulses at each step, and when the voltage reaches MB's redox potential, we get a spike in current that indicates the presence of our target molecules. However, applying any voltage to an electrode also creates unwanted background noise (non-faradaic current) that can overwhelm our real signal, so DPV measures the current both before and after each pulse, then subtracts these measurements to cancel out the noise and leave only the clean MB signal. This differential measurement approach makes DPV extremely sensitive and allows us to detect very small amounts of biological targets that would otherwise be lost in the electrical noise.
DPV example
In our detection system, voltage pulses are applied to each electrode, and the resulting current is measured at each pulse step. When the applied voltage reaches the redox potential of methylene blue (~-0.2V), electron transfer occurs between the MB molecules and the electrode surface, generating a characteristic current peak that serves as our measurable signal. Since our biosensor is designed to detect two different biomarkers simultaneously, we employ a dual-electrode setup where each electrode is functionalized to specifically capture one target biomarker. DPV is performed independently on both electrodes, allowing us to obtain two distinct current peaks - one corresponding to each biomarker's concentration. The magnitude of each current peak directly correlates with the amount of the respective biomarker present in the sample, enabling quantitative dual-biomarker detection in a single measurement.
Electronic Components
There are similar available mechanisms in recent literature. For instance, Cui et al. (2021) developed an ultrasensitive electrochemical biosensing platform for microRNA-21 assay and quantification. Their approach employed Differential Pulse Voltammetry (DPV) to measure biosensing electrodes following CRISPR-Cas and catalytic hairpin assembly (CHA) reactions, generating amplified electrochemical signals proportional to target miRNA-21 concentrations. However, their implementation required a large electrochemical workstation (CHI660E) with an integrated potentiostat for voltage waveform generation—a solution that is both costly and impractical for portable diagnostic applications.
To address this limitation, our design prioritizes miniaturization and cost-effectiveness while maintaining analytical performance. We selected the EmStat Pico (7), a compact, pre-calibrated dual-channel potentiostat that enables simultaneous measurement of two biomarkers through our double electrode configuration. For system control and connectivity, we propose the ESP32-S3-MINI-1 (8) microcontroller, which offers Wi-Fi capabilities in a compact, cost-effective package suitable for wireless data transmission and device management.
Our portable system architecture includes an Adafruit Mini LiPo Battery (3.7V, 4400 mAh) (9) providing full-day operation capacity, complemented by the PowerBoost 1000C (10) for efficient charging and voltage regulation. User interface functionality is achieved through an SSD1306 / SH1106 (11)(12) OLED display, chosen for its low power consumption and clear presentation of diagnostic results.
A high-level block diagram illustrating the device's operational logic is presented below.
Device Diagram
Our project primarily focused on biomarker identification and biological mechanism development rather than hardware implementation. The proposed device represents a conceptual design framework demonstrating how our biological innovations could be integrated into a portable diagnostic platform. Since the system was not physically constructed or tested, the specified components serve as illustrative examples of the technological approach rather than optimized hardware selections. This theoretical framework establishes the foundation for future engineering development while validating the feasibility of our biomarker detection methodology.
Below, we explain the software required for such an application.
Diagnosis Software
The diagnosis is achieved by providing data from each measurement as input to a machine learning diagnostic software. What follows is the core idea of how this software works.
More Than Just Sepsis Detection
As demonstrated in the biomarker selection section of the modeling page (READ MORE), miR-150-5p and miR-486-5p are highly reliable biomarkers for sepsis. However, their significance extends beyond diagnosis, as studies suggest they also hold prognostic value, providing insight into the severity of a patient's condition.
Severity can be captured by the SOFA score (Sequential Organ Failure Assessment score), a clinical tool used in intensive care units (ICUs) to track a patient's status and assess organ dysfunction or failure. A higher score indicates more severe organ failure and a higher risk of death.
- Vasilescu et al. (2009) showed a negative correlation between miR-150-5p's expression and the SOFA score.
- Sun & Guo (2021) showed a positive correlation between miR-486-5p's expression and the SOFA score.
From Vasilescu et al. (2009)
From Sun & Guo (2021)
Furthermore, How et al. (2015) demonstrated that the downregulation of miR-150-5p is correlated with gram-negative bacteria-induced sepsis.
Based on this information, we propose the following approach. By collecting data that correlates miRNA expression with patient condition (sepsis/no sepsis), SOFA score, and the presence of gram-negative bacteria, we can design machine learning software that analyzes miRNA expression values. This software would first determine if a patient had sepsis. If a diagnosis is made, it would then assess the severity using the SOFA score and identify if the condition is related to gram-negative bacteria.
What is a Machine Learning Classifier?
The Machine Learning models proposed above are called "classifiers" because they categorize data into distinct groups or classes. These models operate by learning from a training dataset composed of patient samples, where each patient is characterized by specific features (in our case, miRNA expression levels) and assigned a known class label (such as "septic" or "non-septic"). During the training process, the classifier algorithms analyze the miRNA expression patterns and identify which combinations of expression values are typically associated with septic patients versus non-septic patients, essentially learning to recognize the molecular signatures that distinguish between these two groups. Once trained, when presented with a new patient sample containing only miRNA expression data (without a known sepsis status), the classifier can predict whether that patient is likely to be septic or non-septic based on how closely their miRNA expression pattern matches the learned patterns from the training data. This predictive capability enables the model to serve as a diagnostic tool for identifying sepsis in new patients based solely on their biomarker profiles.
To understand how our diagnostic software functions, it is essential to first examine the training data.
Data Engineering
As mentioned above, machine learning models require training datasets with specific structures that align with the intended application. Since publicly available datasets with the exact combination of miRNA expression profiles and sepsis classifications needed for our diagnostic system do not exist, and collecting real patient data would be required for future device implementation, we generated realistic synthetic data to demonstrate our system's functionality. We accomplished this through data engineering techniques applied to the GEO dataset GSE134358, the same dataset used for our biomarker selection process. This dataset provided a reliable foundation due to its sufficient sample size and relevance to sepsis and miRNA expression analysis. Through computational processing, we transformed this existing dataset to match the structure and characteristics required by our machine learning classifiers, creating realistic enough datasets, not intended for real use, but for a demonstration of how our models would learn by real data. The complete data engineering pipeline, including the R code developed by our team and the resulting synthetic datasets, is available in our GitLab repository in the Diagnosis Software directory, with detailed documentation provided in the accompanying README.md file. For a comprehensive understanding of the dataset structure and its application in model training, the reader can refer to the biomarker selection section of our modeling page. READ MORE
Our device measures current, but the available dataset contains fluorescence intensity values for each miRNA. We transformed fluorescence to concentration using the Langmuir model from Gharaibeh et al. (2010), then converted concentrations to currents to create realistic training data. The model is:
I = ac/(b+c) + d
- I is the measured fluorescence intensity
- a is the maximum signal at saturation, estimated as the difference between the 95th percentile and the 5th percentile of intensity values
- b is the half-saturation constant, set to an initial value of 1.0
- c is the estimated concentration
- d is the background signal, estimated as the 5th percentile (lower quantile) of all intensity values
By estimating the parameters from raw data and solving for c, we can determine the concentrations.
Following this process, to map the concentrations to realistic current values we utilized the formula from Cui et al. (2021). They found that the relationship between the concentration (pM) of miR-21 and the current (μA) measured using differential pulse voltammetry (DPV) is the one shown below.
I(μA) = 0.2543log(Ctarget(pM)) + 0.5970
By applying two mappings — fluorescence → concentration and concentration → current — we transformed our raw dataset into one that resembles what we would use for training. The first mapping is only an approximation, and the second is specific to the system described in paper (13). This means our data are realistic but not accurate enough for real model training. However, they are sufficient to demonstrate how our mechanism works.
Our software uses three machine learning (ML) models, as shown below.
1. Sepsis Detection Model
• Trained on the generated dataset containing miRNA expression profiles and a class label (sepsis/no sepsis).
• Produces two probability outputs.
• A probability threshold (e.g., 80%) determines the next step:
→ If probability < threshold: prints result and stops.
→ If probability ≥ threshold: classifies patient as septic and proceeds to the next models.
2. Gram-Negative Identification Model
• Trained on a subset of the dataset with only septic patients.
• Uses class labels (gram-negative/non-gram-negative).
• Since low miR-150-5p expression is linked to gram-negative infections (14), we labeled the 20% of patients with the lowest miR-150-5p as gram-negative with 90% probability to add noise and keep the dataset realistic.
3. Severity Estimation Model (SOFA Score Prediction)
• Uses the same dataset with new class labels (SOFA score 1–24).
• Lower miR-150-5p or higher miR-486-5p expression increases the score.
• The score range was divided into 24 equal intervals, assigning each patient a SOFA score based on their position in the range.
With the datasets and models defined, we can now explain the software workflow in more detail.
Software Demonstration
The Aechmi system follows a client–server architecture designed for continuous monitoring. The ESP32-based device acts as the client, automatically performing electrochemical DPV measurements from two electrodes every two hours over a 24-hour period (12 measurements in total). After each measurement cycle, the device packages the peak currents as JSON and sends them via HTTPS to the remote diagnosis server. The server then uses trained models to determine sepsis probability, Gram-negative status, and SOFA score, then returns a diagnosis string.
In the Diagnosis Software folder of our GitLab repository, there is a subfolder named demonstration. This folder contains two example files: client.ino, which provides sample code for the portable device, and server.py, which serves as an example implementation of the server-side application.
The client.ino code was also incorporated into a graphical user interface (GUI) application, called device.py to provide an interactive demonstration of our mechanism. When executed, the application opens a GUI with two sliders that let users set values for the measurement biomarkers, along with a display area that presents the resulting diagnosis in real time. At its core, the program relies on three machine learning classifiers—specifically, three multilayered perceptrons (MLPs) trained on the synthetic data described in the previous subsection. Examples of both a low sepsis probability case and a confirmed sepsis diagnosis are provided below for illustration.
Sepsis case
No sepsis case
Device Scalability
The diagnostic models can be retrained without individually dismantling each device. The code on the central server can be remotely updated, allowing for the use of new models with different parameters. As each correctly diagnosed case is added to the training data, the models become more accurate over time. This ensures the device continuously improves rather than becoming obsolete as it ages.
The device is a template diagnostic tool that can be extended to accommodate more biomarkers. This is achieved by retraining the models and making minimal hardware changes. More specifically, each new biomarker requires a duplicate detection channel, including a reservoir, electrode, and other components. Electrode multiplexing would also be required.
The CRISPR/Cas system is biomarker-specific and must be carefully redesigned for each new biomarker. However, the CHA mechanism can be reused as the CRISPR/Cas system is not specific to the CHA system.
To demonstrate the device's scalability and the power of additional biomarkers, we trained machine learning models on more than two biomarkers and performed feature engineering. Through this process, we developed a model that achieved approximately 90% ROC-AUC. To get an understanding of classifier evaluation metrics like ROC-AUC, the reader can read the biomarker selection page of our modeling page. (READ MORE) All of the code is available in our GitLab repository in the "Device Scalability" directory, and the entire process is documented in the subsections below.
Data Engineering
We used the original GSE134358 dataset, as it was based on real preprocessed values, likely to yield good results.
We ranked all available biomarkers based on a metric we called composite score. The composite score is the weighted sum of scores assigned to each biomarker by different data analysis/ML methods.
The methods utilized to compute feature importances as scores were the following.
- F-Score (ANOVA): Statistical test measuring linear dependency between features and target using variance analysis (18)
- Mutual Information: Non-linear dependency measure based on information theory that captures complex relationships (19)
- Chi-Square: Statistical test for independence between categorical variables (applied to discretized continuous data) (20)
- Random Forest: Tree-based feature importance from ensemble voting, measuring how much each feature decreases impurity (21)
- Extra Trees: Randomized tree-based importance with additional randomness in both feature and threshold selection (22)
- XGBoost: Gradient boosting feature importance based on split gains and frequency of feature usage in trees (23)
- Logistic Regression: Absolute coefficients from linear model weights after standardization (24)
- RFE (Recursive Feature Elimination): Backwards selection with ranking that iteratively removes least important features (25)
Each of these methods assigned a score to each biomarker. The scores were then normalized, and their weighted sum provided the final score for each biomarker.
Composite Score = 0.15×F_Score + 0.15×MI + 0.10×Chi² + 0.15×RF + 0.15×ET + 0.10×XGB + 0.10×LR + 0.10×RFE
The python script which performs the above score calculations is called "miRNA_pca_variance.py" and running it, gives us the 30 best biomarkers.
Top 30 Biomarkers:
1. hsa-miR-150-5p
2. hsa-miR-342-3p
3. hsa-miR-342-5p
4. hsa-miR-1275
5. hsa-miR-320d
6. hsa-miR-27a-3p
7. hsa-miR-320e
8. hsa-miR-1273g-3p
9. hsa-miR-320c
10. hsa-miR-4516
11. hsa-miR-486-5p
12. hsa-miR-4772-3p
13. hsa-miR-143-3p
14. hsa-miR-1273h-3p
15. hsa-miR-1587
16. hsa-miR-320b
17. hsa-miR-23a-3p
18. hsa-miR-26a-5p
19. hsa-miR-15a-5p
20. hsa-miR-25-3p
21. hsa-miR-320a
22. hsa-miR-2909
23. hsa-miR-22-3p
24. hsa-miR-29a-5p
25. hsa-miR-1272
26. hsa-miR-4513
27. hsa-miR-451b
28. hsa-miR-29c-5p
29. hsa-miR-1910-5p
30. hsa-miR-135a-3p
The dataset was now reduced from hundreds of features to just 30. Each patient was a point in a 30-dimensional space.
We then performed quantile transformation (26) to the data, making them follow a normal distribution and reduce the effect of outliers. Following that, we proceeded to perform Principal Component Analysis (PCA) (27) to create our own biomarkers by combining the values of existing ones. To understand PCA, it helps to first visualize the data.
For example, suppose we measure two biomarkers for each patient, X1 and X2. If we plot these biomarker values on two perpendicular axes, we can form a grid, placing each patient according to their miRNA expression levels. PCA allows us to reduce the number of features from these two original measurements to a single engineered feature.

Principal Component Analysis Visualization
In the visualization, the yellow dots represent the positions of the patients when the original features are replaced by a new engineered feature, the red axis. As the axis rotates, the projections of the original points change. The axis that captures the greatest variance among the points is called the first principal component. In this example, it would be the line oriented at a 45° angle to the original axes. For datasets with more features, additional principal components can be identified. In our analysis, we selected enough principal components to explain 95% of the variance in the data, reducing overfitting.
Results
Through trial and error, we discovered that an Extra Trees Classifier (28) performed the best at classifying our engineered data. The same script mentioned above, using 5-fold cross validation and ROC-AUC as its evaluation metric, reached an average score of 90%.
ROC curves of the 5 folds
In this demonstration, we showed that further data analysis and a combination of more biomarkers can lead to more accurate classification of patients. Specifically, a combination of the 30 best biomarkers, quantile transformation, PCA and an extra trees classifier yielded exceptional results, with an average ROC-AUC of 90%. This is proof that a careful selection of more biomarkers and the right data processing can scale the template device we proposed to a very powerful diagnostic tool.
References
- Bradley, Z., & Bhalla, N. (2023). Point-of-care diagnostics for sepsis using clinical biomarkers and microfluidic technology. Biosensors and Bioelectronics, 227, 115181. https://doi.org/10.1016/j.bios.2023.115181
- Max, K. E. A., Bertram, K., Akat, K. M., Bogardus, K. A., Li, J., Morozov, P., Ben-Dov, I. Z., Li, X., Weiss, Z. R., Azizian, A., Sopeyin, A., Diacovo, T. G., Adamidi, C., Williams, Z., & Tuschl, T. (2018). Human plasma and serum extracellular small RNA reference profiles and their clinical utility. Proceedings of the National Academy of Sciences, 115(23), E5334–E5343. https://doi.org/10.1073/pnas.1714397115
- Li, Y., Liang, L., Zhang, C., & Yang, C. (2013). Isothermally sensitive detection of serum circulating miRNAs for lung cancer diagnosis. Analytical Chemistry, 85(22), 11174–11179. https://doi.org/10.1021/ac403462f
- Blicharz, T. M., Gong, P., Bunner, B. M., et al. (2018). Microneedle-based device for the one-step painless collection of capillary blood samples. Nature Biomedical Engineering, 2(3), 151–157. https://doi.org/10.1038/s41551-018-0194-1
- Mukerjee, E. V., Collins, S. D., Isseroff, R. R., & Smith, R. L. (2004). Microneedle array for transdermal biological fluid extraction and in situ analysis. Sensors and Actuators A: Physical, 114(2–3), 267–275. https://doi.org/10.1016/j.sna.2003.11.008
- Cadoni, E., Manicardi, A., & Madder, A. (2020). PNA-based microRNA detection methodologies. Molecules, 25(6), 1296. https://doi.org/10.3390/molecules25061296
- PalmSens. (2023). EmStat Pico data sheet (Rev. 7-2023-011). PalmSens. https://assets.palmsens.com/app/uploads/2023/04/PSDAT-ESP-EmStat-Pico-Datasheet.pdf
- Espressif Systems. (2021, November 16). ESP32-S3-MINI-1 & MINI-1U datasheet (v1.51). Espressif. https://www.espressif.com/sites/default/files/documentation/esp32-s3-mini-1_mini-1u_datasheet_en.pdf
- Adafruit Industries. (n.d.). Lithium ion battery pack – 3.7 V 4400 mAh [Datasheet]. Adafruit. https://www.verical.com/datasheet/adafruit-batteries-chargeable-%26amp%3B-non-chargeable-354-5047050.pdf
- Adafruit Industries. (2024, June 3). PowerBoost 1000C load-share USB charge + boost (Datasheet). Adafruit. https://cdn-learn.adafruit.com/downloads/pdf/adafruit-powerboost-1000c-load-share-usb-charge-boost.pdf
- Solomon Systech. (n.d.). SSD1306 OLED display driver datasheet. Solomon Systech. https://cdn-shop.adafruit.com/datasheets/SSD1306.pdf
- Velleman. (n.d.). SH1106 datasheet (CMOS OLED driver). Velleman. https://cdn.velleman.eu/downloads/29/infosheets/sh1106_datasheet.pdf
- Cui, Y., Fan, S., Yuan, Z., Song, M., Hu, J., Qian, D., Zhen, D., Li, J., & Zhu, B. (2021). Ultrasensitive electrochemical assay for microRNA-21 based on CRISPR/Cas13a-assisted catalytic hairpin assembly. Talanta, 224, 121878. https://doi.org/10.1016/j.talanta.2020.121878
- Vasilescu, C., Rossi, S., Shimizu, M., Tudor, S., Veronese, A., Ferracin, M., Nicoloso, M. S., Barbarotto, E., Popa, M., Stanciulea, O., Fernandez, M. H., Tulbure, D., Bueso-Ramos, C. E., Negrini, M., & Calin, G. A. (2009). MicroRNA fingerprints identify miR-150 as a plasma prognostic marker in patients with sepsis. PLoS ONE, 4(10), e7405. https://doi.org/10.1371/journal.pone.0007405
- Sun, B., & Guo, S. (2021). miR-486-5p serves as a diagnostic biomarker for sepsis and its predictive value for clinical outcomes. Journal of Inflammation Research, 14, 3687–3695. https://doi.org/10.2147/JIR.S321040
- How, C.-K., Hou, S.-K., Shih, H.-C., Huang, M.-S., Chiou, S.-H., Lee, C.-H., & Juan, C.-C. (2015). Expression profile of microRNAs in gram-negative bacterial sepsis. Shock, 43(2), 121–127. https://doi.org/10.1097/SHK.0000000000000282
- Gharaibeh, R. Z., Fodor, A. A., & Gibas, C. J. (2010). Accurate estimates of microarray target concentration from a simple sequence-independent Langmuir model. PLoS ONE, 5(12), e14464. https://doi.org/10.1371/journal.pone.0014464
- Knockmeier, V., & Elfving, H. (2023). Feature selection for sensor failure detection in manufacturing. DiVA portal. https://www.diva-portal.org/smash/get/diva2%3A1772363/FULLTEXT01.pdf
- Beraha, M., Metelli, A. M., Papini, M., Tirinzoni, A., & Restelli, M. (2019). Feature selection via mutual information: New theoretical insights. arXiv. https://arxiv.org/abs/1907.07384
- Huang, H., & Zhang, Y. (2021). Feature selection using an improved Chi-square for Arabic text classification. Journal of King Saud University - Computer and Information Sciences. Advance online publication. https://doi.org/10.1016/j.jksuci.2017.10.010
- Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
- Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42. https://doi.org/10.1007/s10994-006-6226-1
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–794). ACM. https://doi.org/10.1145/2939672.2939785
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
- Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1–3), 389–422. https://doi.org/10.1023/A:1012487302797
- Assaf, S., Forman, N., & Pitman, J. (2013). The quantile transform of a simple walk. Electronic Journal of Probability, 18, 1–16. https://doi.org/10.1214/EJP.v18-2061
- Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. https://doi.org/10.1098/rsta.2015.0202