Dry Lab
One of the challenges this project faces is determining the optimal period during which the enzyme should be introduced to the blood samples. RBC units are stored at 1–6 °C for up to 42 days; once removed, they must be returned within 30 minutes or transfused within 4 hours [1]. For a standard 450 mL unit, our current estimate is to incubate the enzyme in whole blood for at least ~1 hour inside the filtration bag. Therefore, using an enzyme that operates well at 4-10°C would allow for the enzyme to cleave antigen groups prior to a transfusion without interrupting the workflow or prolonging the overall blood prep.
One solution for these temperature and time restrictions is using cold-adapted enzymes, also known as psychrophilic enzymes. These psychrophilic enzymes can catalyze biochemical reactions at low temperatures, typically below 20°C [6]. Unlike conventional (mesophilic) enzymes, which lose efficiency in the cold, these enzymes have unique structural and sequencing adaptations that lead to a more loose and flexible structure, allowing them to remain active where most biochemical reactions would otherwise slow down significantly.
So, The goal of the sub-team was to identify psychrophilic counterparts to the five enzymes identified and used by the Wet Lab team.
Project 1: Computational Identification of Psychrophilic Enzymes using CAZy and NCBI
Goal: Parsing through large databases to find desired psychrophilic enzymes for cloning is time-intensive and requires referencing multiple databases and cross-referencing literature to ensure that the enzyme is suitable for colder temperature conditions. To automate this process, we developed a program that can identify specific enzymes from a given protein family and predict whether the enzymes belong to a psychrophilic organism.
Iteration 1: Using CAZy And The NCBI GenBank Database To Determine A Particular Enzyme.
Why CAZy and NCBI?
The CAZy database aims to display and analyze genomic and
biochemical information of structurally-related catalytic
carbohydrate-active enzymes (CAZymes; enzymes involved in the making,
breaking, and modifying of oligo- and polysaccharides). The
NCBI GenBank consists of information about the enzyme, such as
protein/mRNA/DNA sequences, accession IDs, source organism, and
references of original contribution. Both databases are consistently
reviewed and updated frequently (every two months). They are co-kept
with other well-known databases like GenBank, RefSeq, TPA, SwissProt,
PIR, PRF, and PDB. Additionally, both are often accessed and used by the
scientific community due to the above-mentioned reasons of being
uniform, up-to-date, comprehensive, and permitting unrestricted use and
distribution of data. Hence, CAZy and NCBI were chosen.
Work done in this project is summarised as a flowchart, given below. The
shortlisted genera were obtained from Morita & Moyer, 2001.
The CAZy Database was used to identify all enzymes found in each of the GH families that were recognised in Akkermansia muciniphila. The families that were used are given below.
Since GenBank IDs were a part of the information in CAZy, we were able
to
cross-compare with the NCBI Database. This step was
crucial as
each family has different types of enzymes, with variations in
the cleaving region of the substrate or differing in carbohydrate
specificity. Hence, it was important to identify enzymes with a similar
function to the mesophilic ones identified in
Akkermansia muciniphila.
Doing so also helped in reducing the number of entries significantly and
expediting the filtering process. Doing this first, rather than
filtering based on a temperature, is a major advantage since it reduces
the number of entries significantly.
Initially, this process was done manually for the GH36 and GH110
families,
which involved manually looking up the GenBank ID and determining
whether it was the required enzyme. Hence, we first started by
shortlisting the number of entries by comparing with an established list
of psychrophilic genera put together by [3]. For the GH110 family, the
number of entries after this process was significantly lower.
However, for the other families, we had around 15000 to 25000 entries
from the CAZy, and around 1000 to 8000 after the shortlisting process.
This would have taken too much manpower; hence, we
automated the process. As such, we decided to access the NCBI
database remotely through Python; preliminary code (of the same) is
given below :
In order to test whether the code was successful, the number of filtered enzymes (in the output) was compared to the dataset that was manually done.
Iteration 2: Improving Output Readability And Refactoring For Better Usability
At this point, the above code could handle small databases, which might not always be the case. Therefore, we wanted to see if there were any problems with the filtration process when it comes to larger datasets. Likewise, it was tested by running on the entire GH20, GH35, and GH95 families (Results are provided in Table 2). In addition to this, to improve output readability, instead of a list, a dictionary was made, which included information about the enzyme name (that was compared to the given name), accession ID, and the organism name.
As seen, the code was
successful in reducing the number of entries. On manual
inspection of the enzyme name that was outputted, most of the enzymes
were filtered and categorized correctly (errors with the process are
discussed later).
The next step was to
determine whether the enzymes were psychrophilic or not. This was
done by comparing the genus of the organism with a shortlist of
psychrophilic genera. This list of genera was composed of those that had
some psychrophilic organisms; that is, it
did not provide a guarantee that an organism was found in cold
temperatures.
Hence, after the complete filtration, we had to
manually look up whether the enzyme is psychrophilic.
Since the enzymes for cleaving the Extended A and B antigens from Wet Lab’s Cloning Cycle were not successful, it was decided that the software team would focus on characterization of only A and B antigens and compare functioning between the psychrophilic and mesophilic enzymes, and compare structural similarities between the same (Project 2).
Additionally, in order to make the code functional to others for either usage or to make it better, a Python script was written with each part as a function. The main goal was to make it user-friendly by providing methods to manipulate the data and provide easy and readable outputs. Code and information about it can be accessed here: ASU GitLab .
Here is an easy-to-follow user manual made by our team!
Problems Encountered and Future Directions:
- Protein ID not recognized: Some of the protein IDs could not be found through the code, but could be accessed manually. The exact reason for this error is unknown; hence, we decided to include a list of IDs that were rejected in the output of the function.
- Difference in NCBI Title: For this project, the team utilized the Entrez.esummary function to compare the enzyme names. This function presented a nested list of information about the enzyme, one of them being the Title, which included the enzyme name and organism (in square brackets) as one string. For comparison, the string was broken based on the placement of the square brackets, giving two values: the enzyme and organism name. However, for the GH20 family, this order was switched. It is unknown whether this is the case for other entries, so for future interactions with the code, it must be taken into consideration.
- Runtime: Currently, for ~18,000 entries, the code takes ~5-6 hours to finish (with the skipping of “unidentified” enzymes). This can be reduced by using cazy-webscraper.
Proof of Concept: Testing Colwellia Psychrophilic Enzyme in Porcine Type O Blood
As a proof of concept stemming from Project 1, where we filtered for psychrophilic enzymes predicted to target the B antigen chain (α-galactosidases), we selected two top candidates– one from Polaribacter sejongensis (AUC20683.1) and one from Colwellia sp. (ARD46000.1) -- for cloning and bench testing. We successfully cloned the Colwellia sp. candidate and advanced it to functional assays; importantly, the hemagglutination crossmatch assay demonstrated low temperature activity consistent with our design goal of cold-active antigen cleavage. The results below summarize this initial validation in porcine Type O blood.
To test the effectiveness of the Colwellia cold-adapted (CA) enzyme on blood, we treated O-type porcine blood for 2 hours at two different temperatures. While this ɑ-galactosidase enzyme was intended to cleave the human B antigen, it is also capable of cleaving the ɑ-galactose antigen on porcine blood, which is detectable by human antibodies. Figure 5A highlights that at 22 C, the CA and B2 enzymes are not capable of cleaving the ɑ-galactose antigen on porcine blood, resulting in detection by human antibodies and severe clumping. However, Figure 5B and 5C indicate that treatment with CA alone and CA + B2 is capable of ɑ-galactose cleavage, as shown by a reduced amount of agglutinated cells and an increased prevalence of single red blood cells. These results suggest that the cold-adapted enzyme is functional at colder temperatures, as intended.
Read more about the enzyme results here !
Project 2: Temperature-Based Sequence Alignment and Structure Analysis
Goal: Cold-adapted enzymes have structural and sequencing differences from conventional counterparts that allow for functionality in lower temperatures. If we can find areas of similarity among GH families (specifically our chosen GH families) that may allow for low-temperature functionality, we would have the ability to alter our cloned enzymes to be cold-adapted and compatible with our blood conversion kits.
Iteration 1: Using CLUSTAL to find sequence alignment in identified enzymes.
The overall objective from this iteration was to determine regions of similarity between the three temperature groups of enzymes (thermophilic: 41°C-122°C, mesophilic: 20°C-45°C, psychrophilic: < 20°C) [6]. cloned enzymes to be cold-adapted and compatible with our blood conversion kits.
These conserved sequences between temperature groups can be used to determine potential mutation sites to alter protein sequences and convert our original 5 enzymes into cold-adapted enzymes. This could give us insight into where active sites are located and critical regions for cold-adapted enzymes.
We compared enzyme sequences using CLUSTAL, a computer program used to compare sequence alignment of DNA and protein sequences. This program can be used to determine areas of similarity/dissimilarity and create phylogeny trees. For our study, we used CLUSTAL to determine areas necessary to cleave sugar residues and for psychrophilic function.
The plan of action to complete the CLUSTAL sequence alignment was as follows:
- Identify a list of 5 enzymes per group (thermophilic, mesophilic, and psychrophilic enzymes) based on a literature search and review.
- Obtain protein sequences from the NCBI GenBank Database.
- Run the CLUSTALO program to obtain regions of similarity for all groups.
- Compare regions of similarity in one group to regions of similarity in other groups.
Within each temperature range of enzymes, we identified 5 enzymes from
different, well-researched bacteria species to identify regions of
similarity, focusing on bacteria and enzymes used within the dairy
industry to create lactose-free milk and yogurt, as well as our
pre-determined 5 enzymes as mesophilic enzymes and outputted enzymes
from Project 1 as cold-adapted enzymes.
Due to major differences in organisms (lack of matching between genus,
family, and order), there were very few regions of similarity between
temperature groups. For example, for the sequence alignment of the
thermophilic enzyme group (Figure 3), large gaps are found throughout
the sequence, and very few regions of similarity exist between the
enzymes. From this analysis, we were unable to determine the active site
region for each temperature group.
After assembling and running CLUSTAL sequence alignment, no significant regions of similarity were found for thermophilic or mesophilic enzymes. Instead, we pivoted to research and run protein sequences of enzymes from the same families, order, and/or genus to obtain regions of similarity. From Project 1, we obtained 2 psychrophilic enzymes, and only one of which we were able to order for cloning, Colwellia sp. PAMC 21821. For this reason, we decided to investigate the enzymes from this organism’s order, Alteromonadales, to obtain organisms, focusing on the GH110 enzyme family.
Iteration 2: Researching enzymes within the Colwellia genus and the Alteromonadales order to find regions of similarity between selected bacteria.
Using the same general design, we optimized our method to focus on the
Alteromonadales bacterium order, the order of bacteria for the
Colwellia genus obtained from a previous alpha-galactosidase
search (Project 1 program). The order itself is not limited to just
psychrophiles (like the Colwellia genus), but does not include
thermophilic enzymes that operate above a maximum temperature of 45°C.
Overall, we were able to find more mesophiles to compare sequences to
other enzymes found within the Colwellia genus and other
psychrophilic organisms.
From the Alteromonadales order, we compiled enzymes from a total of 5
families: Colwelliaceae (other than Colwellia sp. PAMC 21821),
Alteromonadalaceae, Pseudoalteromonadaceae, Psychromonadaceae, and
Shewanellaceae. In total, we identified 25 enzymes from the GH110
family, as shown in the following table.
Sequence alignments were performed in groups of 5-8 sequences to ensure that information could be easily obtained and organized from CLUSTAL. These include a comparison of the GH110 enzyme from Colwellia sp. PAMC 21821 to the enzyme from Akkermansia muciniphila, of all GH110 enzymes from the Colwelliaceae family, Colwellia sp. PAMC 21821 to enzymes from other psychrophilic organisms, and finally, all the enzymes from mesophilic organisms. There are distinct regions of similarity between Colwellia sp. PAMC 21821 and Akkermansia muciniphila, as seen below.
Among the Colwelliaceae family, there are long regions of similarity, indicating a recent common ancestor.
For an extended list of our results and figures obtained from the CLUSTAL sequence analysis, please view this document:
Overall, this analysis of GH110 sequences provides a fundamental understanding of the structural differences and similarities between mesophilic and psychrophilic enzymes. Due to time constraints, we were unable to perform 3D structural comparisons via AlphaFold, but below we have detailed our future directions for structural and sequencing analysis
Future Direction
1. Complete another round of CLUSTAL sequence comparisons with
related thermophilic organisms.
2. Conduct a thorough literature search on the alpha-galactosidase
enzyme from Colwellia sp. PAMC 21821,
Akkermansia muciniphila, or a different well-characterized
organism, to determine the active site.
3. Using sequence comparisons, find active site regions on all
other enzymes characterized and conduct structural comparison with
AlphaFold.
4. Characterize regions specific to each temperature group based
on AlphaFold results. Attempt to modify protein structures of our
selected enzymes.
5. Clone and compare the efficiency of both psychrophilic and
mesophilic enzymes at various temperatures.
REFERENCES
[1] American Red Cross. (2019). Blood Components. Redcrossblood.org. https://www.redcrossblood.org/donate-blood/how-to-donate/types-of-blood-donations/blood-components.html
[2] Jamile Queiroz Pereira, Ambrosini, A., Pereira, M., & Adriano Brandelli. (2017). A new cold-adapted serine peptidase from Antarctic Lysobacter sp. A03: Insights about enzyme activity at low temperatures. International Journal of Biological Macromolecules, 103, 854–862. https://doi.org/10.1016/j.ijbiomac.2017.05.142
[3] Morita, R. Y., & Moyer, C. L. (2001). Psychrophiles, Origin of. Elsevier EBooks, 917–924. https://doi.org/10.1016/b0-12-226865-2/00362-x
[4] Nowak, J. S., & Otzen, D. E. (2023). Helping proteins come in from the cold: 5 burning questions about cold-active enzymes. BBA Advances, 5, 100104–100104. https://doi.org/10.1016/j.bbadva.2023.100104
[5] OpenStax. (2022, July 18). Temperature Effects on Bacterial Growth.
Biology LibreTexts.
https://bio.libretexts.org/Courses/Prince_Georges_Community_College
/PGCC_Microbiology/08%3A_Microbial_Growth/
8.02%3A_Factors_that_Affect_Bacterial_Growth/8.2.
01%3A_Temperature_Effects_on_Bacterial_Growth
[6] Tankeshwar, A. (2019, November 25). Temperature requirements of Microorganisms. Learn Microbiology Online. https://microbeonline.com/psychrophiles-mesophiles-thermophiles/