Open Access Article
Jitske M.
van Ede
a,
Suzanne
van der Steen
a,
Geert M.
van der Kraan
b,
Mark C. M.
van Loosdrecht
a and
Martin
Pabst
*a
aDepartment of Biotechnology, Delft University of Technology, Delft, The Netherlands. E-mail: m.pabst@tudelft.nl
bGenencor International B.V., International Flavors & Fragrances, Oegstgeest, The Netherlands
First published on 25th September 2025
The immense microbial diversity on Earth represents a vast genomic resource, yet discovering novel enzymes from complex environments remains challenging. Here, we combine a microbial enrichment with metagenomics and metaproteomics to facilitate the identification of microbial glycoside hydrolases that operate under defined conditions. We enriched microbial communities on the carbohydrate polymer pullulan at elevated temperatures under acidic conditions. Pullulan is a natural polysaccharide composed of maltotriose units linked by α-1,6-glycosidic bonds. Pullulan, along with its hydrolyzing enzymes, has broad applications across various industries. The enrichment inocula were sampled from thermophilic compost and from soil from the bank of a pond. In both cases, Alicyclobacillus was identified as the dominant microorganism. Metaproteomic analysis of the enriched biomass and secretome enabled the identification of several pullulan-degrading enzyme candidates from this organism. These enzymes were absent in the metagenomic analysis of the initial inoculum, which is highly complex with a wide diversity of species. This underscores the effectiveness of combining microbial enrichment with multi-omics for uncovering novel enzymes and sequence variants that operate under defined conditions from complex microbial environments.
Pullulan represents a polysaccharide that consists of maltotriose units, in which three glucoses are linked via an α-1,4-linkage, and these maltotriose units are linked via α-1,6-glycosidic bonds.12 Pullulan degrading enzymes can be divided into two main groups: pullulanases (type I and type II) and pullulan hydrolases (types I, II and III). Pullulanases cleave the α-1,6 bond, thereby converting pullulan into maltotriose. In addition, pullulanases are also active against other sugar polymers, including starch and glycogen. While pullulanase type I can only cleave the α-1,6-linkages of these sugar polymers, pullulanase type II can cleave both the α-1,6- and α-1,4-linkages. Pullulan hydrolases are further divided into three types. Type I (neopullulanase) and type II (isopullulanase) cleave the α-1,4-linkage of pullulan, thereby forming panose and isopanose, respectively. Type III has the ability to cleave both the α-1,4- and α-1,6-linkages of pullulan, forming panose, maltotriose, maltose and glucose.13 In general, the optimal pH of pullulan-degrading enzymes varies widely, ranging from pH 3 to 11, although the majority exhibit optimal activity between pH 5 and 8. Similarly, their optimal temperatures span a broad range from 30 °C to 105 °C (SI, Fig. S5).13 Certain enzyme groups, such as pullulanase types I and II, include members active across this entire temperature and pH spectrum. The so far identified pullulan hydrolases of types I and II seem to occupy a more narrow space, with pH optima in the range of 5–8 for type I and 3.5–5 for type II. Type III pullulan hydrolases have only been identified in two archaeal species, both exhibiting high optimal temperatures of 95 °C and 97.5 °C, and optimal pH values of 6.5 and 3.5, respectively (SI, Fig. S5).13
Although a wide variety of enzymes are utilized in various industries, the quest for new enzymes – including pullulanases – with improved features is ongoing.13 There is currently a focused effort on employing advanced protein/enzyme optimization techniques, such as artificial intelligence, directed evolution, and intelligent design, to improve enzymes.14–20 However, despite progress in protein engineering and other modification strategies, these approaches do not always yield the anticipated results due to the complex relationship between the structure and function of proteins.21 Fortunately, a remarkable diversity of microorganisms inhabit our planet, providing us with a vast genetic and therefore biocatalytic resource.2 Exploring this natural diversity with sophisticated enzyme discovery methods presents a promising alternative for finding enzymes with the desired traits or for identifying templates for further engineering.
Conventional methods for mining genetic resources rely on cultivating microorganisms in the laboratory, after which DNA is extracted and sequenced if culturing is successful.22,23 After the genome is assembled, it can be mined for genetic targets of interest. Nevertheless, this approach has limitations, namely, most microorganisms cannot be grown as a pure culture under conditions that prevail in laboratory environments. This limitation is better known as “the great plate anomaly”. It is expected that only about 1% of the microorganisms in a sampled environment can be found in a lab after enrichment.24–26 A way to avoid this problem and achieve culture independence is by using metagenomics.27 Over the past years, whole metagenome sequencing has become a popular method to sequence the entire metagenome from natural microbial communities.28 Currently, there are two main metagenomic strategies: the sequence-based and the function-based approach.29 The function-based approach involves constructing a genomic library from environmental DNA, followed by screening individual genes in a cultivable host cell for desired activities.28–30 While this method enables the discovery of entirely novel enzymes, it is limited by high costs and the requirement of high-throughput activity assays.2 In contrast, the sequence-based approach depends on sequence similarity to known templates,29,30 which restricts its ability to identify enzymes with novel functions.2
More recent strategies include microbial enrichment cultures followed by multi-omics analysis of the cultures (genomics and proteomics, supported by advanced bioinformatics tools).31 These strategies first enrich for certain organisms that possess an enzyme with a desired function. This route is only feasible when the enzyme plays a crucial role in the growth of the organism, e.g. is involved in the carbon or nitrogen source utilization.32 For instance, in pursuit of hydrolytic enzymes that cleave specific polymers, an enrichment with the polymer as the sole carbon source can be performed. This ensures that only microorganisms that can hydrolyse and utilize the polymer hydrolysis products will grow.33 Moreover, the conditions for enriching microbes can be tuned towards the conditions in which the enzymes should be active (e.g. temperature, pH and salinity). Subsequently, instead of only employing metagenomics to mine for the enzymes of interest, a more streamlined approach is to also integrate proteomics into the workflow.31,34 Advantageously, metaproteomics allows identification of the actually expressed enzymes, as well as their cellular location, such as cytosolic, membrane-bound, or secreted.35,36 This greatly simplifies genomic data and it supports identification of novel enzyme candidates or those with enhanced properties. It also provides template sequences for further engineering to optimize desired properties.37 In summary, this study demonstrates the discovery of novel pullulan-degrading enzymes that operate under defined conditions, by combining microbial enrichment with metagenomics and metaproteomics.
No organic buffer was used to prevent it from becoming a potential substrate for microbial growth. All steps were performed next to a Bunsen burner to maintain sterile conditions. The enrichments were transferred twice in a 1
:
1000 (v/v) ratio to fresh medium when almost all pullulan was consumed, after 3 and 4 days, respectively. After the second transfer, the OD600 (using an Ultrospec 10 cell density meter from Biochrom) and the pullulan consumption (measured by HPLC analysis) was monitored daily, while the pH was still measured twice a day. The microbial enrichments were analyzed once all pullulan was consumed, after 8 and 14 days for the compost and soil enrichment, respectively. Images of the enrichments were taken using a Zeiss Axio microscope at 100× magnification.
In addition, a 250 mL flask was inoculated with 0.25 g compost combined with 0.25 g soil collected on the TUD campus and incubated at 75 °C in a water bath. The enrichment was monitored as described above.
000 rcf, after which 667 μL of the supernatant was transferred to a clean screw cap vial. Trifluoroacetic acid (TFA) was added to reach a final concentration of 4 M TFA. The samples were incubated for 4 hours at 100 °C in a ThermoMixer with a ThermoTop (Eppendorf, ThermoMixer C). After a 15-min centrifugation step at 4 °C and 14
000 rcf, the supernatant was collected in a clean Eppendorf tube. The samples were measured using a Vanquish HPLC system (Thermo Scientific, Germany) with an Aminex HPX-87H separation column. A constant flow rate of 0.750 mL min−1 was maintained over a total run time of 45 minutes, using 1.5 mM phosphoric acid in Milli-Q water as the eluent. The column chamber temperature was kept at 50 °C and compounds were detected using an RI detector (ERC, RefractoMax 520). Data analysis was performed using Chromeleon 7 (Thermo Scientific, Germany).
000 rcf. This cycle was repeated twice. Afterwards, the samples were subjected to an ultrasonic bath for 10 min and centrifuged for 15 min at 4 °C and 14
000 rcf. All centrifugation steps were conducted at 4 °C and 14
000 rcf unless stated otherwise. The supernatant was collected in a LoBind Eppendorf tube. Trichloroacetic acid (TCA) was added in a 1
:
4 ratio to the supernatant. The samples were vortexed and incubated for 30 min at 4 °C, followed by centrifugation for 15 min. The supernatant was discarded, and the protein pellet was washed with 200 μL of ice-cold acetone, vortexed and centrifuged for 15 min. The supernatant was removed and 100 μL of 6 M urea was added to the protein pellets. To re-dissolve the pellets, the samples were vortexed thoroughly and incubated at 37 °C and 300 rpm in a ThermoMixer with a ThermoTop (Eppendorf, ThermoMixer). If necessary, an additional 100 μL of 6 M urea was added to dissolve the pellets. The samples were reduced by incubating at 37 °C for 60 min after adding 30 μL of 10 mM dithiothreitol and subsequently alkylated for 30 minutes at RT in the dark by adding 30 μL of 20 mM iodoacetamide. The samples were diluted with 200 mM ammonium bicarbonate to achieve a urea concentration < 1 M. To 100 μL of the final sample volume, 5 μL of 0.1 μg μL−1 trypsin solution was added. After gentle shaking, the samples were incubated at 37 °C overnight. The samples were cleaned and concentrated using the Oasis HLB 96-well μElution Plate with 2 mg sorbent per well, 30 μm (Waters, UK). In short, the columns were conditioned with 750 μL MeOH, equilibrated with 2 × 500 μL MS-H2O and then loaded with the samples. Two washing steps with 350 μL 5% MeOH in MS-H2O were performed, followed by elution with 200 μL of 2% formic acid in 80% MeOH and 200 μL of 1 mM ammonium bicarbonate (ABC) in 80% MeOH. The eluates were collected in a LoBind Eppendorf tube. The samples were dried using a SpeedVac concentrator (Thermo Scientific) at 50 °C and stored at −20 °C until analysis by LC–MS. For LC–MS analysis, the dried samples were dissolved in 15 μL of 3% ACN plus 0.1% trifluoroacetic acid (TFA) in MS-H2O, incubated for 30 min at RT, and vortexed thoroughly. The samples were diluted to a final protein concentration of approximately 0.5 mg mL−1, estimated at 280 nm using a NanoDrop ND-1000 spectrophotometer (Thermo Scientific), before proteomic analysis.
Extracellular proteins from the secretome of the enrichment cultures were prepared as described in the following. Samples from the enrichment culture were centrifuged for 15 min and the supernatant was collected. Several protein precipitation methods were employed including TCA precipitation as described above: acetone precipitation, acetone/salt precipitation and filter assisted sample preparation (FASP). The latter three are described below. For acetone precipitation, a 250 μL sample was mixed with 1250 μL ice-cold acetone, vortexed and stored at −20 °C for 30 min. The samples were centrifuged for 15 min and the supernatant was removed. For acetone/salt precipitation, 50 μL of 3 M NaCl solution was added to a 250 μL sample. Subsequently, 1200 μL acetone (RT) was added and the samples were mixed gently. The samples were incubated for 30 min at RT and centrifuged for 15 min. The supernatant was carefully removed, and the protein pellet was washed with 400 μL acetone (RT). Another 15-minute centrifugation was performed, and the supernatant was removed. After protein precipitation, the protein pellet was re-dissolved in 100 μL of 6 M urea and processed as described above. For filter assisted sample preparation, 200 μL of 6 M urea was added to a 10 kDa Microcon filter (Merck-Millipore), followed by centrifugation for 30 min at 14
000 rcf and 20 °C. After discarding the flow-through, this step was repeated once. Unless stated otherwise, all centrifugations during FASP were conducted at 14
000 rcf and 20 °C. Either a 250 μL sample or a 4 × 500 μL sample was loaded onto the filter followed by centrifugation for 30 min. The samples were then reduced by adding 30 μL of 10 mM DTT and 70 μL of 200 mM ammonium bicarbonate. After vortexing and incubating at 37 °C for 60 min, 30 μL of 20 mM IAA was added to alkylate the samples. Following 30-min incubation in the dark at RT, the samples were centrifuged for 30 min. Subsequently, a wash and equilibration step was performed by first adding 100 μL of 6 M urea and centrifuging for 30 min, followed by adding 100 μL of 200 mM ammonium bicarbonate twice and centrifuging for 30 min. The filter was transferred to a clean collection tube and 5 μL of 0.1 μg μL−1 trypsin solution together with 95 μL of 200 mM ammonium bicarbonate was added to the filter. The filter was incubated overnight in a ThermoMixer with a ThermoTop at 37 °C. Sample collection was performed by centrifuging for 30 min. Subsequently, 150 μL of 200 mM ammonium bicarbonate was added to the filter, followed by centrifugation for 30 min. The samples were again centrifuged for 30 min after the addition of 150 μL of 10% acetonitrile (ACN) and 0.1% formic acid (FA) in MS-H2O. Finally, the samples were cleaned and concentrated using solid phase extraction, as described previously. The results from the different sample preparation methods applied to identify extracellular proteins were combined before further analysis for glycosyl hydrolase enzyme candidates.
Shotgun proteomics was performed on an EASY-nLC 1200 coupled with a Q Exactive Plus Orbitrap mass spectrometer (Thermo Scientific, Germany). Chromatographic separation was carried out on a 0.05 × 150 mm2 C18 column (Thermo Scientific, catalogue no. 164943), with mobile phase A consisting of 0.1% formic acid (FA) and 1% acetonitrile (ACN) in MS-H2O and mobile phase B consisting of 0.1% FA and 80% ACN in MS-H2O. The chromatographic profile included an initial 2-minute run with 5% B, followed by a linear gradient from 5% to 25% B over 90 min, then a gradient to 55% B over 60 min. The gradient was subsequently returned to 5% B within 3 min and equilibrated for an additional 20 min, all at a constant flow rate of 350 nL min−1. Sample injections of 2 μL were performed with blank runs between each sample. Electrospray ionization operated in the positive mode, and MS1 analysis was conducted at a resolution of 70
000, with an AGC target of 3e6 and a maximum injection time of 75 ms. Applying the data dependent acquisition mode, precursors for fragmentation were isolated using a 2.0 m/z window over the scan range of 385–1250 m/z. A normalized collision energy (NCE) was set at 28%. MS2 spectra were collected at a resolution of 17
500, an AGC target of 2e5 and a maximum injection time of 75 ms. The database search was performed using PEAKS Studio 10.5 (Bioinformatics Solutions Inc., Canada) with the sequence database obtained from whole metagenome sequencing experiments (described above) and the GPM cRAP contaminant database. Carbamidomethylation was considered a fixed modification, while oxidation and deamidation were included as variable modifications. Trypsin was used as the proteolytic enzyme, allowing a maximum of three missed cleavages. The parent mass error tolerance was set to 20.0 ppm and the fragment mass error to 0.02 Da. Finally, filtering criteria included a 1% false discovery rate (FDR) for peptide-spectrum matches and a requirement of at least 2 unique peptides for protein identification.
000 rcf and 4 °C, the SN was filtered using a 0.22 μm filter. Purification of the His-tagged protein from the cell free extract (CFE) was performed on a Bio-Rad NGC Chromatography System using a 1 mL HisTrap FF crude column. The column was equilibrated with 5 column volumes (CVs) of 100% mobile phase A (50 mM potassium phosphate pH 7.4, 300 mM NaCl, 20 mM imidazole and 1 mM DTT), after which the CFE was loaded onto the column. A wash phase of 10 CVs with mobile phase A was performed, followed by a linear gradient to 100% mobile phase B (50 mM potassium phosphate (pH 7.4), 300 mM NaCl, 500 mM imidazole and 1 mM DTT) over 40 CVs, while collecting 0.5 mL fractions. Fractions containing the purified protein (A16–A35) were pooled and concentrated using an Amicon® Ultra Centrifugal Filter (MWCO: 10 kDa). The buffer was exchanged with 50 mM potassium phosphate buffer (pH 7.4) using a PD10 desalting column (GE Healthcare), according to the manufacturer's protocol. After eluting with 2.5 mL of 50 mM potassium phosphate buffer (pH 7.4), the sample was again concentrated to ∼0.5 mL.
000 rpm for 5 minutes to remove insoluble material. The samples were then applied to the PGC cartridges, washed twice with 200 μL H2O, and the carbohydrate mono- and oligosaccharide fraction was eluted with 300 μL of 10 mM ABC in 60% ACN. Eluates were collected in Eppendorf tubes, dried in a SpeedVac concentrator at 60 °C and re-dissolved in 50 μL H2O. For PGC-MS/MS analysis, 2 μL of this sample was injected to the LC–MS system.
000 over the scan range of 150–1250 m/z. The automatic gain control (AGC) was set to 1e6 and the maximum injection time (IT) to auto. Precursors for product hexose mono- and oligosaccharides were included for continuous isolation and fragmentation in PRM mode. The MS2 spectra were acquired at a resolution of 35
000, with the AGC set at 2e5 and the maximum IT at auto. Fragmentation occurred with a 25% normalized collision energy. Raw data were processed using Xcalibur 4.1 (Thermo Fisher Scientific, Germany). Calibration of the mass spectrometer was performed using the Pierce™ LTQ ESI negative ion calibration solution (Thermo Fisher Scientific, Germany).
Two days after inoculation, microbial growth was confirmed by microscopy and the increase in OD600 in both the ‘compost’ and ‘soil’ enrichments, while absent in the control (Fig. 2A and B). In addition, no growth was observed for the enrichment at higher temperature (75 °C), which was subsequently terminated. Both the ‘compost’ and ‘soil’ enrichments were observed to acidify over time (Fig. 2B). This required daily pH adjustments with KOH, since no organic buffer was used to prevent it from becoming a substrate for the microorganisms. Assuming the following biomass growth stoichiometry, acidification was likely the result of ammonium uptake (see the SI). Moreover, HPLC-UV analysis using an Aminex organic acid analysis column showed no additional peaks indicative of organic acids, excluding the possibility of acid production during the enrichment (SI, Fig. S1). To confirm pullulan utilization, the residual pullulan in the medium was hydrolysed to glycose and analysed by HPLC. The glucose levels decreased over time in the ‘compost’ and ‘soil’ enrichments, but remained constant in the control, verifying the consumption of pullulan by microorganisms (SI, Fig. S2).
| 1 × C6H12O6 + 0.656 × NH4+ + 2.56 × O2 → 3.28 × CH1.8N0.2O0.5 + 2.72 × CO2 + 4.03 × H2O + 0.6 × H+ |
Whole metagenomic sequencing of the ‘compost’ enrichment yielded ∼14
000 genes. Applying the functional annotation and keyword filtering, this could be narrowed down to ∼2000 genes. However, validating the activity of each gene would remain a costly and time-consuming process. Advantageously, integrating metaproteomics data reveals enzymes that are actively expressed. Because pullulan degradation is essential for growth, relevant enzymes were expected to be highly abundant. Moreover, protein localization was inferred by analyzing supernatant and biomass pellet fractions separately, thereby distinguishing secreted enzymes from intracellular enzymes. This particularly supports the identification of enzymes that are involved in the degradation of the carbon source before cellular uptake.39 Remarkably, for the compost enrichment, integrating the metaproteomics data reduced the number of enzyme candidates to 96 intracellular and 17 extracellular (Table 1). Therefore, employing metaproteomics significantly narrowed down the pool of target enzyme candidates. This greatly streamlines subsequent functional and structural investigations, as well as biocatalytic testing after recombinant expression.
From the 17 extracellular candidates, six were annotated using InterPro, of which two (MMBJNONL_14124 and MMBJNONL_07072) were identified as containing a domain from glycoside hydrolase family 13 (GH13). These are the most likely candidates possessing pullulan degrading activity, since this family includes enzymes such as alpha-amylases and neopullulanases. Some expressed proteins lacked functional annotations and would also require further activity testing. The InterProScan annotation of the final extracellular pullulan-degrading enzyme candidates was further validated using dbCAN340 (see Table S2, Fig. S6 and the SI). Both GH13 candidates were confirmed, and MMBJNONL_05960 was further classified by dbCAN3 as belonging to glycoside hydrolase family 53. Finally, MMBJNONL_09916 and MMBJNONL_07048 were annotated using InterProScan, but not recognized by dbCAN3. Both were assigned the domain IPR015955 and annotated as “lactate dehydrogenase/glycoside hydrolase”. If these enzymes primarily function as lactate dehydrogenases, it might explain their absence in the dbCAN3 annotation. Overall, the annotation results are highly consistent between the two tools and support the hypothesis that MMBJNONL_14124 and MMBJNONL_07072 are likely responsible for pullulan degradation.
Matzke and co-workers previously identified two pullulan-degrading enzymes in Alicyclobacillus: the cytoplasmic protein cyclomaltodextrinase (CdaA), which hydrolyzes 1,4-linkages of pullulan to form panose, and amylopullulanase (AmyA), which hydrolyzes 1,6-linkages of pullulan to form maltotriose.41 AmyA likely hydrolyses pullulan into maltotriose extracellularly, which can be transported over the membrane and is further degraded by CdaA (Fig. 4A). A sequence alignment of AmyA and CdaA against the metagenomics data of the ‘compost’ enrichment resulted in six matches, with the best showing only 80% sequence identity (Fig. 4B). Moreover, the matches with high sequence identity to AmyA (MMBJNONL_07072 and MMBJNONL_14124, Fig. 4B) were identified as extracellular by the enzyme discovery approach, while the best match with CdaA (MMBJNONL_12997 in Fig. 4B) was classified as intracellular. These cellular localizations were further supported by SignalP-6.042 and DeepLocPro43 predictions (SI, Table S1). Besides the 80% sequence identity, the same domains were identified at similar locations, being CBM34 and GH13_39 for MMBJNONL_07072, MMBJNONL_14124 and AmyA and GH13_20 for MMBJNONL_12997 and CdaA (SI, Fig. S6). Pullulan degrading enzymes are known to contain four conserved sequence regions,44,45 all of which have been identified in MMBJNONL_12997, while three have been identified in MMBJNONL_07072 and MMBJNONL_14124 (SI, Fig. S7 and S8), further supporting the pullulan degrading activity. Interestingly, sequence alignment of the AmyA and CdaA genes against the metagenomics data of the compost inoculum yielded matches with only ∼45% sequence identity, while the enzymes identified through the employed enrichment metaproteomics approach were not detected at all (Fig. 4). Although deeper genomic sequencing might detect these enzymes, this underscores the effectiveness of combining microbial enrichment with metaproteomics.
To validate the results of our established meta-omics approach and the associated functional predictions, the top extracellular pullulan-degrading enzyme candidate MMBJNONL_14124 was recombinantly expressed in E. coli with a C-terminal His-tag and purified by affinity chromatography. The purified enzyme was subsequently incubated with 1 mg mL−1 pullulan at 60 °C and pH 4.5, and pullulan hydrolysis was monitored using high-resolution tandem mass spectrometry. Thus, we observed a predominant release of triose oligomers (Fig. 4D), which aligned with the cleavage of α-1,6-glycosidic bonds and the release of maltotriose units from pullulan. This confirms the AmyA-like pullulan-hydrolytic activity of MMBJNONL_14124 under elevated temperature and acidic pH conditions.
000 genes in the compost enrichment, which was narrowed down to around 2000 candidates after functional classification. Integrating metaproteomics data further allowed us to focus on actively expressed enzymes and their specific locations, such as intracellular, membrane-bound, or extracellular. For the compost enrichment, this approach reduced the number of enzyme candidates to 96 intracellular and only 17 secreted enzymes.
Nevertheless, the presence of dead and consequently lysed cells can result in the identification of intracellular proteins in the supernatant. To assess the amount of lysed cells, we compared the identified proteins in the supernatant with the 50 most abundant proteins in the biomass pellet (determined by spectral counts). 18 of these proteins were not at all identified in the secretome, and a significant cytoplasmic contamination is therefore unlikely. In addition, the supernatant showed traces of only a few of the enzymes of the abundant glycolytic pathway (glyceraldehyde-3-phosphate dehydrogenase, phosphoglycerate mutase and glucose-6-phosphate isomerase), which further confirmed that there was no significant cell lysis taking place. Finally, some proteins (e.g. membrane proteins) are difficult to detect by conventional proteomics approaches and might be missed without additional optimisation of the sample preparation.46 To complement experimental data, we used SignalP-6.042 and DeepLocPro43 to predict the cellular localization of the secretome. Of these proteins, 27% were predicted to contain a signal peptide, and 18% contained a signal peptide according to SignalP-6.0 and were predicted to be extracellular by DeepLocPro (SI). Signal peptides not necessarily imply secretion, proteins may be membrane bound or, in the case of Gram-positive bacteria, be retained in the periplasm. Similarly, proteins without a signal peptide can still end up in the secretome.42 Nevertheless, these predictive tools are useful in validating whether the final pullulan degrading enzyme candidates are indeed extracellular (SI, Table S1).
Despite Alicyclobacillus being identified as the dominant genus, other coexisting organisms, though present at lower abundance, were identified. Therefore, it cannot be excluded that these organisms are also involved in hydrolyzing pullulan, which can subsequently be consumed by multiple microbes. Intriguingly, a third enrichment was conducted at 75 °C, but no growth was observed. A review by Kahar et al. highlights diverse pullulan-degrading enzymes identified from microbial sources, demonstrating activity at high temperature and low pH.13 For instance, pullulanases from organisms like Fervidobacterium nodosum Rt17-B1 (an optimal temperature, Topt, of 80 °C and an optimal pH, pHopt, of 5), Desulfurococcus mucosus DSM 2162T (Topt = 85 °C, pHopt = 5), and Thermoanaerobacter sp. B6A (Topt = 75 °C, pHopt = 5) are some examples. However, enzyme activity at 75 °C and pH 4.5 does not guarantee that the organism can grow under these conditions. For instance, for Fervidobacterium nodosum, the growth range is restricted to pH 6.0–8.0. Moreover, this bacterium, as well as Desulfurococcus mucosus and Thermoanaerobacter, are obligate anaerobes, preventing growth under the conditions employed in this study.47–49
One enrichment was inoculated with compost (“compost” enrichment), which typically has an elevated temperature and is expected to host thermotolerant microbes. To assess whether such conditions are essential, a second enrichment was inoculated using soil from a pond (‘soil’ enrichment). Despite the complex and diverse starting communities, Alicyclobacillus dominated in both cases. This bacterium was first isolated from an acidic thermal environment in the Yellowstone national park by Darland and Brock.50 It is a spore-forming, rod-shaped bacterium that typically grows at temperatures of 45–70 °C (optimum 60–65 °C) and pH values of 2–6 (optimum 3–4), closely matching the applied enrichment conditions. Cross-contamination between the enrichments is unlikely, as no growth was observed in the control experiment. Thus, similar organisms thrived under identical enrichment conditions regardless of the inoculum's initial temperature. The similarity may also be attributed to the proximity of the sampling sites (∼20 km) and their shared soil environment. Both type I and II pullulanases commonly degrade starch, a natural plant polymer found in soil, which may support the growth of similar organisms.
Although previous studies identified two pullulan-degrading enzymes in Alicyclobacillus, AmyA and CdaA, our enrichment metaproteomics approach revealed distinct enzyme sequences with potentially altered properties (Fig. 4). The four conserved sequence regions of alpha-amylases44,45 were identified in the enzyme candidates, along with the same glycosyl hydrolase domains found in AmyA and CdaA. These similarities support the hypothesis that the enzymes exhibit pullulan degrading activity. The identified enzymes were not detected in the whole metagenome sequencing data of the inoculum, highlighting the effectiveness of our approach. Metaproteomics confirmed that the putative AmyA is extracellular, while the putative CdaA is intracellular. This suggests that AmyA hydrolyzes pullulan into maltotriose units extracellularly, which are then imported into the cell and further broken down by CdaA into glucose. Supporting this, we also detected the expression of a maltose/maltodextrin-binding periplasmic protein and the maltose/maltodextrin transport system MalG. This aligns with previous findings of a high-affinity, binding-protein-dependent ABC transport system specific for maltose and maltodextrins in Alicyclobacillus.51
Finally, the pullulan-degrading activity of the extracellular enzyme MMBJNONL_14124 was confirmed by recombinant expression followed by activity testing at 60 °C and pH 4.5. High-resolution mass spectrometry confirmed the efficient release of predominantly hexotrioses, consistent with the predicted cleavage of the α-1,6-glycosidic bonds and the release of maltotriose units. Nevertheless, some larger and smaller oligomers were also detected, which may arise from some promiscuous activity under the tested conditions.
In summary, we demonstrated the combination of a microbial enrichment and metaproteomics for discovering novel microbial glycoside hydrolases, as exemplified by a pullulan enrichment from two different microbial sources. The integration of metaproteomics significantly accelerated enzyme identification in these complex environments, reducing the number of genes requiring subsequent biochemical validation.
Supplementary information (SI) is available. See DOI: https://doi.org/10.1039/d5cb00049a.
| This journal is © The Royal Society of Chemistry 2025 |