Polymer-based chemical-nose systems for optical-pattern recognition of gut microbiota

Gut-microbiota analysis has been recognized as crucial in health management and disease treatment. Metagenomics, a current standard examination method for the gut microbiome, is effective but requires both expertise and significant amounts of general resources. Here, we show highly accessible sensing systems based on the so-called chemical-nose strategy to transduce the characteristics of microbiota into fluorescence patterns. The fluorescence patterns, generated by twelve block copolymers with aggregation-induced emission (AIE) units, were analyzed using pattern-recognition algorithms, which identified 16 intestinal bacterial strains in a way that correlates with their genome-based taxonomic classification. Importantly, the chemical noses classified artificial models of obesity-associated gut microbiota, and further succeeded in detecting sleep disorder in mice through comparative analysis of normal and abnormal mouse gut microbiota. Our techniques thus allow analyzing complex bacterial samples far more quickly, simply, and inexpensively than common metagenome-based methods, which offers a powerful and complementary tool for the practical analysis of the gut microbiome.

Fluorescence responses of the polymers to bacteria. Prior to the study, the stock solution of gut-derived bacteria was thawed at 4 °C and centrifuged at 6000 g (10 °C, 10 min). The supernatant was removed, and distilled water was added to reach OD 600 = 0.5 for the bacteria. Fluorescence measurements were performed using a Cytation5 Imaging Reader. Solutions (120 μL)

Chemical-nose sensing
Gut-derived bacteria. The stock solution of gut-derived bacteria was thawed at 4 °C and centrifuged at 6000 g (10 °C, 10 min). The supernatant was removed and distilled water was added to reach OD 600 = 0.5 for the bacteria. Obesity model bacteria mixtures. Two of the four different gut-derived bacteria were mixed in distilled water as indicated in Table S3. Mouse gut microbiota. Homogeneous microbiome suspensions were prepared by combining the methods provided in two reports; 5,6 phosphate buffer saline (PBS) was added to fecal samples collected on day 28 from healthy or insomniac mice (n = 4 for both) to give a concentration of 40 mg/mL. The solution was mixed by vortexing for 1 min and allowed to stand at 4 °C for 5 min; this process was repeated several times. To remove the soluble fraction, the resulting suspension was centrifuged at 8000 g (4 °C, 10 min), the supernatant was removed, and PBS was added. This process was repeated twice. A homogeneous suspension of the gut microbiome was then obtained by filtration with a sterile sieving device (pluriStrainer®, mesh size 40 μm, pluriSelect) to remove large aggregates, followed by a further 200-fold dilution with distilled water.
For the analysis of intestinal bacterial strains, the Escherichia coli strains and mouse gut microbiota, aliquots (108 μL) of solutions containing TPE-functionalized PEG-b-PLLs (167 nM) and 167 mM NaCl in 22.2 mM MOPS buffer (pH = 7.0) or 22.2 mM acetate buffer (pH = 5.0) were deposited in the wells of a 96-half-well plate using a PIPETMAX system. After incubation (35 °C, 10 min), the fluorescence intensity was recorded using two different channels (Ch1: λ ex /λ em = 330 nm/480 nm; Ch2: λ ex /λ em = 360 nm/530 nm). Subsequently, aliquots (12 μL) of the samples were added to each well, and the fluorescence intensity was recorded after incubation (35 °C, 10 min). For other analyses, aliquots (17.5 μL) of solutions containing TPE-functionalized PEG-b-PLLs (214 nM) and 214 mM NaCl in 28.5 mM MOPS buffer (pH = 7.0) or 28.5 mM acetate buffer (pH = 5.0) were deposited in the wells of a 384-well NBS TM black microplate (Corning Inc.) using a PIPETMAX system. After incubation (35 °C, 10 min), the fluorescence intensity was recorded using two different channels (Ch1: λex/λem = 330 nm/480 nm; Ch2: λex/λem = 360 nm/530 nm). Subsequently, aliquots (7.5 μL) of the samples were added to each well, and the fluorescence intensity was recorded after incubation (35 °C, 10 min). These processes were performed at least six times for 4 distinct samples to generate a training data matrix. This training data matrix was processed using linear discriminant analysis (LDA), hierarchical clustering analysis (HCA) and principal component analysis (PCA) in SYSTAT 13 (Systat Inc.). For holdout testing, four fluorescence patterns out of ten or eleven for each analyte were separated from the training data matrix and used as a test data matrix. The test data were classified in groups generated by the remaining training matrix according to their shortest Mahalanobis distances. HCA dendrograms were created based on the Euclidean distances using the Ward method and a dataset standardized prior to analysis using the following equation: z = (x -μ)/σ, where z is the standardized score, x the raw score, μ the mean of the population, and σ the standard deviation of the population. Tables   Table S1. Gut-derived bacterial strains used in this study.

Phylum
Genus Species Strain Abbr.
Ref.  . Two-dimensional LDA score plots for gut-derived bacteria (OD 600 = 0.04) obtained from the array consisting of 12 TPE-functionalized PEG-b-PLLs. The ellipsoids represent the confidence intervals (±1 SD) for each analyte. Two datasets, one consisting of the fluorescence response patterns of (A) the raw data (I) and the other of (B) the data after background subtraction (I-I0), were subjected to LDA. For the raw data, the clusters of the 16 bacteria were spatially well separated, whereas using the I-I0 data, several clusters overlapped (e.g., F.A. and F.L.2). The results of the jackknife test also indicated better accuracy for the raw data (100% for the raw data and 99% for I-I0). These results suggest that background subtraction has a negative effect on the present system. Since many of TPE-functionalized PEG-b-PLLs exhibited low background fluorescence, the negative effect of the data variability in measuring I 0 may be greater than the positive effect of background cancellation.  were also isolated in the plots of score (3) through score (5), which accounted for only 14.1%, 7.6%, and 7.2% of the variation, respectively, indicating that our chemical nose succeeds in extracting various independent aspects of the gut-derived bacteria.  were not included in the cluster of the corresponding labels. Nevertheless, the LDA analysis, which allows a more accurate representation and assessment of the potential of the array for classification, yielded high accuracy as shown in Fig. 4B and Dataset 3. In other words, our array is sufficiently accurate to identify this analyte set.

Firmicutes
13 Fig. S7. Cages for the preparation of mouse feces. In a normal cage, the mouse is free to enter and exit a running wheel, while in the sleep-disturbance cage, the mouse is constantly exposed to the stress of unstable ground in an unanchored running wheel. 2   16 Dataset 1 (separate file). Data-set matrix of the differences in fluorescence intensity before and after the addition of gutderived bacteria (I-I0, OD600 = 0.04) generated from the chemical nose. The jackknife test afforded 99% accuracy.
Dataset 2 (separate file). Data-set matrix of the fluorescence intensity generated by the chemical nose after the addition of gut-derived bacteria (OD 600 = 0.04). The three columns at the right indicate whether the data in the corresponding row was used as training data (denoted by "-") or test data (results of the verification are shown) in the holdout test.

Dataset 3 (separate file).
Data-set matrix of the fluorescence intensity generated by the chemical nose after the addition of obesity model bacteria mixtures. The two columns at the right indicate whether the data in the corresponding row was used as training data (denoted by "-") or test data (results of the verification are shown) in the holdout test.

Dataset 4 (separate file).
Data-set matrix of the fluorescence intensity generated by the chemical nose after the addition of mouse gut microbiota. The rightmost column indicates whether the data in the corresponding row was used as training data (denoted by "-") or test data (results of the verification are shown) in the holdout test.

General synthetic information
Physical data were measured as follows: 1 H (400 MHz and 500 MHz) nuclear magnetic resonance (NMR) spectra were recorded using a Bruker NMR Spectrometer with DMSO-d 6  Biotechnology. Succinic anhydride, phthalic anhydride, and 2,3-pyrazinedicarboxylic anhydride were purchased from Sigma Chemical Co. All chemicals were used without further purification.

Guanidinylation of -None (-hA).
Guanidinylation of -None was carried out using a slightly modified literature procedure. 21 The amino acid modification of -None was carried out using a slightly modified literature procedure. 24

Acid anhydride modification of -None (-Suc, -Pht, and -Pyr).
Acid anhydride modification of -None was carried out using a slightly modified literature procedure. 25

Section 4: Characterization of the polymers
As the interactions between the TPE-functionalized PEG-b-PLLs and bacteria may be governed primarily by electrostatic forces (Fig. 2), elucidation of the charge state of the synthesized polymers was important in order to understand and construct the chemical nose. Therefore, acid-base titration was carried out to determine the pKa of the polymers (Fig. S14). The titration results indicated that the functionalization of PEG-b-PLL with amino acids not only changed the structure of the side chains, but also greatly affected the pKa of the amino groups (Table S4). The introduction of amino acids with dual amino groups (-

Section 5: Understanding the sensing elements and reproducibility, and construction of minimal sensor systems
The array of TPE-functionalized block copolymers provided 48 elements (12 polymers × 2 pH values × 2 channels).
Understanding how each element contributes to the extraction of microbial features in such a large array is important to provide guidelines for the effective selection or design of materials, as well as for discovering new applications of the arrays. Thus, we subjected all the signals from the 16 gut-derived bacteria to an unsupervised hierarchical clustering analysis (HCA), in which the calculated distance between elements corresponds to the similarities in the response patterns of these elements. 28 In the resulting dendrogram (Fig. S17) These tendencies were also similar in loading plots (Fig. S18). For example, in the plots of PC1 vs. PC2, we observed (i) clusters of anionic polymers, (ii) clusters of cationic polymers with hydrophobic amino acids at pH = 7.0, and (iii) clusters of cationic polymers without amino acids at pH = 7.0. These plots also suggest that the data of two channels (Ch1 and Ch2) covaried. Therefore, their contribution should be lower than those of polymer structures and pH values, although the cost of obtaining data in different channels is quite low. In fact, even with Ch1 alone, the validation tests showed high reliability (99% in a leave-one-out cross-validation test and 100% in a holdout test). In addition, the loading plots showed that the difference in pH values particularly affected PC2 and PC4, e.g., the points corresponding to pH = 7.0 appeared to be distributed in the positive direction in PC2 and in the negative direction in PC4. These results indicate that the differences in both polymer structure and pH value are important for providing cross-reactivity to gut-derived bacteria.
In summary, (i) the charge state and (ii) hydrophobicity of the polymers, as well as (iii) the solution pH, contributed to the diversification of the fluorescence responses to bacteria, demonstrating the effectiveness of our polymer design and choice of solution conditions for sensing gut-derived bacteria. Based on the HCA results, the subsequent studies used a lower number of combinations while maintaining a sufficiently high performance of the sensor elements in models of gut microbiota associated with obesity (Fig. 4), real mouse microbiota (Fig. 5), and the others (vide infra), i.e., six polymers (-None, -Dap, -Gly3, -Leu, -Phe, and -Pht), the two pH values, and Ch1. The six polymers were selected evenly from the five clusters observed in Fig.   S17, with the expectation of efficiently generating fluorescence responses with low similarity, i.e., diverse responses. We decided to continue to use two different pH values (pH = 5.0 and 7.0) as pH differences played an important role in diversifying 29 responses (Fig. S17). In TPE-functionalized polymers, we found that differences in detection channels contributed little to obtaining different information, and hence, Ch1 was chosen as it produces higher fluorescence intensity.
In order to gain a deeper insight into the effects of the pH value, we attempted to discriminate gut-derived bacteria with the addition of weakly basic conditions (pH = 9.0) using a chemical nose consisting of the above six polymers (-None, -Dap, -Gly3, -Leu, -Phe, and -Pht) and detection using Ch1 (Fig. S19; for the raw data, see Dataset 5). As inferred from the HCA analysis (Fig. S17), the distribution of clusters in the LDA plots obtained at pH = 5.0 and pH = 7.0 was different, and most of the clusters were well separated (the jackknife test afforded 100% accuracy in both cases). In contrast, at pH = 9.0, some of the clusters overlap, and the accuracy based on the jackknife test was 92%. This decrease in accuracy was probably due to an increase in fluorescence intensity prior to bacterial addition, which should be associated with deprotonation of the cationic polymers (Figs. S15 and S16). These results suggest that the selection of a weakly acidic (pH = 5.0) and a neutral (pH = 7.0) conditions is sufficiently effective in the present TPE-functionalized block-copolymers for gut-derived bacteria sensing.
In addition, we compared the results of the same experiment performed on different days (E1 and E2) in order to examine the reproducibility of our chemical noses that consist of six polymers (-None, -Dap, -Gly3, -Leu, -Phe, and -Pht), two pH values (pH = 5.0 and 7.0), and Ch1 ( Fig. S20; for the raw data, see Dataset 6). Fluorescence pattern data of 16 kinds of labels (8 gut-derived bacteria × 2 experiments) analyzed by LDA showed that the clusters of the same bacteria were very close to each other even if the experimental dates were different. Importantly, when a holdout test was conducted using the dataset obtained by E1 as training data and the dataset obtained by E2 as test data, 100% accuracy was obtained (48 out of 48).
Therefore, our chemical noses offer a robust evaluation system that is capable of reproducibly distinguishing gut-derived bacteria.
While the construction of chemical noses with high discrimination potentials is important, for practical applications, it is also significant to find minimum components that exhibit sufficient discriminatory power. Therefore, the accuracy of chemical noses composed of two polymers in identifying the 16 intestinal bacterial strains was comprehensively tested (Table S5) were separated in the score (3) and score (5) space (Fig. S21). In addition, this combination also provided 97% accuracy in the holdout test. These results suggest that dual amino groups (-Dap) and an aromatic ring (-Phe) are particularly suitable structural features for the recognition of the bacterial surfaces. Thus, we have demonstrated that a wide range of chemical noses can be constructed, ranging from minimal systems with sufficient reliability (Fig. S21) to large systems with further potential for identifying more bacteria (Fig. 3).
The combination of solely -None and -Dap, which afforded 100% accuracy for gut-derived bacteria in the jackknife test above, also achieved high accuracy for the mouse fecal samples (93% for the jackknife test and 97% for the holdout test).
Cluster analysis of the response patterns generated by the chemical nose composed of six polymers (Fig. 5) showed a low correlation between the responses of these two polymers (Fig. S22). These results suggested that this minimal sensor could possibly be useful for a wide range of applications, from identifying bacterial strains to distinguishing mouse gut microbiota.      (Table S2) Statistical analysis of the fluorescence response patterns (Fig. S23A) using HCA showed that each strain was clustered to some extent (Fig. S23B). LDA showed that the clusters of all of the strains, except those of Rosetta2 (DE3) and Rosetta-gami B (DE3), were distributed without overlap in the two-dimensional space (Fig. S24A). The clusters of these two strains were almost separated in the plot of score (4) vs. score (5) (Fig. S24B)  Blautia spp. (phylum Firmicutes) can vary from a few percent to about 15% of the total gut bacteria at the population level. 35 Members of the genus Bacteroides (phylum Bacteroidetes) can account for as much as 30% of the human gut microbiome. 36 The ability to detect changes in such particular classes of bacteria may lead to the creation of unique applications. Therefore, we have attempted to take a step forward on this issue, although that this step remains preliminary at this point. with OD600 = 0.0003 and 0.0010 (Fig. S25B). Therefore, the detection limit of this method is estimated to be in the range of OD600 = 0.0010-0.0030.
In the present study, model samples were prepared wherein two gut-derived bacteria (F.C. and B.B.4) were spiked with 7.9%, 14,6%, and 20.5% of the fecal microbiome, based on the estimated detection limit and peak responses of these gut-