Leveraging autocatalytic reactions for chemical domain image classification†

Autocatalysis is fundamental to many biological processes, and kinetic models of autocatalytic reactions have mathematical forms similar to activation functions used in artificial neural networks. Inspired by these similarities, we use an autocatalytic reaction, the copper-catalyzed azide–alkyne cycloaddition, to perform digital image recognition tasks. Images are encoded in the concentration of a catalyst across an array of liquid samples, and the classification is performed with a sequence of automated fluid transfers. The outputs of the operations are monitored using UV-vis spectroscopy. The growing interest in molecular information storage suggests that methods for computing in chemistry will become increasingly important for querying and manipulating molecular memory.


Code Repository
All of the MATLAB methods and scripts used to train and simulate the autocatalysisbased winner-take-all networks can be found in the GitHub repository AutocatalyticWTA: https://github.com/Chris3Arcadia/AutocatalyticWTA.

Image Database
The images used in this study are from the CalTech 101 Silhouettes dataset 1 , which itself is based on the CalTech 101 image annotations 2 . The database contains 8,641 binary 256-pixel (16×16) images that are each labeled with one of 101 object classes. The data available for each class is summarized by their averages in Figure S1. Additionally, as a measure of class distinctness, the Euclidean distance between each of the class-averaged images is shown in Figure S2. From both Figures S1 and S2, it is clear that many of the images contain class-specific features, but there are also several classes with significant feature overlap. In particular, large, filled circular objects, such as the soccer ball, stop sign, and yin yang symbol, are nearly indistinguishable.   Euclidean distance between each class-averaged representation of the CalTech 101 Silhouettes 1 (those shown in Figure S1). Similar classes, such as Faces 2 and Faces 3 (indices 2 and 3, respectively), are nearer to each other and thus have intersections that appear darker in color. The distances (d) were computed as: , where x and y are vectors formed by reshaping the 16×16-pixel images being compared.

Selected Classes
The data used for training and testing our winner-take-all network is a subset of the images available in the CalTech 101 Silhouettes database 1 . Specifically, the dragonfly, ibis, kangaroo, llama, and starfish classes are used in the main text. For completeness, all of the images from these classes are displayed in Figures S3 (dragonfly), S4 (ibis), S5 (kangaroo), S6 (llama), and S7 (starfish). These classes were selected for aesthetic qualities, and to eliminate the large solid objects which would represent a much more difficult classification task.

Classifier Training
The winner-take-all network in the main text was trained used gradient descent. Data from the 5 considered classes was split 70:30 into training (279 images) and test (119 images) sets. The objective function, F (defined in the main text), was minimized over 700 descent epochs using a learning rate of 5 × 10 −4 . Since the partial derivatives of the objective, (also defined in the main text), are dependent only on class-specific terms, the weight vectors for each class were updated separately during gradient descent. The exact MATLAB function used to tune the classifier weights is presented in Listing S1 (also see Section 1). After each epoch, the objective function was evaluated and its change between epochs was used to monitor training progress (see Figure S8a). The diminishing change in objective value towards the end of the training indicated a local minimum had been successfully reached (see Figure S8b). To assess classifier performance, all images from the considered classes were labeled by the trained network. A comparison between these predictions and the known true labels is shown in Figure S9. Ultimately, the network was found to have correctly classified 81.16% of all the data. cells represent the number of images from each known class (row) that were labeled by each predicted class (column). The predicted class of each image was determined by the reaction well with the shortest simulated time to transition. Data from both the training and test sets was used to generate this matrix. Overall, classes were assigned with 81.16% accuracy.
Listing S1 Network training function, as implemented in MATLAB.

Simulating Many 5-Class Networks
The winner-take-all network presented in the main text used a manually chosen subset of image classes from the the CalTech 101 Silhouettes dataset 1 . To see how similar networks would perform if the classes were randomly selected, we trained and simulated 100 such classifiers, each with their own unique set of 5 object classes. The results are summarized in Figure S10 and Table S1. Despite the known class degeneracies present in this dataset (see Section 3) and the simple network topology, the majority of classifiers perform significantly better than random guessing. Overall classification accuracy across all 100 simulated 5-class networks. a A histogram of the classification accuracies observed across both the training and test sets (those shown in b). The dashed orange line represents the probability of randomly guessing the correct image class ( 1 5 = 20%). b Classification accuracies of the simulated networks plotted against their ensemble class separations, the minimal Euclidean distance between class averages (see Figure S2). There is a moderate positive correlation (Pearson coefficient of 0.39) between accuracy and class separation. Each network was randomly assigned a unique set of 5 image classes to learn (see Table S1) and was trained over 700 epochs with a learning rate of 5 × 10 −4 .

Simulating a 9-Class Network
To demonstrate that an autocatalytic winner-take-all network can be successfully applied to more difficult classification tasks, we trained one such network over 900 epochs on the following 9 classes from the Caltech 101 Silhouettes database 1 : revolver, lamp, mandolin, headphone, umbrella, helicopter, pyramid, chair, saxophone. The learned weights are shown in Figure S11. These weights were used to classify all images associated with the 9 considered classes. The resulting confusion matrix is shown in Figure S12, and the overall classification accuracy was 80.00%.

Revolver Lamp Mandolin
Headphone Umbrella Helicopter Pyramid Chair Saxophone Figure S11 Classifier weights for the 9-class network. Weights were learned over the course of 900 gradient descent epochs with a learning rate of 5 × 10 −4 . True Class Figure S12 Classification confusion matrix for the 9-class network. Images from both the training and test sets were included. Overall, classes were assigned with 80.00% accuracy.