Yoon Ho
Jang‡
,
Joon-Kyu
Han‡
,
Sangik
Moon
,
Sung Keun
Shim
,
Janguk
Han
,
Sunwoo
Cheong
,
Soo Hyung
Lee
and
Cheol Seong
Hwang
*
Department of Materials Science and Engineering and Inter-university Semiconductor Research Center, College of Engineering, Seoul National University, Seoul, 08826, Republic of Korea. E-mail: cheolsh@snu.ac.kr
First published on 15th November 2023
In-sensor reservoir computing (RC) is a promising technology to reduce power consumption and training costs of machine vision systems by processing optical signals temporally. This study demonstrates a high-dimensional in-sensor RC system with optoelectronic memristors to enhance the performance of the in-sensor RC system. Because optoelectronic memristors can respond to both optical and electrical stimuli, optical and electrical masks are proposed to improve the dimensionality and performance of the in-sensor RC system. An optical mask is employed to regulate the wavelength of light, while an electrical mask is used to control the initial conductance of zinc oxide optoelectronic memristors. The distinct characteristics of these two masks contribute to the representation of various distinguishable reservoir states, making it possible to implement diverse reservoir configurations with minimal correlation and to increase the dimensionality of the in-sensor RC system. Using the high-dimensional in-sensor RC system, handwritten digits are successfully classified with an accuracy of 94.1%. Furthermore, human action pattern recognition is achieved with a high accuracy of 99.4%. These high accuracies are achieved with the use of a single-layer readout network, which can significantly reduce the network size and training costs.
New conceptsThe current in-sensor reservoir computing (RC) systems involve fused in-sensor computing with RC by utilizing volatile optoelectronic memristors as optical reservoirs. However, prior in-sensor RC approaches faced a challenge in creating higher-dimensional reservoirs, limiting the system's performance or versatility. This study introduces a novel method to address this limitation, capitalizing on the dual responsiveness of optoelectronic memristors to both optical and electrical stimuli. This method employs optical and electrical masks to diversify the reservoir state and enhance the reservoir dimensionality. Unlike conventional masking processes requiring intricate signal conversions, this technique enhances dimensionality efficiently in a controllable manner. Leveraging zinc oxide (ZnO) deposited using the atomic layer deposition technique, optoelectronic memristors were fabricated, which exhibited a large optoelectronic switching window of ∼103. Utilizing the fabricated ZnO optoelectronic memristor as a reservoir component, this system successfully classified MNIST handwritten digits (94.1%) and recognized human action patterns (99.4%). These results were achieved with a single-layer readout network, offering higher accuracy than previous studies. This study underscores the potential for novel computing hardware in materials research fields, marking a significant step toward the neuromorphic computing era. |
Recently, many neuromorphic machine vision systems have been proposed using optoelectronic memristors as the central component of in-sensor computing systems.10–12 Optoelectronic memristors can detect light directly and perform preliminary processing of visual input as in the biological sensory neurons.13–18 Neuromorphic machine vision systems based on optoelectronic memristors are promising because data latency and power consumption can be significantly reduced without requiring analog-to-digital conversion.
Besides, reservoir computing (RC) is a promising approach to further decrease the power consumption and training costs of the neuromorphic system by incorporating temporal processing through reservoirs.19–24 Reservoirs within the RC system transmit temporal signals to high-dimensional states, which are processed by a simple readout layer. This approach substantially decreases the overall network size and the expenses associated with training. Therefore, implementing an in-sensor RC system with optoelectronic memristors provides a unique opportunity to combine the benefits of both models and significantly enhance the efficiency of processing optical signals. Driven by this motivation, various in-sensor RC systems have recently been proposed based on optoelectronic materials, such as metal oxides and 2-dimensional materials, and applied to temporal sensory signal processing and pattern classification.25–30
In the meantime, the dimensionality of the reservoirs, which refers to the number of input neurons, is a crucial factor that determines the performance of the RC system when dealing with complex problems.24,31–33 Increasing the dimensionality of the reservoirs enables the RC system to achieve higher prediction accuracy by providing a larger capacity to capture and differentiate information from the input signals within a broader feature space. The most widely adopted approach to achieve higher dimensionality involves using a mask.22 Typically, a mask is a matrix where random binary values (1 and −1) are assigned, transforming the original input signals into various modified input signals. However, applying such a mask in an in-sensor RC system is too complicated because optical inputs must be converted into electrical signals and then subjected to random modifications based on the mask. In addition, the mask cannot be controlled according to specific requirements since it is based on a random process. Thus, no methods were reported to enhance the dimensionality of in-sensor RC systems.
This paper proposes novel methods to improve the dimensionality of in-sensor RC systems, which can enhance the performance of neuromorphic machine vision systems. Since optoelectronic memristors can respond to both optical and electrical stimuli, two kinds of masks are adopted, which control the optical inputs and the memristor's electrical properties. As illustrated in Fig. 1a, an optical mask is employed to regulate light wavelength. An electrical mask is utilized to control the initial conductance of zinc oxide (ZnO) optoelectronic memristors before receiving the optical inputs. The distinct characteristics of these two masks help represent various distinguishable reservoir states, enabling the implementation of diverse reservoir configurations with minimal correlation and enhancing the dimensionality of the in-sensor RC system. Unlike conventional masking processes, complex conversion or random modification processes are unnecessary. Using such a high-dimensional in-sensor RC approach, handwritten digits are successfully classified with 94.1% accuracy using the Modified National Institute of Standards and Technology (MNIST) dataset, corresponding to the highest performance among previously reported in-sensor RC works. In addition, human action pattern recognition shows its capability for complex motion perception with a high accuracy of 99.4%. Notably, these accomplishments are achieved by utilizing a single-layer readout network, which can significantly reduce the network size and minimize the training costs.
The optoelectronic properties of the fabricated ZnO optoelectronic memristor were verified by measuring its current–voltage (I–V) characteristics, as shown in the left panel of Fig. 2a. A white light stimulus with an intensity of 5 mW cm−2 was applied during the positive sweeps. As a result of the light illumination, the device transitioned into a low-resistance state (LRS), confirming the optical SET process. During the negative sweeps, the device gradually returned to a high-resistance state (HRS) due to negative voltage application, confirming the electrical RESET process.
The conductance values obtained from 100 consecutive optical SET and electrical RESET cycles are presented in the right panel of Fig. 2a, demonstrating excellent cycle-to-cycle uniformity. The conductance values were extracted at a reading voltage. In this study, a consistent −3 V read voltage was used, and Fig. S2 (ESI†) demonstrates that the read operation at −3 V does not affect the memristor's conductance. The results of Fig. 2a show that the fabricated device exhibits an exceptional on/off ratio exceeding 103 at the read voltage, outperforming prior research on optoelectronic memristors.12,29,36 This substantial switching range not only offers sufficient capacity for representing various input patterns but also facilitates the implementation of the masking process, thereby contributing to the establishment of a high-dimensional reservoir state. The operation of the ZnO optoelectronic memristor is not associated with capacitive effects or ferroelectric polarization and does not exhibit non-zero crossing I–V characteristics,37 as illustrated in Fig. S3 (ESI†). Additionally, Fig. 2b exhibits the I–V curves of 15 independent devices, indicating minimal device-to-device variation.
Fig. 2c and d depict the conductance changes observed while applying optical pulses. The conductance values were read at a voltage of −3 V. The optical pulses increased the conductance, attributed to the capture of electrons in trap sites, a common property of optoelectronic memristors. When the illumination was removed, the conductance gradually decayed as the trapped electrons were released. This phenomenon represents a typical light-induced STP mechanism employed to achieve in-sensor RC.23,25–29 Besides, pair-pulse facilitation (PPF) results from STP, which plays a vital role in biological synapse function.29 To further substantiate the presence of STP in this device, an additional experiment was conducted to measure the PPF characteristics. When a pair of optical pulses are applied, the PPF ratio is defined as A2/A1, where A1 represents the conductance resulting from the first pulse, and A2 represents the conductance resulting from the second pulse. In devices exhibiting STP, the PPF ratio is influenced by the pulse interval (Δt) due to a decrement in conductance over time. As shown in Fig. S4 (ESI†), the PPF ratio decreased from 1.3 to 1.08 when Δt increased from 0.1 s to 1.0 s, confirming the STP in this device.
Two parameters were controlled to manipulate the conductance changes. First, the wavelengths of the light were varied, and second, the initial conductance of the device was modulated. Wavelength modulation was achieved using an optical mask, while the initial conductance modulation was accomplished using an electrical mask. These masks enabled high-dimensional operations within the in-sensor RC.
The optical masks, controlling the wavelengths, were achieved using colored cellophane papers for filtering while generating identical optical pulses from the light source. Three different kinds of cellophane papers were used for masking: transparent cellophane paper for white light illumination, blue-colored cellophane paper for blue light illumination, and red-colored cellophane paper for red light illumination. It is worth noting that these colored cellophane papers can be substituted with a color filter array, commonly used in practical applications of image sensors.38,39 As shown in Fig. 2c, when the cellophane papers had colors, the increase in conductance due to optical pulses became smaller, as only photons with specific wavelengths were applied to the device. When the red-colored cellophane paper was used as the mask, the conductance increase was smaller than the blue-colored cellophane paper, owing to the lower photonic energy of red light.40 The decay rate was slower when illuminated with red light, as the activation energy at the lower conductance state is higher, and electrons rarely escape from the trap sites. For the quantitative activation energy analysis, the time-dependent current-relaxation characteristics were measured, as shown in Fig. S5 (ESI†). From the Arrhenius plots, the activation energies of the low and high conductance states were extracted as 0.27 eV and 0.18 eV, respectively.41 Because of the higher activation energy, it was difficult to detrap the trapped electrons from the trap sites at the lower conductance state formed by red light illumination.
On the other hand, with the electrical masks, the initial conductance of the device was regulated by applying negative electrical pulses before exposing it to light. The trapped electrons were released by applying negative voltages. Modifying the initial conductance can result in various reservoir states, as the optical response varies based on the initial conductance. Fig. S6 (ESI†) shows the conductance as a function of the number of electrical pulses. Such conductance modulation was utilized to control the initial conductance for the operation of the electrical mask. For example, memristors with three distinct initial conductance states can be created by not applying any electrical pulses to the first memristor (low initial conductance), applying a small number of negative electrical pulses to the second memristor (middle initial conductance), and applying a large number of negative electrical pulses to the third memristor (high initial conductance). As the initial conductance increased, the conductance increase by optical pulses became smaller because a significant number of electrons were already trapped at the trap sites, as shown in Fig. 2d. The decay rate was faster when the initial conductance was high, primarily because the activation energy was lower, allowing electrons to escape the trap sites more easily. Note that this behavior differs from that of the optical mask used for wavelength modulation. In the case of the optical mask, the increase and decay were lower at longer wavelengths (red light). However, in the case of the electrical mask, the increase rate was lower at higher initial conductance, while the decay was activated. These distinct characteristics of the optical and electrical masks can help represent various distinguishable reservoir states, thereby enhancing the dimensionality of the in-sensor RC system with minimal correlation.
Fig. 2e depicts the heatmaps showing the normalized conductance value after applying different optical inputs, representing the operations of the optical reservoir. The input stream consisted of four optical pulses, corresponding to the input bits ranging from ‘0000’ to ‘1111’ for the 4-bit in-sensor reservoir operations.42 The current at each bit was measured at 0.1 s after the optical pulses, while a reading voltage was set to −3 V. The detailed measurement setup for the in-sensor reservoir operations is explained in Fig. S7 (ESI†) and the experimental section. The initial conductance used for each optical mask is summarized in Table S1 (ESI†). Notably, all 16 reservoir states were distinguishable, demonstrating the successful experimental implementation of in-sensor RC. It is essential to highlight that even when the same input stream was applied, the conductance varied depending on the wavelength (optical mask) and initial conductance (electrical mask). This finding implies that different optical and electrical masks can represent a wide range of distinguishable reservoir states, enhancing the dimensionality of the in-sensor RC system. In this work, 9 mask sets were employed, comprising combinations of 3 optical masks and 3 electrical masks. As depicted in Fig. 2e, each mask set was assigned a numerical name ranging from 1 to 9. Considering that the current passing through the ZnO optoelectronic memristor ranges from 1 to 400 pA, and the reading voltage is set at 3 V, the power consumption is within the range of 0.003 to 1.2 nW. This value is lower than from previous studies, which is attributed to the low operation current of the ZnO optoelectronic memristor.25–30
As previously mentioned, the distinct characteristics of the optical and electrical masks play a crucial role in representing various distinguishable reservoir states. Fig. 3 shows the conductance of the ZnO memristor at each time step (t1–t4) when applying 4-bit inputs, demonstrating the significance of adopting two different types of masks (optical and electrical). The results in Fig. 3 include the 5 repeated measurements on a total of 16 devices for each input pattern and the 4 mask sets, revealing the reproducibility of the results. As shown in Fig. 3a, the input patterns ‘1001’, ‘1110’, and ‘0101’ were not distinguishable when mask set 1 (white light and low initial conductance state) was used. However, Fig. 3b shows that the ‘0101’ pattern became distinguishable when mask set 3 (white light and high initial conductance state) was applied, showing the impact of the electrical mask. Specifically, the conductance difference between ‘1110’ and ‘0101’ significantly increased from 3% to 20% when changing mask set 1 to mask set 3. In this mask set, the ZnO optoelectronic memristor experienced weaker potentiation and stronger relaxation than in mask set 1.
On the other hand, when employing mask set 7 (red light and low initial conductance state), the ‘1110’ pattern became distinguishable, as shown in Fig. 3c, highlighting the effect of the optical mask. In this mask set, the device underwent stronger potentiation due to the longer wavelength, leading to a weaker relaxation effect. As a result of the reduced relaxation, mask set 7 enhanced the capacity to detect ‘1’ in the input pattern, leading to a significant increase in the conductance of the ‘1110’ pattern. As shown in Fig. 3d, all three patterns became distinguishable by applying mask set 9 (red light and high initial conductance state), showing that combining both masks is essential. Mask set 9 minimized the potentiation effect while providing substantial relaxation, enabling clear differentiation among the three patterns. It should be noted that the effectiveness of the mask set in distinguishing patterns varies depending on the patterns themselves. Therefore, it is essential to create diverse reservoir states by utilizing different mask sets. Such various distinguishable reservoir states enhance the dimensionality and significantly improve the performance of the in-sensor RC system, which will be further discussed in the following sections.
(1) |
As shown in Fig. 5b, the classification accuracy was low when a single mask set was utilized due to the limited dimensionality. However, when two mask sets were employed, the classification accuracy improved to over 91%. It should be noted that the classification accuracy was inversely proportional to the correlation coefficient between the two mask sets, as demonstrated in Fig. 5c. For a detailed analysis, the Pearson correlation matrix shows the correlation coefficient between the reservoir states generated by two mask sets and the confusion matrix displays the classification accuracy, which are shown in Fig. 5d. These results suggest that diverse dynamics in reservoir operations are necessary to achieve high performance, which is why this study adopted two different types of masks (optical and electrical). Fig. 5e exhibits the classification accuracies of four cases: when only mask set 1 was used as the baseline, when mask sets 1 and 3 were used to observe the effect of the electrical mask, when mask sets 1 and 7 were used to observe the effect of the optical mask, and when mask sets 1, 3 and 7 were used to examine the combined effects of the electrical and optical masks. The classification accuracy could be further enhanced when both electrical and optical masks were employed, thanks to the increased dimensionality. Notably, the accuracy was higher when combining the electrical and optical masks rather than using three electrical or optical masks separately (Fig. S9, ESI†). This is because electrical and optical masks can represent various distinguishable reservoir states, thus enhancing the dimensionality of the in-sensor RC system with minimal correlation. Fig. 5f shows the classification accuracy as a function of the number of training epochs, comparing the use of a single mask set with the utilization of all 9 mask sets, which are the combinations of 3 optical and 3 electrical masks. Furthermore, Fig. 5g presents the confusion matrix for each case. The successful classification was achieved with an accuracy of 94.1% when all 9 mask sets were employed, which is the highest accuracy among previously reported in-sensor RC works.27,29 Notably, this outstanding performance in MNIST recognition was accomplished using a readout network that consisted of a single layer without any hidden layers. To further validate these outcomes, the impact of device variations was tested. Cycle-to-cycle (C2C) and device-to-device (D2D) variations were included in the recognition process. Based on normal distributions with an average (μ) of 1 and various standard deviations (σ), random scale factors with errors were incorporated into the training and inference of MNIST images. C2C variation was introduced to induce variations in device responsiveness across different samples, whereas D2D variation was employed to generate variations in device responsiveness within a specific sample. Fig. S10 (ESI†) shows the RC performance based on the various σ/μ ratios of 0–0.125, including the experimental C2C and D2D variation values obtained from the DC cycling test (Fig. 2a) and the device variation test (Fig. 2b). This result demonstrated the robustness of classification accuracy even with C2C or D2D variations.
Employing multiple mask sets in parallel proves efficient for creating a high-dimensional reservoir state. This approach significantly enhances system performance but increases the number of optoelectronic memristors. Therefore, the power consumption of reservoirs will increase when there are too many optoelectronic memristors, while the speed remains constant due to their parallel operation. However, it should be noted that the power consumption of the entire system can be reduced by simplifying the readout network. Remarkably, in this work, high accuracies were achieved even with a single-layer readout network, which can significantly decrease the power consumption of the entire system by reducing training synapses and training costs. Besides, a ‘mask blending’ technique can be adopted to enhance the performance without increasing the number of optoelectronic memristors. Blending multiple masks to specific regions can yield better results since frequently appearing patterns differ from region to region. When red, blue, and white masks were applied to the central, border, and edge sections of the 20 × 20 image, an improved accuracy of 90.35% was achieved while reducing the number of optoelectronic memristors by approximately 50% compared to the case using a single mask set based on 196 memristors (Fig. S11, ESI†). Therefore, achieving high accuracy while reducing the number of optoelectronic memristors is possible, reducing power consumption.
As shown in Fig. 6b, it is evident that the recognition accuracy increased as the number of mask sets increased, thanks to the higher dimensionality. Fig. 6c depicts the recognition accuracy as a function of the number of training epochs, comparing the use of only one mask set versus all 9 mask sets, which are the combinations of 3 optical and 3 electrical masks. Furthermore, Fig. 6d shows the confusion matrix for each case. The recognition accuracies for all mask set combinations are presented in Fig. S13 (ESI†). When all 9 mask sets were used, successful recognition was achieved with a remarkable accuracy of 99.4% while using a single-layer readout network with 189000 training synapses. It should be emphasized that this result is higher than that observed in previous literature (97.14%),28 even though previous literature employed a multi-layer readout network with 211000 training synapses. In this in-sensor RC system, the mask set combination can be flexibly adjusted, allowing for the selective utilization of mask sets to achieve additional savings in training synapses and training costs. For example, when employing three mask sets, an accuracy rate of 98.44% was achieved while reducing the number of training parameters to 63000. This significant improvement was attributed to the high dimensionality of our in-sensor RC system, which incorporates optical and electrical masks. The impact of device variation on system performance was tested to validate the results further. C2C and D2D variations were incorporated into the human action recognition process, similar to the variation test in MNIST recognition. The results of the variation test shown in Fig. S14 (ESI†) demonstrated the robustness of classification accuracy in the presence of C2C or D2D variations.
z = WT·x | (2) |
(3) |
(4) |
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3mh01584j |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2024 |