Yufei Shi†
a,
Ngoc Thanh Duong†
ab and
Kah-Wee Ang
*a
aDepartment of Electrical & Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583, Singapore. E-mail: eleakw@nus.edu.sg
bDepartment of Energy Science, Sungkyunkwan University, Suwon 16419, Republic of Korea
First published on 12th November 2024
The advent of the novel in-sensor/near-sensor computing paradigm significantly eliminates the need for frequent data transfer between sensory terminals and processing units by integrating sensing and computing functions into a single device. This approach surpasses the traditional configuration of separate sensing and processing units, thereby greatly simplifying system complexity. Two-dimensional materials (2DMs) show immense promise for implementing in-sensor computing systems owing to their exceptional material properties and the flexibility they offer in designing innovative device architectures with heterostructures. This review highlights recent progress and advancements in 2DM-based in-sensor computing research, summarizing the unique physical mechanisms that can be leveraged in 2DM-based devices to achieve sensory responses and the essential biomimetic synaptic characteristics for computing functions. Additionally, the potential applications of 2DM-based in-sensor computing systems are discussed and categorized. This review concludes with a perspective on future development directions for 2DM-based in-sensor computing.
Recent advances in light-sensing technology for optoelectronics sensory devices, including phototransistors, photonic-memory, photovoltaic, gas sensing, and piezoelectric-based tactile sensor devices, have become a major focus in in-sensor device research. State-of-the-art devices based on 2D nanostructured materials for various applications are summarized in Fig. 1. In-sensor computing, which integrates computational capabilities directly within the sensor, enables local data processing rather than relying on an external processor. This approach can significantly enhance data processing efficiency and speed, reduce latency, and lower power consumption.
Fig. 1 Overall graphic review of progress and perspectives for in-sensor computing, including four different mainframes: retinomorphic hardware, circuit integration & materials, light detection and ranging (LiDAR), and multimodal processing. Reproduced with permission.4 Copyright 2024, Springer Nature Limited. Reproduced under the terms of Creative Commons CC BY License.5 Copyright 2023, The Authors. Advanced Science published by Wiley-VCH GmbH. Reproduced with permission.6 Copyright 2018, WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. Reproduced with permission.7 Copyright 2021, American Chemical Society. Reproduced under the terms of Creative Commons CC BY License.8 Copyright 2021, The Author(s), published by Springer Nature. Reproduced under the terms of Creative Commons CC BY License.9 Copyright 2021, The Authors, published by Frontiers. Reproduced with permission.10 Copyright 2023, Wiley-VCH GmbH. Reproduced with permission.11 Copyright 2021, The Author(s), under exclusive license to Springer Nature Limited. |
Significant focus is directed toward utilizing two-dimensional (2D) semiconductor materials to develop sensor circuits that are both more efficient and compact. Among these materials are two-dimensional WSe2, PtSe2, MoTe2/PdSe2 heterostructures, and electrostatically doped silicon, each capitalizing on their inherent ambipolar transport properties.21–23 Additionally, integrating ferroelectric materials, known for their non-volatility and low power consumption, is paramount for advancing in-sensor computing technologies. The literature explores the ferroelectricity and photoconductivity of two-dimensional materials, such as α-In2Se3 and SnS2, to develop innovative retinomorphic sensors.5,24–26 These ferro-photonic AI chips leverage switchable polarization dipoles, enabling electrical programming while optical stimulation is applied to the channel to modulate the polarization state, thereby facilitating multifunctional operation.
The fundamental mechanism of a typical LiDAR sensor involves transmitting laser signals and analyzing the reflected signals to determine the distance to an obstacle or collision.27 Optical stimulation can be transmitted in either two forms: as pulses or as frequency-modulated continuous waves (FMCW). The study of flying animals, such as bats, which use natural sonar, informs the development of LiDAR (light detection and ranging) systems for enhanced spatial awareness and navigation.9 Inspired by how bats navigate using echolocation, LiDAR is used to penetrate dense forest canopies and map the underlying terrain, uncovering hidden archaeological sites beneath dense vegetation. This application is crucial for forestry management, environmental monitoring, and habitat characterization. Additionally, some motion detection models are inspired by flying insects, such as Drosophila (fruit flies) and locusts, which use lobula giant movement detector (LGMD) neurons in their eyes to detect motion within milliseconds, serving as a model for designing motion detection algorithms in LiDAR systems.8,28
Multimodal processing in in-sensor computing refers to the integration and simultaneous processing of multiple types of sensory data within a single sensor system. This approach enables the sensor to capture and analyze different forms of information, such as visual, auditory, and tactile. For instance, a multimodal in-sensor computing system might combine image data from a camera sensor with depth information from a LiDAR sensor and sound data from a microphone. Sensors that can process tactile and taste information expand the range of detectable sensory inputs, provide comprehensive analysis in real-time, reduce data transfer to external processors, and enhance overall system efficiency. Integrating these data types allows the system to understand the environment more accurately. This capability is advantageous in applications such as autonomous vehicles, where combining visual and depth information can improve object detection and navigation, or in smart devices that need to interact with users more intuitively and responsively.
Fig. 2 Block diagrams depict the comparison between conventional imaging sensor architecture and in-sensor computing technology. Reproduced under the terms of Creative Commons CC BY License.29 Copyright 2017, The Author(s), published by Springer Nature. Reproduced under the terms of Creative Commons CC BY License.30 Copyright 2018, The Author(s), published by Springer Nature. Reproduced with permission.31 Copyright 2021 John Wiley & Sons, Ltd. |
The essence of in-sensor computing lies in embedding certain levels of processing directly within the image sensor, thereby enabling early-stage processing and computing at the point of capture or at the edge where data is generated. The image sensor, which consists of an array of photodetectors, performs initial computations on the raw image data as it is captured. These computations can be categorized into low-level and high-level processing tasks. Low-level processing involves fundamental image processing operations, such as edge detection, noise reduction, and filtering, which are computationally intensive and require significant data movement in conventional systems. By integrating these functions into the sensor, the amount of data that needs to be transferred to subsequent hardware is significantly reduced. For instance, edge detection can be performed within the sensor array, producing a processed output that highlights contours and edges within the scene, as illustrated by the progression from the raw image to the processed edge map in Fig. 2. This approach economically reduces data bandwidth and accelerates processing time.
Building on low-level processing, high-level processing tasks, including sophisticated functions like pattern recognition and motion detection, are executed. High-level processing aims to identify and categorize objects within the image, leveraging machine learning algorithms and neural networks. In the context of in-sensor computing, these algorithms can be partially or fully implemented within the sensor hardware, allowing for rapid real-time identification and classification of objects. For example, the processed data from low-level edge detection can be used to extract specific features, such as shapes and textures, which are then used to classify objects into different categories. The final output of these processing stages is a labeled image indicating the presence and type of objects detected within the scene.
In-sensor computing technology leverages advanced fabrication techniques and sensor architectures to embed computational capabilities within the pixel array. This involves using specialized hardware such as analogue processing circuits, digital signal processors, and emerging devices like memristors and photonic processors, which can perform computation at the speed of light. These innovations enable the sensor to handle a substantial computation, offloading work from the central processor and reducing overall system power consumption.
Moreover, in-sensor computing supports parallel processing due to the inherent parallelism of the sensor array. Each pixel can operate independently, performing computations concurrently with its neighbors, significantly accelerating the processing speed. This parallelism contrasts sharply with the sequential processing nature of conventional CPUs, making in-sensor computing highly efficient for image processing tasks.
The retina has a laminar organization, comprising Ganglion, inner nuclear, photoreceptor, and pigments.32 Light-sensitive cells (photopigments) in the disk membranes absorb light, triggering changes in the photoreceptor membrane potential.33 This change causes the synaptic terminal in the photoreceptor to encode the information in a spiking format, which is then transferred to the brain's visual cortex via the optic nerve for processing. The retina contains two types of photoreceptors, distinguished by the morphology of their outer segments. Rod photoreceptors have long, cylindrical outer segments containing numerous disks, while cone photoreceptors have shorter, tapering outer segments with fewer disks. Rods are over 1000 times more sensitive to light than cones due to their increased number of disks and higher photopigment concentration. Each human retina is estimated to contain approximately 5 million cones and 92 million rods.34 This mechanism is analogous to an artificial image sensor, which converts light into digital signals and encodes them into electrical spikes or synaptic excitatory/inhibitory signals for further processing by the CPU.
A similar approach is found in auditory and vestibular systems. In the auditory systems, sound waves, understood as the motion of air pressure, move the tympanic membranes. This movement passes through the ossicles, oval windows, and cochlea fluid, ultimately causing a response in sensory neuron hair cells. The human auditory system can respond to sound wave frequencies ranging from 20 Hz to 20000 Hz.35 Analogous to hair cells in the inner ear, audio sensors capture sound wave information and transmit it to a processor for further analysis.
The chemical sense ability in humans, encompassing gustation (taste) and olfaction (smell), detects environmental chemicals. Smell and taste are directly connected to fundamental internal needs such as thirst, hunger, emotion, sexual behavior, and specific forms of memory.36 The chemically sensitive part of a taste receptor cell is its apical end, located near the tongue's surface. These apical ends have microvilli that extend into the taste pore, exposing the cell to mouth contents. Although taste receptor cells are not neurons by standard criteria, they form synapses with gustatory afferent axons and basal cells, creating a simple information-processing circuit within each taste bud. When a chemical activates a taste receptor cell, its membrane potential changes, typically depolarizing, leading to the opening of voltage-gated calcium channels and transmitter release.32 Sour and salty taste cells release serotonin, while sweet, bitter, and umami cells release adenosine triphosphate (ATP), which excites the postsynaptic sensory axon and causes action potentials that communicate the taste signal to the brain stem. Taste cells may also use other transmitters like acetylcholine, gamma-aminobutyric acid (GABA), and glutamate, though their functions are unclear. Unlike taste receptors, odorants or chemical stimuli in the air dissolve in the mucus layer before reaching olfactory receptor cells. Olfactory transduction involves odorants binding to receptor proteins, stimulating a G-protein, activating adenylyl cyclase, and forming cyclic adenosine monophosphate (cAMP), which opens cation channels, allowing Na+ and Ca2+ influx and subsequently opening Ca2+-activated Cl− channels, leading to membrane depolarization. If the receptor potential is large enough, it triggers action potentials that propagate to the central nervous system. The olfactory response can terminate due to odorant diffusion, enzymatic breakdown, or receptor cell adaptation, which reduces the response over time, even in the presence of the stimulus.
Mechanoreceptors, mainly present in somatic sensory systems, respond to physical distortion such as stretch, pressure, or bending. The Pacinian corpuscle is the most extensively studied mechanoreceptor, located deep within the dermis, with dimensions reaching 2 mm in length and nearly 1 mm in diameter. It converts physical stimuli into electrical signals that the nervous system can interpret. The functioning of the human somatic sensory system is comparable to artificial tactile sensors, which predominantly utilize piezoelectric materials such as lead zirconate titanate (PZT),37 barium titanate (BaTiO3),38 and zinc oxide (ZnO) to convert mechanical force into electrical signals.39
Both biological and artificial systems exemplify a modular approach to sensory processing, where distinct types of receptors or sensors are specialized for specific stimuli, ensuring efficient and accurate environmental perception and response. The CPU in artificial systems parallels the integrative and interpretative functions of the human brain, highlighting the interdisciplinary convergence of biology, neuroscience, and engineering in the development of sensory technologies.
In this review, we thoroughly examine recent advancements in in-sensor computing hardware utilizing emerging two-dimensional materials (2DMs). We categorize and consolidate the unique properties and physical mechanisms of 2DMs that enable in-sensor computing. To provide a comprehensive understanding of the in-sensor computing paradigm, we conclude its evolution process and outline the essential characteristics required for high-performance hardware. In addition, various applications possibly realized by 2DM-based in-sensor computing hardware are categorized, such as in-sensor reservoir computing, motion detection, and multimodal signal processing. Finally, a summary with current research efforts, key challenges, and potential directions for future exploration in this field is provided.
Secondly, the electronic properties of 2DMs can be precisely tuned through various methods such as defect engineering, chemical doping, and strain application.4,6 This tunability enables the optimization of 2DMs for specific in-sensor computing functions. For example, leveraging the tunable bandgap of 2DMs can lead to broadband vision systems with robust photoresponse to near-infrared (NIR) light, which is crucial for applications like night vision and medical imaging.15,46–50 Reconfigurable, van der Waals infrared photodetectors can perform in situ edge extraction at the detection end, reducing the transmission burden by extracting key features directly within the sensor and enhancing the efficiency of image perception systems.51
Furthermore, the absence of dangling bonds in 2DMs leads to ideal periodic structures and uniform charge distribution, enabling the formation of clean interfaces in 2DM-based devices. By leveraging the weak interlayer interactions within 2DMs, various heterostructures can be constructed by stacking multiple 2DMs.52–55 This approach offers opportunities for designing novel device architectures that realize multifunctionality or precise control of specific functions,56 enabling devices to perform complex computing tasks directly at the sensor interface. Additionally, heterostructures of different 2DMs can exhibit synergistic effects that enhance their overall performance.
The rapid development of 2DM synthesis technology, such as chemical vapor deposition, enables the production of large-scale, high-quality 2DMs suitable for industrial applications.57–60 This compatibility facilitates the seamless incorporation of 2DMs into current manufacturing workflows, accelerating the development and deployment of in-sensor computing technologies. Moreover, their excellent mechanical flexibility and robustness allow integration into various substrates, further manifesting their potential in flexible and wearable electronics. Beyond these shared advantages, different 2DMs also possess unique properties and exhibit exceptional responses to specific external sensory signals or combinations of stimuli, demonstrating significant potential for in-sensor computing.
For instance, the ambipolar open circuit voltage (Voc) observed in the gate-tunable junction mode of broken-gap SnSe2/MoTe2 heterostructure exemplifies this potential.68 Harnessing the bi-direction photovoltaic effect of van der Waals (VdW) heterostructures, the 2DM-based in-sensor computing system exhibits superiority in broadband convolutional processing, outperforming conventional CMOS technology for broadband sensing and convolutional processing.22,69 Pi et al. demonstrated a practical broadband convolution processing approach using the gate-tunable linear broadband photovoltaic effect of PdSe2/MoTe2 heterostructures (Fig. 4a).22 Under different gate voltages, the band alignment between n-type PdSe2 and p-type MoTe2 varies, resulting in distinct positive and negative photovoltaic responses. Negative gate voltages induce type-II band alignment, promoting the separation of photogenerated electron–hole pairs through a built-in electric field. In contrast, positive gate voltages create a type-III band alignment, introducing a tunneling barrier that modulates the current flow (Fig. 4b). This configuration presents a linear relationship between photoresponse and both light intensity and gate voltage (Fig. 4c), ideal for implementing BCP and enabling the direct integration of convolution operations within the sensor, facilitating efficient and low-power image processing. The device shows efficient sharpness operation and edge enhancement for Jasper Ridge datasets images containing information across different wavelength bands, achieving a significant improvement of over 90% in image recognition tasks compared to single band-based convolutional neural networks.
Fig. 4 Photovoltaic effect adopted in in-sensor computing systems. (a) Schematic of broadband convolutional processing (BCP), dynamic kernels inside BCP are composed of PdSe2/MoTe2 heterostructure with gate-tunable photoresponsivity. (b) Different electron–hole pair generation patterns under light illumination, arising from different band-alignment caused by opposite gate voltages. (c) Light-tunable and gate-tunable current response. Reproduced with permission.22 Copyright 2022, Springer Nature Limited. (d) Schematic of graphene–Ge heterostructure device. (e) Edge features extraction of low-contrast images by using graphene–Ge heterostructure array. (f) Effective extraction edge profile of dim image using in-sensor dynamic computing. Reproduced with permission.63 Copyright 2024, Springer Nature Limited. |
In addition to conventional static optoelectronic convolutional processing, Yang et al. employed the graphene/Ge heterostructure to construct an in-sensor dynamic computing system for robust tracking of dim targets (Fig. 4d).63 By dynamically tuning the photoresponse via adjustable gate voltages, a dynamic kernel is implemented using an optoelectronic device array that contains internal correlations and can respond to spatial light patterns. As shown in Fig. 4e, the array comprises both passive and active components, where passive devices are sensitive to local light intensity, and active devices’ responses are modulated by both local and surrounding light intensities, facilitating the selective amplification of minor light intensity variations. This configuration allows for robust extraction of edge features from low-contrast images, a critical capability in complex visual environments. Effective edge extraction from low-contrast scenes (Fig. 4f) and improved recognition accuracy of human faces across varying contrast levels are achieved with this intelligent machine vision system.
For instance, by exploiting the ferroelectricity in hafnium-zirconium oxide (HZO) and semiconducting two-dimensional p-type black phosphorus (BP), Hao et al. demonstrated a 2D optoelectronic memristive device with superior electrical properties and synaptic functions.76 This strategic implementation leverages the ferroelectric properties of HZO to modulate the energy band structure and electron transport in the BP channels, resulting in enhanced photodetection and memristive behavior. The broadband optical response of black phosphorus and the tunable conductance via HZO regulation underscore the device's potential for highly efficient, multifunctional in-sensor computing applications.
Furthermore, conventional hafnium-zirconium oxide not only possesses nonvolatile ferroelectricity, but with the careful engineering of material composition (i.e., increasing the proportion of Zr in HZO), volatile anti-ferroelectricity can also be achieved. This enhances the capability of oxide-2DM integrated devices for handling temporal information. By exploiting the inherent fading memory (short-term effect) in Hf0.25Zr0.75O2, Shi et al. demonstrated a multi-modal reservoir computing (RC) system for efficient spatiotemporal information processing.77 The dipoles polarize under the control of an external electric field and depolarize spontaneously upon its removal. This dynamic switching process regulates carrier flow inside the MoS2 channel and adjusts the output current, enabling the generation of diverse reservoir states with different combinations of pulse sequences. Besides the electrical modulation, the RC system also harnessed the light sensitivity induced by the 2D MoS2 integration to achieve control of temporal dynamics under optical stimuli. This dual-mode modulation response significantly enhances the reservoir state richness and boosts the overall system performance, resulting in an impressive accuracy of 95.4% in spoken digit recognition tasks.
In contrast, the 2D ferroelectrics exhibit a small critical thickness and enhanced immunity to functional fatigue, facilitating ultra-fast polarization flipping, a flat surface, and controllable ferroelectric polarization. Beyond serving as dielectric layers, 2D ferroelectric materials display diverse properties, ranging from insulators (e.g., CuInP2S6) to semiconductors (e.g., α-In2Se3) and metals (e.g., WTe2), significantly expanding their application scenarios.78 Moreover, forming a clean interface with heterogeneous integration between 2D ferroelectrics and other 2D layers allows for active modulation of the optical and electrical properties of devices.
CuInP2S6 (CIPS) is widely utilized as the dielectric layer in fully two-dimensional ferroelectric field-effect transistors due to its robust switchable ferroelectric polarization at room temperature. The ferroelectricity properties arise from the displacement of Cu ions within its layered structure, where these ions occupy off-center positions within the octahedral voids formed by the surrounding P and S atoms.86,87 This displacement creates an electric dipole, leading to spontaneous polarization that can be switched by an external electric field. The ferroelectricity of CIPS is preserved even at the monolayer limit and exhibits a relatively high Curie temperature of approximately 340 K.88 Integrating CIPS with a 2D semiconductor channel layer allows modulation of the hysteresis loop in dual-swept channel conductivity via ferroelectric polarization, enabling both sensing and memory functions within the same device. Additionally, CIPS's large polarization and a strong coupling between its ferroelectric and mechanical properties make it highly suitable for applications in flexible electronics.
Wang et al. report a 2D heterostructure-based ferroelectric field-effect transistor that uses light-modulated ferroelectric polarization to realize a multi-functional vision system.89 When illuminated by photon energy, the interfacial charges at the SnS2/CIPS interface are compensated by photogenerated carriers, resulting in the change of built-in field and polarization direction. This modulated polarization in CIPS causes the depletion or accumulation of electrons in the SnS2 channel, switching the device between high resistance state (HRS) and low resistance state (LRS) (Fig. 5a). This photoelectric modulation property enables the device to simulate associative learning and light adaptation behaviors. The intensity and projection time of the light is crucial in controlling synaptic behavior, with longer light pulses resulting in longer retention times and a transition from short-term plasticity (STP) to long-term plasticity (LTP), as shown in Fig. 5b. Multi-level conductance states can be achieved with varying doses of light irradiation (Fig. 5c). Utilizing both short-term plasticity and long-term plasticity features, a comprehensive reservoir computing system is physically constructed based on the demonstrated FeFET. This system includes a reservoir layer capable of capturing external temporal optical information and a fully connected layer for simultaneous training and processing. This configuration achieves a high accuracy of 93.62% in recognizing MNIST handwritten digits.
Fig. 5 2D ferroelectrics for in-sensor computing. (a) Change of channel carrier concentration by both electric and light-induced ferroelectric polarization change. (b) The transition of synaptic behavior from STP to LTP by increasing light pulse duration. (c) Multi-level PSC states are obtained by successive optical and electrical spike pulse stimuli. Reproduced under the terms of Creative Commons CC BY License.89 Copyright 2023 The Authors. Advanced Science published by Wiley-VCH GmbH. (d) Hysteresis loop presents in output characteristics of α-In2Se3 device, resulting from OOP ferroelectric polarization. (e) Energy band diagrams of polarization state modulation on Schottky barrier height. Varying device states are generated with the change of polarization direction. (f) Long-term stable distinct intermediate states are achieved through electrical pulses of varied amplitudes. (g) Influence of ferroelectric polarization on the movement of photo-generated carriers under light illumination. Reproduced with permission.90 Copyright 2024, Wiley-VCH GmbH. (h) Schematic of vdWH FeFET (CIPS/hBN/α-In2Se3). (i) Light-induced dipole polarization change in a-In2Se3 layer. (j) Stable LTP behavior, with optical potentiation and electrical depression. (k) Successful demonstration of Pavlov's experiment implies the classical conditioning obtained from the 2D-ferroelectric-based optoelectronic synaptic device. Reproduced with permission.91 Copyright 2023, American Chemical Society. |
Beyond being utilized as a dielectric layer, 2D ferroelectric α-In2Se3 can also function as a channel layer due to its natural semiconducting properties.92,93 Unlike conventional FeFETs, where polarization charges accumulate at the ferroelectric and semiconductor interface, semiconductor FeFETs using α-In2Se3 as the channel layer exhibit charge accumulation at both the top and bottom surfaces. This approach maintains the ferroelectric property while allowing the substitution of the dielectric layer with other high-quality materials, alleviating non-ideal interface issues. The α-In2Se3 layer's quintuple-layer structure, with asymmetric Se atoms positions breaking centrosymmetric, easily produces spontaneous in-plane and out-of-plane polarization.94 The covalent bond configuration causes the interlocking effect among dipoles, significantly stabilizing ferroelectric polarization even to monolayer thickness.95–97 Moreover, α-In2Se3's superior photosensitivity and light-induced polarization change further enhance its capability for integrating sensing and processing functions within a single device.
The integration of electrical and optical modulation of α-In2Se3's ferroelectric polarization produces various dynamics, significantly enhancing its information processing capabilities. Leveraging the crystal regulation of α-In2Se3 ferroelectricity, Zeng et al. developed a multisensory synapse that merges sensing, computing, and memory within a single device. This system performs both in-sensor front-end processing and subsequent post-processing.90 The dynamic modulation of retention and plasticity is linked to the adjustment of Schottky barrier height in response to ferroelectric domain motions under both light and electric stimuli (Fig. 5d). The applied electric field shifts the Fermi level of the bottom graphene layer, causing the Schottky barrier height to increase (decrease) when α-In2Se3 polarization orientates downward (upward), resulting in the device's HRS (LRS) (Fig. 5e). Moreover, various parameters, such as pulse amplitude and width, alter the overall polarization state and direction, enabling the transition between STP and LTP and producing multiple intermediate conductance states, beneficial for synaptic characteristics (Fig. 5f). The electric field-induced Schottky barrier height change further generates a built-in field, influencing its intensity and accelerating or decelerating the separation of photogenerated carriers under light stimuli (Fig. 5g). Consequently, the nonvolatile conductance and photoresponsivity are associated with the dynamics of ferroelectric polarization within the α-In2Se3 layer, enhancing the multifunctionality of neural networks.
In addition, in Fig. 5h, Das et al. demonstrated a ferroelectric field-effect transistor composed of CIPS and α-In2Se3 van der Waals heterostructures, capable of sensing, storage, and processing of optical signals within a single device.91 The spontaneous dipole polarization is bidirectionally locked and controlled within the α-In2Se3 channel by gate voltages, enabling robust memory storage. Light-induced polarization within α-In2Se3 alters the dipole order in response to optical signals, facilitating optical sensing and the transition from short-term to long-term memory (Fig. 5i and j). The coupled in-plane (IP) and out-of-plane (OOP) polarizations, modulated by electrical and optical inputs, underpin complex functionalities and enable the device to mimic retinal functions, combining sensing, processing, and memory within a single unit. The system achieves high paired-pulse facilitation, color recognition, and accurate pattern recognition (92.5% with MNIST datasets), along with demonstrating optical logic operations and event learning tasks, such as Pavlov's dog experiment as shown in Fig. 5k.
Furthermore, the source of migrated ions does not only originate from metal electrodes. Several studies concluded that 2DMs inherently contain intrinsic defects or vacancies, which can also facilitate the migration of native ions (e.g., sulphur vacancies in MoS2, selenide vacancies in PdSe2). The formation and rupture (deformation) of conductive filaments induce nonvolatile or volatile (for spontaneously ruptured filaments) characteristics into the devices, thereby introducing various synaptic plasticity.
Besides intrinsically forming conductive filaments in the 2DM itself, 2DMs also exhibit effective regulation of conductive filaments when integrating them with other oxide layers. Wang et al. employ the high hydrophilicity of the 2D MXene layer and further integrate it with the photo-active ZnO layer to construct a multi-modal in-sensor computing system that emulates the capability of human perception system for sensing and processing visual information in a complex environment.98 Inside the MXene-ZnO memristor, oxygen vacancies migrate and form a conductive filament under the modulation of applied voltage; simultaneously, due to the existence of abundant hydrophilic sites in the MXene nanosheet (Fig. 6a), the device shows high protonic sensitivity and the electrostatic attraction between proton and oxygen vacancies causes restriction of the conductive filament growth (Fig. 6b). Therefore, the electric field and relative humidity control and modulate the resistive switching process of the memristor and enable an adaptive behavior sensitive to both visual and RH-sensory information (Fig. 6c and d). By combining these two characteristics, a humidity-adapted visual system is demonstrated based on the MXene-ZnO memristor crossbar array, showcasing noise suppression and filtering during the sensing and preprocessing of visual signals, and it also functions as the artificial synapse for the postprocessing of sensed images. With the RH modulation, good linearity of weight update is achieved, and the system finally exhibits image recognition with a high accuracy of 82.96%.
Fig. 6 Multimodal regulation on conductive filament formation. (a) Water molecule's adsorption process at the abundant hydrophilic sites located at the surface of MXene-ZnO heterojunction in high relative humidity (RH) environment. (b) Schematic of the proton-mediated resistive switching mechanism. The combination of protons and oxygen vacancies regulates the conductive filament growth. (c) RH modulation on I–V curves of the MXene-ZnO based memristor. (d) Optical modulation on the current response of the memristor in various RH environments. Reproduced with permission.98 Copyright 2021, Wiley-VCH GmbH. Dynamic modulation of photosensitivity by charge trapping process via defect engineering. (e) Schematic of MoS2 phototransistor with ultraviolet/ozone (UVO)-induced defect states. (f) Schematic of the band structure of the MoS2 phototransistor under different gate voltages and density of states (DOS). (g) Current change ratio (CCR) of the device at different VG values, indicating the presence of both excitation and inhibition effects. (h) Scotopic and photopic adaption achieved with efficient photoelectric modulation. Reproduced with permission.4 Copyright 2022, Springer Nature Limited. |
Liao et al. demonstrated an in-sensor visual adaption system using MoS2 phototransistors, mimicking the dynamic modulation of photosensitivity under different light conditions of the human retina to achieve image contrast enhancement and significantly improve image recognition accuracy.4 The visual adaption behavior is achieved by intentionally inducing defect states in MoS2 phototransistors via ultraviolet-ozone (UVO) treatment (Fig. 6e). Ambipolar trap states are created in the bandgap of UVO-treated MoS2. Under light stimulation, photogenerated carriers are trapped in these defective states, shifting the Fermi level and changing the conductance state (Fig. 6f). Furthermore, an applied back-gate voltage can tune the position of the Fermi level and influence the charge trapping–detrapping process at the defect states (Fig. 6g), showing the current excitation/inhibition phenomenon. By employing this photoelectric modulation, scotopic and photopic adaptation is successfully achieved by utilizing negative (positive) back gate voltage, Vg, for the dim (bright) background to obtain excitation (inhibition) behavior (Fig. 6h). A [8 × 8] phototransistor array fabricated using this method exhibited accurate image perception under both dim and bright backgrounds. In addition, by harnessing the conductance change during the charge trapping process, an optoelectric memory was implemented with the MoS2 phototransistor, enabling the optical information record. Finally, the image recognition task performed using the MoS2 phototransistor array showed a noticeable improvement in recognition rate exceeding 80% due to its superior photopic adaption.
It is evident that data sensing and processing are equally critical within in-sensor computing systems. Due to the diverse requirements of various applications, different processing methods may be employed even for the same type of sensory signals. Consequently, in-sensor computing devices may exhibit various forms of synaptic plasticity, tailored to the specific needs of target applications. Here, we summarize and categorize the most commonly utilized types of plasticity for implementing computing functions within in-sensor computing devices.
Short-term plasticity: to meet the demands of capturing temporal information in rapidly changed environments (light, sound, or smell), short-term plasticity (STP) is favorable for real-time signal processing. It refers to the temporary changes in synaptic strength that occur on timestep ranging from milliseconds to minutes and its transient nature enables rapid and reversible adjustments in synaptic efficacy. STP contains both short-term facilitation and depression, which are always characterized by paired-pulse facilitation that investigates the response of the device under the stimuli of two identical consecutive pulses with a small interval τ. Normally the PSC shows a more significant increase when a smaller interval is adopted and the time-interval-dependent PSC gain is evaluated by the PPF index, which is determined by the following equation:
In a common retina-inspired photodetector, Duong et al. demonstrated STP in an α-In2Se3-based retinomorphic sensor. Under the co-modulation of electrical and optical stimuli, the device exhibits gate-tunable excitatory and inhibitory functions in photon-induced short-term plasticity.5 The laser-induced paired-pulse facilitation is measured as shown in Fig. 7a, with extracted PPF ratio that changes along with varying pulse intervals (Fig. 7b). The PPF measurement verifies the sub-millisecond fading dynamics of the device, which is vital for the implementation of dynamic kernels for convolutional image processing. In addition, as shown in Fig. 7c and d, Chen et al. demonstrated temporal summation characteristics in MoS2-based phototransistors.103 This STP assists the encoding process of light stimuli in the temporal domain and contributes to the construction of bioinspired vision sensors that emulate the graded neuron inside the insect visual system.
Fig. 7 Biomimetic synaptic characteristics for data computation. (a) Characterization of short-term plasticity using paired-pulse facilitation. (b) Extracted PPF ratio corresponding to measurement results in (a), as a function of the pulse interval. Short-term plasticity in MoS2-based phototransistor. Reproduced under the terms of Creative Commons CC BY License.5 Copyright 2023, The Authors. Advanced Science published by Wiley-VCH GmbH. (c) Drain current exhibits a dramatic increase under stimulation and gradually decays towards zero once the light is removed, showing temporal summation behavoir. (d) Distinguishable device states obtained from short-term potentiation. Reproduced with permission.103 Copyright 2022, Springer Nature Limited. (e) Realization of synaptic conductance modulation with all optically controlled modulation in a graphene/titanium dioxide quantum dot heterostructure. Reproduced with permission.104 Copyright 2023, American Chemical Society. (f) The transition from STP to LTP induced by increasing the duration of light pulses. Reproduced under the terms of Creative Commons CC BY License.89 Copyright 2023 The Authors. Advanced Science published by Wiley-VCH GmbH. |
Long-term plasticity (LTP) and plasticity transition: different from the short-term plasticity that is associated with adaptive response and spatiotemporal information processing, long-term plasticity is preferably related to long-term learning algorithms (e.g. Hebbian learning rule) and memory capabilities.105 The implementation of LTP in a synaptic device not only requires effective modulation of connect strength directly by external stimuli but also needs to make sure the obtained modifications are nonvolatile. In experiments, the LTP is usually characterized by investigating the existence of long-term potentiation and depression behaviors. Linearity and symmetry are two important parameters to evaluate the LTP within a device. To realize a promising LTP, good linearity is expected under the stimuli of sequential identical pulses. Although nonlinearity can be improved by using a pulse sequence with progressively enhanced pulse amplitude or prolonged pulse interval, this comes at the expense of introducing complex peripheral circuits inside the system.
In-sensor computing devices extend beyond simple modulation by electrical stimuli; they can achieve long-term potentiation and depression through the synergistic regulation of diverse external signals. As shown in Fig. 5j, co-modulation of both electrical and optical stimuli is a widely adopted method for synaptic conductance modulation in optical synapses, including an optical stimuli-controlled LTP process and an electrical stimuli-controlled LTD process. However, this mixed-input control for long-term plasticity usually relies on complex external circuitry to generate the necessary pulse schemes, leading to great latency in overall processing times. An alternative approach, which shows greater promise, employs all-optical regulation for synaptic conductance modulation.104,106 As depicted in Fig. 7e, this method allows bidirectional synaptic conductance modulation using solely optical stimuli, thereby simplifying the system complexity and accelerating processing time.
Additionally, synaptic responses in the form of STP and LTP are not discrete. A gradual transition from STP to LTP can be achieved through repeated stimulation, mimicking the memory consolidation process in the human brain where information transfers from the hippocampus to the cerebral cortex. Wang et al. demonstrated a CIPS-based optoelectronic FeFET in which the transition from STP to LTP is achieved by adjusting the pulse count or duration (Fig. 7f).89 This mechanism enables the device to display both volatile and nonvolatile characteristics, facilitating seamless transitions between these states and broadening the applicability of in-sensor computing devices.
Among various approaches, memristor-based RC systems have gained the most attention owing to their simple fabrication and integration. However, the relatively fixed nonlinear transformation function of memristors makes it difficult to adjust timescales, thus limiting their performance.107,114,115 To address this, a multimode and multiscale reservoir computing system was developed using 2D α-In2Se3-based memristors with a structure of Au/Pd/α-In2Se3/Pd/Au (Fig. 8a).116 Leveraging the in-plane ferroelectricity and photoresponsivity of α-In2Se3, the memristor exhibits tightly coupled electrical and optical synaptic plasticity, allowing for tunable timescales modulated by this hetero-synaptic plasticity. Electric-field-driven ferroelectric switching results in noticeable potentiation and depression behaviors under electrical pulse train stimuli, with higher stimulus frequencies leading to more pronounced potentiation or depression.
Fig. 8 In-sensor reservoir computing systems. (a) Schematic of α-In2Se3-based optoelectronic heterosynapse with additional light/back-gate modulating terminal. (b) Tunable excitatory post-synaptic current (PSC) induced a train of optical pulses. (c) Light modulation on electrical pulse induced inhibitory PSC, demonstrating tunable relaxation time and synaptic behavior of the device. (d) Demonstration of mixed-input reservoir computing based on multi-sensory fusion of the α-In2Se3-based optoelectronic for feature extraction. (e) Recognition of images composed of mixed types of inputs (both tactile and visual inputs). (f) Multi-mode digit recognition accuracy at different working modes. Reproduced with permission.116 Copyright 2022, Springer Nature Limited. (g) Schematic of MoS2/h-BN/Te heterostructure-based optoelectronic FG memory unit. (h) Various volatile optoelectronic memory behaviors are obtained with optical pulse modulation, showing rich fading memory dynamics. (i) A sample from the N-MNIST dataset. The event stream of the N-MNIST dataset possesses a higher spatial resolution (34 × 34 pixels) than the MNIST dataset. (j) Featuring encoding by the multimodal reservoir using linear discriminant analysis (LDA), exhibiting effective nonlinear transformation and dimensionality reduction. Reproduced with permission.117 Copyright 2023, Wiley-VCH GmbH. |
Similarly, optical pulses induce an accumulation effect modulated by light intensity and frequency due to photo-generated carriers (Fig. 8b). These two modulation types interact, generating diverse current levels (reservoir states). For electrical stimulation, the relaxation time of inhibitory post-synaptic current (PSC) decreases under constant light, and similar modulation can be achieved with light under negative back-gate voltage conditions (Fig. 8c). This tunable fading memory behavior and the coexistence of optoelectrical modulation enable a mixed-input RC system that demonstrates multi-feature extraction and multi-sensory fusion with one physical reservoir (Fig. 8d). Unlike conventional RC systems, which require multiple reservoirs to process various inputs individually before combining them for classification, this α-In2Se3-based multimode RC system dramatically reduces hardware costs and energy consumption. The system's superiority is demonstrated with a multimode hand-written digit recognition task, achieving an 86.1% accuracy that outperforms the performance of the RC system that can only handle one type of input (Fig. 8e and f). Moreover, the prediction of multiple superimposed oscillator (MSO) time-series data containing various frequency information is successfully demonstrated with a low normalized-root-mean-square error of 0.105, further proving the capability of the α-In2Se3-based in-sensor RC system to handle sensory information.
Despite the satisfactory performance of the 2DM memristor-based in-sensor RC system in sensory data processing, challenges remain. Typically, the conductance states are used as reservoir states in memristor-based RC systems, necessitating a read pulse after each programming pulse to read out the conductance state, limiting processing speed and increasing energy consumption.116,118 In addition, the difficulty of addressing single device nodes within the crossbar arrays and minimizing the sneak current also caused another practical challenge in the memristor-based RC system. Sneak path currents can lead to unintended currents flowing through other memristors in the array that are not directly accessed and cause additional leakage currents, resulting in the wrong reading and writing of data and increasing the overall energy consumption. Consequently, transistor-based RC systems have gained more attention due to their better control abilities with an additional gate terminal and the absence of sneak path current.
As demonstrated in Fig. 8g, a 2DM heterostructure-based floating gate memory device leverages gate modulation to control the charge-trapping process, functioning as both memory and an in-sensor physical reservoir.117 The MoS2/hexagonal boron nitride (h-BN)/Te heterostructure is formed by drop-casting solution synthesized Te nanoflakes and dry transfer of exfoliated h-BN and MoS2 flakes. The 2D Te serves well as a floating gate due to its high trapped charge density and rapid recombination at shallow trap states. Dual-type input stimuli (electrical and optical) can both rely on the charge-trapping and neutralization process to achieve nonvolatile and volatile behavior. Electrical pulses applied to the control gate modulate electron tunneling from MoS2 to TE, and the trapped charge in the FG caused the change of device states. As for optical stimuli, they generate photo-excited carriers that can neutralize the stored electrons inside the FG. In addition, it is easy to switch between the nonvolatile and volatile behavior by simply tuning the stimuli intensity. A superior and stable nonvolatile behavior is obtained under high-intensity stimuli, while weak stimuli provide volatile behavior with rich dynamics (Fig. 8h). The performance of this heterostructure-based RC is examined using the N-MNIST composed of temporal information and changes in pixel intensities rather than the static image in the MNIST dataset (Fig. 8i). By leveraging the multisensory fusion of the heterostructure-based RC system, effective nonlinear transformation and dimensionality reduction are achieved during feature extraction process (Fig. 8j), resulting in a 90.77% accuracy, significantly higher than the single-mode RC system and surpassing the software baseline.
In insects, the lobula giant movement detector (LGMD) neurons, located in the lobula of their brains, are sensitive to the expansion of objects in the visual field and play a crucial role in collision avoidance behavior.121,122 Inspired by LGMD neurons, a collision avoidance system for autonomous vehicles is designed (Fig. 9a). However, traditional implementations require very-large-scale-integration (VLSI) systems, which are energy and area inefficient.123–125 To overcome these bottlenecks, a nanoscale collision detector was designed by stacking a monolayer MoS2-based photodetector on top of a non-volatile floating-gate (FG) memory (Fig. 9b).126 This detector effectively performs collision detection while consuming minimal energy (nanojoules) and having a small device footprint (∼1 μm × 5 μm). It successfully mimics the escape response of LGMD neurons by exhibiting an excitatory response to light stimulation (Fig. 9c) and an inhibitory response to electrical stimulation (Fig. 9d). During collision detection, a static light source with changing intensity (i.e., looming stimuli) simulates an approaching car. Owing to the photoexcitation effect of the MoS2 detector, the device exhibits a monotonic current increase when exposed to the light source. The inhibitory behavior is a result of dynamic engineering of the threshold voltage caused by charge trapping between the back gate and the floating gate under a programming voltage. Integrating these two competing mechanisms, the device shows an inflection point before the collision occurs (when the light source reaches maximum intensity), indicating its capability to work as a collision indicator. The MoS2-based collision detector demonstrates high energy and area efficiency as it comprises only a single device without requiring an array or peripheral circuits. It also works effectively with both direct incident light from a moving object and reflected passive light from the collision object, demonstrating its versatile functionality (Fig. 9e).
Fig. 9 In-sensor motion-detection systems. (a) Optoelectronic collision detector mimicking the working principles of LGMD neurons. Excitatory, inhibitory, and escape responses are obtained when only looming optical stimulus, only electrical stimulus, both optical and electrical stimulus applied, respectively. (b) Schematic of a biomimetic collision detector consisting of a MoS2-based photodetector and a programmable FG non-volatile memory stack. (c) Excitatory photocurrent responses in the MoS2 FET at negative constant back-gate biasing. (d) Inhibitory output current obtained with varying electrical programming pulses. (e) Schematic of two different collision detection scenarios emulated using LED light source: (1) direct light from collision object (2) passive light reflected from collision object. Reproduced with permission.126 Copyright 2020, Springer Nature Limited. (f) Schematic of the MoS2 phototransistor-based vision sensor. (g) Gate voltage modulation on the position of defect states. The device exhibits distinct temporal resolution, ranging from 10 ms to 106 ms. (h) Spatiotemporal information encoding using MoS2 phototransistor array, accurately depicting the trajectory contour of the moving object. (i) Perception of motion direction using bioinspired vision sensor with output containing spatiotemporal frames. (j) Successful spatiotemporal information encoding across varying timescales. Reproduced with permission.103 Copyright 2023, Springer Nature Limited. |
Moreover, the intrinsic defects in the as-grown MoS2 film can be employed to implement learning rules. By coupling photoexcitation with shallow trap centers in MoS2, an optoelectronic phototransistor was successfully implemented to perceive different types of motions (Fig. 9f).103 Under light stimulation, photogenerated carrier trapping into existing defects produces a nonlinear conductance increase, and the trapped charges are released spontaneously upon light removal, emulating the function of graded neurons. Furthermore, effective gate voltage modulation tunes the charge dynamics of shallow trapping centers, enabling various temporal resolutions ranging from 10 to 106 ms (Fig. 9g). This efficient encoding of temporal information fuses spatiotemporal features across varying timescales into compressive images (Fig. 9h), achieving a high recognition accuracy of 99.2% for direction perception (Fig. 9i). This setup allows for the detection of motion with varying speeds and enhanced vision saliency of moving objects (Fig. 9j), significantly outperforming the conventional image sensors.
The demonstration of a high-performance tactile information processing system composed of seamlessly connected tactile sensors and memristor arrays has been realized based on MXene, showcasing the potential for Morse code recognition and human health state monitoring.130 Requiring no high-temperature processes or complex material growth or transfer processes, the sensor and memristor are integrated, utilizing only a single functional MXene layer (Fig. 10a), significantly reducing the transmission delay and energy cost and simplifying the fabrication process. Due to the micropump structure produced by sandpaper templates in PDMS films, under certain physical pressure, the contact area between the MXene sensitive layer and PDMS films keeps changing and converts the physical pressing signal into action potentials. Then, the generated stimulation is transferred into the MXene-based memristor array for processing (Fig. 10b). The single functional MXene layer enables efficient sense and conversion of external tactile signals (Fig. 10c) and shows highly linear and symmetric weight updates in the synaptic array, implying its capability for information encoding and processing. A high accuracy of 97.96% is achieved for human respiration state classification tasks (see Fig. 10d and e).
Fig. 10 Multi-modal in-sensor computing schemes. (a) Schematic of near sensor computing system constructed using MXene-based sensory neuron array and meristor array. (b) Signal transmission processing between the sensor unit and synapse unit. (c) Conversion between tactile signals and postsynaptic current within the sensory neuron. (d) Schematic three-layer fully connected ANN constructed based on MXene-based flexible sensory neuron for human respiratory state classification. (e) Confusion matrix and recognition accuracy for the human respiratory state classification task. Reproduced with permission.130 Copyright 2024, Elsevier Ltd. (f) Schematic of the bioinspired gustatory system consisting of graphene chemitransistor-based sensing unit monolayer MoS2 memtransistor-based computing unit. (g) Evolution of temporal response to sweet and bitter taste stimulants in hunger and appetite neurons. (h) Schematic of adaptive feeding circuit using MoS2-based memtransistor as building blocks. Reproduced under the terms of Creative Commons CC BY License.36 Copyright 2023, The Author(s), Published by Springer Nature. |
Besides the sensing and processing of physiological behavior, novel bio-inspired 2DM-based in-sensor computing systems also exhibited the capability to mimic the psychological behavior of humans. An all-2D gustatory circuit mimicking the physiology and psychology of adaptive feeding behavior in humans has been proposed.36 As demonstrated in Fig. 10h, this innovative system integrates graphene-based chemo-transistors as artificial gustatory taste receptors, forming an electronic tongue that allows the device to detect and process taste information, together with monolayer MoS2 memtransistors constructing an electronic gustatory cortex, enabling the system to simulate the neural processes involved in feeding behavior (Fig. 10i). The advantages of different 2DMs are fully harnessed for different implementations: due to the high carrier mobility at room temperature,131 inherently low electrical noise,132 and enhanced sensitivity caused by the exceptionally high surface-to-volume ratio,133 graphene is adopted for artificial taste receptors. MoS2 is utilized to emulate cortical decision function due to its superior semiconducting properties, which are beneficial for signal processing and decision-making. Such in-sensor computing device not only capable of co-processing physical and physiologic information, but also establishes a novel paradigm for emotional-AI systems that can bridge the gap between human and machine intelligence, with significant implications for human health and future AI applications.
In addition to processing tactile and gustatory information, 2DM-based sensors have also exhibited superior capabilities in dealing with various sensory sources (e.g., gas7,134,135 or audio information77,128). Integrating 2DM with intelligent memristors and transistors overcomes the limitation of in-sensor computing devices that rely solely on electrical modulation, significantly expanding their application in perception computing.
Device structure | Functional 2DM | Sensory source | Mechanism | Application | Performance | Ref. |
---|---|---|---|---|---|---|
(ITO)/MXene/ZnO/Al/PDMS memristor | MXene | Relative humidity | Multi-field controlled conductive filament formation | Noise suppression & image recognition | 82.96% | 98 |
Ag/Mxene/Au/PI memristor | MXene | Pressure | Conductive filament formation | Human state classification | 97.96% | 130 |
Cr/Au/MoS2/SiO2/Si phototransistor | MoS2 | Light | Light-modulated charge trapping detrapping | Handwritten digital recognition | 96.90% | 4 |
Pt/Au/WSe2/h-BN/Ti/Au/Si/SiO2 transistor | WSe2/h-BN | Light | Charge trapping & photoresponsivity | Color-mixed pattern recognition | Over 90% | 23 |
Cr/Au/SnS/Si/SiO2 memristor | SnS | Light | Charge trapping & photogating effects | Practical Korean sentence classification | 91% | 118 |
Cr/Au/MoS2/h-BN/Te/SiO2/Si transistor | MoS2/h-BN/Te | Light | Charge trapping & photoresponsivity | Event-based N-MNIST digit-recognition | 90.77% | 117 |
Cr/Au/MoS2/Al2O3/Cr/Au/SiO2/Si phototransistor | MoS2 | Light | Photoexcitation & gate-tunable charge trapping | Moving ball track recognition | 99.2% | 103 |
SnS2/h-BN/CIPS/Au/SiO2/Si FeFET | h-BN/CIPS | Light | Light-modulated ferroelectric polarization | (MNIST) handwritten digital recognition | 93.62% | 89 |
Cr/Au/CIPS/PET | CIPS | Pressure | Ferroelectric-enhanced piezoelectricity | Speech recognition & motion monitoring | 96% | 128 |
Graphite/CIPS/graphite/Si/SiO2 memristor | Graphite/CIPS | Light | Reconfigurable photovoltaic effect | Pattern recognition | Around 95% | 136 |
Ni/α-In2Se3/HfO2/SiO2/Si | α-In2Se3 | Light | Ferroelectric polarization & photoresponsivity | Handwritten digital recognition | 95% | 5 |
Au/graphene/α-In2Se3/graphene/SiO2/Si | α-In2Se3 | Light | Ferroelectricity regulated Schottky barrier height modulation | Pattern recognition | 80%–97% | 90 |
Pd/Au/α-In2Se3/Ti/Au phototransistor | α-In2Se3 | Light | Photoinduced ferroelectric polarization switching | Visual perceptual processing | N.A. | 137 |
Pd/Au/α-In2Se3/Si memristor | α-In2Se3 | Light | Coupled ferroelectric & optoelectronic effects | Multimode handwritten digit recognition | 86.1% | 116 |
Ti/Au/α-In2Se3/h-BN/CIPS/Ti/Au | α-In2Se3/h-BN/CIPS | Light | Electrical and optical modulation of ferroelectric polarization | Optical logic operations & handwritten digital recognition | 92.5% | 91 |
Cr/Au/graphene/MoTe2/PVDF/SiO2/Si photodiode | Graphene/MoTe2 | Light | Ferroelectric-tuned photoresponsivity | Pattern recognition & classification on robot dog movement | N.A. | 138 |
Cr/Au/PdSe2/MoTe2/SiO2/Si photodiode | PdSe2/MoTe2 | Light | Gate-tunable broadband photovoltaic effect | Broadband image classification | Nearly 100% | 22 |
Ti/Au/graphite/graphene/Ge/SiO2/Si | Graphite/graphene | Light | Gate-tunable photovoltaic effect | Human face recognition | Over 90% | 63 |
Ti/Au/WSe2/Al2O3/Ti/Au photodiode | WSe2 | Light | Photovoltaic effect | Image encoding & handwritten digit recognition | N.A. | 13 |
Cr/Au/h-BN/b-AsP/MoTe2/SiO2/Si | b-AsP/MoTe2 | Light | Photovoltaic & photothermoelectric effects | Handwritten digit recognition | 96% | 139 |
Au/MoS2/HfO2/Ti/Au/PET (SiO2/Si) phototransistor | MoS2 | Light | Gate-tunable photoresponsivity | Hand-written digit recognition | 90.64% | 10 |
Au/h-BN/WSe2/Al2O3/BP/Al2O3/Si transistor | h-BN/WSe2/BP | Light | Gate-tunable photoresponsivity | Detection and recognition of moving trolleys | 100% separation detection | 11 |
Pd/Au/WSe2/h-BN/Al2O3/Ti/Au | WSe2 | Light | Gate-tunable photoresponsivity | Image stylization, edge enhancement, contrast correction | N.A. | 15 |
Cr/Au/h-BN/WSe2/Cr/Pd/Si/SiO2 photodiode | WSe2 | Light | Photoresponsivity | Convolutional image processing | N.A. | 140 |
Au/WSe2/Al2O3/HfO2/Al2O3/Cr/Au/SiO2/Si | WSe2 | Light | Programmable photoresponsivity | Motion recognition | 92% | 141 |
Cr/Au/MoS2/sr-SiNx/Si phototransistor | MoS2 | Light | Photoconductivity | All-day motion detection and recognition | N.A. | 142 |
Au/h-BN/MoS2/graphene/Si/SiO2 transistor | h-BN/MoS2/graphene | Light | Photoelectric-coupling effect | Implementation of reconfigurable logic functions | N.A. | 143 |
Cr/Au/WSe2/Al2O3/Cr/Au phototransistor | WSe2 | Light | Photogating effect associated with trap states | Color-mixed pattern recognition | 95% | 144 |
Ni/Au/MoS2/Al2O3/Pt/TiN | MoS2 | Light | Photoexcitation & charge trapping | Collision detection | N.A. | 126 |
Multiple sensory computing: a fundamental requirement for multimodal sensory computing is the development of advanced sensors capable of capturing various types of sensory data. Although there has been notable progress in developing sensors for individual modalities such as vision, touch, and olfaction, the ability to effectively capture and process multimodal data in real time remains a significant challenge. A key question in multisensory integration is how the brain determines the weighting of each sensory input when forming a combined percept. One prominent theory is the “statistical optimization model” or “Bayesian integration model,” which suggests that the brain weights each sensory cue based on its reliability derived from its prior probability. Integrating data from multiple sensors due to differences in temporal and spatial resolutions, noise characteristics, and units of measurement. Developing efficient algorithms and hardware architectures to fuse this heterogeneous data into a coherent representation is crucial for enabling meaningful multimodal perception and decision-making.
Integration scale: large-scale integration of sensing-memory-computing devices into arrays presents a significant hurdle. Current vision chips typically feature a relatively small scale of 1000 to 100000 pixels with a substantial device-to-device variation, impeding large-scale integration and limiting their suitability for practical applications.145 Furthermore, the absence of a compact physical model for emerging vision chip technologies based on 2D semiconductors complicates the prediction of device performance in larger systems.
Material system and device structure trade-offs: developing materials and device structures suitable for multifunctional sensing-memory-computing electronics poses a significant challenge. The ideal material must perform stably across sensing, memory, and computing functions. For example, proposing memristors and light-induced ferroelectric materials with considerable photo-response across a wide light range is crucial for artificial visual systems. In-sensor computing also necessitates the development of novel device functions and mechanisms. While direct analogy processing offers real-time readout and reduced data conversion, it is vulnerable to noise and complex to design on a large scale, requiring optimization for specific applications. Developing new algorithms will be necessary for non-visual data, as features from auditory and olfactory sensors have lower dimensionality than visual signals.
Ferroelectric materials: ferroelectric materials offer low-power operation and air stability for sensing-memory-computing devices, making them potentially suitable for retina-inspired sensors. For instance, Hf0.5Zr0.5O2 exhibits pronounced ferroelectric properties, displaying robust polarization at nanoscale thicknesses. However, interface states, ambiguous mechanisms, and integration challenges hinder the advancement of ferroelectric materials for in-sensor computing. Further research is essential to elucidate the material properties and enhance polarization characteristics, thereby facilitating the development of high-performance semiconductor devices.
Multimodal processing: current research predominantly focuses on single-sensory processing. Future intelligent devices and systems must be capable of real-time fusion of multiple sensory nodes, such as visual, auditory, olfactory, and tactile. This advancement is crucial for applications in robotics, intelligent vehicles, and wearable electronics. The ultimate objective is to develop intelligent systems that can perceive and interact with the environment in a human-like manner.
The successful development of multimodal sensory computing technologies could revolutionize various fields. In robotics, multimodal sensor systems could enhance robots’ perception capabilities, enabling them to navigate complex environments, interact safely with humans, and perform intricate tasks. In autonomous driving, combining data from cameras, lidar, radar, and other sensors could allow vehicles to perceive their surroundings more accurately and reliably, improving safety and driving capabilities. Wearable electronics for healthcare could also benefit, with multimodal sensors integrated into devices like smartwatches and fitness trackers providing comprehensive health monitoring, personalized feedback, and context-aware experiences.129 Advancing multimodal sensory computing requires a holistic approach inspired by biological sensory systems, such as the human retina, and the vision systems of flying animals and insects. This approach involves co-designing sensor technology, integration techniques, and computing architectures to optimize system performance, power consumption, and cost. Additionally, novel algorithms specifically designed for processing and interpreting multimodal sensory data are essential for enabling sophisticated perception, decision-making, and learning capabilities in these systems.
On-chip computing devices: conventional silicon CMOS electronics face limitations when implementing neural network algorithms due to their digital nature. Emerging neuromorphic computing devices such as two-terminal resistive switching memories are promising candidates for in-memory and in-sensor computing. Metal-oxide–semiconductor field effect transistors (MOSFETs) incorporating ferroelectric materials as dielectric components, known as negative capacitance field-effect transistors (NCFETs), offer potential advantages. Replacing the conventional insulated gate dielectric with a ferroelectric thin film exhibiting a “negative capacitance” effect can mitigate the short-channel effect that constrains integrated circuit performance. Ferroelectric materials are also well-suited for non-volatile memory applications due to their inherent spontaneous and remnant polarization properties.
Ferroelectric materials are commonly used in various sensing-memory-computing devices. For instance, in piezoelectric devices, all ferroelectric materials exhibit both ferroelectricity and piezoelectricity, facilitating the conversion between mechanical and electrical energy. Consequently, ferroelectric semiconductor materials can transduce information from both optical and mechanical signals. In available photodiodes, ferroelectric materials such as BiFeO3 can be used to create optical devices with multifunctional capabilities. Despite significant advancements in applying ferroelectric materials for piezoelectric devices, research in other domains remains predominantly theoretical or experimental.
Low-temperature integration technology is essential for maintaining existing device functionalities during three-dimensional (3D) integration. Reliable low-temperature bonding and interconnect processes help minimize thermal expansion mismatches between stacked chips. Additionally, reducing the thickness of active devices and passive components is necessary to minimize parasitic time delays. Near-/in-sensor architectures offer advantages in power efficiency, processing speed, and simplified circuitry. However, direct analog processing in these techniques is susceptible to noise, and the design of large-scale analog near-sensor systems is complex, limiting portability and scalability. Balancing noise, power, and design complexity is crucial for specific applications. Although there has been encouraging progress in emulating synapses and neurons using memristive devices, the field is still in its early stages. Experimental work is primarily at the single-device or small array level. Building practical hardware systems based on these prototypes requires significant scientific and engineering effort. Developing neuromorphic systems is highly interdisciplinary, necessitating collaboration among physicists, material scientists, electrical engineers, and computer scientists. This collaboration is essential for understanding device mechanisms, designing material structures to control device behavior, and optimizing algorithms based on device characteristics.146
• Stacked 2D FETs: this method involves stacking multiple layers of 2D field-effect transistors (FETs) to optimize area utilization and achieve scaling benefits.
• Monolithic 3D integration: this technique involves integrating 2D materials directly with silicon-based logic or memory devices on a single chip, supporting the development of complex systems, such as image sensors and memory devices combined with traditional silicon circuits.
Reduced electrical parasitic capacitance: the close proximity of components within a 3D monolithic integrated circuit shortens interconnect lengths, thereby reducing parasitic capacitance. This reduction in parasitic capacitance improves signal integrity and reduces signal delay, resulting in faster and more efficient circuits. Furthermore, the increased interconnect density, reduced parasitic capacitance, and the potential use of low-power materials like 2D semiconductors can significantly improve energy efficiency.148
Optimal area utilization and enhanced scaling: 3D monolithic integration enables the vertical stacking of multiple device tiers, optimizing area usage to its fullest potential. This technique supports continued scaling beyond the constraints of conventional 2D planar technologies, facilitating the development of more intricate and potent devices without notably enlarging the footprint.
Integration of diverse functionalities: monolithic 3D integration serves as a platform for incorporating diverse functionalities within a single chip, combining logic, memory, and sensing elements to create highly integrated systems with unique capabilities. This integration supports advanced applications, such as smart cameras with embedded artificial intelligence, intelligent robots for medical use, and edge computing devices with integrated processing and data storage. Consolidating multiple functionalities onto a single chip simplifies system architecture, reduces complexity, and improves overall performance.149
Overall, 3D monolithic integration shows substantial promise for the future of electronics. Nonetheless, significant challenges remain, particularly in the application of 2D materials, which must be addressed for widespread adoption.
Performance and efficiency: the atomic thinness of 2D materials helps mitigate short channel effects, a key concern in scaled transistors, leading to improved performance and reduced leakage currents. Certain 2D materials, such as transition metal dichalcogenides (TMDs), even demonstrate high carrier mobilities, surpassing those of silicon, which enables faster switching speeds and enhances overall performance.
Back-end-of-line (BEOL) compatibility: the low-temperature processing requirements of many 2D materials make them compatible with BEOL integration within existing silicon CMOS processes. This compatibility allows 2D-based devices, such as sensors and memory, to be added onto prefabricated silicon circuits without risking damage to the underlying layers.150
Multifunctional chips: the application of 2D materials in 3D integration facilitates the development of multifunctional chips that combine logic, memory, sensing, and additional functions within a single 3D-integrated framework.
Optoelectronics: materials such as graphene and molybdenum disulfide (MoS2) have shown promise in applications like image sensors and photodetectors, highlighting their potential for advanced optoelectronic systems.
Memory devices: 2D materials, including MoS2 and hexagonal boron nitride (h-BN), are promising candidates for non-volatile memory devices such as memristors and memtransistors, which can be seamlessly integrated with logic circuits.
Sensors: the high surface-to-volume ratio and unique properties of 2D materials render them particularly well-suited for sensing applications, including gas and biosensing. Integrating these sensors within 3D systems can enable the development of novel functionalities and applications.
Material synthesis and transfer: the direct growth of 2D materials on substrates at temperatures compatible with back-end-of-line (BEOL) processing (below 450 °C) presents a significant challenge. High temperatures are required for the high-quality growth of many 2D materials, particularly transition-metal dichalcogenides (TMDs), which risk damaging lower layers in a 3D stack. Preserving the structural integrity of large-area 2D films during transfer, while minimizing defects and contamination, is also essential. However, the atomically thin and fragile nature of 2D materials makes them susceptible to damage during transfer, leading to wrinkles, folds, cracks, and contamination, all of which can significantly degrade device performance and reliability.
Device fabrication and performance: achieving low-resistance ohmic contacts in 2D FETs remains a major obstacle, impacting device performance. The atomically thin nature of 2D materials and phenomena such as Fermi-level pinning complicate contact engineering. While progress have been made with metals like bismuth (Bi) and antimony (Sb),151 further research is required to develop reliable, low-resistance contact strategies that are compatible with large-scale fabrication and 3D integration.152
High-κ dielectric integration: integrating high-κ dielectrics with 2D materials without introducing damage is crucial for achieving optimal performance in scaled devices, particularly in gate-all-around (GAA) architectures. The chosen dielectrics must have low leakage currents and high dielectric strength, and the integration process must minimize damage to the 2D material. Identifying suitable dielectric materials and developing damage-free integration processes are key areas of ongoing research.152
Threshold voltage control: precise control over the threshold voltage in both n-type and p-type 2D FETs is essential for designing low-power, high-performance 3D CMOS circuits. Techniques such as substitutional doping and surface charge-transfer doping are being explored. However, achieving uniform doping density and precise control at large scales compatible with CMOS processes remains challenging.
Performance variability and yield: minimizing performance variations across individual devices and achieving high yields are crucial for reliable 3D integrated circuits. Due to the sensitivity of 2D materials to defects and process variations, optimizing synthesis, transfer, and fabrication methods to reduce defects and enhance uniformity is essential for commercial viability.
Heat dissipation bottleneck: the increased device density and power consumption in 3D integration generate considerable heat, necessitating efficient dissipation from upper tiers in a 3D stack to prevent performance degradation and ensure device reliability. However, the high out-of-plane thermal insulation of 2D materials, combined with challenges in integrating effective heat spreaders, complicates thermal management.
Thermal conductivity mismatch: the mismatch in thermal expansion coefficients between 2D materials and substrates leads to stress and strain during temperature changes, affecting device reliability. Careful selection of materials and integration strategies is essential to mitigate these effects.
3D architecture and interconnects: creating reliable and low-resistance Inter-Tier Vias (ITVs) that connect different tiers in a 3D stack is essential for signal integrity and overall performance. Traditional high-temperature or high-energy ITV fabrication processes may damage 2D materials. Thus, developing low-temperature, damage-free processes compatible with 2D materials is critical.
Electrostatic coupling: the close proximity of conductive elements in densely packed 3D structures can lead to significant parasitic capacitance and electrostatic coupling, causing signal degradation, noise, and crosstalk. The atomically thin nature and high integration density achievable with 2D materials exacerbate these challenges. Mitigating these effects requires careful layout design, use of low-κ dielectrics, shielding layers, and proper grounding strategies.
Addressing these challenges is essential to unlock the full potential of 2D materials in monolithic 3D integration. The field is advancing rapidly, with ongoing research focused on developing new materials, fabrication processes, and integration strategies to overcome these hurdles and pave the way for commercially viable 2D-based 3D integrated circuits.
Footnote |
† Y. S. and N. T. D. contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2024 |