Quasi-two-dimensional α-molybdenum oxide thin film prepared by magnetron sputtering for neuromorphic computing

Two-dimensional (2D) layered materials have attracted intensive attention in recent years due to their rich physical properties, and shown great promise due to their low power consumption and high integration density in integrated electronics. However, mostly limited to mechanical exfoliation, large scale preparation of the 2D materials for application is still challenging. Herein, quasi-2D α-molybdenum oxide (α-MoO3) thin film with an area larger than 100 cm2 was fabricated by magnetron sputtering, which is compatible with modern semiconductor industry. An all-solid-state synaptic transistor based on this α-MoO3 thin film is designed and fabricated. Interestingly, by proton intercalation/deintercalation, the α-MoO3 channel shows a reversible conductance modulation of about four orders. Several indispensable synaptic behaviors, such as potentiation/depression and short-term/long-term plasticity, are successfully demonstrated in this synaptic device. In addition, multilevel data storage has been achieved. Supervised pattern recognition with high recognition accuracy is demonstrated in a three-layer artificial neural network constructed on this α-MoO3 based synaptic transistor. This work can pave the way for large scale production of the α-MoO3 thin film for practical application in intelligent devices.


Introduction
Today, the Internet of Things and articial intelligence have brought great convenience to our lives in big data analytics, autonomous vehicles, speech and image recognition. 1,2 However, they generate large quantities of complex data and present a huge challenge to the conventional von Neumann architectures. With the background of massive information, neuromorphic computing, seeking inspiration from the human brain to work in parallel and achieve high energy efficiency, hopes to solve computationally hard problems of the conventional computer systems. [3][4][5] Synapses in the neuromorphic systems can combine information processing and memory by changing the synaptic weight. 1 Interestingly, the electrolyte gated transistor (EGT) can modulate the conductance of the channel by ion migration, which mimics the function of biological synapses very well. 6 For more than ve decades, the traditional transistor has been shrinking exponentially in size in order to increase the switching speed, reduce the power dissipation and improve the integration density. 7 However, scaling down of the transistor is approaching the limitation of miniaturization due to shortchannel effects. 8 An alternative solution to solve this bottleneck is using atomically thin semiconductor lms, such as 2D materials, as the channel layer of the eld effect transistor (FET). 9-15 a-Molybdenum oxide (a-MoO 3 ), has the well-known layered crystal structure, offers the possibility to obtain ultrathin lm. 16 The layered a-MoO 3 has unit cell parameters of a ¼ 3.96Å, b ¼ 13.86Å, c ¼ 3.70Å, which composed of double layers of linked and distorted MoO 6 octahedra, and belongs to the space group Pbnm. 17,18 a-MoO 3 has been studied as the channel layer of synaptic transistor for neuromorphic computing in recent works. 19,20 Unfortunately, in these works, the a-MoO 3 lm was prepared by the mechanical exfoliation, which could not realize large scale production for practical application. Wang et al. have fabricated a-MoO 3 based twoterminal memristive device prepared by pulsed laser deposition (PLD). 21 However, in their device, the thickness of a-MoO 3 is 400 nm, which loses the excellent characteristics of 2D material based device. Moreover, the signal transmission and learning functions could not be carried out simultaneously in two-terminal memristive device. 19 Therefore, large scale production of a-MoO 3 thin lm for practical application in neuromorphic computing needs further research.
In this work, an all-solid-state synaptic transistor based on a-MoO 3 thin lm prepared by magnetron sputtering is designed and studied. The surface morphology, crystal structure and valence characterizations indicate that a highly uniform orthorhombic single phase a-MoO 3 thin lm is obtained. About four orders of reversible conductance modulation is observed in the a-MoO 3 channel by the protons (H + ) intercalation/ deintercalation under the positive/negative gate voltages. The essential synaptic functions, including potentiation, depression and short-term/long-term plasticity were successfully mimicked. High recognition accuracy is achieved using a simulated articial neural network built from this a-MoO 3 synaptic device. Overall, our study paves the way for large scale production of a-MoO 3 thin lm for practical application in intelligent devices.

Experimental section a-MoO 3 thin lm preparation and characterizations
A Mo metal target (99.99%) was used to prepare a-MoO 3 thin lm on SrTiO 3 (100) single crystalline substrate (MTI, Hefei) at 450 C by radio-frequency magnetron sputtering. The working pressure was 0.65 Pa (O 2 23% + Ar 77%) in a gas ow of 19.0 sccm. The sputtering power was 45 W and the deposition rate was about 0.1Å s À1 . The thickness of a-MoO 3 thin lm was about 18 nm. The surface morphology of a-MoO 3 thin lm was checked by atomic force microscope (AFM, Solver P47 PRO, NT-MDT) and scanning electron microscope (SEM, G300 FE-SEM System). The crystal structure and thickness of a-MoO 3 thin lm were characterized by X-ray diffraction (XRD, Smartlab, Rigaku Co). The thickness of a-MoO 3 thin lm was also characterized by a step proler (Kosaka, ET 150). The chemical composition and bonding states of the a-MoO 3 thin lm were determined by X-ray photoelectron spectroscopy (XPS, Escalab 250).

Device fabrication
Aer photolithography patterning, the bottom Cr/Au(3 nm/20 nm) source/drain electrodes were prepared on SrTiO 3 (100) substrate by thermal evaporation and li-off method. The distance between source and drain electrodes was 20 mm. Then the a-MoO 3 thin lm used as the channel was deposited on the electrodes. And 30 nm amorphous Gd 2 O 3 lm was deposited on a-MoO 3 thin lm used as the electrolyte by radio-frequency magnetron sputtering at room temperature. During the deposition of Gd 2 O 3 lm, the working pressure was 0.65 Pa (O 2 50% + Ar 50%) in a gas ow of 19.0 sccm, and the sputtering power was 70 W. Finally, 6 nm Pd lm was deposited on Gd 2 O 3 lm through a Cu hard mask as the top gate electrode by directcurrent magnetron sputtering at room temperature. In this process, an all-solid-state synaptic transistor device is successfully fabricated.

Electrical measurement
The transfer curves of the a-MoO 3 based synaptic transistor device were measured using a custom-designed 4-probe station with two Keithley 2400 source meters under air and vacuum (<2 Â 10 À4 Pa) conditions, respectively. The voltage range was from À1.2 V to 2.5 V with a sweep rate of 2 mV s À1 . The synaptic plasticity of the device was characterized by the 4-probe station in air condition.

Results and discussion
Structure and composition of a-MoO 3 thin lm In this work, layered a-MoO 3 thin lm was prepared on SrTiO 3 (100) single crystalline substrate by magnetron sputtering. 22 Fig. 1a exhibits a 2 Â 2 mm AFM image (the scale bar is 300 nm) of the a-MoO 3 thin lm, which demonstrates a uniform surface. The root mean square (RMS) roughness is 1.31 nm, indicating that the lm is smooth. As deduced from X-ray reectance spectrum (Fig. S1a, ESI †), the thickness of the a-MoO 3 thin lm is about 18 nm ($14 layers). The thickness of a-MoO 3 thin lm was also conrmed by a step proler ( Fig. S1c and d, ESI †). Fig. 1b shows the top-view SEM image (the scale bar is 2 mm) of this lm, which demonstrates that the lm is at and continuous in the whole view. XRD characterization is carried out to study the microstructure of the a-MoO 3 thin lm. As shown in Fig. 1c, the peaks at 2q ¼ 12.7 , 25.6 and 38.9 correspond to the (020), (040) and (060) planes of a-MoO 3 (JCPD: 05-0508), respectively. 18,19 It is obvious that there is no impurity peak except the peaks of a-MoO 3 thin lm and SrTiO 3 (100) substrate in Fig. 1c. Because of its thin thickness, the intensity of a-MoO 3 thin lm peaks are relatively weaker than those of bulk a-MoO 3 single crystal. 18 The XRD results demonstrate that the a-MoO 3 thin lm is single orthorhombic phase with (010) preferred orientation.
The chemical composition and bonding states of the a-MoO 3 thin lm were characterized by XPS ( Fig. 1d-f). The carbon C 1s peak at 284.6 eV was used to calibrate the XPS spectra. Highresolution scans of Mo 3d (Fig. 1d) and O 1s (Fig. 1e) were collected and t with Gaussian-Lorentz distribution using a Shirley background. The binding energy peaks at 232.81 eV and 235.95 eV are the characteristic peaks of Mo 3d5/2 and Mo 3d3/2, respectively, which indicates that Mo in this lm has a valence of +6. The binding energy peak at 530.52 eV is the characteristic peak of O 1s. All of these peaks are consistent well with previous reports. 23,24 The total XPS spectrum (150-600 eV) of the a-MoO 3 thin lm is shown in Fig. 1f, in which the characteristic peaks of Mo 3d, C 1s, Mo 3p, Mo 3s and O 1s can be obtained. No peak related to impurity elements appears in the overview spectrum. In short, all the above results indicate high quality single phase a-MoO 3 thin lm was obtained on SrTiO 3 (100) substrate by magnetron sputtering.

Electrical properties of a-MoO 3 based synaptic transistor
Recently, electrolyte gated synaptic transistors have been studied extensively for neuromorphic computing due to their excellent performance. 6,14,25 In this study, an all-solid-state synaptic transistor based on a-MoO 3 thin lm is successfully fabricated. Fig. 2a and b show the schematic diagrams of the biological synapse and the a-MoO 3 based synaptic transistor.
The biological synapse is composed of presynaptic neuron, neurotransmitters, synaptic cle and postsynaptic neuron (Fig. 2a). Aer an action potential arrived at presynaptic neuron, the neurotransmitters will be released from presynaptic neuron into postsynaptic neuron through combination with specic receptors, resulting in variations of the postsynaptic potential to transmit information. [26][27][28][29] Imitating the biological synapse, Gd 2 O 3 solid-state electrolyte and a-MoO 3 channel are used in the transistor device as the presynaptic neuron and postsynaptic neuron, respectively (Fig. 2b). In this articial synaptic device, the external stimulus (voltage pulse) applied on the gate electrode will lead to hydrolysis reaction at the interface between Gd 2 O 3 electrolyte layer and top Pd electrode, and create a lot of protons (H + ) that function as neurotransmitters. These neurotransmitters (H + ) will be driven across the Gd 2 O 3 electrolyte and injected into a-MoO 3 channel under a high enough positive gating voltage. As a result, the channel conductance (synaptic weight) will be modulated to transmit information.
In order to characterize the performance of the transistor, the transfer curves were measured in air (Fig. 2c) and under vacuum condition (Fig. 2d), respectively. The gate voltage (V G ) was swept from 0 V to 2.5 V, 2.5 V to À1.2 V and then back to 0 V. The sweep rate was 2 mV s À1 and the source drain voltage (V read ) was 0.5 V. The red and blue lines represent source drain current (I SD ) and gate current (I G ), respectively. As we can see, I SD increases signicantly (about 10 4 times) and shows a clear anticlockwise hysteresis by sweeping the V G in air condition (Fig. 2c). When V G was swept to the positive part, H + ions will be injected into the MoO 3 channel, resulting in H x MoO 3 with a high conductance state. This process can be described by the following reaction: 16 In contrast, when V G is reduced, H + ions will be extracted from the H x MoO 3 channel to Gd 2 O 3 lm, recovering the low conductance in MoO 3 channel. This process can be described by the following reaction: The magnitude of I G is below 4 nA during the whole sweeping cycle, which is more than three orders smaller than I SD , indicating that the leakage current has little effect on I SD . However, if the air was pumped out of the probe station chamber and the transfer curve was obtained under vacuum condition, I SD increases slightly and shows a tiny anti-clockwise hysteresis by sweeping the V G (Fig. 2d). The red and blue lines represent I SD and I G , respectively. The dramatically different gating responses observed in air and vacuum conditions indicate that the water molecular in air may play a crucial role in modulating the channel conductance. 19,30,31 In addition, some theoretical calculations performed with the density functional theory have also predicted that the conductance of MoO 3 can be changed by hydrogenation. [32][33][34] The resistance data of the a-MoO 3 channel were extracted from the I SD -V G curves, as shown in Fig. 2e and f. The channel resistance changes from 10 8 U to 10 4 U and exhibits a clear clockwise hysteresis by sweeping the V G in air condition, while it changed slightly under vacuum condition. The slight channel resistance variation under vacuum condition is caused by residual H + ions in the device.

Synaptic plasticity of the synaptic transistor
The obvious hysteresis in the transfer curve of this novel transistor lays a good foundation for potential synaptic application. To demonstrate the nonvolatile characteristics, the solid-state a-MoO 3 transistor was trained by sending a series of voltage pulses to the gate electrode. During the voltage pulses, the source-drain current is recorded at the same time. The synaptic plasticity characteristics of the articial synaptic transistor device are shown in Fig. 3a. A series of single voltage pulse stimulation with different magnitudes (1.0 V-3.0 V) and identical pulse width of 100 ms were applied to gate electrode, and a source-drain voltage of 0.5 V was applied. Similar to the short-term plasticity (STP) in biological synapse, I SD shows a sharp peak response to the V G , and quickly decreases to the initial value under low V G of 1.0 V and 1.5 V. While under a higher V G (>2.0 V), the I SD could not return to the initial value aer the gate stimulation, indicating a nonvolatile behavior. This means that a memory behavior can be realized using higher voltage gating pulses. A more detailed study of the synaptic plasticity characteristics under different V G (pulse magnitude: 1.0-3.0 V, pulse width: 50-500 ms) was shown in Fig. 3b and c. These values of I SD were recorded aer 10 s of each spike. It can be clearly seen that the channel conductance change increases with the pulse width and amplitude of V G . Voltage pulses with longer duration time and larger magnitude can lead to a higher H + doping in the a-MoO 3 channel, resulting in a larger nonvolatile change of the channel conductance. The energy consumption in plasticity process was estimated. The energy consumption per spike is about 5.0 Â 10 À10 J as calculated with the formula I peak Â Dt Â V read , 35 where I peak , Dt and V read represent the peak value of the I SD ($20 nA), the pulse width (50 ms) and the source drain voltage (0.5 V), respectively. The energy consumption can be further reduced by device miniaturization for practical application.  It is well known that bidirectional plasticity, that is the longterm potentiation (LTP) and long-term depression (LTD), is a key concept in synapses. Potentiation and depression represent synaptic strengthening and weakening, respectively. These functions were mimicked in our synaptic transistor device. The channel conductance (synaptic weight) can be reversible modulation by insertion/extraction of H + ions into/from the a-MoO 3 thin lm under the positive/negative electric bias. Fig. 3d shows the LTP and LTD of the synaptic transistor. By alternatively applying 50 identical pulses (2.5 V, 100 ms), I SD gradually increases with the positive gate spikes (LTP), while I SD decreases to the initial value (LTD) aer 50 negative gate spikes (À2.0 V, 100 ms). Fig. 3e depicts the spike number dependent plasticity (SNDP) of the synaptic transistor, which was measured by monitoring the channel conductance aer applying 1, 5, 10 and 20 identical pulses (2.0 V, 100 ms). It is obvious that the I SD increases with the pulse numbers. Fig. 3e also shows that aer the withdrawal of the gate spikes, the channel current decays slowly with time, just like the forgetting process in biological systems. Fig. 3d and e also demonstrate a nonvolatile behavior of the synaptic transistor. However, if the decaying time is much longer than the switching time, this decaying process will have little effect on the operation of this synaptic device.

Analog switching of the synaptic transistor
As shown above, several essential synaptic functions have been realized in the a-MoO 3 based synaptic transistor device. Realization of multi-states in synaptic weight is a requisite condition for articial neuromorphic computing. Fig. 4a shows the 16/32/ 64 multi-level data storage functions obtained by applying 16/ 32/64 positive (2.0 V, 100 ms) and negative (À2.0 V, 100 ms) V G pulses. The values of channel conductance were calculated with I SD divided by V read . The G max /G min ratios of the 16/32/64 multiple states are as large as 15.9/55.2/183.0, respectively, where the G max and G min represent the maximum and the minimum (initial) channel conductance values, respectively. Large G max /G min ratio can provide a potential opportunity to obtain more storage states.
To demonstrate the behaviors of weight update, a repeated 16/32 potentiation and depression cycling test was performed, as shown in Fig. 4b and c. The transistor device exhibits good repeatability and stability. To check the linearity and stability of the weight update behavior during analog switching, the asymmetric ratio (AR) and cycle-to-cycle variation (C2C) were calculated, respectively. The AR can be obtained using the following formula: 20,37,38 Where n (from 1 to 32) is the voltage pulse number of the 32 potentiation and depression cycle, G P (n) and G D (n) are the channel conductance values at nth state during the potentiation and depression processes, respectively. G max and G min are the maximum and minimum channel conductance values, respectively. The value of AR is zero for ideal linear device. Here, the calculated AR of the a-MoO 3 based synaptic transistor device is 0.58 AE 0.0069, which is comparable to previous report. 37 The linearity of our device still needs to be further improved for achieving higher image recognition accuracy, which will be discussed in the next section. The cycle-to-cycle variation (the average of the channel conductance standard deviation divided by the maximum conductance) 20,26,27 was characterized with 5 sequential switching cycles. It can be dened as: ðG m ðnÞ À sðnÞÞ 2 5 32 Where m (from 1 to 5) is the number of cycles, n (from 1 to 32) is the voltage pulse number of each cycle. s(n) is the mean value of channel conductance in nth state, which is the average of channel conductance of nth voltage pulse in 5 switching cycles. G m (n) is the channel conductance of mth cycle in nth state. G max is the maximum value of channel conductance. The calculated C2C of the a-MoO 3 based synaptic transistor device is as low as 1.02%. This low C2C demonstrates a low write noise during neuromorphic computing.

Simulation of supervised pattern recognition
Finally, the computing performance of the a-MoO 3 based synaptic transistor was evaluated. As shown in Fig. 5a, a threelayer articial neural network (one hidden layer) was constructed to perform supervised learning based on a backpropagation algorithm. The neurons of every layer connect with each other. Back-propagation is a widely used method for training articial neural networks in neuromorphic computing. 39 Two data sets were employed to train this articial neural network: a small image version (8 Â 8 pixels) of handwritten digits from the "Optical Recognition of Handwritten Digits" data set, 40 and a large image version (28 Â 28 pixels) of handwritten digits from the "Modied National Institute of Standards and Technology" (MNIST) data set. 41 Fig. 5b shows the crossbar array of the CrossSim simulator, which was used to perform vector-matrix multiplication and outer-product update operations. 36,42 The crossbar array contains read pulses (red color), programming pulses (green color) and read outputs (blue color). The a-MoO 3 channel conductance represents the synaptic weight. The LTP/LTD experimental data in Fig. 4b were selected for supervised simulation. The simulating results were demonstrated in Fig. 5c-f. The channel conductance deviation (DG) between experimental values were characterized to evaluate the device nonideality. In contrast to C2C, DG is the change of channel conductance induced by single voltage pulse within each cycle. As we can see, the distribution of DG in LTP process (Fig. 5c) is narrower than that in LTD process (Fig. 5d) over the entire range of G. The image recognition accuracy of the simulated network aer training 25 epochs was plotted in Fig. 5e and f. High image recognition accuracies of 94% for small images (Fig. 5e) and 88% for large images (Fig. 5f) were obtained. Although the recognition accuracy is comparable to recently reported results in a-MoO 3 based synaptic devices, [19][20][21] further works are still required to optimize the linearity and symmetry of analogy switching in this device to increase the image recognition accuracy. Finally, the properties of our a-MoO 3 synaptic transistor were compared with those of other devices reported in the literature, which were summarized in Table 1. [19][20][21]26,27,[43][44][45] Our solid-state device shows a high stability and good recognition accuracy. What's more, no ionic liquid and organic material is adopted in our device, which make it more compatible with current semiconductor processes.

Conclusion
In summary, an all-solid-state synaptic transistor based on single phase a-MoO 3 thin lm prepared by magnetron sputtering is designed and fabricated. Four orders of reversible conductance modulation is realized experimentally in the a-MoO 3 thin lm by H + ions intercalation/deintercalation. Several essential synaptic behaviors in biological synapse, including STP, LTP/LTD and SNDP are successfully mimicked in our device. An articial neural network based on a-MoO 3 based synaptic transistor was constructed to perform supervised learning. High recognition accuracies of 94% and 88% were achieved for Handwritten Digits data set (small images) and MNIST data set (large images), respectively. In addition, both channel material and electrolyte material in this device are prepared by magnetron sputtering, which is compatible with modern semiconductor technology. This work can pave the way for large scale production of quasi-2D material for practical application in synaptic transistor device.

Conflicts of interest
There are no conicts to declare.