I Introduction
Many emerging applications have posed new challenges for the design of conventional analogtodigital (AD) converters (ADCs) [1, 2, 3, 4]. For example, multisensor systems desire programmable nonlinear AD quantization to maximize the extraction of useful features from the raw analog signal, instead of directly performing uniform quantization by conventional ADCs [3, 4]. This can alleviate the computational burden and reduce the power consumption of backend digital processing, which is the dominant bottleneck in intelligent multisensor systems. However, such flexible and configurable quantization schemes are not readily supported by conventional ADCs with dedicated circuitry that has fixed conversion references and thresholds.
To overcome this inherent limitation of conventional ADCs, several recent works [5, 6, 7] have introduced neural networkinspired ADCs (NNADCs) as a novel approach to designing intelligent and flexible AD interfaces. For instance, a learnable 8bit NNADC [7] is presented to approximate multiple quantization schemes where the NN weight parameters are trained offline and can be configured by programming the same hardware substrate. Another example is a 4bit neuromorphic ADC [6] proposed for generalpurpose data conversion using online training by leveraging the input amplitude statistics and application sensitivity. These NNADCs are often built on resistive randomaccess memory (RRAM) crossbar array to realize the basic NN operations, and can be trained to approximate the specific quantizationconversion functions required by different systems. However, a major challenge for designing such NNADCs is the limited conductanceresistance resolution of RRAM devices. Although these NNADCs optimistically assume that each RRAM cell can be precisely programmed with 612bit resolution, measured data from realistic fabrication process suggest the actual RRAM resolution tends to be much lower (24bit) [8, 9]. Therefore, there exists a gap between the reality and the assumption of RRAM precision, yet lacks a design methodology to build superresolution NNADCs from lowprecision RRAM devices.
In this paper, we bridge this gap by introducing an NNinspired design methodology that constructs superresolution ADCs with lowprecision RRAM devices. Taking advantage of a codesign methodology that combines a pipelined hardware architecture with deep learningbased custom training framework, our method is able to achieve an NNinspired ADC whose resolution far exceeds the precision of the underlying RRAM devices. The key idea of a pipelined architecture is that many consecutive lowresolution (1
3bit) quantization stages can be cascaded in a chain structure to obtain higher resolution. Since each stage now only needs to resolve 13bit, we can accurately train and instantiate it with lowprecision RRAM devices to approximate the ideal quantization functions and residue functions. Key innovations and contributions in this paper are as follow:
We propose a codesign methodology leveraging pipelined hardware architecture and custom training framework to achieve superresolution analogtodigital conversion that far exceeds the limited precision of the RRAM device.

We systematically evaluate the impacts of NN size and RRAM precision on the accuracy of NNinspired subADC and residue block and perform design space exploration to search for optimal pipelined stage configuration with balanced tradeoff between speed, area, and power consumption.

SPICE simulation results demonstrate that our proposed method is able to generate robust design of a 14bit superresolution NNADC using 3bit RRAM devices. Comparisons with both the stateoftheart ADCs and other NNADC designs reveal improved performance and competitive figureofmerits (FoMs).

Our proposed ADC can also support configurable nonlinear quantization with highresolution, high conversion speed, and low conversion energy.
Ii Preliminaries
Iia RRAM Device, Crossbar Array and NN
IiA1 RRAM device
IiA2 RRAM crossbar array
RRAM devices can be organized into various ultradense crossbar array architectures. Fig. 1(a) shows a passive crossbar array composed of two subarrays to realize bipolar weights without the use of powerhungry operationalamplifiers (opamps) [7]
. The relationship between the input voltage “vector” (
) and output voltage “vector” () can be expressed as . Here, () and () are the indices of input ports and output ports of the crossbar array. The weight can be represented by the subtraction of two conductances in upper () subarray and lower () subarray as(1) 
Therefore, the RRAM crossbar array is capable of performing analog vectormatrix multiplication (VMM) and the parameters of the matrix rely on the RRAM resistance states.
IiA3 Artificial NN
With the RRAM crossbar array, an NN shown in Fig. 1(b) can be implemented on such hardware substrate. Generally, the NN processes the data by executing the following operations layerwise [17]:
(2) 
Here, is the weight matrix to connect the layer and layer .
is a nonlinear activation function (NAF). These basic NN operations, e.g., VMM and NAF, can be mapped to the RRAM crossbar array and CMOS inverters shown in Fig.
1(a), where the voltage transfer characteristic (VTC) is used as an NAF [7].IiB NNInspired ADCs
ADC can be viewed as a special case of classification problems which maps a continuous analog signal to a multibit digital code. An NN can be trained to learn this inputoutput relationship, and a hardware implementation of this NN can be instantiated in the analog and mixedsignal domain. This is the basic idea behind NNADCs which implements the learned NN on a hardware substrate to approximate the desired quantization functions for data conversion:
(3) 
where, is the resolution; is input analog signal and is the output digital codes; and are the minimum and maximum values of the scalar input signal . Since RRAM crossbar array provides a promising hardware substrate to build NNs, recent work has demonstrated several NNADCs based on RRAM devices [5, 6, 7]. Although the NN architectures adopted by these NNADCs are various, they all rely on a training process to learn the appropriate NN weights to approximate flexible quantization schemes that can be configured by programming the weights stored in RRAM conductanceresistance. However, existing NNADCs [5, 6, 7] often exhibit modest conversion resolution (48bit) and invariably rely on optimistic assumption of the RRAM precision (612bit), which is not well substantiated by measurement data from realistic RRAM fabrication process [8, 9]. This resolution limitation severely constrains the application of NNADCs in emerging multisensor systems that require highresolution (10bit) A
D interfaces for feature extraction and nearsensor processing
[1, 3, 4].IiC Pipelined ADCs
Pipelined architecture is a wellestablished ADC topology to achieve high sampling rate and high resolution with lowresolution quantization stages [11]. Fig. 2(a) illustrates a typical pipelined ADC with stages whose resolution RESO can be achieved by concatenating bit of each stage with digital combiner: . Note that is usually and not necessarily identical in all stages. As the Fig. 2(a) illustrates, an arbitrary stage contains two subblocks: a subADC and a residue. The subADC resolves bit binary codes from input residue , while the residue part amplifies the subtraction between the input residue and the analog output of subADC by to generate the output residue for next stage. This process can be expressed as a simple function:
(4) 
Here, is the analog output of subDAC that depends on . For example, assuming and , then and ; and Fig. 2(b) shows the corresponding residue function. To understand the basic working principle of pipelined ADCs, we use a 4bit pipelined ADC with four 1bit stages in Fig. 2(c) as an example. Assuming the initial analog input is (), then the first stage will output “1”—a digital code, and “”— an analog residue according to Eq. (4) which will be processed by the following stage in the same way as initial analog input. Finally, we can obtain 4bit outputs , which is the quantization of (). This example also shows that a higher resolution (4bit) can indeed be constructed with lowprecision (1bit) stages in a pipelined ADC.
Iii CoDesign Methodology
Iiia Hardware Substrate
IiiA1 Pipelined architecture
The observation from traditional pipelined ADCs motivates us to extend such architecture to NNADC to enhance its resolution beyond the limit of RRAM precision. The overall hardware architecture for the proposed highresolution NNADC is presented in Fig. 3(a), where a pipelined architecture composed of cascaded conversion stages is adopted in the design. This pipelined architecture brings two direct benefits. First, each stage in the proposed NNADC now only needs to resolve bit quantization, which is well within the precision limit of current RRAM fabrication process [8, 9] and can be easily achieved with the automated design methodology introduced in previous work [7]. Second, although many cascading stages are needed, there only exist three distinct lowresolution configurations to choose from for each stage, namely . This allows us to simplify the design process by focusing on optimizing the subblock design of each stage with different resolutions. The full pipelined system can then be assembled by iterating through different combinations of the subblocks with different resolutions.
IiiA2 Lowresolution NNADC stage
For stage in the proposed NNADC, we use a fivelayer NN to implement the subADC and the residue block. The fivelayer NN can be decomposed into two threelayer subblocks, and each of them can be mapped into the corresponding subADC and residue in Fig. 2(a). The cornerstone of this mapping methodology is the universal approximation theorem that a feedforward threelayer NN with a single hidden layer can approximate arbitrary complex functions [13]. We use the RRAM crossbar array and CMOS inverter illustrated in Fig. 1(a) as the hardware substrate to design the subblocks of each stage. As Fig. 3
(b) shows, for the subADC, the input analog signal represents the single “place holder” neuron in MLP’s input layer. Therefore, the weight matrix dimensions are
between the hidden and the input layer, and between the hidden and the output layer, assuming there are and neurons in the hidden and output layer. Here, we use a redundant “smooth” encoding method to replace the standard bit binary encoding with bits () according to previous work [7], as it improves the training accuracy and reduces hidden layer size of the subADC. For example, we use smooth codes to train a 2bit subADC with 3bit smooth codes as output in Fig. 4(b). For the residue, there are () input neurons (one analog input and bit smooth digital codes from the proceeding subADC block), and only one analog output neuron; therefore, the weight matrix dimensions are between the hidden and the input layer and between the hidden and the output layer, assuming there are hidden neurons. The samplinghold (SH) circuits [18] are used in the output layer to drive the next stage. Since the opamps in Fig. 2(a) are eliminated in the NNinspired design of residue circuit, considerable power saving can be obtained from each stage.IiiB Training Framework
IiiB1 Training overview
We propose a training framework that accurately captures the circuitlevel behavior of the hardware substrate in its mathematical model and is able to learn the robust NNs and its associated hardware design parameters (i.e., RRAM conductance) to approximate the subADC and residue for each stage. The training framework incorporates two important features. First, we employ collaborative training for the two subblocks in each stage. The subADC is initially trained to approximate the ideal quantization function with highfidelity, then its digital outputs and original analog input are directly fed to the residue block for the residue training. This collaborative training flow can effectively minimize the discrepancy between the circuit artifacts and the ideal conversion at each stage. Second, nonidealities of devices, such as process, voltage and temperature (PVT) variations of the CMOS device and limited precision of the RRAM devices, can be incorporated into training to make the proposed NNADC robust to these defects [14]. This is another advantage of the proposed NNADC over traditional ADC designs, where even with delicate calibration techniques, the nonidealities cannot be fully mitigated [11].
IiiB2 Training steps
The detailed training flow is shown in Fig. 3(b), which consists of four steps. We focus on describing the training steps for the residue block, as we adopt similar subADC training method that has been elaborated in previous work [7, 14].
Step \⃝raisebox{0.9pt}{1}: establish learning objective. For the residue circuit, its output is an analog value; therefore, the hardware substrate can be modeled as a threelayer NN with a “placeholder” output neuron:
(5) 
Here, . indicates the digital output of the ADC (“1” means , and “0” means GND), and is the scalar residue input of stage; denote the outputs of the first crossbar layer, which are modeled as a linear function of and , with learnable parameters corresponding to RRAM crossbar array conductances. Each of these voltages is passed through an inverter (shown in Fig. 1(a)), whose inputoutput relationship is modeled by the nonlinear function , to yield the vector . The linear function models the second layer of the crossbar to produce the output residue for next stage, with learnable parameters . The learning objective is to find optimal values for the parameters such that for all values of in the input range, the circuit yields corresponding residue that are equal or close to the desired “ground truth” in Eq. (4). To achieve this aim, we define a cost function to measure the discrepancy between predicted and true based on the meansquare loss:
(6) 
Step \⃝raisebox{0.9pt}{2}: model hardware constraints. Hardware constraints come from three aspects: CMOS neuron PVT variations, limited precision of RRAM device, and passive crossbar array. To reflect these hardware constraints, we first group all VTCs obtained by Monte Carlo simulations in using the technology specification in Section IVA. Meanwhile, we control the precision of weight with bit during the training. Finally, we let the summation of all elements (absolute value) in each column (“0”) of be 1:
(7) 
to reflect the weights constraints in Eq. (1).
Step \⃝raisebox{0.9pt}{3}: hardwareoriented training. We initialize the parameters randomly, and update them iteratively based on gradients computed on minibatches of pairs randomly sampled from the input range. To incorporate the hardware constraints in step \⃝raisebox{0.9pt}{2} into training, we let each neuron in Eq. (5) randomly pick up a VTC from during training:
(8) 
We then periodically clip all values of between , as well as between to satisfy Eq. (7).
Step \⃝raisebox{0.9pt}{4}: instantiate conductance values. We adopt the same instantiation method based on previous work [7], which is proven to always find a set of equivalent conductances from the trained weights and biases to map to the RRAM devices in the hardware substrate. After this, we perturb each resistance R by:
(9) 
to evaluate the robustness of the NN model to the stochastic variation of RRAM resistance [2].
IiiC Examples of Trained SubADC and Residue
Fig. 4 illustrates the SPICE simulation of different trained stages with the proposed training framework. The subADC and the residue in Fig. 4(a) are trained through a NN and a NN respectively by setting , while the subADC and the residue in Fig. 4(b) are trained through a NN and a NN by setting . In both figures, we use 3bit RRAM and set in Eq. (9) for evaluation. The comparison between the trained function and the ideal function shows that each stage with lowprecision RRAM can accurately approximate the ideal stage function with the aid of the proposed training framework.
Iv Experimental Results
Iva Experimental Methodology
IvA1 Training configuration
We set
to get three distinct resolution configurations in each pipeline stage in our experiments. For each stage, we train different NN models and each NN model is trained via stochastic gradient descent with the Adam optimizer using TensorFlow
[15]. The weight precision during training is set to be 17bit. The batch size is 4096, and the projection step is performed every 256 iterations. We train for a total of 2 iterations for each subADC model and residue model, varying the learning rate from to across the iterations.IvA2 Technology model
We use the HfObased RRAM device model to simulate the crossbar array [16]. We set the resistance stochastic variation , since it is a moderate variation based on the evaluations from prior work [17]. The transistor model is based on a standard 130 CMOS technology. The inverters, output comparators, and transistor switches in the RRAM crossbars are simulated with the 130nm model using Cadence Spectre. The VTC group is obtained by running 100 times Monte Carlo simulations. The simulation results presented in the following section are all based on SPICE simulation.
IvA3 Metric of training accuracy
The trained accuracy of the subADCproposed NNADC is represented by the effective number of bits (ENOB)–a metric to evaluate the effective resolution of an ADC. We report ENOB based on its standard definition ENOB(SNDR1.76)6.02, where the signal to noise and distortion ratio (SNDR) is measured from the subADC’sproposed NNADC’s output spectrum. The training accuracy of the residue circuit is represented by the meansquare error (MSE) between predicted residue function and ideal residue function. We report the MSE based on 2048 uniform sampling points in the full range of input .
IvB Subblock Evaluations
IvB1 Resolution and robustness
To find a robust design for each stage, we study the relationship between the trained accuracy and RRAM precision of each subblock with different NN sizes at a fixed stochastic variation. For these experiments, we first incorporate both CMOS PVT variations and limited precision of RRAM device into training, and then instantiate several batches of 100run Monte Carlo simulations with a resistance variation in Eq. (9), and finally compute the median accuracy of each model.
We plot the trends in Fig. 5. Generally, an bit RRAM precision is enough to train an NN model to accurately approximate an bit subADC, which confirms the conclusion in previous work [7]. Particularly, larger size NN models with more hidden neurons can even accurately approximate an bit subADC with bit RRAM precision. Similar conclusions can also be made from the trained performance of residue circuits. As the Fig. 5(b) shows, an bit RRAM precision is enough to train an NN model to accurately approximate a residue circuit. Moreover, a larger size NN with more hidden layer neurons can accurately approximate the residue circuit of bit stage with bit RRAM precision.
IvB2 Subblock design tradeoff
Each stage has design tradeoff among power consumption , sampling rate and area . A completed design space exploration may involve the searching of different NN sizes of each subblock in stage, RRAM precision and stochastic variations. Here, we use three pairs of subblocks highlighted by the solid boxes in Fig. 5 as an example to illustrate the design tradeoff, since each of them shows enough accuracy and robustness with no more than 4bit RRAM precision. For these experiments, we combine each pair of subblocks to form three distinct subblocks with resolution , respectively. We then fix the precision of RRAM device with 3bit for for all building blocks except for the residue in stage, which use 4bit RRAM device. We finally study the relationship between the power , speed , and area of each distinct stage () by simulating the minimum power consumption/area of each distinct stage that works well at different sampling rates.
The trends are plotted in Fig. 6, which shows clear tradeoffs between speed and power consumption, as well as speed and area, for each distinct stage. This is because in order to make each subblock work well under faster speed, we need to increase the driving strength of the neurons by sizing up the inverters, which results in an increase of power consumption and area for each stage.
IvB3 Design optimization
Based on the exploration of different subblock configurations, an optimal design for the proposed ADC with a given resolution can be derived by solving the following optimization problem:
(10) 
Here, the first optimal objective () is a standard figureofmerit that describes the energy consumption of one conversion for an ADC, and the second optimal objective is the area of the proposed ADC. We set as the main optimal objective, since energy efficiency usually is the most important consideration for most applications. In this way, as shown in Fig. 7, we can obtain an optimal design for a maximum 14bit pipelined NNADC with 12.5 bits of ENOB, and of working at 1. It showcases the advantages of our proposed codesign framework that incorporates many circuitlevel nonidealities in the training process, allowing us to realize a robust design cascading up to eleven stages, a level often unattainable with traditional pipelined ADCs.
IvC Full Pipelined NNADC Evaluation
We choose the three distinct stages in Section IVB to evaluate the quantization ability of the proposed full pipelined NNADC. We find that although the codesign framework can help us to train a lowresolution stage to approximate the ideal quantization function and residue function with highfidelity, the minor discrepancy between the trained stage and ideal stage will propagate and aggregate along the pipeline and finally results in a wrong quantization. Our simulations based on various combinations of different pipeline stages show that a maximum 14bit pipelined NNADC working at 1 can be achieved by cascading nine 1bit stages, one 2bit stage and one 3bit subADC with 3bit RRAM precision. Note that the last stage of the 14bit pipelined NNADC does not need to generate residue. The reconstructed signal of this 14bit ADC is shown in Fig. 7(a), where the ENOB is 12.5 bits under z sampling frequency. We also report the SNDR trend with input signal frequency in Fig. 7(b). The SNDR begins to degenerate after z input, verifying the sampling frequency ( of input signal frequency) of the proposed 14bit NNADC is well above 1GHz.
Finally, we train a nonlinear ADC based on the same methodology using a logarithmic encoding on the input signal by replacing in Eq. (3) with () to train a 1bit stage. We find that a 10bit logarithmic ADC with 9.1bit ENOB working at can be achieved by cascading ten such 1bit stages, and the reconstructed signal is illustrated in Fig. 8.
IvD Performance Comparisons
IvD1 Comparison with existing NNADCs
We first design an optimal 8bit NNADC by cascading eight 1bit stages in Section IVB and compare it with previous NNADCs [6, 7]. The comparative data are summarized in the left columns of Table I. Compared with them, the proposed 8bit NNADC can achieve the same resolution and higher energy efficiency with ultralow precision 3bit RRAM devices. Both NNADC1 and NNADC2 adopt a typical NN (Hopfield or MLP) architecture to directly train an 8bit ADC without the optimization of architecture; therefore, they needs highprecision RRAM to achieve the targeted resolution of ADC. NNADC1 uses a large size () threelayer MLP as the circuits model, where parasitic aggregations on the large size crossbar array degenerates the conversion speed. In addition, more hidden neurons are used in NNADC1 which consume more energy. Since each stage in the proposed 8bit NNADC resolves only 1bit and has very small size, it can achieve faster conversion speed with higher energyefficiency, and highresolution with lowprecision RRAM devices. Please note that the reported in NNADC2 is based on sampling a low frequency (44KHz) signal at high frequency (1.66GHz). Therefore, it is considered outside the scope of a Nyquist ADC, and cannot be compared directly with our work on the same basis.
ADC types  NNADC  Nonlinear ADC  Uniform ADC  
Work  NNADC1[7]^{*}  NNADC2[6]^{*}  This work^{*}  JSSC 09’[11]^{**}  ISSCC 18’[3]^{**}  This work^{*}  JSSC 15’[12]^{**}  This work^{*} 
Technology ()  130  180  130  180  90  130  65  130 
Supply ()  1.2  1.2  1.5  1.62  1.2  1.5  1.2  1.5 
Area ()  0.2  0.0050.01  0.02  0.56  1.54  0.03  0.594  0.1 
Power ()  30  0.10.65  25  2.54  0.0063  31.3  49.7  67.5 
()  0.3G  1.66G0.74G  1G  22M  33K  1G  0.25G  1G 
Resolution (bits)  8  48  8  8  10  10  12  14 
ENOB (bits)  7.96  3.7(NA)  8  5.68  9.5  9.1  10.6  12.5 
()  401  8.257.5  97.7  2380  263  57  108.5  11.6 
RRAM precision  9  612  3  NA  NA  3  NA  3 
Reconfigurable ?  Yes  YesYes  Yes  No  Yes  Yes  No  Yes 

The results are shown based on simulation.

The results are shown on chip.
IvD2 Comparison with traditional nonlinear ADCs
We then compare the trained 10bit logarithmic ADC with stateoftheart traditional nonlinear ADCs [11, 3]. The comparative data are summarized in the middle columns of Table I. As it shows, the proposed 10bit logarithmic ADC has competitive advantages in area, sampling rate, and energy efficiency. JSSC 09’ [11] uses a pipelined architecture to implement an 8bit logarithmic ADC. Due to the devices mismatch, its ENOB degenerates a bit from the targeted resolution. ISSCC 18’ [3] requires 10bit capacitive DAC to achieve a configurable 10bit nonlinear quantization resolution; therefore, it can achieve high ENOB but only works at with significant area overhead. Since we adopt the proposed training framework to directly train a logencoding signal using smallsized NN models and incorporating device nonidealities, we can achieve a logarithmic ADC with small area, high sampling rate and high ENOB.
IvD3 Comparison with traditional uniform ADC
Finally, we compare the trained 14bit uniform ADC with stateoftheart traditional uniform ADC. The comparative data are summarized in the right columns of Table I. It shows that the proposed 14bit NNADC has competitive advantages in sampling rate, ENOB, and energy efficiency. JSSC 15’ [12] uses power hungry opamps and dedicated calibration techniques, resulting in the power consumption overhead and degeneration of conversion speed. The proposed 14bit NNADC uses lowresolution stages with very small NN size, enabling faster conversion speed with higher energy efficiency. The slight ENOB degeneration of the proposed ADC is caused by the discrepancy (between the trained stage and ideal stage) propagation along the pipeline stages. Also note that the performance of the proposed NNADCs and the performance of previous NNADCs are based on simulations, while the performance of the traditional nonlinear ADCs and uniform ADC are based on measurements.
V Conclusion
In this paper, we present a codesign methodology that combines a pipelined hardware architecture with a custom NN training framework to achieve highresolution NNinspired ADC with lowprecision RRAM devices. A systematic design exploration is performed to search the design space of the subADCs and residue blocks to achieve a balanced tradeoff between speed, area, and power consumption of each distinct lowresolution stages. Using SPICE simulation, we evaluate our design based on various ADC metrics and perform a comprehensive comparison of our work with different types of stateoftheart ADCs. The comparison results demonstrate the compelling advantages of the proposed NNinspired ADC with pipelined architecture in high energy efficiency, high ENOB and fast conversion speed. This work opens a new avenue to enable future intelligent analogtoinformation interfaces for nearsensor analytics using NNinspired design methodology.
Acknowledgement
This work was partially supported by the National Science Foundation (CNS1657562).
References
 [1] R. LiKamWa et al., “RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision,” IEEE ISCA, 2016, pp. 255266.
 [2] B. Li et al., “RRAMBased Analog Approximate Computing,” IEEE TCAD, vol. 34, no. 12, pp. 19051917, 2015.
 [3] J. PenaRamos et al., “A Fully Configurable NonLinear MixedSignal Interface for MultiSensor Analytics,” IEEE JSSC, vol. 53, no. 11, pp. 31403149, Nov. 2018.

[4]
M. Buckler et al., “Reconfiguring the Imaging Pipeline for Computer Vision,”
IEEE ICCV, 2017, pp. 975984.  [5] L. Gao et al., “Digitaltoanalog and analogtodigital conversion with metal oxide memristors for ultralow power computing,” IEEE/ACM NanoArch, 2013, pp. 1922.
 [6] L. Danial et al., “Breaking Through the SpeedPowerAccuracy Tradeoff in ADCs Using a Memristive Neuromorphic Architecture,” IEEE TETCI, vol. 2, no. 5, pp. 396409, Oct. 2018.
 [7] W. Cao et al., “NeuADC: Neural NetworkInspired RRAMBased Synthesizable AnalogtoDigital Conversion with Reconfigurable Quantization Support,” DATE, 2019, pp. 14561461.
 [8] T. F. Wu et al., “14.3 A 43pJ/Cycle NonVolatile Microcontroller with 4.7s Shutdown/Wakeup Integrating 2.3bit/Cell Resistive RAM and Resilience Techniques,” IEEE ISSCC, 2019, pp. 226228.

[9]
Y. Cai et al., “Training low bitwidth convolutional neural network on RRAM,”
ASPDAC, 2018, pp. 117122.  [10] H. . P. Wong et al., “Metal–Oxide RRAM,” in Proceedings of the IEEE, vol. 100, no. 6, pp. 19511970, June 2012.
 [11] J. Lee et al., “A 2.5mW 80 dB DR 36dB SNDR 22 MS/s Logarithmic Pipeline ADC,” IEEE JSSC, vol. 44, no. 10, pp. 27552765, Oct. 2009.
 [12] H. H. Boo et al., “A 12b 250 MS/s Pipelined ADC With Virtual Ground Reference Buffers,” IEEE JSSC, vol. 50, no. 12, pp. 29122921, 2015.
 [13] Kurt Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, issue. 2, pp. 251257, 1991.
 [14] W. Cao et al., “NeuADC: Neural NetworkInspired Synthesizable AnalogtoDigital Conversion,” IEEE TCAD, 2019, Early Access.
 [15] Kingma et al, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [16] P. Chen and S. Yu, “Compact Modeling of RRAM Devices and Its Applications in 1T1R and 1S1R Array Design,” IEEE TED, vol. 62, no. 12, pp. 40224028, Dec. 2015.
 [17] B. Li, et al., “MErging the Interface: Power, area and accuracy cooptimization for RRAM crossbarbased mixedsignal computing system,” IEEE ACM/EDAA/IEEE DAC, 2015, pp. 16.
 [18] Weidong Cao et al., “A 40Gb/s 39mW 3tap adaptive closedloop decision feedback equalizer in 65nm CMOS,” IEEE MWSCAS, 2015, pp. 14.
Comments
There are no comments yet.