# CMOS Current-Mode Neural Associative Memory Design with On-Chip Learning Chung-Yu Wu, Member, IEEE, and Jeng-Feng Lan, Student Member, IEEE Abstract—Based on the Grossberg mathematical model called the outstar, a modular neural net with on-chip learning and memory is designed and analyzed. The outstar is the minimal anatomy that can interpret the classical conditioning or associative memory. It can also be served as a general-purpose pattern learning device. To realize the outstar, CMOS (complimentary metal-oxide semiconductor) current-mode analog dividers are developed to implement the special memory called the ratio-type memory. Furthermore, a CMOS current-mode analog multiplier is used to implement the correlation. The implemented CMOS outstar can on-chip store the relative ratio values of the trained weights for a long time. It can also be modularized to construct general neural nets. HSPICE (a circuit simulator of Meta Software, Inc.) simulation results of the CMOS outstar circuits as associative memory and pattern learner have successfully verified their functions. The measured results of the fabricated CMOS outstar circuits have also successfully confirmed the ratio memory and on-chip learning capability of the circuits. Furthermore, it has been shown that the storage time of the ratio memory can be as long as five minutes without refreshment. Also the outstar can enhance the contrast of the stored pattern within a long period. This makes the outstar circuits quite feasible in many applications. #### I. INTRODUCTION THE fundamental characteristics of the artificial neural nets (ANN's) are parallel processing and learning capabilities [1], [2]. To realize these characteristics in real time, the hardware implementation is needed. Off-chip learning and memorizing the weights are first proposed to realize the characteristics of ANN's in hardware. By using this method, the programmable neural nets were developed [3]–[5]. Because the weights are programmable, this type of neural nets can have different weights for different applications. Nevertheless, this method needs an extra computer control. Due to the need of more powerful neural hardware, the neural nets with on-chip learning became a new design trend [6]–[10]. Using the on-chip learning, the neural net has much faster learning speed and much smaller area as compared to the conventional computer training. Due to the inherent leakage in the analog weight storage, however, it is very difficult to directly and permanently on-chip store the trained weights no matter what method is used to train them. Among the proposed on-chip learning structures so far [6]–[10], one Manuscript received January 10, 1994; revised January 5, 1995 and August 2, 1995. This work was supported by the National Science Council (NSC), Taiwan, ROC, under Grant NSC83-0416-E009-016. The authors are with Integrated Circuits and Systems Laboratory, Department of Electronics Engineering and Institute of Electronics, National Chiao Tung University, Hsinchu, Taiwan 300, ROC. Publisher Item Identifier S 1045-9227(96)00178-6. of the storage methods is to store the weights in the extra memory, e.g., DRAM (dynamic random access memory) [10]. When the learning has been completed in an analog neural net, the trained weights pass through A/D (analog-to-digital) converters and the resultant digital codes are stored in the memory. When the trained neural net starts to work, the stored weights are loaded back to the net from the memory through D/A (digital-to-analog) converters. This requires a complex control scheme to operate the neural net. Alternately, the refreshable neural net is an efficient way to store the trained weights [8]. It can store the trained weights in an extra on-chip memory through frequent refreshing. In the hardware implementation of neural nets, both analog and digital techniques may be used. Because the analog design technique can realize more neurons with higher synapse density and parallel processing rate, however, it is more suitable for neural net implementation than the digital one in certain applications [6], [11], [12]. In this paper, a new modular structure is proposed to implement an analog neural net with on-chip learning. The basic concept is extracted from the Grossberg's prediction and learning theory where a minimal structure that can be served as a classical conditioning learner and a general-purpose pattern learner is proposed and called the outstar [13]-[18]. In the outstar structure, a ratio-type memory that can on-chip store the trained analog weights is used. Thus it can perform learning and storage without extra memory devices. With the ratiotype memory, the outstar as a classical conditioning learner can learn the related things and be refreshed by reminding. Therefore, it can be used as an associative memory. As a general-purpose pattern learner, the outstar can memorize the relative strengths of the input pattern but not the absolute values. Moreover, the outstar has a modular structure which can be used to construct general learnable neural nets. In the following sections, the outstar model, architecture, CMOS (complimentary metal-oxide semiconductor) implementation, and operating principles will be described. Then both simulation and experimental results are presented to verify the functions of the outstar circuits as associative memory and general-purpose pattern learning device. System applications and discussions are also given. #### II. MODEL, ARCHITECTURE, AND CMOS IMPLEMENTATION According to the Grossberg model [13]–[18], the minimal anatomy that can interpret the classical conditioning is called the outstar due to its structure shown in Fig. 1 [17]. CS: conditioning stimulus UCS: unconditioning stimulus UCR: unconditioning response Fig. 1. The outstar is the minimal anatomy that can be used to interpret the classical conditioning [17]. The proposed outstar model can be described by the following nonlinear difference-differential equations [13]–[15] $$\frac{dx_1(t)}{dt} = -\alpha_1 x_1(t) + I_1(t) \tag{1}$$ $$\frac{dx_{j}(t)}{dt} = -\alpha_{j}x_{j}(t) + \beta x_{1}(t - \tau_{j})y_{1j}(t) + I_{j}(t)$$ (2) $$y_{1j}(t) = z_{1j}(t) \left[ \sum_{k=2}^{n} z_{1k}(t) \right]^{-1}$$ (3) $$\frac{dz_{1j}(t)}{dt} = -u_j z_{1j}(t) + \beta x_1(t - v_j) x_j(t)$$ (4) where $x_1(t)$ and $x_j(t)$ are the nonnegative neuron states (or short-term memory) variables, $z_{1j}(t)$ are the nonnegative long-term memory state variables, $y_{1j}(t)$ are the nonnegative ratio or normalized memory variables, $\alpha_j$ , $\beta$ , and $u_j$ are positive constants, $I_1(t)$ and $I_j(t)$ are the nonnegative inputs, and $\tau_j$ and $\upsilon_j$ are the propagating delay times through the axon with $j=2,3,\cdots,n$ . Also, the initial data of this system must always be nonnegative and continuous. For convenience, $z_{1j}(0)>0$ is assumed. The outstar model described above has a special memory style in (3) so that the learning results can be stored in the network as long as possible. Although $z_{1j}(t)$ decays when the time is elapsed, the ratio relationship among $z_{1j}(t)$ can keep each relative strength $y_{1j}(t)$ constant in the ideal case where all the decay rates $\alpha_j$ , $j=1,2,\cdots,n$ , are equal and the same for $u_j$ . This means that the learning results can be memorized no matter how $z_{1j}(t)$ decays, without practicing overtly. Also, the outstar can memorize the stimulus relative strengths, i.e., it can learn spatial patterns. A pattern can be defined as the input $I_i(t)$ of the form [13]-[15] $$I_i(t) = \theta_i I(t), \quad j = 2, 3, \dots, n$$ where $\theta_j$ is an arbitrary, but fixed, nonnegative number and I(t) is a nonnegative continuous function. When the stimulus is applied, it can be shown that $y_{1j}(t)$ approaches $\theta_j$ if the learning time is long enough [18]. So, the outstar can learn the relative strength of the inputs. Sometimes, the relative strengths of the inputs are much more important than the absolute ones because the information is always represented by the relative relationship. Because the outstar can learn the relative strength of the inputs, it can be used as a general-purpose pattern learning device which can learn a pattern as completely as possible. According to (1)–(4), the block diagram of the outstar can be drawn and shown in Fig. 2. The architecture consists of multipliers, dividers, multi-input adders, integrators, and pseudo-linear transfer functions. Since the outstar has many summing actions, it is very convenient to implement the outstar architecture in the current mode, where the current summing is very simple to realize and the power dissipation is low. Therefore, the analog current-mode design technique is applied to the design of the outstar architecture in Fig. 2. # A. Analog Current Multiplier Using Subthreshold CMOS Transistors In the subthreshold region with the drain-source voltage larger than $3V_t$ , the MOSFET (MOS field effect transistor) drain current can be approximately expressed as [19], [20] $$I_D = SI_{Do}e^{-V_{BS}[(1/nV_t) - (1/V_t)]}e^{(V_{GS} - V_T)/nV_t}$$ (5) where $S={ m geometrical\ shape\ factor\ of\ the\ transistor}=W/L$ (effective width over effective length of the channel), n = a constant between 1 and 2, $V_t = KT/q$ , where K is Botlzmann's constant, T is the device temperature in degrees Kelvin, and q is the charge of an electron, $I_{Do}$ is the effective drain current in the subthreshold operation, $V_T$ is the threshold voltage, $V_{GS}$ is the gate-source voltage, and $V_{BS}$ is the bulk-source voltage. Because the sum of two number logarithms is equal to the logarithm of the two number products, one can use this property to realize the two-number multiplication. Fig. 3(a) shows the conceptual circuit. Suppose that all MOSFET's are operated in the subthreshold region and properly matched, i.e., they have the matched $I_{Do}$ and n. By using the subthreshold current (5), the output current $I_{om}$ can be expressed as functions of the input currents $I_{m1}$ and $I_{m2}$ . The detailed derivations considering of the body effect are given in the Appendix. If the body effect is neglected and assume that $I_{biasm}$ is much larger than $I_{om}$ , the (A-11) and (A-12) can be simplified as $$I_{om} \cong \frac{S_3 S_4}{S_1 S_2 I_{biasm}} \cdot I_{m1} I_{m2}$$ $$\equiv K_M \cdot I_{m1} I_{m2}. \tag{6}$$ Fig. 2. The block-diagram of the outstar where the architecture consists of multipliers, dividers, multi-input adders, integrators, and pseudo-linear functions. It can be seen from (6) that the analog current multiplier can be realized by using the subthreshold operated transistors. Due to the subthreshold operation, the total power dissipation can be kept low. The whole circuit of the analog current multiplier and its symbol are shown in Fig. 3(b) and (c), respectively. The dimensions of the transistors are also given. In Fig. 3(b), MM10, MM11, and MM12 are bias circuits for the bias current $I_{biasm}$ and $V_{bm}$ is the bias voltage. $V_{bm}$ can be used to adjust the gain of the multiplier. To reduce the errors due to the channel length modulation effect and geometric mismatches, long channel devices have been used in the subthreshold operated MOS current mirrors. In Fig. 3(b), MMR is operated in the linear region with the equivalent output resistance $R_M$ which is one of the parameters in the forgetting term $\alpha_i$ or $u_i$ in (1), (2), and (4). The equivalent output resistance $R_M$ of MMR is designed to have the value between 26.5 $K\Omega$ and 29.5 K $\Omega$ if the output voltage $V_{om}$ varies from 0 V to -270mV. Iom in Fig. 3(a) is amplified through the current mirrors MM6, MM7, MM8, and MM9 by a factor of $\xi = 7$ . The output current $I_m$ is equal to the amplified $I_{om}$ subtracted by the forgetting current in MMR. # B. Analog Current Divider Using Subthreshold CMOS Transistors Similar to the multiplier, one can utilize the principle, the subtraction of two number logarithms is equal to the logarithm of the two number quotient, to build a current divider. The conceptual circuit is shown in Fig. 4(a). Assume that all MOSFET's also operate in the subthreshold region. Using (5) and the similar derivations given in the Appendix, the output current $I_{od}$ can be expressed as functions of the input currents $I_{d1}$ and $I_{d2}$ as in (A-13) and (A-14). If the body effect is neglected, $I_{od}$ can be simplified as $$I_{od} \cong \frac{S_1 S_4 I_{biasd}}{S_2 S_3} \cdot \frac{I_{d2}}{I_{d1}}$$ $$\equiv K_D \cdot \frac{I_{d2}}{I_{d1}}.$$ (7) In (7), the bias current $I_{biasd}$ is used to stabilize the voltage $V_D$ at the node D of Fig. 4(a). Without $I_{biasd}$ and MD3, the voltage $V_D$ would drift depending on the current level of $I_{d2}$ and the divider would not work well. As may be seen from (7), the circuit of Fig. 4(a) performs the division operation and acts as an analog current divider. The whole circuit of the analog current divider and the transistor dimensions are shown in Fig. 4(b) where $V_{bd}$ is the bias voltage for the bias current $I_{biasd}$ . To obtain a suitable output current level of the divider, $I_{od}$ is further amplified by a factor of $\psi=4$ through the current mirror MD7 and MD8. Fig. 4(c) shows its symbol of the divider where N is the numerator input node and D is the denominator input node. ## C. Integrating Circuit and One-Half Absolute-Value Circuit It is well known that a capacitor can be used as an integrator if the input signal is a current signal to be integrated into an output voltage signal. Thus all the integrator in Fig. 2 can be realized as a simple capacitor. The output voltage signal of the simple capacitor integrator has to be changed into the current signal for realization of the pseudo-linear functions in the current mode. Thus, a transconductance amplifier is required at the integrator output. Although, there have been many proposed operational transconductance amplifiers (OTA's) [21], [22], a simple four-transistor transconductance amplifier [22] is chosen since the chip size and the linearity, rather than the frequency response, are the most important concern in this application. The circuit Fig. 3. (a) The conceptual circuit of the analog current multiplier. (b) The complete circuit of the analog current multiplier with the forgetting resistor MMR. (c) The symbol representation of (b), where m1 denotes the first input node and m2 is the second input node. diagram of the transconductance amplifier is shown in Figs. 5 and 6 [22] where the four transistors are denoted as MG1, MG2, MG3, and MG4. The total transconductance $g_m$ which depends on the MOS device parameters can be tuned by the gate bias voltages $V_{G1}$ and $V_{G4}$ . As mentioned previously, it has been shown that the outstar model is only for nonnegative signals. Because the real signal is represented by the negative voltage in the integrator, the Fig. 4. (a) The conceptual circuit of the analog current divider. (b) The complete circuit of the analog current divider. (c) The symbol representation of (b), where N denotes the numerator input node and D denotes the denominator input node. negative integrator output signal has to be converted into the nonnegative signal whereas the unwanted positive one has to be eliminated. This can be achieved by adding an inverted pseudo-linear function. By combining with the four-transistor transconductance amplifier and the current mirror circuits formed by MG5, MG6, and MG7 or MG71 of Figs. 5(a), 6(a) and (c) and MG8 and MG9 in Fig. 6(a) and (c), two types of one-half absolute-value circuits (OHAVC's) can be used to realize the inverted pseudo-linear functions. The type 1 in Fig. 5(a) and (b) is used to implement (4), the learning equations. The output current $I_{ot1}$ is duplicated through the current mirrors for various current dividers. The type 2 in Fig. 6(a) and (b) is used in the master neuron $N_1$ , whereas the type 2 in Fig. 6(c) and (d) are used in the slave neuron $N_i$ and $j=2, 3, \dots, n$ . In these two type 2 OHAVC's, two kinds of the output currents are required. One is the neuron external output current $I_{ot2}$ which is amplified by a factor of Fig. 5. (a) Type 1 OHAVC for learning equations. (b) The symbol representation of (a). 250 from $I_{ot}$ . The other is the neuron internal output currents to be sent to various multipliers. They are duplicated from $I_{ot}$ through the current duplicators. Basically, these two types of OHAVC's use the same core structure but different output circuits. The output currents $I_{ot1}$ in Fig. 5(a) and $I_{ot2}$ in Fig. 6(a) and (c) can be derived as $$I_{ot1} = G_1 \cdot V_{it1} \tag{8}$$ $$I_{ot2} = G_2 \cdot V_{it2} \tag{9a}$$ $$I_{ot} = \gamma \cdot I_{ot2} \tag{9b}$$ where $G_1 = -g_m \cdot (W_7/W_6)$ , $G_2 = -g_m \cdot [(W_{71}W_9)/(W_6)W_8]$ ], $\gamma = [(W_6W_8)/(W_{71}W_9)]$ , and $V_{it1}$ and $V_{it2}$ are nonpositive. The HSPICE simulated characteristic of the type 1 OHAVC is given in Fig. 7(a). It can be seen from Fig. 7(a) that the OHAVC realizes the inverted pseudolinear function. The operation range of $V_{it1}$ is from 0 V to -270 mV, whereas that of the output current is from 0–1.405 $\mu$ A. The linearity analysis of the OHAVC is shown in Fig. 7(b). The offset output current is 1.97 nA when the input is 0 V. The error is below 5%. #### D. Summing Circuit The summing circuit has a very simple structure because the system uses current-mode signals. The basic Kirchhoff summing circuit is shown in Fig. 8. The negative output currents $I_{sk}$ , for $k=2,3,\cdots,n$ , at the node S are connected to $I_{ot1}$ of the type 1 OHAVC. The summed current $I_{sum}$ which realizes the denominator term $\sum_{k=2}^{n} z_{1k}$ of (3) is the drain current of the PMOS (p-channel MOS) transistor MPS in Fig. 8 with its source grounded. The gate of MPS is shorted Fig. 6. Type 2 OHAVC's. (a) Type 2 OHAVC for the master neuron. (b) The symbol representation of (a). (c) Type 2 OHAVC for the slave neuron. (d) The symbol representation of (c). to its drain to act as the master of the current mirrors which are used to distribute the summed current to the dividers. Fig. 7. (a) The HSPICE simulation result of the type 1 OHAVC where the inverted pseudolinear characteristic is shown between the input voltage and the output current with $VDD=2.5~\rm V,\ VSS=-2.5~\rm V,\ V_{G1}=2.5~\rm V,$ and $V_{G4}=-2.2~\rm V.$ (b) The linearity analysis of the type 1 OHAVC in (a). #### E. Input Circuit As described in (1) and (2), the input current must be directly injected into the neuron. If the input current signals are directly connected to the neuron input nodes, however, some disturbances coming from outside to the input nodes may influence the characteristic of the whole system. To prevent this effect, a buffer at the input to reject the outside disturbance is used. The input circuit and its symbol are shown in Fig. 9. The MOS device MIR is operated in its linear region and its equivalent resistance is denoted as $R_I$ . $R_I$ together with $R_M$ , $G_1$ , and $G_2$ realize the decay terms $\alpha_j$ in (1) and (2). The value of $R_I$ is designed in the range of 26.5–29.5 $\mathrm{K}\Omega$ if the output node voltage $V_{oin}$ varies from 0 V to - 270 mV. $I_{iin}$ is compressed through the cascode current mirror MI1, MI2, MI3, and MI4 by a factor of $\zeta = 10$ . The output current $I_{oin}$ is equal to the compressed $I_{iin}$ subtracted by the forgetting current in MIR. Fig. 8. (a) The Kirchhoff current summing circuit. (b) The symbol representation of (a). Fig. 9. (a) The input circuit and (b) its symbol. #### F. Complete Circuit Using the above described basic building circuits, the whole outstar circuit can be formed from Fig. 2. The block diagram of the outstar circuit is shown in Fig. 10. As compared to Fig. 10. The complete circuit of the outstar. Fig. 2, it can be seen that the neuron inputs $I_j(t)$ are realized by the input currents $I_j$ . The neuron states $x_j(t)$ are realized by $I'_{oj}$ for external use and $I_{oj}$ for internal use. From (9a) and (9b), we have $I_{oj} = \gamma \cdot I'_{oj}$ . The long-term memory states $z_{1j}(t)$ are realized by the currents $I_{1j}$ . Using the circuit parameters, (1)–(4) can be rewritten as $$\frac{C_1}{G_2} \cdot \frac{dI_{o1}(t)}{dt} = -\frac{I_{o1}(t)}{R_I G_2} + \frac{1}{\zeta} \cdot I_1(t) \tag{10}$$ $$\frac{C_j}{G_2} \cdot \frac{dI_{oj}(t)}{dt} = -\frac{1}{RG_2} I_{oj}(t) + K_{M1}$$ $$\cdot I_{o1}(t - \tau_j) y_{1j}(t) + \frac{1}{\zeta} \cdot I_j(t) \tag{11}$$ $$y_{1j}(t) = K'_D \cdot \frac{I_{1j}(t)}{\sum_{k=2}^{n} I_{1k}(t)}$$ $$\frac{C_{1j}}{G_1} \cdot \frac{dI_{1j}(t)}{dt} = -\frac{1}{R_M G_1} I_{1j}(t) + K_{M2}$$ $$\cdot I_{o1}(t - v_j) I_{oj}(t) \tag{13}$$ where $R=R_MR_I/(R_M+R_I)$ , $K_{M1}=\xi\gamma\cdot K_M$ , $K_{M2}=\xi\gamma^2\cdot K_M$ , $K_D'=\psi\cdot K_D$ , and $\tau_j$ and $v_j$ are the propagation delay times through the interconnection line. Notes that $\tau_j$ and $v_j$ affect the correlation and association rates, but they do not cause faulty function of the outstar. From the above equations, it can be clearly seen that the outstar can be realized by the CMOS circuits of Fig. 10. In Fig. 10, the PMOS switches $M_{sj}$ with the control clock $V\phi$ are added before $C_{1j}$ . With these switches, the outstar can be operated as associative memory or general-purpose pattern learner. This will be explained in Section III. #### III. OPERATIONAL PRINCIPLES OF THE OUTSTAR CIRCUIT #### A. The Associative Memory To serve as an associative memory, the switches $M_{sj}$ in Fig. 10 are always on with $V\phi = -2.5$ V. If the conditioning input stimulus $I_1$ and the unconditioning input stimulus $I_2$ are applied to the two neurons N<sub>1</sub> and N<sub>2</sub> simultaneously and no signals are applied to the other neurons $N_r(I_r = 0, \text{ for }$ $r \neq 1$ and 2), the output signals $I_{o1}$ and $I_{o2}$ increase gradually according to (10) and (11), respectively. If the duration of the input signals is long enough, the type 2 OHAVC's (T21)<sub>N1</sub> and (T22)<sub>N2</sub> become saturated and go into the triggered state. On the other hand, the output $I_{or}$ of $N_r$ still remains in the zero state. From (13), the cross correction of $I_{o1}$ and $I_{o2}$ make $I_{12}$ high and $I_{or}$ makes $I_{1r}$ low. Assume that all the decay rates of $I_{1k}(k=2, 3, \dots, n)$ are equal, that is, all $R_M$ and $R_I$ are the same. Then, the memory (12) keeps $y_{12}$ and $y_{1r}$ nearly constant for an infinitely long time. This is why the circuit has memory. Nevertheless, it is impossible to memorize all forever due to different circuit delay times, decay rates, and noise. Thereafter, the inputs are released with $I_1=I_2=0$ . The neuron outputs begin to decay due to the term $-[1/(RG_2)]I_{oj}(t)$ and finally come back to the zero state or inactive state. But $y_{12}(t)$ is still memorized as described above. After elapsing some time, a trigger signal is sent to $(T21)_{N1}$ and $I_{o1}$ is triggered. Due to the correlation term $K_{M1} \cdot I_{o1}(t-\tau_2)y_{12}(t)$ , $I_{o2}$ goes high. But $I_{or}$ still keeps low owing to the low value of $y_{1r}(t)$ . This, however, is the ideal case. Actually, $I_{or}$ goes slightly high and then comes back to zero due to the slight memory loss of $y_{1r}(t)$ . From the above descriptions, it is seen that if $N_1$ is triggered by a stimulus, $N_2$ can be triggered associatively and the other neurons $N_r$ remain inactive. Evidently, this system has learned the relationship Fig. 11. The HSPICE simulation results of the three-neuron outstar as the associative memory. (a) The neuron input stimuli. (b) The neuron output responses. (c) The ratio memory states. between $N_1$ and $N_2$ . Also, the memory loss can be restored from the above action. Because when $I_{o1}$ goes high, $I_{o2}$ associatively goes high immediately and the correlation term of the learning (13) enhances $I_{12}$ and decreases $I_{1r}$ . Then $y_{12}$ can be restored to high and $y_{1r}$ to low. Now, the memory has been refreshed. This is the same as human. When we memorize a thing, we will forget something after certain time if we do not remember it. We have more clear reminiscence, however, if we remember it as often as we could. The HSPICE simulation results of the outstar as an associative memory is shown in Fig. 11 where three neurons are simulated. Fig. 11(a) shows the input stimuli and Fig. 11(b) shows the neuron output responses where one can find that $I_{o2}'$ has been associated correctly and $I_{o3}'$ goes slightly high and then comes back to zero due to the memory loss. From Fig. 11(c), it can be seen that the memory has been restored by reminding. Fig. 12 shows the HSPICE simulation results of the outstar that can correct the error learning by retraining. Fig. 12. The HSPICE simulation results of the three-neuron outstar for the relearning association. (a) The neuron input stimuli. (b) The neuron output responses. (c) The ratio memory states. Fig. 12(a) shows the input stimuli, whereas Fig. 12(b) shows that the associated output has been corrected. Fig. 12(c) shows that the memory states have been changed through relearning. ### B. The General-Purpose Pattern Learning Device To serve as a general-purpose pattern learning device, the outstar must memorize the relative strengths among the inputs. Because there may exist many mismatched parameters in the learning feedback loop, the environment of the memory nodes should be simplified after the learning has been finished. Otherwise, the memory nodes will be disturbed by the effects of mismatched parameters. In Fig. 10 when $V\phi$ is low (-2.5 V), the switch is on and the circuit is operated in the learning phase. When $V\phi$ is high (2.5 V), however, the switch is turned off to prohibit the feedback signals and the circuit is operated in the memory phase. In this phase, the memory nodes $Z_j$ in Fig. 10 only see the mismatches among the switches $M_{sj}$ and the system can hold the relative strengths among the inputs much more accurately and steadily. Due to the inevitable leakage current at the memory node even when the switches $M_{sj}$ are off, the memorized weights still decay gradually. Due to the ratio memory, however, the outstar of Fig. 10 can effectively increase the memory storage time. The tolerance of the weights against various physical parameter variations can also be improved #### C. Device Nonideal Effects on the Outstar Operation Since the MOS devices in the analog current multiplier and divider are operated in the subthreshold region, the device nonideal effects including the body effect, the channel length modulation effect, and the threshold voltage $V_T$ mismatch, may influence the outstar circuit performance. They are discussed in the following. 1) The Body Effect: From (A-11)–(A-14), it can be seen that both current multiplier and divider have nonideal factors $K_M(V_{BS1},V_{BS3})$ , $(I_{m2})^{n_2/n_1}$ , and $K_D(V_{BS1},V_{BS2})$ , $(I_{d2}/I_{d1})^{n_1/n_4}$ , respectively, due to the body effect of the MOS devices. Since the n ratio is very close to unity, the terms $(I_{m2})^{n_2/n_1}$ and $(I_{d2}/I_{d1})^{n_1/n_4}$ only induce a small nonlinearity. As seen from Fig. 3(a), the difference of the bulk-source voltages $V_{BS}$ in MM3 and MM1 is equal to that of the gate-source voltages of MM4 and MM2, which depends on $I_{om}$ and $I_{m2}$ , respectively. Since $I_{om}$ is proportional to $I_{m2}$ , $V_{BS3} - V_{BS1}$ has a small variation which reduces the variation of $K_M(V_{BS1}, V_{BS3})$ . Similar conclusion can be obtained for $K_D(V_{BS1}, V_{BS2})$ . Thus the body effect can be suppressed. Moreover, because the multiplier output current is proportional to the inputs $I_{m1}$ and $I_{m2}$ and the outstar uses the relative quantities to process the signals rather than the absolute ones, the nonideal factors $K_M$ and $K_D$ can be further suppressed. Thus, the body effect does not affect the outstar operation significantly. - 2) The Channel Length Modulation Effect: Due to the channel length modulation effect, the simple current mirror has some errors. Thus larger input currents to the current mirror lead to larger current-mirror output currents. This increases the signal ratio, as the ratio memory in the outstar does. Thus the channel length modulation effect in the simple current mirror does not disturb the outstar operation. Therefore, the simple current mirror is used. But the long channel devices are used to reduce the device mismatches. - 3) The Threshold Voltage $V_T$ Mismatch: As shown in the (A-12) and (A-14), the $V_T$ mismatch in both multiplier and divider has a significant effect on their gain values due to the exponential functions. The Monte-Carlo analysis in the HSPICE simulation is used to observe the $V_T$ mismatch effect on the circuit performance. According to the measured data from the wafer testing, the standard deviation of the NMOS (n-channel MOS) threshold voltage over a chip is about 2 mV and that of the PMOS threshold voltage is about 3 mV. Also, the threshold voltage variation is assumed to be the Gaussian random distribution over all the transistors in the chip. In this assumed worst case, all the MOS devices have different Fig. 13. (a) The measurement result of the analog current multiplier. (b) The linearity analysis of the analog current multiplier where $I_{om5}$ is the multiplier output current with parameter $I_{m2}=10~\mu\mathrm{A}$ . threshold voltages even though they may be close in layout. The simulation of the chip performance under $V_T$ mismatches is passed if the order of the output levels is the same as that of the input levels. The success rate is 60% over 30 trails. In the actual layout design, the subthreshold operated devices are carefully arranged by using the interdigitized layout technique [23]. Thus, the $V_T$ directional variations can be partially compensated. With small $V_T$ deviation and interdigitized layout, the measured yield rate of the fabricated outstar circuit is about 90% over a wafer. This is quite satisfactory in production. #### IV. EXPERIMENTAL RESULTS #### A. Analog Current Multiplier and Divider The designed analog current multiplier and divider have been fabricated by using 0.8 $\mu$ m double-poly double-metal n-well CMOS process. The measured characteristics of the analog current multiplier are shown in Fig. 13(a) where the horizontal axis represents the input current $I_{m1}$ and the vertical axis represents the output current $I_{om}$ with the parameter $I_{m2}$ Fig. 14. (a) The measurement result of the analog current divider. (b) The relative ratio analysis of the analog current divider. from 0–10 $\mu$ A at a step 2 $\mu$ A. For observation convenience, the input operation range is from 0–500 nA whereas the output range is from 0–250 nA. The linearity analysis shown in Fig. 13(b) is analyzed by calculating the slope error of the output current $I_{om5}$ , where $I_{om5}$ is the multiplier output current measured in Fig. 13(a) with the largest parameter $I_{m2} = 10 \,\mu$ A. In Fig. 13(b), the maximum error is about 4.2%. The measurement results of the analog current divider are shown in Fig. 14(a) where the horizontal axis represents the input denominator current $I_{d1}$ and the vertical axis represents the output current $I_{od}$ with the parameter $I_{d2}$ from 5–25 $\mu$ A at a step 5 $\mu$ A. The input current level is enlarged by 20 times and the output current level by 100 times for observation convenience. The real input operation range is from 0–500 nA and the output range from 0–10 $\mu$ A. Because the divider is used to process the ratio of the signals, its performance can be evaluated by the output current difference at two different $I_{d2}$ values. In Fig. 14(b), it can be found the error is below $\pm 7\%$ except at the low current level of $I_{d1}$ ( $I_{d1}$ < 1 $\mu$ A). The large error at the low current level is due to the leakage current which gradually dominates the divider current. #### B. The Associative Memory 3.5 $\mu$ m CMOS double-poly single-metal p-well technology is used to implement the three-neuron outstar shown in Fig. 10 as a classical conditioning learner to test the associative memory capability [24]. The capacitors $C_{12}$ and $C_{13}$ are 1 pF and $C_k$ is the parasitic capacitor at the corresponding node where k=1, 2, 3. Fig. 15 shows the chip photograph and Fig. 16 shows the measurement results where $I'_{o1}$ and $I'_{o2}$ are obtained by adding a 1 K $\Omega$ resistor to the output node and measuring its voltage. In the measurement, the pulses that applied to $I_1$ , $I_2$ , and $I_3$ have a duration of 500 $\mu$ s. The elapsed time between two pulses is 1.5 msec. Fig. 16(a) shows the associative waveforms which successfully verify the function of the fabricated associative memory as simulated in Fig. 11. Fig. 16(b) shows the relearning capability as simulated in Fig. 12. Fig. 15. The chip micrograph of the fabricated three-neuron outstar as the associative memory. #### C. The General-Purpose Pattern Learning Device To experimentally verify the function of the general-purpose pattern learning device, the outstar neural network in Fig. 10 has been fabricated by 0.8 $\mu$ m double-poly double-metal nwell CMOS process. The chip photograph is shown in Fig. 17. There are five neurons in the chip including one master neuron and four slave neurons. The storage capacitor $C_{1j}$ is 1 pF where j=2, 3, 4, 5. Based on the test chip layout, it can have 381 ratio memories within a single chip with 100 mm² die area. For convenience, a wide-range voltage to current converter is added at the input of the neuron so that the input voltage testing signals can be used. Fig. 18 shows the input voltage waveforms to test the ratio memory of the fabricated chip. To observe the memory retention time, both learning time and retrieving time in Fig. 18 are lengthened to several ms, although both can be smaller than 20 $\mu$ s. In the learning phase, the input voltage of the master neuron $V_{i1}$ is given as 1.2 V, whereas the input voltages of the four slave neurons $V_{i2}$ , $V_{i3}$ , $V_{i4}$ , and $V_{i5}$ are 0.8 V, 0.6 V, 0.4 V, and 0.2 V, respectively. After 0.8 ms has been elapsed $(t_1)$ , the switches $M_{sj}$ are switched off by pulling up $V\phi$ to 2.5 V to hold the learned voltages. After 4.4 ms (t<sub>3</sub>), the input signal of 1.2 V is given to the master neuron to retrieve the stored relative strength in the slave neurons. As can be seen from Fig. 19. the slave neuron outputs have been retrieved in the sense that the ratio of the output currents is the same as that of the initial learned inputs. The maximum storage time $t_3$ can be as long as three minutes before two of the stored voltage levels decay to zero. Therefore, the fabricated outstar chip has stored the relative input strength. Although there are some offset voltages (100 mV) at the slave neuron outputs, the relative strength is not affected. In Fig. 10 with the switches $M_{sj}$ , the voltages stored on the capacitors $C_{1j}$ decay gradually due to the leakage current. The measured decay characteristic of the stored voltages is shown in Fig. 20. The major leakage current is the junction diode Fig. 16. (a) The measurement result of the fabricated three-neuron outstar as the associative memory. (b) The measurement result of the associative memory with relearning capability. reverse saturation current which is measured as about 0.8 fA. With this leakage current, the relative error of the outstar ratio memory is about 11.5% in the first minute as shown in Fig. 19. The absolute error, however, is 45.1% in the first minute. As can be seen in Fig. 20, initially, the relative memory denoted as $y_U$ increases gradually if the stored voltage is greater than $z_{ave} = (z_{12} + z_{13} + z_{14} + z_{15})/4$ , where $z_{12}$ , $z_{13}$ , $z_{14}$ , and $z_{15}$ are the held trained voltages. Otherwise, the relative memory y decreases gradually and is denoted as $y_D$ . Therefore, the outstar can enhance the contrast of the relative memory in this initial period. This property can be further used in some applications such as pattern classification. The useful contrast enhancement time period from measurement observation is about five minutes in the sense that one of the upper ratio memory $y_U$ (in this case, $y_{13}$ ) decreases to below the average level of the initially trained ratio memory values (in this case, $[y_{12}(0) + y_{13}(0) + y_{14}(0) + y_{15}(0)]/4$ ). After five minutes, the memory continues to decay and gradually converge to the same voltage level. Fig. 17. The photograph of the outstar chip as the general-purpose pattern learning device. Fig. 18. The input voltage signals of the outstar chip where $t_0=0$ ms, $t_1=0.8$ ms, $t_2=1.6$ ms, $t_3=5.2$ ms, and $t_4=6.0$ ms. Fig. 19. The measurement result of the outstar as the general-purpose pattern learning device. From the above experimental results, it has been shown that even without the refresh operation, the outstar neural network can have a long storage period. Moreover, as the outstar Fig. 20. The measured decay characteristics of the stored voltages in the fabricated outstar circuit of Fig. 10. Fig. 21. The feedforward Hamming net using the outstar structure. memorizes the input patterns, the leakage current decays the storage voltages and causes the ratio memories to enhance the contrast of the stored pattern. In other words, when the time elapses, the ratio memory not only memorizes the training pattern but also processes the storage pattern with contrast enhancement. Because the contrast enhancement is one of the image processing technique, one can use the outstar circuit to realize the pattern recognition network. ## V. SYSTEM APPLICATIONS Using the outstar structure, the feedforward Hamming net for pattern recognition and classification can be constructed. As shown in Fig. 21, every outstar represents a class where Sc is the matching score function circuit. The first layer neurons $\mathbf{x}_{11}, \mathbf{x}_{12}, \cdots, \mathbf{x}_{1m}$ are the slave neurons and the second layer neurons $\mathbf{x}_{21}, \mathbf{x}_{22}, \cdots, \mathbf{x}_{2n}$ are the master neurons. So there are n outstars and they share the common slave neurons in the first layer. In the training phase, the neural net is trained by one outstar. After the outstar has been trained, it is isolated from other outstars and the next outstar is trained. This process continues until all the outstars have been trained. The typical training time of an outstar is about 20 $\mu$ s. Since the storage time of the outstar ratio memory can be as long as five minutes, it is enough for the learning phase and the subsequent classification phase. To avoid the discontinuity of image classification due to the training time 20 $\mu$ s, the alternative ping-pong structure may be used to allow one net in training while the other is in classification and vice versa. In the classification phase, the input pattern is sent to the slave neurons and goes through the norm circuit which is the divider circuit of the outstar. Then it is compared with the internal exemplar pattern. After comparison, the matching score is produce and sent to the master neurons. A winner-take-all circuit is needed to choose the minimum score master neuron which represents the best matched class. Besides the feedforward neural nets, it is feasible that the outstar structure could be used to construct the feedback neural nets with learning, such as the learnable Hopfield net. With the features of the ratio memory, the feedback learnable neural nets may be used in many applications. #### VI. CONCLUSIONS AND DISCUSSIONS In this paper, the analog current-mode design technique is used to realize the outstar. Two outstar structures are proposed to serve as a classical conditioning (or associative memory) learner and a general-purpose pattern learning device. To implement on-chip learning and memory, an analog current multiplier and divider are developed. In addition, the ratiotype memory is realized to store the trained weight. In the classical conditioning learner, the memory could be lost due to the unequal decay. But it can be refreshed by reminding. This memory loss problem can be further improved by adding a switch before the integrator. In the general-purpose learning device, the switches before the integrators are added. When the switches are turned off, the trained weight values can be held in the ratio-type memory with less memory decay. Thus the relative relationship among the weights can be memorized in a longer period and also the relative information of the inputs. The fabricated outstars as the associative memory and the general-purpose pattern learning device have been measured and their functions have been successfully verified. The Hamming net constructed by the outstar is also described as an application example. Future research will be conducted to explore the applications of the outstar circuits in image processing and feedback learnable neural nets. #### APPENDIX # THE ANALOG CURRENT MULTIPLIER AND THE ANALOG CURRENT DIVIDER In Fig. 3(a), all MOSFET's are operated in the subthreshold region and properly matched. By using (5), the drain currents of the devices MM1, MM2, and MM4, respectively, are $$\begin{split} I_{MM1D} &= I_{m1} \\ &= S_1 I_{Do1} e^{-V_{BS1} \{ [1/(n_1 V_t)] - (1/V_t) \}} \\ &\quad e^{(V_{GS1} - V_{T1})/n_1 V_t} \end{split} \tag{A-1} \\ I_{MM2D} &= I_{m2} \end{split}$$ $$I_{MM2D} = I_{m2}$$ $$= S_2 I_{Do2} e^{(V_{GS2} - V_{T2})/(n_2 V_t)}$$ (A-2) $$I_{MM4D} = S_4 I_{Do4} e^{(V_{GS4} - V_{T4})/(n_4 V_t)}.$$ (A-3) (A-1), (A-2), and (A-3) can be rewritten as $$V_{GS1} = V_{T1} + n_1 V_t$$ $$\cdot \ln \left\{ \frac{I_{m1}}{S_1 I_{Do1} e^{-V_{BS1} \{[1/(n_1 V_t)] - (1/V_t)\}}} \right\}$$ (A-4) $$V_{GS2} = V_{T2} + n_2 V_t \ln \left\{ \frac{I_{m2}}{S_2 I_{Do2}} \right\}$$ (A-5) $$V_{GS4} = V_{T4} + n_4 V_t \ln \left\{ \frac{I_{MM4D}}{S_4 I_{Do4}} \right\}.$$ (A-6) From Fig. 3(a), it can be found that $$V_M = V_{GS1} + V_{GS2}$$ = $V_{GS3} + V_{GS4}$ . (A-7) Substituting (A-4), (A-5), and (A-6) into (A-7), we have $$\begin{split} V_{GS3} &= V_{GS1} + V_{GS2} - V_{GS4} \\ &= \left[ V_{T1} + V_{T2} - V_{T4} - (n_1 - 1)V_{BS1} \right] \\ &+ n_1 V_t \ln \left\{ \frac{I_{m1}}{S_1 I_{Do1}} \cdot \left( \frac{I_{m2}}{S_2 I_{Do2}} \right)^{n_2/n_1} \right\} \\ &- n_4 V_t \ln \left\{ \frac{I_{MM4D}}{S_4 I_{Do4}} \right\}. \end{split} \tag{A-8}$$ Now, the drain current equation of MM3 can be written as $$I_{MM3D} = S_3 I_{Do3} e^{-V_{BS3} \{ [1/(n_3 V_t)] - (1/V_t) \}}$$ $$e^{(V_{GS3} - V_{T3})/(n_3 V_t)}.$$ (A-9) Substituting (A-8) into (A-9), we have $$I_{om} = I_{MM3D}$$ $$= \frac{(S_3 I_{Do3})(S_4 I_{Do4})^{n_4/n_3}}{(S_1 I_{Do1})^{n_1/n_3}(S_2 I_{Do2})^{n_2/n_3}(I_{MMD4})^{n_4/n_3}}$$ $$\cdot \exp\left\{\frac{(n_3 - 1)V_{BS3} - (n_1 - 1)V_{BS1}}{n_3 V_t}\right.$$ $$+ \frac{V_{T1} + V_{T2} - V_{T3} - V_{T4}}{n_3 V_t}\right\}$$ $$\cdot (I_{m1})^{n_1/n_3} \cdot (I_{m2})^{n_2/n_3}. \tag{A-10}$$ Because MM2 and MM4 have no body effect, therefore, $V_{T2}=V_{T4},\,n_2=n_4,$ and $I_{Do2}=I_{Do4}.$ Also, assume $n_1\approx n_3$ and $I_{Do1}\approx I_{Do3}.$ Then $$I_{om} \cong K_M(V_{BS1}, V_{BS3}) \cdot I_{m1}(I_{m2})^{n_2/n_1}$$ (A-11) where $$K_{M}(V_{BS1}, V_{BS3}) = \frac{S_{3}(S_{4})^{n_{2}/n_{1}}}{S_{1}(S_{2})^{n_{2}/n_{1}}(I_{MMD4})^{n_{2}/n_{1}}} \cdot \exp\left\{\frac{(n_{1}-1)(V_{BS3}-V_{BS1})}{n_{1}V_{t}} + \frac{V_{T1}-V_{T3}}{n_{1}V_{t}}\right\}.$$ (A-12) For the current divider shown in Fig. 4(a), similar derivations can be used to obtain the expression of the output current $I_{od}$ as $$I_{od} = K_D(V_{BS1}, V_{BS2}) \left(\frac{I_{d2}}{I_{d1}}\right)^{n_1/n_4}$$ (A-13) where $$K_D(V_{BS1}, V_{BS2}) \equiv \frac{S_4 I_{biasd}}{S_3} \left(\frac{S_1}{S_2}\right)^{n_1/n_4} e^{[(V_{T2} - V_{T1})/(n_4 V_t)]}.$$ (A-14) #### ACKNOWLEDGMENT The authors would like to thank the reviewers for their valuable comments and suggestions. They also wish to thank the Chip Implementation Center (CIC) of the National Science Council, Taiwan, ROC, for giving them the chance to implement the chips. #### REFERENCES - [1] D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1: Foundations. Cambridge, MA: MIT Press, 1986. - [2] R. P. Lippmann, "An introduction to computing with neural nets," *IEEE ASSP Mag.*, pp. 4–22, Apr. 1987. [3] B. W. Lee and B. J. Sheu, "General-purpose neural chips with electroductions". - [3] B. W. Lee and B. J. Sheu, "General-purpose neural chips with electrically programmable synapses and gain-adjustable neurons," *IEEE J. Solid-State Circuits*, vol. 27, no. 9, pp. 1299–1302, Sep. 1992. - Solid-State Circuits, vol. 27, no. 9, pp. 1299–1302, Sep. 1992. [4] B. E. Boser, E. Säckinger, J. Bromley, Y. L. Cun, and L. D. Jackel, "An analog neural network processor with programmable topology," *IEEE J. Solid-State Circuits*, vol. 26, no. 12, pp. 2017–2024, Dec. 1991. - J. Solid-State Circuits, vol. 26, no. 12, pp. 2017–2024, Dec. 1991. [5] D. B. Schwartz, R. E. Howard, and W. E. Hubbard, "A programmable analog neural network chip," *IEEE J. Solid-State Circuits*, vol. 24, no. 2, pp. 313–319. Apr. 1989. - 2, pp. 313-319, Apr. 1989. [6] H. C. Card and C. R. Schneider, "Analog CMOS neural circuits—in situ learning," Int. J. Neural Syst., vol. 3, no. 2, pp. 103-124, 1992. - [7] M. Yasunaga, N. Masuda, M. Yagu, M. Asai, K. Shibata, M. Ooyama, M. Yamada, T. Sakaguchi, and M. Hashimoto, "A self-learning digital neural network using wafer-scale LSI," *IEEE J. Solid-State Circuits*, vol. 28, no. 2, pp. 106–114, Feb. 1993. - [8] Y. Arima, M. Murasaki, T. Yamada, A. Maeda, and H. Shinohara, "A refreshable analog VLSI neural network chip with 400 neurons and 40 K synapses," *IEEE J. Solid-State Circuits*, vol. 27, no. 12, pp. 1854–1861, Dec. 1992. - [9] A. Johannet, L. Personnaz, G. Dreyfus, J. D. Gasuel, and M. Weinfeld, "Specification and implementation of a digital Hopfield-type associative memory with on-chip training," *IEEE Trans. Neural Networks*, vol. 3, no. 4, pp. 529–539, July 1992. - [10] T. Shima, T. Kimura, Y. Kamatini, T. Itakura, Y. Fujita, and T. Iida, "Neuro chips with on-chip backpropagation and/or Hebbian learning," *IEEE J. Solid-State Circuits*, vol. 27, no. 12, pp. 1868–1876, Dec. 1992. - [11] M. Verleysen, B. Sirletti, A. M. Vandemeulebroecke, and P. G. A. Jespers, "Neural networks for high-storage content-addressable memory: - VLSI circuit and learning algorithm," *IEEE J. Solid-State Circuits*, vol. 24, no. 3, pp. 562–569, June 1989. - 12] C. Mead and M. Ismail, Analog VLSI Implementation of Neural Systems. Norwell, MA: Kluwer, 1989. - [13] S. Grossberg, "Nonlinear difference-differential equations in prediction and learning theory," in *Proc. Nat. Acad. Sci. USA*, vol. 58, 1967, pp. 1329–1334. - [14] \_\_\_\_\_\_, "Some nonlinear networks capable of learning a spatial pattern of arbitrary complexity," in *Proc. Nat. Acad. Sci. USA*, vol. 59, 1968, pp. 368–372. - [15] \_\_\_\_\_\_, "Some physiological and biochemical consequences of psychological postulates," in *Proc. Nat. Acad. Sci. USA*, vol. 60, 1968, pp. 758, 765. - Netherlands: Elsevier, 1987, pp. 1–81. [17] \_\_\_\_\_\_, Studies of Mind and Brain: Neural Principles of Learning, Perception, Development, Cognition, and Motor Control. Dordrecht, The Netherlands: D. Reidel, 1982. - [18] \_\_\_\_\_, "Some networks that can learn, remember, and reproduce any number of complicated space-time patterns, II," Studies Appl. - Mathematics, vol. XLIX, no. 2, pp. 135–166, June 1970. [19] R. L. Geiger, P. E. Allen, and N. R. Strader, VLSI Design Techniques for Analog and Digital Circuits. New York: McGraw-Hill, 1990. - [20] E. Vittoz and J. Fellrath, "CMOS analog integrated circuits based on weak inversion operation," *IEEE J. Solid-State Circuits*, vol. SC-12, no. 3. pp. 224–231. June 1977. - 3, pp. 224–231, June 1977. [21] C. Ioumazou, F. J. Lidgey, and D. G. Haigh, Analogue IC Design: The Current-Mode Approach. Stevenage, UK: Peregrinus, 1990. - [22] C. S. Park and R. Schaumann, "A high-frequency CMOS linear transconductance element," *IEEE Trans. Circuits Syst.*, vol. CAS-33, pp. 1132–1138, Nov. 1986. - [23] J. E. Franca and Y. Tsividis, Design of Analog-Digital VLSI Circuits for Telecommunications and Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1994. - [24] C.-Y. Wu and J.-F. Lan, "A new neural associative memory with learning," in *Proc. Int. Joint Conf. Neural Networks*, Baltimore, MD, June 1992, vol. I, pp. 487–492. Chung-Yu Wu, ((S'76-M'77) for a photograph and biography, see this issue, p. 166. Jeng-Feng Lan (S'93) was born in Pingtung, Taiwan, Republic of China, in 1965. He received the B.S. degree from the Department of Electrical Engineering, Tatung Institute of Technology, Taipei, Taiwan, in 1990 and the M.S. degree from the Institute of Electronics, National Chiao-Tung University, Hsinchu, Taiwan, in 1992. He is currently working toward the Ph.D. degree at the same institute. His research interests include neural network hardware design, signal processing, and VLSI design