# Artificial Intelligence Techniques for SPICE Optimization of MOSFET Modeling

Jatmiko E Suseno, *Student Member, IEEE*, Munawar A Riyadi, *Student Member, IEEE*, Nurul Ezaila Alias, Yau Wei Heong, and Razali Ismail, *Member, IEEE* 

*Abstract*—This paper proposes new method for optimize and verified electric characterization graph of MOSFET by using artificial neural network. Optimization using Neural Network (ONN) will compare current-voltage (I-V) Characteristic graph between the TCAD simulation and TSPICE modeling as desire data control a model parameter of BSIM. In this paper, the neural network method is dynamic feedforward Neural Network. After NN training, the best result is at Neural Network architecture of 36-30-10-5 with Mean Squared Error (MSE) of 1e-28 at epoch of 5.

## I. INTRODUCTION

Submicron CMOS technology appears to be a feasible and cost-effective integration solution for electronics device systems. Effectively, the maturity of silicon-based CMOS technology for small device feature size and low voltage digital circuits and also the recent progresses of MOSFETs performances. A model will help us to understand the meaning of some of the parameters that appear in its mathematical model.

Besides the physical parameters often introduces nonphysical parameters that do not necessarily need to correspond to some physical parameter. Such nonphysical parameters often combine the effects of one or more physical effects. Each MOSFET model is characterized with a set of parameters and these parameters have to be estimated if they cannot be measured either easily or not at all. The process of estimating model parameters is called parameter extraction and a brief description of various optimization methods that can be used to control the progress of a parameter extraction algorithm follows.

Parameters of MOSFET model may represent component

Manuscript received April 15, 2009. This work was supported in part by the Malaysian Ministry of Science, Technology and Innovation (MOSTI) under e-science Grant.

Jatmiko E Suseno is PhD Student at Electrical Engineering Faculty, Universiti Teknology Malaysia (UTM) and works at Physics Department, Diponegoro University, Semarang, Indonesia (e-mail: jatmikoendro@ ieee.org).

Munawar A Riyadi is PhD Student at Electrical Engineering Faculty, Universiti Teknology Malaysia (UTM) and works at Electrical Engineering Department, Diponegoro University, Semarang, Indonesia (e-mail: munawar.riyadi@ieee.org).

Nurul Ezaila Alias is Master Student and staff at Electrical Engineering Faculty, Universiti Teknology Malaysia (UTM) (e-mail: ezaila@fke.utm.my).

Yau Wei Heong is Master Student at Electrical Engineering Faculty, Universiti Teknology Malaysia (UTM) (e-mail: <u>yauwh83@live.com.my</u>).

Razali Ismail is with Electrical Engineering Faculty, Universiti Teknology Malaysia (UTM) (e-mail: razali@fke.utm.my).

values such as the width and length of MOSFETs, or any other quantities that are fixed by the particular choice of circuit design and manufacturing process, but that may, at least in principle, be adapted to optimize circuit or device performance. Constants of nature, such as the speed of light or the boltzmann constant, are therefore not considered as parameters.

Existing approaches for transistor modeling are based on lumped equivalent circuits. The equivalent circuit approach involves determination of an equivalent circuit topology and formulation of the circuit elements. Such an approach not only requires experience but also a difficult trial and process. As the drain current depends of the drain-to-source, Vd and gate-to-source,Vg, bias voltages, it was implemented into SPICE as a voltage-controlled current source.

Artificial Neural Network (ANN) has been recognize as a powerful tool for modeling and optimization problems[1]. Therefore, this paper proposes ANN method for modeling and optimization of current-voltage (I-V) characterization between the TCAD Simulation and TSPICE modeling. The universal approximation property of ANN provides them the ability to learn any arbitrarily nonlinear input-output relationships [2] from corresponding measured or simulated data for investigating NN approaches to model transistor DC [3], small signal [4], and large-signal [5] behaviors and also presents possible ways to continue extract some other circuit value such as parasitic capacitance, time delay and speed performance.

#### II. THEORY

#### A. MOSFET Device

The Metal Oxide Substrate Field Effect Transistor (MOSFET) is the first transistor type ever manufactured and is the most common FET transistor.

Fig. 1 shows N-MOSFET transistor which consists of ntype semiconductor material and its channel is made of ptype material. N-type semiconductor materials are based on a doping process which adds certain types of atoms to the semiconductor to increase the number of free negative mobility carriers — electrons. The process of n-type doping produces an abundance of electrons in the material and therefore electrons carry the charge. If we start with all voltages grounded and apply a positive voltage at gate (V<sub>GS</sub>), an electric field is created. This forces electrons move towards the gate oxide pushing out holes. Since the gate oxide is an insulator, electrons cannot pass through for a voltage that is less than some threshold and form an electric



field known as the "Inversion Layer" which connects the drain and source and closes the electric circuit. Electric current can now flow from in between source and drain and moreover drain (or source) can supply more electrons. The current gain capability of a Field-Effect-Transistor (FET) is easily explained by the fact that no gate current is required to maintain the inversion layer and the resulting current between drain and source. The device has therefore an infinite current gain in DC. The current gain is inversely proportional to the signal frequency, reaching unity current gain at the transit frequency. The voltage gain of the MOSFET is caused by the fact that the current saturates at higher drain-source voltages, so that a small drain current variation can cause a large drain voltage variation [6].

## B. MOSFET Modeling

VGs (gate-source voltage) controls the value of IDS (drainsource current) thus how much current flows in between drain and source by creating an inversion layer (electric field) which connects the circuit in between drain and source. IDS and VGS together with VDS (drain-source voltage) and VBS (base-source voltage) control the shape of the inversion layer and are the only variables we measure. The value of IDS is transformed by means of a mathematical model where take all information that we know and with the use of mathematical modeling construct a function which presents the dependency. The large signal currents of the ID – VDS plane are calculated from the expression:

$$I_G = I_B = 0 \tag{1}$$

(2)

$$I_{CUTOFF}$$
,  $V_{GS} < V_{TH}$ 

$$I_{D} = \begin{cases} I_{OHMIC}, & V_{GS} > V_{TH} \text{ and } V_{DS} < V_{DSAT} & (3) \\ I_{SAT}, & V_{GS} > V_{TH} \text{ and } V_{DS} > V_{DSAT} & (4) \end{cases}$$

Where it is assumed that the drain and source are designated so that  $V_{DS} \ge 0$ .  $V_{DSAT}$  is a parameter that characterizes the transition between the ohmic and saturation regions.

Operation in Quadrant 3 of the ID – VDS plane is characterized by the same equations, where the drain and source designations are internally made so that  $VDS \ge 0.[7]$ The device element line (card) in the SPICE deck contains information about the nodal location of the device in the circuit as well as geometrical information about the device and optional initial condition variables. In the device element line, reference is made to a specific device model. The .MODEL line (card) in the SPICE deck contains generic information about the electrical characteristics of device formed in a process based upon the characterizing process parameters. Each device has a separate device element line. Typically, many devices will reference a single .MODEL line.

## C. Optimization Methods

In order to be able to classify parameter extraction methods, brief information about the methods and heuristics that are used to obtain mathematical model parameters and/or control the progress of the parameter extraction algorithm are necessary. This overview is also necessary because the following methods either already have been used for parameter extraction or could be used in future. Parameter extraction of MOSFET model for each process technology start with an initial set of parameters that comes from, 1) Vendor supplied models. 2) Previous MOSFET models. 3) Extracted models from physical fundamentals. Interaction between parameters that are optimized in a given strategy is controlled by the maximum and minimum limit of each parameter. There are nine optimization strategies that are implemented such as Parameters in Threshold and Subthreshold Regions, Threshold Shift effect parameters, Threshold Shift and Channel Resistance effects parameters, Threshold Shift and Channel Resistance effects Binning parameters, Low Bias Drain Saturated Current parameters, Low Bias Output Resistance Parameters, High Bias Drain Saturated Current parameters, High Bias Output Resistance Parameters, and Junction Capacitance Parameters. All of strategies presented are a standard optimization strategy and it may vary from one technology to the other.[8]

### D. Dynamic Feedforward Neural Network

Dynamic feedforward neural networks are conceived as mathematical constructions, independent of any particular physical representation or interpretation. This section shows how these artificial neural networks can be related to device and subcircuit models that involve physical quantities like currents and voltages. Feedforward neural networks can, under relatively mild conditions, be guaranteed to preserve monotonicity in the multidimensional static behaviour. With contemporary physical models, it is generally no longer possible to guarantee monotonicity, due to the complexity of the mathematical analysis needed to prove monotonicity. It is an important property, however, because many devices are known to have monotonic characteristics. A nonmonotonic model for such a device may yield multiple spurious solutions for the circuit in which it is applied and it may lead to nonconvergence even during time domain circuit simulation. The monotonicity guarantee for neural networks can be maintained for highly nonlinear multidimensional behaviour, which so far has not been possible with table models without requiring excessive amounts of data. Furthermore, the monotonicity guarantee is optional, such that nonmonotonic static behaviour can still be modelled.



Figure 2. A dynamic feedforward neural network architecture.

A feedforward neural network will be characterized by the number of layers and the number of neurons per layer. Layers are counted starting with the input layer as layer 0, such that a network with output layer K involves a total of K + 1 layers (which would have been K layers in case one prefers not to count the input layer). Layer k by definition contains N<sub>k</sub> neurons, where  $k = 0, \ldots, K$ . The number N<sub>k</sub> may also be referred to as the width of layer k. Neurons that are not directly connected to the inputs or outputs of the network belong to a so-called hidden layer, of which there are K - 1 in a (K + 1)-layer network. Network inputs are labeled as  $\mathbf{x}^{(0)} \equiv (\mathbf{x}_1^{(0)}, \ldots, \mathbf{x}_{N0}^{(0)})^T$ , and network outputs as  $\mathbf{x}^{(K)} \equiv (\mathbf{x}_1^{(K)}, \ldots, \mathbf{x}_{N0}^{(K)})^T$ . The neuron output vector  $\mathbf{y}_k (\mathbf{y}_{1,k}, \ldots, \mathbf{x}_{N0}^{(K)})^T$ .  $(y_{Nk,k})^{T}$  represents the vector of neuron outputs for layer k, containing as its elements the output variable yi;k for each individual neuron i in layer k. The network inputs will be treated by a dummy neuron layer k = 0, with enforced neuron j outputs  $y_{j,0} \equiv x_j^{(0)}$ ,  $j = 0, ..., N_0$ . However, when counting the number of neurons in a network, we will not take the dummy input neurons into account. The logistic function  $F(s_{ik})$ , is strictly monotonically increasing in  $s_{ik}$ . However, we will generally use nonzero v's and  $\tau$ 's, and will instead of the logistic function apply other infinitely smooth  $(C^{\infty})$  nonlinear modelling functions F. The standard logistic function lacks the common transition between highly nonlinear and weakly nonlinear behaviour that is typical for semiconductor devices and circuits.

$$F_{1}(s_{ik}, \delta_{ik}) = \frac{1}{\delta_{ik}} \left[ \ln \left( \cosh \frac{s_{ik} + \delta_{ik}}{2} \right) - \ln \left( \cosh \frac{s_{ik} - \delta_{ik}}{2} \right) \right]$$
$$= \frac{1}{\delta_{ik}} \ln \frac{\cosh \frac{s_{ik} + \delta_{ik}}{2}}{\cosh \frac{s_{ik} - \delta_{ik}}{2}}$$
(5)

with  $\delta i k \neq 0$ 

Preliminary experience with modeling MOSFET dc characteristics indicates that this helps to avoid unacceptable local minima in the error function (cost function) for optimization-unacceptable in the sense that the results show too gradual near-subthreshold transitions [9].

#### II. METHODOLOGY

The Artificial Neural Network method is that used for this optimization is called Optimization using Neural Network (ONN). The steps of ONN are showed at flow chart figure 3 below.



Figure 3. Optimization using Neural Network (ONN) Flow chart

The process used to fabricate the planar and vertical NMOS transistor will be simulated using TCAD Silvaco. It is used to create the device structure, adding dopant, defining electrodes and creating the mesh. The structure is that resulted follow parameter at Table I below.

After that, the results from the simulation using Silvaco-Atlas will get electrical characteristics such as Id-Vd, Id-Vg, C-V characteristics graph. In this paper, an only Id-Vd characteristic is that optimized. The result data of TCAD will be used as comparison data or optimization from ONN

| TABEL I                       |                              |  |
|-------------------------------|------------------------------|--|
| MOSFET DEVICE PARAMETER       |                              |  |
| Parameter                     | Value                        |  |
| Туре                          | n-MOSFET                     |  |
| Body doping of Boron          | $9.10^{17} \mathrm{cm}^{-3}$ |  |
| Polysilicon doping of Arsenic | $1.10^{15} \mathrm{cm}^{-3}$ |  |
| Channel length (Lg)           | 150 nm                       |  |
| Oxide thickness (Tox)         | 5 nm                         |  |

method. The other hand, the electrical characteristics of MOSFET can be obtained from SPICE. The ONN can be trained using such data to produce fast and accurate DC neuromodels. For this work, the training samples are collected by T-SPICE simulations using BSIM3 according to 1.3  $\mu$ m technology. The n MOSFET length and width are 1.5  $\mu$ m and 3  $\mu$ m respectively. The device is fully characterized from gathered data with Vg is 1.1, 2.2 and 3.3 V, and Vd ranging from 0 to 3.3V. The I-V characteristic

simulations were taken for the drain current Id. The each Id-Vd graph has physics value parameters, such as threshold voltage (VTHO), the drain induced barrier lowering (PDIBLC), L dependent coefficient of the DIBL effect in output resistance (DROUT), Gate dependence of early voltage (PVAG), channel length modulation (PCLM), and Drain saturation voltage (VSAT) parameters. If the parameters change then the graph will change according to what changed parameter. The parameter is called as data target. The pair of Id-Vd graph and physics parameter as ONN input and output respectively is showed at Figure 4 below.



Figure 4. Input-Target Data of the ONN method



| ARTIFICIAL NEURAL NET WORK I ARAMETER |                              |  |
|---------------------------------------|------------------------------|--|
| Parameter                             | Value                        |  |
| Network Type                          | Feed-Forward Backpropagation |  |
| Training Function                     | Trainlm                      |  |
| Adaption Training Function            | Learngdm                     |  |
| Error Function                        | Mean Square Error            |  |
| Input Transfer Fuction                | Tansig                       |  |
| Hidden Transfer Function              | Tansig                       |  |
| Output Transfer Function              | Purelin                      |  |

More and more various pairs of input-target are more and better. The training sets were that used follow the table II above. Series of neural networks with different numbers of hidden neurons are trained using Levenberg Marquardt (trainlm) algorithm. The feed forward with various architectures was found to provide the best trade-off between the desired accuracy and the model complexity.

#### III. RESULT

The MOSFET device simulation using TCAD have gotten electrical parameter result that can be shown at table III below. This parameter is obtained with planar n-MOSFET process parameter of substrate concentration,  $9.10^{-17}$  cm<sup>-3</sup>, source and drain doping concentration,  $1.10^{-15}$  cm<sup>-3</sup>, channel length (L<sub>g</sub>), 150 nm and oxide thickness (t<sub>ox</sub>), 5 nm.

TABEL III FLECTRIC PERFORMANCE RESULT OF MOSEF

| ELECTRIC PERFORMANCE RESULT OF MOSFEI |                               |  |
|---------------------------------------|-------------------------------|--|
| Parameter                             | Value                         |  |
| Threshold Voltage (Vth)               | 0.77 V                        |  |
| Leakage Current (Ioff)                | 1.36.10 <sup>-15</sup> (A/um) |  |
| Drive Current (Ion)                   | 8.06.10 <sup>-7</sup> (A/um)  |  |
| DIBL                                  | 44 (mV/V)                     |  |
| Subthreshold Slope (S)                | 82 (mV/decade)                |  |

With the MOSFET structure, the electric characteristic is used by SPICE model for optimizing process. Before and after Optimizing using Neural Network process of Id-Vd graphs can be described at figure 5 below.



Figure 5. Optimization method using Artificial Neural Network

The ONN method used Neural Network process with its parameters was declared at table II. In this network type, it used the feed-forward backpropagation method because this method had the monotonicity guarantee for neural networks that could be maintained for input and target nonlinear multidimensional data, which so far had not been possible with table models without requiring excessive amounts of data. And also this method can avoid convergence problems, because it avoids the need for an iterative solver. The graph error can be shown at figure 6 below.



Figure 6. Training Performance of ONN process

The stability of feedforward neural networks can be guaranteed. The stability of feedforward neural networks depends solely on the stability of its individual neurons. If all neurons are stable, then the feedforward network is also stable. For training process with various the network architectures, graphs of error training showed Mean Square Error (MSE) result that more and more reduce for all architectures.

The training Mean Squared Errors (MSEs) of all architectures are below of 1e-25 that was obtained after 5-6 iterations. The best result is at Neural Network architecture of 36-30-10-5 with Mean Square Error (MSE) of 1e-28 at epoch of 5.

## IV. CONCLUSION

An alternative optimization method for optimize electric characterization graph of MOSFET between MOSFET model and measured/simulation using Neural Network was proposed. The conventional optimization method is trial and error method so that it need more time to achieve optimizing result. The Optimization using Neural Network (ONN) has few steps and small error to obtained desire result. In this paper, the neural network method is dynamic feedforward Neural Network. After NN training, the best result is at Neural Network architecture of 36-30-10-5 with Mean Squared Error (MSE) of 1e-28 at epoch of 5.

#### References

- Burrrascano, P., M. Mongiardo, A review of artificial neural networks applications in microwave CAD, Int. J. RF Microwave CAE: Special Issue Application ANN to RF and Microwave Design., 1999, 9: 158-174.
- [2] Hornik, K., M. Stinchcombe, H. White, P. Auer, Degree of Approximation Results for Feedforward Networks Approximating Unknown Mappings and their Derivatives, Neural Computation., 1994, 6: 1262-1275.
- [3] Wang, F., Q.J. Zhang, Knowledge-Based Neural Models for Microwave Design, IEEE Trans, Microwave Theory Tech., 1997, 45: 2333-2343.
- [4] Devabhaktuni, V.K., C. Xi, Q.J. Zhang, A Neural Network Approach to the Modelling of Heterojunction Bipolar Transistors from S-Parameter Data, Proc. 28th European Microwave Conf., Amsterdam, Netherlands, Oct. 1998, pp 306-311.
- [5] Zaabab, A.H., Zhang, Q.J., M.S. Nakhla, A Neural Network Modelling Approach to Circuit Optimization and Statistical Design, IEEE Trans Microwave Theory Tech, 1995, 42: 1349-1358.
- [6] B. V. Zeghbroeck, *Principles of semiconductor devices*, Electrical and Computer Engineering Department, University of Colorado at Boulder, 2004. Available: http://ecewww.colorado.edu/~bart/book/.
- [7] "BSIM3 Manual," University of California at Berkeley, March 1997. Available: http://www-device.EECS.Berkeley.EDU/~bsim3/
- [8] Simulation Silvaco Standard, Local Optimization Templates for Extracting BSIM3v3.1 Parameters in UTMOST III, VOL 9 No 1 January 1998.
- [9] P. B. L. Meijer, Fast and Smooth Highly Nonlinear Table Models for Device Modeling, IEEE Trans. Circuits Syst., Vol. 37, pp. 335-346, Mar. 1990.