Journal of Applied Sciences Research, *(*): 2100-2108, 2012
ISSN 1819-544X
This is a refereed journal and all articles are professionally screened and reviewed
ORIGINAL ARTICLES
Design and Implementation of High speed ALU using Optimized PDP adder and
Multiplier
1
M. Kathirvelu, 2Dr.T.Manigandan
1
Associate Professor, Department of ECE, KPR Institute of Engg & Tech, Coimbatore Tamilnadu, India.
2
Principal, PA College of Engineering & Technology, Pollachi, Coimbatore, Tamilnadu, India
ABSTRACT
The applications of digital design are present in our daily life, including Computers, calculators, video
cameras etc. In fact, there will be always need for high speed and low power digital products which makes
digital design a future growing business. ALU (Arithmetic logic unit) is a critical component of a
microprocessor and is the core component of central processing unit. Furthermore, it is the heart of the
instruction execution portion of every computer. The Arithmetic and Logic Unit (ALU) is a combination circuit
that performs a number of arithmetic and logical operations within a microprocessor. In this paper we split the
ALU into two modules, Logic module and Arithmetic module. Designing each module separately will be easier
than designing a bit-slice as one unit. We designed a 8-bit ALU that is formed by optimized power and speed
adder cell. The different adder architectures are designed and simulated with T - SPICE. The adder designed
with 4 transistors XOR will consume less power and delay and is used to design multiplier and ALU. Generally
the CMOS design style is not area efficient for complex gates with large fan-in. Thus, care must be taken when
a static logic style is selected to realize a logic function. The designed circuit is simulated with 5 micron
technology with an operating voltage of 5V using Tanner EDA.
Key words:
Introduction
In the past, ALU and full adder circuits have been implemented for optimum area and delay, each with their
distinct features that bring about optimum area and delay. Some of them have been briefly described below to
give us an idea of earlier work and shed light on different optimization techniques. Past work related to
multiple-input floating gate CMOS applications has been reviewed to give us an idea about its operation, design
and simulation issues. Bui et al have designed a low power 10-transistor full adder called Static Energy-
Recovery Full-Adder (SERF) using 10 transistors. A novel set of XOR and XNOR gates in combination with
existing ones have been used. The XOR and XNOR circuits designed by them do not directly connect to power
and ground lines, respectively. Wang etal have shown an improved version of XOR and XNOR gates that make
use of six transistors. In another design of CMOS 1-bit full adder cell, four transistor XOR and XNOR gates
have been used by wang et al. The cell offers higher speed and lesser power consumption than standard 1-bit
full adder cell. Radhakrishnan has presented the design of low power CMOS full adder circuits using
transmission function theory. Suzuki et al prescribed the usuage of six transistor CMOS XOR and XNOR gates.
16-bit, 2.4ns,0.5 m CMOS ALU design which consists of a logical and arithmetic unit (LAU), a magnitude
comparator (CMP), an overflow detector (OVF) and zero flag detector (ZERO). The ALU employs a binary
look-ahead carry (BLC) adder. All units in this design operate in parallel and high speed is achieved.
The most efficient way to reduce the power consumption of digital circuits is to reduce the supply voltage
( Hosseinghadiry et al 2009), since the average power consumption of CMOS digital circuits is proportional to
the square of the supply voltage. The resulting performance loss can be overcome for standard CMOS
technologies by introducing more parallelism (Jou, et al 1997) and to modify the process and optimize it for
low supply voltage operation. In this paper we reviewed the various adder architecture, novel method for
designing the ALU, Power consumption of ALU with different adder and multiplier architecture are presented.
Analysis of various full adder architectures:
a) Conventional CMOS Full Adder:
Corresponding Author: M. Kathirvelu, Department of ECE, KPR Institute of Engg & Tech, Coimbatore Tamilnadu, India.
E-mail: ***********@*****.***
2101
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
The conventional adder is implemented with 28 Transistors in CMOS technology and it requires minimum
of one volt supply for the proper function.
b) Static Energy Recovery Full (SERF) Adder:
Static Energy Recovery Full (SERF) adder requires only 10 transistors to implement a full adder. Where an
intermediately generated XNOR (A,B) signal is shared to generate the carry out and the sum outputs (Shalem et
al 1999).
c) Transmission Function Adder:
A transmission gate, or analog switch, is an electronic element that will selectively block or pass a signal
level from the input to the output. This solid-state switch is comprised of a pMOS transistor and nMOS
transistor The control gates are biased in a complementary manner so that both transistors are either on or off.
The transmission function full adder which uses 16 transistors for the realization of the circuit. It occupies less
area and consumes less power than Ex-or based adder.
d) Ex - OR and Multiplexer Based Full Adder Design:
The logic approach uses only one XOR gate and two multiplexer to implement the Carry and SUM[12].
XOR gate is the most power hungry components of the full adder cells. Therefore, the new logic approach will
reduce the power consumption.
e) Multiplexer Based Full Adder:
In this architecture multiplexers are used to design a adder (Yingtao Jiang et al 2009), As multiplexers
doesn t need any supply voltage for its function, the full adder designed with multiplexer may not have the
leakage problems and short circuit problems. The multiplexers are implemented using NMOS and PMOS pass
transistors. The select line of the 2 to 1 multiplexer can be considered from any one input from the full adder.
The CARRY function of full adder can be implemented by using the general equation
CARRY = a b c + a b c + a b c + a b c (1)
By reducing this equation we obtained
CARRY = a b c + a b c + a c (2)
To implement the sum and carry function in the equation 2, it requires the 6 identical multiplexers.
f) Proposed Multiplexer Based 12T Full Adder:
The proposed adder in this paper is designed with multiplexer. Proposed 1-bit full adder that utilizes 6
identical multiplexer, substituting each of the multiplexer with a 2-transistor circuit gives us the new MBA-12T
adder, which requires a total of 12 transistors to realize the function of a full adder is shown in figure 3. The
carry function of the multiplexer based adder discussed in is implemented with equation 2. The input a gets
changed on every cycle it causes the switching of transistor at every clock cycles results in more switching
power. In the proposed adder the carry function is implemented with the equation 3.
CARRY = a b c + a b c + bc (3)
In the proposed adder the input b and c are used as a select input for the multiplexer. The input b gets
changed in its position only for every two clock cycle and in results less switching of transistor causes the less
switching power. The proposed adder in this paper is designed with multiplexer. Proposed 1-bit full adder that
utilizes 6 identical multiplexer, substituting each of the multiplexer with a 2-transistor circuit gives us the new
MBA-12T adder, which requires a total of 12 transistors to realize the function of a full adder is shown in figure
1.
In addition to reduced transition activity and charging recycling capability, this circuit has no direct
connections to power supply nodes and the entire signal gates are directly excited by the fresh input signals,
leading to noticeable reduction in short -circuit power consumption. There are three major sources of power
dissipation in a digital CMOS circuit: logic transition, short-circuit current and leakage current. The short-circuit
2102
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
current is defined to be the direct current passing through the supply and the ground, when both the NMOS and
PMOS transistors are simultaneously active. Same short-circuit current problem as they have some internal
nodes driven by signals with slow raise and/or fall times. This leads to significant (20%) short- circuit
power dissipation for loaded inverters. Such problem was partially solved in SERF adders as a result of
absence of connection to Vss port. In this case no direct path from supply to ground can be formed.
Fig. 1: Proposed Multiplexer-Based 12-Transistor Circuit
The new MBA- 12T adder moves one step further and provides the best solution for the short-circuit
current problem as all of its internal gate nodes are directly excited by fresh input signals. On top of that,
MBA-12T does not have direct connections to Vdd or Vss port, which can substantially reduce the
probability of a direct path formation from positive voltage supply to the ground during switching.
g) Proposed 4T XOR Adder:
XOR gates form the fundamental building block of full adders. Enhancing the performance of the XOR
gates can significantly improve the performance of the adder. A survey of literature reveals a wide spectrum of
different types of XOR gates that have been realized over the years. The early designs of XOR gates were based
on either eight transistors or six transistors that are conventionally used in most designs. In this paper
considerable emphasis has been laid on the design of four-transistor XOR gate and is shown in figure 2.
c m
u carry
x
xor
a xor sum
b
Fig. 2: Proposed XOR based 10 T Adder
The various full adder structures discussed above are simulated with Tanner Spice and the dynamic power
consumption of various structures are given in table 1. The proposed adder designed with multiplexer will
consume less power and delay and utilized to design a multiplier. The output waveform of proposed structure is
similar to other structure and it consumes less power and delay than other architecture discussed is shown in
figure 3.
2103
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Fig. 3: Output waveform of proposed adder
Table 1: Power Comparison between the adders
S.No Name of the adder Avg.Power Delay in ns PDP
1 CMOS adder 3.51E-05 1.2 4.21E-05
2 26 Transistor adder 1.22E-04 1.8 219.6 E-06
3 Transmission Function adder 2.85E-05 2.6 7.41E-05
4 SERFull adder 1.20E-05 2.5 30 E-06
5 XOR and MUX based adder 1.54E-05 1.2 18.48 E-06
6 Multiplexer based adder [18] 1.55E-05 1.8 2.79E-05
7 Proposed Multiplexer based 12T adder 1.39E-05 0.8 11.12 E-06
8 Proposed 4T XOR adder 1.52E-05 0.6 9.12E-06
The average power consumption of the various adder architecture is plotted in graph is shown in figure 4.
The proposed adder is optimistic in power and delay and it is utilized in multiplier.
Power Comparision of adders
14
12
0-5 atts
10
Pow in 1 W
8
6
er
4
2
0
0-100-***-***-*** 500 600 700
Time in nano sec
CMOS 26T MUX BASED SERF XNOR ADDER PROPOSED ADDER
Fig. 4: Average power consumption of different adders.
2104
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Arithmetic Logic Unit Design:
An arithmetic logic unit is a digital circuit that performs arithmetic and logical operations. The ALU is a
fundamental building block of the central processing unit (CPU) of a computer, and even the simplest
microprocessors contain one for purposes such as maintaining timers. The processors found inside modern
CPUs and graphics processing units (GPUs) accommodate very powerful and very complex ALUs; a single
component may contain a number of ALUs. Most of a processor's operations are performed by one or more
ALUs. An ALU loads data from input registers and an external Control Unit command the ALU to perform the
required operation on the data and it stores the result into an output register (suma et al 2011). The inputs to the
ALU are the data to be operated on (called operands) and a code from the control unit indicating which
operation to perform. Most ALUs can perform the following operations:
Integer arithmetic operations (addition, subtraction, and sometimes multiplication and division, though this
is more expensive)
Bitwise logic operations (AND, NOT, OR, XOR)
Bit-shifting operations (shifting or rotating a word by a specified number of bits to the left or right, with or
without sign extension).
When designing the ALU we used the principle "Divide and Conquer" in order to use a modular design that
consists of smaller, more manageable blocks, some of which can be re-used. Instead of designing the 8-bit ALU
as one circuit we will first design a one-bit ALU, also called a bit-slice. These bit-slices can then be put together
to make a 8-bit ALU.
Fig. 5: Block diagram of one bit ALU
There are different ways to design a bit-slice of the ALU. One method is forming the truth table with 6
inputs (M, S1, S0, C0, Ai and Bi) and two outputs Fi and Ci+1. But it is more complex to form the table. An
alternative way is to split the ALU into two modules, Logic module and Arithmetic module. Designing each
module separately will be easier than designing a bit-slice as one unit. The block diagram of the ALU is shown
in Figure 5 with three modules, 2:1 MUX, a Logic unit and Arithmetic unit. In the ALU design Multiplication is
the biggest task for the processor to compute the value. In this paper we designed three different multiplier
architecture namely array multiplier, Wallace tree multiplier and multiplexer based multiplier for computation
of 8 bit multiplication. The proposed XOR based 10 transistor is used for constructing the multipliers in
different architecture and power and delay will be computed through tanner spice. The power consumption of
different multiplier architecture is shown in table 2 and shown as graph in figure 6.
Table 2: Performance comparison of different multiplier architecture
Delay PDP No.of
Power (Watts) (Nano sec) (WS) Transistors
Architecture
8 bit array multiplier 3.508697e-002 44 1.564 e-9 944
8 bit Wallace Tree multiplier 3.799979e-002 55 2.095 e-9 1024
8 bit Multiplexer based multiplier 2.548114e-002 68 1.773 e-9 844
2105
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Power comparision of Multipliers
4.5
4
3.5
Power in 10-2 Watts
3
Array Multiplier
2.5
W allace Tree Multiplier
2 Mux based Multiplier
1.5
1
0.5
0
0-100-***-***-*** 500 600 700
Time in ns
Fig. 6: power comparison of different multiplier architecture
An 8-bit ALU has been designed for 5.0 V operation. The full adder design has been implemented using
proposed 4T XOR. The 8 bit ALU performs, MULTIPLICATION, ADDTION, BARREL SHIFTER,
SUBSTRACTION, EXOR, NAND and NOR operations. The result of all computation is obtained from the
output of 8 to 1 multiplexer. The select signals of multiplexers will decide the operation to be performed and
correspondingly the input and output will be selected and the schematic structure of designed 8 bit ALU is
shown in figure 7. The different operation of designed ALU for different select input is shown in table 3.
Table 3: ALU Operation for various select inputs
Select Input Operation
S2 S1 S0
0 0 0 NAND
0 0 1 NOR
0 1 0 XOR
0 1 1 SUBTRACTION
1 0 0 SHIFT
1 0 1 ADDITION
1 1 0 MULTIPLICATION ( P0 P7)
1 1 1 MULTIPLICATION( P8 P15)
Fig. 7: Schematic structure of 8 bit ALU
2106
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Result:
The designed ALU is simulated with 5 micron CMOS technology. The 8 bit input data is applied to the ALU
and based on the select input, output of the various operations are taken out from the multiplexer and is shown
in figure 8 The power consumption of ALU with different adder and multiplier architecture is shown in table 3
and is represented as graph in figure 9.
Fig. 8: Simulated waveform of 8 bit ALU
Table 3: Performance comparison of 8 bit ALU with various adder & multiplier architecture
Delay PDP No.of
8 bit ALU Power (Watts) (Nano sec) (WS) Transistors
Using array multiplier 3.241411e-001 736 238.7 e-9 1794
Using Wallace Tree multiplier 3.799979e-002 726 27.53 e-9 1874
Using Multiplexer based multiplier 7.352331e-002 564 41.47 e-9 1694
2107
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Power Consumption of ALU with different multiplier
architecture
30
25
Power in 10 -2 watts
Using Array Multiplier
20
Using Wallace Tree
15
Multiplier
Using Mux based Multiplier
10
5
0
Time in nano sec
Fig. 9: Power Consumption of ALU with different multiplier architecture
Conclusion:
In this paper we split the ALU into two modules, Logic module and Arithmetic module and it is easier than
designing a bit-slice as one unit. We designed a 8-bit ALU that is formed by optimized power and delay adder
cell. The optimized PDP adder is used to construct the multiplier. Use of multiplier and adder structure different
ALU operations is performed. The circuits are designed with 5 micron CMOS technology and are simulated
with T SPICE. The performance comparison shows that the proposed 4T XOR based adder consumes less
power and delay than other architecture.
Refernces
Abu-Shama, E., M.B. Mazz, M.A. Bayoumi, A Fast and Low power Multiplier Architecture, The Centre for
Advanced Computer Studies, The University of Southwestern Louisiana Lafayette, LA 70504.
Abudulkarim Al-Sheridah, Yingtao Jiang, and Edwin Sha, 2009, A Novel Low Power Multiplexer-Based Full
Adder, European Conference on Circuit theory and Design, August 28-31, Espoo, Finland
Bui, H.T., A.K. Al-Sheraidah and Y. Wang, 2000. Design and Analysis of 10-Transistor Full Adders Using
Novel XOR-XNOR Gates, International Conference on Signal Processing 2000, World Computer
Congress, Beijing, China.
Goel, S., S. Gollamudi, A. Kumar, M. Bayoumi, 2004. On the Design of Low- Energy Hybrid CMOS 1 - Bit
Full Adder Cells, 47th IEEE International Midwest Symp. on Circuits and Systems 2: 209-211.
Hosseinghadiry, M., H. Mohammadi, M. Nadisenejani, 2009. Two New low power High Performance Full
adders With minimum gates International journal of Electronics, Circuits and System, 3(2): 124-131.
Issam, S., A. Khater, A. Bellaouar, M.I. Elmasry, 1996. " Circuit techniques for CMOS low power high
performance multipliers ", IEEE J. Solid-State Circuit, 31: 1535-1544.
Jiang, H.T., Y.W. Bui, 2002. ang and, Design and analysis of low-power 10-transistor full adders using
novel XOR XNOR gates. IEEE Trans. On Circuits and Systems -II: Analog and Digital Signal
Processing, 49: 25-30.
Jou, S.J., C.Y.C.E.C.Y ang and C.C. Su, 1997. A pipelined multiplier accumulator using a high-speed, low-
power static and dynamic full adder design. IEEE Journal of solid-state circuits, 32: 114-118.
Keivan Navi and Omid Kavehei, February, 2008. Low-Power and High-Performance 1-Bit CMOS Full-Adder
Cell, Journal of Computers, 3(2): 66-71.
Keivan Navi and Omid Kavehei, 2008 The Design of a High-Performance Full Adder Cell by Combining
Common Digital Gates and Majority Function, European Journal of Scientific Research ISSN 1450-216X
23(4): 626-638.
Lee, P.M., C.H. Hsu and Y.H. Hung, 2007. Novel 10-T full adders realized by GDI,structure, Proc. on
Intl. Symposium on Integrated Circuits, pp: 115-118.
Marimuthu, C.N., P. Thangaraj, 2008. Low Power High Performance Multiplier, ICGST- PDCS, 8(1): 87-
93.
M nico Linares Aranda, Mariano Aguirre Hern ndez, 2011. New High Performance Full Adders Using
an Alternative Logic Structure Computation and Sistemas, 14(3): 213-223.
2108
J. Appl. Sci. Res., 8(4): 2100-2108, 2012
Navi, K., V. Foroutan, B. Mazloomnejad, Sh. Bahrololoumi, O. Hashemipour, M. Haghparast, 2008. A Six
Transistors Full Adder, World Applied Sciences Journal, 4: 142-149.
Pekmestzi, K.Z., 1999. Multiplexer- based array multipliers . IEEE Trans.on Computers, 48: 15-23.
Prashant Gurjar, Rashmi Solanki, Pooja Kansliwal and Mahendra Vucha, 2011. VLSI Implementation of
Adders for High Speed ALU . International Journal of Computer Applications, 29(10): 11-15.
Pratibhadevi Tapashetti, A.S Umesh, Ashalatha Kulshrestha, 2012. Design and Simulation of Energy Efficient
Full Adder for Systolic Array International Journal of Soft Computing and Engineering (IJSCE) ISSN:
2231-2307, 1(6): 356-360.
Radhakrishnan., D., 2001. Low-voltage low-power CMOS full adder,IEE Proceedings-Circuits, Devices and
Systems, 148: 19-24.
Ravi, N., T.S. Rao and T.J. Prasad, 2011. Performance Evaluation of Bypassing Array Multiplier with
Optimized Design .International Journal of Computer Applications., 28(5): 1-5.
Senthilpari, S., 2011. A Low- power and High-performance Radix-4 Multiplier Design Using a Modified
Pass- Transistor Logic Technique IETE journal of Research, 57(2): 149-155.
Shalem, E. John and L.K. John, 1999. A novel low power energy recovery full adder cell, Proc. Of the
IEEE Great Lakes Symposium of VLSI, Feb., pp: 380-383.
Shams, A.M., M.A. Bayoumi, 2000. A Novel High Performance CMOS 1 Bit Full Adder Cell, IEEE
Trans.Circuits and Systems II: Analog Digital Signal Process., 47(5): 256-263.
Shubhajit Roy chowdhury, Aritra Banerjee, Aniruddha Roy, Hiranmay Saha 2008. A high speed 8
transistor full Adder design using novel 3 transistor XOR gates, International journal of Electronics,
Circuits and System, 2(4): 217-223.
Sreenivasa Rao.Ijjada, D. Srinivas And Dr. V. Malleswara Rao, 2011. Low Power And High Speed
Architecture For 32-Bit ALU Design Journal Of Information And Communication Technologies, 1(1): 1-
5.
Srivastava, A and C. Srinivasan, 2002. ALU Design using Reconfigurable CMOS Logic, Proc. of the 45th
IEEE 2002 Midwest Symposium on Circuits and Systems, 2: 663-666.
Suma, T., Hegde, Dr. Siva Yellampalli and R. Nandeesh, 2011. Design and Implementation of ALU Using
Redundant Binary Signed Digit . IJCA Proceedings on International Conference on VLSI,
Communications and Instrumentation (ICVCI) (5): 30-35.
Suzuki, K., M. Yamashina, J. Goto, Enomoto and H. Yamada, 1993. A 2.4- ns, 16-bit, 0.5- n CMOS
arithmetic logic unit for microprogrammable video signal processor LSIs, Proc. of the IEEE Custom
Integrated Circuits Conference, 9: 12.4.1 -12.4.4.
Wang, Y., Y. Jiang, and E. Sha, 2001. On area-efficient low power array multipliers. Proceedings of the
8th IEEE International Conferenceon Electronics, Circuits and Systems, ICECS, 3: 1429-1432.
Wang, J.M., S.C. Fang and W.C. Fang, 1994. New efficient designs for XOR and XNOR functions on
transistor level, IEEE J. of Solid State Circuits, 29: 780-786.
Yingtao Jiang, Abdulkarim AI Sheraidah, Yuke Wang, Edwin Sha,and Jin-Gyun Chung, 2009, A novel
multiplxer based low power full adder IEEE transaction on circuits and systems II Express Briefs,
Vol52
Zaid Al-bayati, Bassam Jamil Mohd, 2011. Sahel Alouneh Low power Wallace multiplier design based on wide
counters, International Journal of Circuit Theory and Applications, 1: 26-32.
Zimmermann, R., W. Fichtner, 1997. "Low-power logic styles: CMOS versus pass-transistor logic", IEEE J.
Solid-State Circuits, 32: 1079-1090.