Post Job Free
Sign in

Power Design

Location:
India
Posted:
January 30, 2013

Contact this candidate

Resume:

****

Journal of Applied Sciences Research, *(*): 2100-2108, 2012

ISSN 1819-544X

This is a refereed journal and all articles are professionally screened and reviewed

ORIGINAL ARTICLES

Design and Implementation of High speed ALU using Optimized PDP adder and

Multiplier

1

M. Kathirvelu, 2Dr.T.Manigandan

1

Associate Professor, Department of ECE, KPR Institute of Engg & Tech, Coimbatore Tamilnadu, India.

2

Principal, PA College of Engineering & Technology, Pollachi, Coimbatore, Tamilnadu, India

ABSTRACT

The applications of digital design are present in our daily life, including Computers, calculators, video

cameras etc. In fact, there will be always need for high speed and low power digital products which makes

digital design a future growing business. ALU (Arithmetic logic unit) is a critical component of a

microprocessor and is the core component of central processing unit. Furthermore, it is the heart of the

instruction execution portion of every computer. The Arithmetic and Logic Unit (ALU) is a combination circuit

that performs a number of arithmetic and logical operations within a microprocessor. In this paper we split the

ALU into two modules, Logic module and Arithmetic module. Designing each module separately will be easier

than designing a bit-slice as one unit. We designed a 8-bit ALU that is formed by optimized power and speed

adder cell. The different adder architectures are designed and simulated with T - SPICE. The adder designed

with 4 transistors XOR will consume less power and delay and is used to design multiplier and ALU. Generally

the CMOS design style is not area efficient for complex gates with large fan-in. Thus, care must be taken when

a static logic style is selected to realize a logic function. The designed circuit is simulated with 5 micron

technology with an operating voltage of 5V using Tanner EDA.

Key words:

Introduction

In the past, ALU and full adder circuits have been implemented for optimum area and delay, each with their

distinct features that bring about optimum area and delay. Some of them have been briefly described below to

give us an idea of earlier work and shed light on different optimization techniques. Past work related to

multiple-input floating gate CMOS applications has been reviewed to give us an idea about its operation, design

and simulation issues. Bui et al have designed a low power 10-transistor full adder called Static Energy-

Recovery Full-Adder (SERF) using 10 transistors. A novel set of XOR and XNOR gates in combination with

existing ones have been used. The XOR and XNOR circuits designed by them do not directly connect to power

and ground lines, respectively. Wang etal have shown an improved version of XOR and XNOR gates that make

use of six transistors. In another design of CMOS 1-bit full adder cell, four transistor XOR and XNOR gates

have been used by wang et al. The cell offers higher speed and lesser power consumption than standard 1-bit

full adder cell. Radhakrishnan has presented the design of low power CMOS full adder circuits using

transmission function theory. Suzuki et al prescribed the usuage of six transistor CMOS XOR and XNOR gates.

16-bit, 2.4ns,0.5 m CMOS ALU design which consists of a logical and arithmetic unit (LAU), a magnitude

comparator (CMP), an overflow detector (OVF) and zero flag detector (ZERO). The ALU employs a binary

look-ahead carry (BLC) adder. All units in this design operate in parallel and high speed is achieved.

The most efficient way to reduce the power consumption of digital circuits is to reduce the supply voltage

( Hosseinghadiry et al 2009), since the average power consumption of CMOS digital circuits is proportional to

the square of the supply voltage. The resulting performance loss can be overcome for standard CMOS

technologies by introducing more parallelism (Jou, et al 1997) and to modify the process and optimize it for

low supply voltage operation. In this paper we reviewed the various adder architecture, novel method for

designing the ALU, Power consumption of ALU with different adder and multiplier architecture are presented.

Analysis of various full adder architectures:

a) Conventional CMOS Full Adder:

Corresponding Author: M. Kathirvelu, Department of ECE, KPR Institute of Engg & Tech, Coimbatore Tamilnadu, India.

E-mail: ***********@*****.***

2101

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

The conventional adder is implemented with 28 Transistors in CMOS technology and it requires minimum

of one volt supply for the proper function.

b) Static Energy Recovery Full (SERF) Adder:

Static Energy Recovery Full (SERF) adder requires only 10 transistors to implement a full adder. Where an

intermediately generated XNOR (A,B) signal is shared to generate the carry out and the sum outputs (Shalem et

al 1999).

c) Transmission Function Adder:

A transmission gate, or analog switch, is an electronic element that will selectively block or pass a signal

level from the input to the output. This solid-state switch is comprised of a pMOS transistor and nMOS

transistor The control gates are biased in a complementary manner so that both transistors are either on or off.

The transmission function full adder which uses 16 transistors for the realization of the circuit. It occupies less

area and consumes less power than Ex-or based adder.

d) Ex - OR and Multiplexer Based Full Adder Design:

The logic approach uses only one XOR gate and two multiplexer to implement the Carry and SUM[12].

XOR gate is the most power hungry components of the full adder cells. Therefore, the new logic approach will

reduce the power consumption.

e) Multiplexer Based Full Adder:

In this architecture multiplexers are used to design a adder (Yingtao Jiang et al 2009), As multiplexers

doesn t need any supply voltage for its function, the full adder designed with multiplexer may not have the

leakage problems and short circuit problems. The multiplexers are implemented using NMOS and PMOS pass

transistors. The select line of the 2 to 1 multiplexer can be considered from any one input from the full adder.

The CARRY function of full adder can be implemented by using the general equation

CARRY = a b c + a b c + a b c + a b c (1)

By reducing this equation we obtained

CARRY = a b c + a b c + a c (2)

To implement the sum and carry function in the equation 2, it requires the 6 identical multiplexers.

f) Proposed Multiplexer Based 12T Full Adder:

The proposed adder in this paper is designed with multiplexer. Proposed 1-bit full adder that utilizes 6

identical multiplexer, substituting each of the multiplexer with a 2-transistor circuit gives us the new MBA-12T

adder, which requires a total of 12 transistors to realize the function of a full adder is shown in figure 3. The

carry function of the multiplexer based adder discussed in is implemented with equation 2. The input a gets

changed on every cycle it causes the switching of transistor at every clock cycles results in more switching

power. In the proposed adder the carry function is implemented with the equation 3.

CARRY = a b c + a b c + bc (3)

In the proposed adder the input b and c are used as a select input for the multiplexer. The input b gets

changed in its position only for every two clock cycle and in results less switching of transistor causes the less

switching power. The proposed adder in this paper is designed with multiplexer. Proposed 1-bit full adder that

utilizes 6 identical multiplexer, substituting each of the multiplexer with a 2-transistor circuit gives us the new

MBA-12T adder, which requires a total of 12 transistors to realize the function of a full adder is shown in figure

1.

In addition to reduced transition activity and charging recycling capability, this circuit has no direct

connections to power supply nodes and the entire signal gates are directly excited by the fresh input signals,

leading to noticeable reduction in short -circuit power consumption. There are three major sources of power

dissipation in a digital CMOS circuit: logic transition, short-circuit current and leakage current. The short-circuit

2102

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

current is defined to be the direct current passing through the supply and the ground, when both the NMOS and

PMOS transistors are simultaneously active. Same short-circuit current problem as they have some internal

nodes driven by signals with slow raise and/or fall times. This leads to significant (20%) short- circuit

power dissipation for loaded inverters. Such problem was partially solved in SERF adders as a result of

absence of connection to Vss port. In this case no direct path from supply to ground can be formed.

Fig. 1: Proposed Multiplexer-Based 12-Transistor Circuit

The new MBA- 12T adder moves one step further and provides the best solution for the short-circuit

current problem as all of its internal gate nodes are directly excited by fresh input signals. On top of that,

MBA-12T does not have direct connections to Vdd or Vss port, which can substantially reduce the

probability of a direct path formation from positive voltage supply to the ground during switching.

g) Proposed 4T XOR Adder:

XOR gates form the fundamental building block of full adders. Enhancing the performance of the XOR

gates can significantly improve the performance of the adder. A survey of literature reveals a wide spectrum of

different types of XOR gates that have been realized over the years. The early designs of XOR gates were based

on either eight transistors or six transistors that are conventionally used in most designs. In this paper

considerable emphasis has been laid on the design of four-transistor XOR gate and is shown in figure 2.

c m

u carry

x

xor

a xor sum

b

Fig. 2: Proposed XOR based 10 T Adder

The various full adder structures discussed above are simulated with Tanner Spice and the dynamic power

consumption of various structures are given in table 1. The proposed adder designed with multiplexer will

consume less power and delay and utilized to design a multiplier. The output waveform of proposed structure is

similar to other structure and it consumes less power and delay than other architecture discussed is shown in

figure 3.

2103

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Fig. 3: Output waveform of proposed adder

Table 1: Power Comparison between the adders

S.No Name of the adder Avg.Power Delay in ns PDP

1 CMOS adder 3.51E-05 1.2 4.21E-05

2 26 Transistor adder 1.22E-04 1.8 219.6 E-06

3 Transmission Function adder 2.85E-05 2.6 7.41E-05

4 SERFull adder 1.20E-05 2.5 30 E-06

5 XOR and MUX based adder 1.54E-05 1.2 18.48 E-06

6 Multiplexer based adder [18] 1.55E-05 1.8 2.79E-05

7 Proposed Multiplexer based 12T adder 1.39E-05 0.8 11.12 E-06

8 Proposed 4T XOR adder 1.52E-05 0.6 9.12E-06

The average power consumption of the various adder architecture is plotted in graph is shown in figure 4.

The proposed adder is optimistic in power and delay and it is utilized in multiplier.

Power Comparision of adders

14

12

0-5 atts

10

Pow in 1 W

8

6

er

4

2

0

0-100-***-***-*** 500 600 700

Time in nano sec

CMOS 26T MUX BASED SERF XNOR ADDER PROPOSED ADDER

Fig. 4: Average power consumption of different adders.

2104

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Arithmetic Logic Unit Design:

An arithmetic logic unit is a digital circuit that performs arithmetic and logical operations. The ALU is a

fundamental building block of the central processing unit (CPU) of a computer, and even the simplest

microprocessors contain one for purposes such as maintaining timers. The processors found inside modern

CPUs and graphics processing units (GPUs) accommodate very powerful and very complex ALUs; a single

component may contain a number of ALUs. Most of a processor's operations are performed by one or more

ALUs. An ALU loads data from input registers and an external Control Unit command the ALU to perform the

required operation on the data and it stores the result into an output register (suma et al 2011). The inputs to the

ALU are the data to be operated on (called operands) and a code from the control unit indicating which

operation to perform. Most ALUs can perform the following operations:

Integer arithmetic operations (addition, subtraction, and sometimes multiplication and division, though this

is more expensive)

Bitwise logic operations (AND, NOT, OR, XOR)

Bit-shifting operations (shifting or rotating a word by a specified number of bits to the left or right, with or

without sign extension).

When designing the ALU we used the principle "Divide and Conquer" in order to use a modular design that

consists of smaller, more manageable blocks, some of which can be re-used. Instead of designing the 8-bit ALU

as one circuit we will first design a one-bit ALU, also called a bit-slice. These bit-slices can then be put together

to make a 8-bit ALU.

Fig. 5: Block diagram of one bit ALU

There are different ways to design a bit-slice of the ALU. One method is forming the truth table with 6

inputs (M, S1, S0, C0, Ai and Bi) and two outputs Fi and Ci+1. But it is more complex to form the table. An

alternative way is to split the ALU into two modules, Logic module and Arithmetic module. Designing each

module separately will be easier than designing a bit-slice as one unit. The block diagram of the ALU is shown

in Figure 5 with three modules, 2:1 MUX, a Logic unit and Arithmetic unit. In the ALU design Multiplication is

the biggest task for the processor to compute the value. In this paper we designed three different multiplier

architecture namely array multiplier, Wallace tree multiplier and multiplexer based multiplier for computation

of 8 bit multiplication. The proposed XOR based 10 transistor is used for constructing the multipliers in

different architecture and power and delay will be computed through tanner spice. The power consumption of

different multiplier architecture is shown in table 2 and shown as graph in figure 6.

Table 2: Performance comparison of different multiplier architecture

Delay PDP No.of

Power (Watts) (Nano sec) (WS) Transistors

Architecture

8 bit array multiplier 3.508697e-002 44 1.564 e-9 944

8 bit Wallace Tree multiplier 3.799979e-002 55 2.095 e-9 1024

8 bit Multiplexer based multiplier 2.548114e-002 68 1.773 e-9 844

2105

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Power comparision of Multipliers

4.5

4

3.5

Power in 10-2 Watts

3

Array Multiplier

2.5

W allace Tree Multiplier

2 Mux based Multiplier

1.5

1

0.5

0

0-100-***-***-*** 500 600 700

Time in ns

Fig. 6: power comparison of different multiplier architecture

An 8-bit ALU has been designed for 5.0 V operation. The full adder design has been implemented using

proposed 4T XOR. The 8 bit ALU performs, MULTIPLICATION, ADDTION, BARREL SHIFTER,

SUBSTRACTION, EXOR, NAND and NOR operations. The result of all computation is obtained from the

output of 8 to 1 multiplexer. The select signals of multiplexers will decide the operation to be performed and

correspondingly the input and output will be selected and the schematic structure of designed 8 bit ALU is

shown in figure 7. The different operation of designed ALU for different select input is shown in table 3.

Table 3: ALU Operation for various select inputs

Select Input Operation

S2 S1 S0

0 0 0 NAND

0 0 1 NOR

0 1 0 XOR

0 1 1 SUBTRACTION

1 0 0 SHIFT

1 0 1 ADDITION

1 1 0 MULTIPLICATION ( P0 P7)

1 1 1 MULTIPLICATION( P8 P15)

Fig. 7: Schematic structure of 8 bit ALU

2106

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Result:

The designed ALU is simulated with 5 micron CMOS technology. The 8 bit input data is applied to the ALU

and based on the select input, output of the various operations are taken out from the multiplexer and is shown

in figure 8 The power consumption of ALU with different adder and multiplier architecture is shown in table 3

and is represented as graph in figure 9.

Fig. 8: Simulated waveform of 8 bit ALU

Table 3: Performance comparison of 8 bit ALU with various adder & multiplier architecture

Delay PDP No.of

8 bit ALU Power (Watts) (Nano sec) (WS) Transistors

Using array multiplier 3.241411e-001 736 238.7 e-9 1794

Using Wallace Tree multiplier 3.799979e-002 726 27.53 e-9 1874

Using Multiplexer based multiplier 7.352331e-002 564 41.47 e-9 1694

2107

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Power Consumption of ALU with different multiplier

architecture

30

25

Power in 10 -2 watts

Using Array Multiplier

20

Using Wallace Tree

15

Multiplier

Using Mux based Multiplier

10

5

0

0-200-***-***-***

Time in nano sec

Fig. 9: Power Consumption of ALU with different multiplier architecture

Conclusion:

In this paper we split the ALU into two modules, Logic module and Arithmetic module and it is easier than

designing a bit-slice as one unit. We designed a 8-bit ALU that is formed by optimized power and delay adder

cell. The optimized PDP adder is used to construct the multiplier. Use of multiplier and adder structure different

ALU operations is performed. The circuits are designed with 5 micron CMOS technology and are simulated

with T SPICE. The performance comparison shows that the proposed 4T XOR based adder consumes less

power and delay than other architecture.

Refernces

Abu-Shama, E., M.B. Mazz, M.A. Bayoumi, A Fast and Low power Multiplier Architecture, The Centre for

Advanced Computer Studies, The University of Southwestern Louisiana Lafayette, LA 70504.

Abudulkarim Al-Sheridah, Yingtao Jiang, and Edwin Sha, 2009, A Novel Low Power Multiplexer-Based Full

Adder, European Conference on Circuit theory and Design, August 28-31, Espoo, Finland

Bui, H.T., A.K. Al-Sheraidah and Y. Wang, 2000. Design and Analysis of 10-Transistor Full Adders Using

Novel XOR-XNOR Gates, International Conference on Signal Processing 2000, World Computer

Congress, Beijing, China.

Goel, S., S. Gollamudi, A. Kumar, M. Bayoumi, 2004. On the Design of Low- Energy Hybrid CMOS 1 - Bit

Full Adder Cells, 47th IEEE International Midwest Symp. on Circuits and Systems 2: 209-211.

Hosseinghadiry, M., H. Mohammadi, M. Nadisenejani, 2009. Two New low power High Performance Full

adders With minimum gates International journal of Electronics, Circuits and System, 3(2): 124-131.

Issam, S., A. Khater, A. Bellaouar, M.I. Elmasry, 1996. " Circuit techniques for CMOS low power high

performance multipliers ", IEEE J. Solid-State Circuit, 31: 1535-1544.

Jiang, H.T., Y.W. Bui, 2002. ang and, Design and analysis of low-power 10-transistor full adders using

novel XOR XNOR gates. IEEE Trans. On Circuits and Systems -II: Analog and Digital Signal

Processing, 49: 25-30.

Jou, S.J., C.Y.C.E.C.Y ang and C.C. Su, 1997. A pipelined multiplier accumulator using a high-speed, low-

power static and dynamic full adder design. IEEE Journal of solid-state circuits, 32: 114-118.

Keivan Navi and Omid Kavehei, February, 2008. Low-Power and High-Performance 1-Bit CMOS Full-Adder

Cell, Journal of Computers, 3(2): 66-71.

Keivan Navi and Omid Kavehei, 2008 The Design of a High-Performance Full Adder Cell by Combining

Common Digital Gates and Majority Function, European Journal of Scientific Research ISSN 1450-216X

23(4): 626-638.

Lee, P.M., C.H. Hsu and Y.H. Hung, 2007. Novel 10-T full adders realized by GDI,structure, Proc. on

Intl. Symposium on Integrated Circuits, pp: 115-118.

Marimuthu, C.N., P. Thangaraj, 2008. Low Power High Performance Multiplier, ICGST- PDCS, 8(1): 87-

93.

M nico Linares Aranda, Mariano Aguirre Hern ndez, 2011. New High Performance Full Adders Using

an Alternative Logic Structure Computation and Sistemas, 14(3): 213-223.

2108

J. Appl. Sci. Res., 8(4): 2100-2108, 2012

Navi, K., V. Foroutan, B. Mazloomnejad, Sh. Bahrololoumi, O. Hashemipour, M. Haghparast, 2008. A Six

Transistors Full Adder, World Applied Sciences Journal, 4: 142-149.

Pekmestzi, K.Z., 1999. Multiplexer- based array multipliers . IEEE Trans.on Computers, 48: 15-23.

Prashant Gurjar, Rashmi Solanki, Pooja Kansliwal and Mahendra Vucha, 2011. VLSI Implementation of

Adders for High Speed ALU . International Journal of Computer Applications, 29(10): 11-15.

Pratibhadevi Tapashetti, A.S Umesh, Ashalatha Kulshrestha, 2012. Design and Simulation of Energy Efficient

Full Adder for Systolic Array International Journal of Soft Computing and Engineering (IJSCE) ISSN:

2231-2307, 1(6): 356-360.

Radhakrishnan., D., 2001. Low-voltage low-power CMOS full adder,IEE Proceedings-Circuits, Devices and

Systems, 148: 19-24.

Ravi, N., T.S. Rao and T.J. Prasad, 2011. Performance Evaluation of Bypassing Array Multiplier with

Optimized Design .International Journal of Computer Applications., 28(5): 1-5.

Senthilpari, S., 2011. A Low- power and High-performance Radix-4 Multiplier Design Using a Modified

Pass- Transistor Logic Technique IETE journal of Research, 57(2): 149-155.

Shalem, E. John and L.K. John, 1999. A novel low power energy recovery full adder cell, Proc. Of the

IEEE Great Lakes Symposium of VLSI, Feb., pp: 380-383.

Shams, A.M., M.A. Bayoumi, 2000. A Novel High Performance CMOS 1 Bit Full Adder Cell, IEEE

Trans.Circuits and Systems II: Analog Digital Signal Process., 47(5): 256-263.

Shubhajit Roy chowdhury, Aritra Banerjee, Aniruddha Roy, Hiranmay Saha 2008. A high speed 8

transistor full Adder design using novel 3 transistor XOR gates, International journal of Electronics,

Circuits and System, 2(4): 217-223.

Sreenivasa Rao.Ijjada, D. Srinivas And Dr. V. Malleswara Rao, 2011. Low Power And High Speed

Architecture For 32-Bit ALU Design Journal Of Information And Communication Technologies, 1(1): 1-

5.

Srivastava, A and C. Srinivasan, 2002. ALU Design using Reconfigurable CMOS Logic, Proc. of the 45th

IEEE 2002 Midwest Symposium on Circuits and Systems, 2: 663-666.

Suma, T., Hegde, Dr. Siva Yellampalli and R. Nandeesh, 2011. Design and Implementation of ALU Using

Redundant Binary Signed Digit . IJCA Proceedings on International Conference on VLSI,

Communications and Instrumentation (ICVCI) (5): 30-35.

Suzuki, K., M. Yamashina, J. Goto, Enomoto and H. Yamada, 1993. A 2.4- ns, 16-bit, 0.5- n CMOS

arithmetic logic unit for microprogrammable video signal processor LSIs, Proc. of the IEEE Custom

Integrated Circuits Conference, 9: 12.4.1 -12.4.4.

Wang, Y., Y. Jiang, and E. Sha, 2001. On area-efficient low power array multipliers. Proceedings of the

8th IEEE International Conferenceon Electronics, Circuits and Systems, ICECS, 3: 1429-1432.

Wang, J.M., S.C. Fang and W.C. Fang, 1994. New efficient designs for XOR and XNOR functions on

transistor level, IEEE J. of Solid State Circuits, 29: 780-786.

Yingtao Jiang, Abdulkarim AI Sheraidah, Yuke Wang, Edwin Sha,and Jin-Gyun Chung, 2009, A novel

multiplxer based low power full adder IEEE transaction on circuits and systems II Express Briefs,

Vol52

Zaid Al-bayati, Bassam Jamil Mohd, 2011. Sahel Alouneh Low power Wallace multiplier design based on wide

counters, International Journal of Circuit Theory and Applications, 1: 26-32.

Zimmermann, R., W. Fichtner, 1997. "Low-power logic styles: CMOS versus pass-transistor logic", IEEE J.

Solid-State Circuits, 32: 1079-1090.



Contact this candidate