# Area Efficient High Speed Vedic Multiplier

Ketan Thakur, Tripti Sharma

Abstract: Very large scale integration is a process of integrating hundreds of thousands of transistors or devices into a single chip. VLSI can be categorized into two fields Frontend and Backend. Digital VLSI design falls under the Frontend design. Multiplication is an arithmetic operation important for the Digital Signal Processing (DSP) and for processors. Multiplier is the main hardware block for the digital circuit. More than 70% of the applications in a digital circuit are either addition or multiplication. As these operations dominates most of the execution time so we need fast multipliers. The overall objective of a good multiplier is to have high speed, low power consumption unit, less area. Vedic multipliers are the fast multipliers and occupy less area. They are based on the Vedic mathematics sutra "Urdhava-Triyakbhyam". The paper contain a high speed multipliers and use of different adder structures.

Keywords-VLSI, DSP, Vedic, multiplier.

## I. INTRODUCTION

Multiplication or else can be named as repeated addition. Multiplication is one of the major operation in digital and also in analog domain. In digital domain multiplier is used in the digital signal processing, image processing and to perform various computer arithmetic techniques. Same as for the analog domain. Analog multiplier is a circuit with an output that is proportional to the product of two inputs. Mostly used circuit is the operational amplifier to have an easy obtained algorithm. Most common operation for the multiplier in digital domain is the binary multiplication. A binary multiplier is a combinational logic circuit used to multiply two binary numbers. It used in multiplying two binary numbers. The two numbers used are more specifically known as multiplicand and multiplier and result is known as product. The multiplier and multiplicand can be of various bit sizes, product size depends on the bit size of the multiplicand and multiplier. The bit size of the product is equal to the sum of the bit size of multiplier and multiplicand. A binary multiplier is made up of the binary adders. Binary multiplication is same as of the decimal multiplication. Binary multiplication of more than 1-bit numbers contains two steps. The first step is done by the and gate operation and then adders to add the and product to have the final product. The schematic design of the multiplier changes by increase in bit size. Efforts are being made continuously to attain a high performance multiplier can keep up with the current hardware. To reduce the hardware of the multiplication truncation is necessary. Multiplier architecture has three stages: a partial product generation stage then a partial product addition stage and final addition stage. To speed up whole process we need to speed one of the three stages. And to reduce hardware multiplication truncation is necessary. The multiplication by array multiplier is simple pen and pencil work implemented on the hardware. For a N×M bit multiplication it requires N×M AND gates and (N-1)M adders. [2]-[7] Most of the area is devoted to adding the partial product produced. To resolve this issue Vedic multiplier is used. The word 'Vedic' is derived from the word 'Veda' which means the store-house of all knowledge .Vedic mathematics is mainly based on 16 Sutras (or aphorisms) dealing with various branches of mathematics like arithmetic, algebra, geometry etc. The Sutra picked is 'Urdhavatriyakbhyam', which is used as the method of multiplication by many implementers. This method is a vertical and crosswire technique and results is attained by the help of appropriate adder. The main aspect of the method is partial products of multiplication are generated in advance. These partial products are then added based on the Vedic math algorithm to obtain the final product. This leads to decrease in delay and thus saves the time.

| STEP 1  | STEP 2   | STEP 3   |
|---------|----------|----------|
|         | X        | X        |
| STEP 4  | STEP 5   | STEP 6   |
| Ж       | Ж.       | <b>X</b> |
| STEP 7  | STEP 8   | STEP 9   |
|         | •        |          |
| ·       | color.   | A TON    |
| STEP 10 | STEP 11  | STEP 12  |
|         | <b>X</b> | <b>X</b> |
| STEP 13 | STEP 14  | STEP 15  |
| X       | X        |          |

**Figure 1.** Vedic multiplier Crosswire and vertical technique[2].

The figure 1 shows the 'Urdhaya-Triyakbhyam' for a 8-bitmultiplication . The arrows in the figure shows the multiplication of the digits and the resultant sum is taken at final step. Compressor is a logic circuit which tries to have a better computational speed. The compressor makes difference in the way cells are connected by introducing a new path, which limit the propagation of carry signal .A much better multiplier tree is formed. It is a basic combination of the gates which tends to minimize the uneven carry propagation. It hold many input values then the conventional half or full adders.

Figure 2 shows the conventional 4:2 compressor having four inputs and two outputs. It is composed of the two full adders connected in parallel. A basic compressor circuit may be used to implement a higher order compressor such as 10-4, 15-4 and 20-5

Revised Manuscript Received on July 22, 2019

Ketan Thakur, ECE, Chandigarh University, Mohali, India. Tripti Sharma, ECE, Chandigarh University, Mohali, India.



**Figure 2**. Conventional 4:2 compressor[9].

#### II. LITERATURE REVIEW

Multiplier is the basic arithmetic operation performed by digital circuit. Earlier multiplier were based on array multiplication. The technique presented for the Dadda scheme has (3,2),(2,2) counter and a carry look ahead adder[1]. The complexity of a carry look ahead adder increases and also speed is reduced as the number of input increases. The algorithm described is to reduce the length of carry look ahead adder and to increase the speed of the adder. This also reduces logic complexity. The reduction can be done as-

i.) By reducing array of AND gates from bit-product matrix. ii.)Reducing the bit product matrix to an equivalent two row matrix with counter.

iii.)To generate the product by sum of two rows with a carry look-ahead adder.

The least significant product bits are obtained in step ii) . This reduces number of column in the two-row matrix and overall reduces size of carry-look ahead adder. The approach is employed in Dadda multiplier . The first step is of N×N bit multiplication is to generate the bit-product matrix using an array of N×N AND gate. In second step counters are used and N rows are reduced to an equivalent two-row matrix in S stage.

The described scheme have -

i.)An array of  $N\times N$  AND gate is used to generate bit product matrix of  $N\times N$  bits.

ii.)A transformed two -row matrix in S stages from N-row matrix.

iii.)The operation involved in the stags i=1 to S can be generalized as follows . For the stages i=1

-uses a half adder for (i+1) columns.

-for rest uses full adder.

-or uses half adder counter for rows.

This reduces carry look ahead adder from 22 to 17.

The array and parallel multiplier such as Dadda multiplier have the problem of the carry accumulation

and larger length of the carry propagation. Vedic multiplier is employed which uses the ancient India system of mathematics. It is described in the ancient Indian sculptures called as Vedas[2]. Nikhilam Sutra is one of the sutra for the arithmetic operation. It operates as to finds out the compliment of the layer number from its nearest base to perform the multiplication. The multiplication is done in parallel followed by addition. This reduces the multiplication

to the shorter form. Resulted multiplication have the minimum area and high speed. But the complexity imposed by the design is much higher than other design. Vedic multiplier approach can be implemented by use of different methods[3]. One such method is pipelined architecture. The method employs portioning the larger bits to smaller bits to achieve the resultant. A nibble is considered as the leaf cell. This makes the modification easier. The other approach for the for n-bit multiplication requires the n\*4 AND gates. But this method employs the use of n AND gates only. This have the less area and power consumption. It leads to the increase in the speed. The partial products produced are added by the help of the half and full adder during multiplication. But if there is necessity of adding larger bits it posses much difficulties as a half adder can add 2 bits and full adder can add 3 bits[4]. This leads to additional hardware and logic. A compressor is a digital circuit capable of adding more than 3 bits and have a much less hardware. This leads to improving efficiency in terms of speed.



Figure 3. 4:2 Compressor[4].

Figure 3. is a 4:2 compressor is capable of adding 4 bits and producing 3 output bit. Thus the critical path is shortened in terms of addition of 5 bits by use of adder. The speed increases by 66.6% recorded in comparison to the half adder and full adder design. The "Urdhava Trikayabhyam" multiplier has 12 stages as compared to the traditional half adder and full adder 15 stages multiplier. Vedic multiplier using compressor have the 1% less area than traditional multiplier. The new design have the improved area and speed. Multiplier have the different uses in digital system. It is employed in the digital image processing[5]. The design of Q-format signed multiplier includes a 'Urdhava Triyakbyam' integer multiplier with certain modifications. A 16 bit Q15 multiplier have the output Q15 number having 16 bit long. The output of the number presents the different information, say if MSB is 1 then it is a negative number. So the 2's complement is taken before processing with multiplication. As MSB only have the sign so it is excluded a '0' is placed before it for multiplication. The basic block of the both Q15 and Q31 multiplier is made up of the 4×4 "Urdhava Triyakbhyam" integer multiplier which in turn is made up of two 2×2 multiplier blocks. It is faster than Xilinx parallel multiplier (IP) by 1.12 times and 1.25 by Booth multiplier. As a multiplier have the three stages. First stage being partial product generation second partial product addition stage and final addition stages[6]. Figure 4. shows the use of the multiplier flow chart and changes it holds differ dorm the traditional designed array multiplier.



Figure 4. Algorithm Flow Chart [6]

To make the multiplier fast one of the three stage that is partial product could be improved. The improvement can be done by using Sklansky tree adder. This is much improved than the ripple carry adder. There is improvement in the speed as there is 33.3% latency than Wallace tree multiplier. Also 21% faster than the Dadda multiplier. The use of the tree adder also reduce the power consumption. The idea to use the tree format comes from the class of parallel look ahead adder. To operate the adder tree generates all carry input bits simultaneously. The delay is reduced to logN. A compressor is also used for a multiplier to have easy implementation for the higher bits[7]. The compressor used is 4:2 and 7:2 as basic design unit. The multiplier have the less parallel stages then the normal multiplier. Also the hardware is redesigned and have only required stages. To complete the demand of high performance arithmetic multiplier, changes are often made by considering individual elements[8]. Modified Booth multiplier which is a advance version of the Booth multiplier reduces the partial products generated. Dadda multiplier another form of multiplier which have the less number of adders. By looking into the advantages of the both multipliers, combining the multipliers gives new Booth Dadda algorithm. This may be efficient than earlier algorithm.

In the algorithm partial products will be formed using modified Booth algorithms as shown in figure 5 and final process is performed by Dadda algorithm. In Booth multiplier the partial products formed are equal to the number of bits of multiplier. So cost of multiplier is high and speed is slow. The produced partial product number is 'n/2' if it is even and '(n+1)/2' if it is odd. These improves the area and speed. Vedic multiplier can be used to realize the higher bit order. To form the higher bit multiplier smaller multiplier designs are used. It seems much simpler to have the small multiplier to make a higher bit multiplier. So a  $16{\times}16$  multiplier is formed by the  $8{\times}8$  which is formed by  $4{\times}4$  multiplier[9] . But it poses the power consumption problems.



Figure 5. Modified Booth Architecture [8].

To overcome those BEC can be employed which reduces usage of the gates as compared to normal Vedic multiplier which reduces power consumption. It consists of four groups of same size that is each group consist of 8×8 Vedic multiplier whose inputs are partition according to Urdhava Triyakbhyam sutra. Outputs from Vedic multipliers are given as inputs to BEC adders of different sizes. Replacing circuitry by the BEC is to have the increased speed of the operation and to have low area as compared to the ripple carry adder.



Figure 6. Vedic Multiplier using BEC adder [10].

The figure 6 have the use of the BEC adder to make a 16×16 multiplier. The BEC adder posses to have the less number of gates. This reduces power consumption and delay Vedic multiplier hardware can be changed according to the need. One problem is of carry propagation, to solve this carry select adder is used instead of simple carry select adder[10].



Figure 7. Carry Select Adder Multiplier[10].

The carry propagation shown in the figure 7 have the less carry propagation, as the use of separate units for carry and sum propagation. Carry select adder is used as the one of the fastest adder structure. But carry select adder can be modified from traditional to have a performance improvement. The modified carry select adder has one half sum generator unit, a final sum generation unit. The carry select adder is studied for improvement and redundant operations are eliminated to have a modified carry select adder. To reduce the number of gates all the prime implicants for logic functions are generated. Essential prime implicants are combined so that final carry and sum are generated. Separate sum and carry generation is used to reduce redundant operation. At first, single stage carry select adder is implemented. A higher bit Vedic multiplier can be implemented by the use of multistage carry select adder. Depending on the widths ranging from 8/16/32 bits multiplier are formed. Vedic multiplier using multi-stage carry select adder have the small delay but not less than binary excess adder based multiplier. But the area occupied is very less. Multiplier is an arithmetic operation which have the operation to have the addition multiple times. Thus hardware requirement is very large. To overcome adder requirements compressors are used. A compressor is circuit to perform addition more than the 3 bits. Otherwise, the half adder and full adder requirements is much higher and it also complicated the structure. A 4-2 compressor can operate on low voltage. It is made from the XOR-XNOR gates which can operate on low voltage. The new circuit is to analyze the circuit and eliminate the weak logics [12]. 4:2 compressor can operate as low as 0.6V.Inputs and outputs for the compressors are 5 and 3 respectively. Circuit changes in the 4:2 compressor are necessary for the improvements, by the employment of the four XOR circuits and two 2-1 multiplexer critical delay can be reduced. The output response time is also very ,much maintained. This circuit change works on the gate level. Sum is generated by the use of XNOR circuit followed by an inverter and carry by use of a multiplexer. The small compressor units can be used to from the large units. A novel 5:3 compressor architecture can be used to from a novel 15:4 compressor. It follows the parallelism in the computation[13]. The formed design improves the delay and have the minimum hardware. To further reduce the delay bit slice technique can replace the parallelism in computation[14]. The design posses only three gate delays rather than five gate delays. The inputs are divided into equal groups. Compression starts from the first stage by compressing three bits to two bits by use of full adder. The sum and carry outputs of each adder are these after sent to two 5-3 compressors. The output of compressor are sent to 4 bit parallel adder.

#### III. RESULT COMPARISON OF VEDIC MULTIPLIER

The multiplier is in use from the much earlier in the digital design. Modifications and changes in the multiplier design are done according to the need of the design. Earlier design used is simple array multiplier and the problem encountered is of the high carry propagation adder[1]. The reduction in the carry look ahead adder is from 22 to 17. But to encounter the high complexity of the designs method such as Dadda and Booth encoder posses different problems of speed, area and logic complexity. Vedic multiplier a method based on the ancient Vedic Mathematics, is used to be implemented in the digital multiplier design. The Vedic sutras can be implemented using various methods to form a multiplier. The approach of using a pipelined architecture followed by using AND gates for the implementing the design[3]. The multiplier formed is faster but have the high area consumption. To have the less area then use of the AND gates implementation the half adder and full adder are used for the implementation. They can add 2 to 3 bits at a time and have the very less hardware requirement[4]. To have more efficiency in design compressor is used which is a digital circuit capable of adding more than 3 bits at a time. This have the improvement in terms of speed. Speed increases by 66.6% then comparison to the half adder and full adder design. Also in terms of the area the stages reduces in comparison to the half and full adder design, 1% less area than traditional design.

Apart from using the Vedic multiplier there are other approaches as efficient. To reduce partial product that is one of the stage of multiplier. This have the low latency of 18 and 33.3% low area [6]. To provide the multiplier with more speed BEC structure is used. The outputs from the Vedic multiplier is given to the BEC adder which increase the speed of the whole multiplier[9]. The Vedic multiplier for the higher bits can also be formed by the carry select adder, delay not less than the binary excess adder. But area occupied is very less [10].

### IV. CONCLUSION

The multiplier is a basic digital operation. Its importance is in the digital signal processing, processors and other major arithmetic operations. It is used in other fields by different names but the operations is same. The realization of the operation used the gates to perform the addition for multiple times. But as the advancement continues multiplier design also followed new trends. From using gates it used the half and full adders to have three bits added at a time. From that it follows the uses of compressors which can add more than three bits at a time. Implementation approach is also important for the multiplier design. Vedic mathematics which have the sutras to solve the arithmetic problems by much easy approach is employed. Urdhava Trikhbyam one of the 16 sutras of the Vedic mathematics is used to build the multiplier and have a high speed multiplier..Most of the cases the

multiplier implementation sacrifice the area and it is a major portion on which future work can be done.



#### REFERENCES

- [1] A.DHURKADAS, "Faster parallel multiplier", 1984, PROCEEDINGS OF THE IEEE, VOL. 72, NO. 1, JANUARY 1984.
- [2]Honey Durga Tiwari,et.al.," Multiplier design based on ancient Indian Vedic Mathematics",2008 International SoC Design Conference.
- [3] Vaijyanath Kunchigi,et.al."High speed and area efficient Vedic multiplier", 2012, IEEE conf.
- [4] Sushma R.Huddar and Sudhir Rao,"Novel High Speed Vedic Mathematics Multiplier",2013,IEEE conf.
- [5]Sandesh S. Saokar ,et..al.,"High speed Multiplier for digital signal processing ",2012,IEEE conf.
- [6] T. Arunachalam and S. Kirubaveni ,"Analysis of High Speed Multipliers", International conference on Communication and Signal Processing, April 3-5, 2013, India.
- [7]N.Rajasekhar and Dr.T.Shanmuganantham,"A Modified Novel Compressor based Urdhwa Tiryakbhyam Multiplie"r,2014,ICCCI conf.
- [8]Sumod Abraham,et.al, "Study of Various High Speed Multipliers", 2015 International Conference on Computer Communication and Informatics (ICCCI -2015), Jan. 08 – 10, 2015, Coimbatore, INDIA.
- [9]G.Challa Ram ,et.al.,"Area efficient modified Vedic multiplier" 2016,ICCPCT Int. conf.
- [10] Ms. G. R. Gokhale, et. al, "Design of Area and Delay Efficient Vedic Multiplier Using Carry Select Adder", 2015 International Conference on Information Processing (ICIP) Vishwakarma Institute of Technology. Dec 16-19, 2015.
- [11] YogitaBansal, CharuMadhu, "Anovel high-speed approach for 16×16Vedic multiplication with compressor adders", Elsevier Computers and Electrical Engineering 49 (2016), pp. 39–49.
- [12] Jiangmin Gu, Chip-Hong Chang, Ultra Low voltage ,Low power 4-2 compressor for high speed multiplications",2003,IEEE trans.
- [13] J.Subhajit Roy chaudhary ,et.al.,"Design simulation and testing of high speed low power 15-4 compressor for high speed multiplication applications"2008,IEEE conf.
- [14] R.Abhilash ,et.al,"ASIC design of low power VLSI architecture for different multiplier algorithms using compressors",2016,ICIIS conf.

## **AUTHORS PROFILE**



**Ketan Thakur** is from Himachal Pradesh and has done his masters from the Chandigarh University, Punjab in Electronics and Communication from the year 2017-1019. He had done his bachelors from the SGI college panipat.



**Dr. Tripti Sharma** has achieved her M.Tech. and Ph. D. degree in the field of Low Power VLSI Circuits Design. She has more than 16 years of teaching experience along with intense research interest. After stepping into professional world, she started her career as lecturer with

C.S.J.M University (Govt.), Kanpur and continued it up to late 2003; after that she served Mody Institute of Technology and Science (Deemed University), Rajasthan and stayed for more than a decade. There she worked in the core team to achieve NBA, AICTE and NAAC accreditation. Later on she joined Vivekananda Global University in 2015 as Professor and Head of the ECE deptt. and along with all the academic & administrative responsibilities she worked there for getting UGC affiliation for the University in her short stay. She joined Chandigarh University as Professor of the ECE department in January 2016.

Her research interests include Digital & Analog low power VLSI circuits and Double Gate MOSFET Circuit Design & Analysis. She has more than 60 publications in International Journals and National/ International conferences in the areas of high-performance integrated circuits and emerging semiconductor Technologies. She has also authored 05 technical books useful for research in the field of digital circuit design and filed a patent for the neuro-developmental disorder to help the society

