# Analysis and Optimization of timing paths in MTCMOS based Change-Sensing Flip Flop for SoC Design

Bhargavi N.S, Shylashree N

Abstract—VLSI industry is very vast and fast-growing industry in the world with increasing complexity and intensive development. In semiconductor industry there is always a trade of between area, power and timing. As there are advancements in the technology along with miniaturization taking place the timing and power play important role for SoC chip designing and production. The main aim in the SoC chip design is to make a cost-effective chip with low area, power consumption with efficient and fast timing response. By using latest technological advancement techniques and methods the exact timing can be maintained. By proper analysis of the timing paths, and suitable optimization techniques we can have accurate timing and reduce power consumption. In this paper, Change-Sensing flip flop is designed to have accurate timing and minimum power consumption. The tool used is Cadence virtuoso with 45nm technology and MTCMOS technology is used for power reduction. Results verify that the designed Change-Sensing flip flop with MTCMOS technology consumes very less power with-respect to many flip-flops designed for the industry. The designed FF reduces power consumption by 44.37 % when compared to TGFF. The FF has setup time of 47.70 pS and hold time of -15.15 pS. The designed flip flop will help in overall power reduction in the SoC chip design and will improve the timing response.

Key words: setup and hold time measurement, MTCMOS technique, low power, single-phase clock, flip-flop (FF).

## 1. INTRODUCTION

The standard Moore's law states that "The number of transistors that are present in the chip will double for every 18 months". As the density of the transistors increases with technology it has undesirable effects on many parameters of the SoC chip. The older 180nm and 90nm technology chips had less density, but with introduction of the latest 45nm and even less technologies the power consumption is increasing drastically with increase in transistors. Therefore power reduction at the early stage plays a very important role in SoC designing. Timing analysis conjointly plays vital role in SoC chip design. In timing analysis we mainly analyze the designed digital circuit for examining timing constraints that should be met.

The main fundamental component of all the digital systems designed today are flip flops. The area along with power for designed circuits mainly depend on area and power consumed by the flip flops. Hence, we should take extreme care while designing the flip flops according to our

#### Revised Version Manuscript Received on April 12, 2019.

Bhargavi N.S, Department of ECE, R V College of Engineering, Bengaluru, Karnataka, India (Email: janavi.bhargavi@gmail.com) Shylashree N, Department of ECE, R V College of Engineering,

Bengaluru, Karnataka, India (Email: shylashreen@rvce.edu.in)

requirements. The flip flops have large redundant transitions from their internal clocked nodes. When the data is constant also these transitions will be taking place in flip flops which will be taking huge amount of dynamic power. In a SoC chip the flip flops and their clock distribution systems require 50% of power consumed by entire system. Therefore in this paper main importance is given to design a low power, low area true single phase with accurate timing is designed, implemented and simulated to analyze the functionality of various functional parameters [1].

The Transmission gate FF (TGFF) is commonly implemented FF for the SoC design. Transmission gates have an advantage of lesser transistor count because of the use of the transmission gates. The transistor count is less when compared to the flip flops designed using CMOS logic. The major drawbacks of the transmission gate flip flops are very high dynamic power consumption because of the high loading effect of the clock signal. In the design there is requirement of two-phase clock for TGFF, this increases the demand for highly efficient clock distribution system to reduce the clock jitter and clock skew. The proposed Change-Sensing flip flop will reduce the power consumption to a very large extent by eliminating redundant clocked node transitions and improve timing performance of the entire designed system.

The remaining part of the paper has contents in the following organized way. Section II will describe the major disadvantages and drawbacks of the flip flops designed recently and working of the TGFF is explained. In Section III the description about the proposed Change-Sensing flip flop design and working is explained. The implementation of CSFF and proposed Change-Sensing flip flop with MTCMOS scheme is described and simulation results are explained in Section IV. Then final conclusions are explained in Section V.

## 2. BACKGROUND

In the recent technological developments huge number of flips flops are designed using the low power techniques. Initially conditional-clocking flip flops (CCKFF) were designed for reducing the unwanted transitions that the local nodes produce if we have no difference in data. The CCFF was designed by adding additional logic circuits to detect and monitor input data changes. But the area overhead and C-Q



### Analysis and Optimization of timing paths in MTCMOS based Change-Sensing Flip Flop for SoC Design

delay is very large in CCKFF compared to other circuits [2].

The conditional pre-charge circuit is used to design the conditional pre-charge flip flop (CPFF) that prevents unnecessary toggling of all the internal nodes. The CPFF also induces larger area penalty, delay and there are chances for the functional failure to take place due to the strong disagreement at the output of the latched element. The next designed flip flop is data mapping flip flop (DMFF) that reduces the power dissipation from the unwanted node transitions. But they also have disadvantage of generated internal clock CM which consume switching power even when the data activity is low. The DMFF also requires additional transistors like CPFF to reduce the chances of functional failure due to the contention at the output latch.

The adaptive coupling flip flop (ACFF) is designed mainly to reduce the unwanted node transitions without increasing the area overhead. The ACFF use the technique of employment of the adaptive coupling element in the master latch and removing it in the slave latch. This has a major drawback of increased operating voltages when there in strong contention in slave latch.

The topologically compressed flip flop (TCFF) reduces the clocked node transitions by compressing the structure of a combinational type flip flop. The TCFF does not impose large area overhead also but it degrades the cell's robustness and causes functional failures due to the huge number of shared transistors at low voltage operations [3].

The next designed type of flip flop is static single-phase contention free flip flop (SSCFF). The SSCFF reduces the unwanted transitions and provides energy saving for wide supply voltage ranges similar to TGFF. The major drawback of the SSCFF is that it reduces the energy only when the input data is high persistently. When the input data does not change from low level logic then the SSCFF does not save energy or reduce the internal node transitions. Therefore from the above knowledge about the designed flip flops there is requirement for a flip flop that reduces power, area overhead and addresses functional failures at low-voltage operations at the same time [4].

The transmission gate flip flop (TGFF) is the most commonly used flip flop in present technology. The TGFF reduces the number of transistors drastically by using the transmission gates in the design. The transmission gate flip flop is most commonly used flip flop because of its static and contention free nature. The TGFF is robust with the voltage scaling but consumes large power due to the toggling internal clock nodes. The circuit diagram of the transmission gate flip flop is as shown in Fig. 1. The clocked nodes CKN and CKI always keep toggling and cause huge power consumption [5].



Fig. 1 Conventional TGFF

The main working principle of the TGFF when the clock values are high/low and data values are high/low is explained in detail in Fig.2. When the clock value is low the output node at point X follows the input node D continuously, but the node at M does not depend on the node X. So the final output node Q has the value that was stored in the slave flip flop. When the clock value becomes high the node W will be isolated from the input node D and the node M will now get connected to the node X. Therefore the final output Q will get the value that is stored in the node X.



Fig. 2 Working of TGFF

The setup and hold time will play an important role in the design of the flip flop at this point because when the clock value goes from low to high state, there are chances for the node X to enter metastable state. To explain in detail about this, initially when we consider the clock to be low and goes to high state the node X will stop following the input node D.

If the input node D changes from low to high or high to low at this point of time, then the value stored at node X cannot be predicted and might enter the metastable state. The D value must remain stable for some amount of time before the clock event takes place and after the clock event takes place. These are defined as the setup and hold time for the flip flop which will ensure that the value present at input node D is successfully transferred to the output node Q.

The two transmission gates present in the main path will never turn on simultaneously, so the D node does not have direct effect on the output node Q. The major drawbacks with the TGFF is that they consume huge dynamic power because of the clock nodes CKI and CKN always toggling.

A total of 12 transistors are clock driven hence they have large capacitive loading even when the switching activity is very low. The TGFF is not suitable for the requirement of low power applications as they consume huge dynamic power for the unwanted clock transitions even when the data is constant.



| Table I Comparisor | of Various Low | Power Flip Flops   |
|--------------------|----------------|--------------------|
| Tubic I Comparison | or various bon | I Ower I mp I tops |

|                            | CCKFF | CPFF | DMFF | ACFF | TCFF | SSCFF | TGFF |
|----------------------------|-------|------|------|------|------|-------|------|
| Transistor Count           | 40    | 28   | 24   | 22   | 21   | 24    | 24   |
| Operation at low voltage   | Yes   | No   | No   | No   | No   | Yes   | Yes  |
| Unwanted Clock transitions | No    | Yes  | Yes  | No   | No   | Yes   | Yes  |

The TGFF has requirement of two clock phases that requires additional circuits to generate two phase clock and to minimize the skew and jitter we require robust and efficient clock distribution system. The Table-I shows comparison for various low power FF designed in using the recent technological advancements. The comparison is made with transistor count, low voltage operation and unwanted clock transition parameters.

To overcome the drawbacks of the TGFF, the change sensing flip flop (CSFF) is designed that reduces the dynamic power to a very large extent. The CSFF has advantage of eliminating unwanted internal node transitions while maintaining the functionality in the low voltage operations.

### 3. DESIGN APPROACH OF CSFF

The CSFF is mainly designed using the change-sensing scheme that makes use of single-phase clock for operation. The CSFF eliminates the unwanted internal clock node transitions while maintaining the performance and area.



Fig.3 Change Sensing Scheme

In the CSFF stacking transistors are used instead of the transmission gates, then the stacking transistors are replaced with logically equivalent transistors which has an advantage of reduction of transistor count in the design of CSFF compared to the TGFF that has more power consumption.

The change sensing scheme used to design the CSFF is explained in detail in the Fig. 3. The main use of this change sensing scheme is to reduce the unwanted toggling of the internal clock nodes that consume large power. The change in the input data D is identified by the change sensing scheme and it will toggle internal clock node CS indicating that there is change in input data. It will store the received input data

only if the state has changed compared to the previous input data state. The change sensing scheme is made up of a total of 6 transistors and prevents toggling of the node CS when there is no change in input data. The change sensing scheme consumed 4 additional transistors than that of TGFF, hence detail analysis is done to replace the additional transistors with reduced number of functionally and logically equivalent devices [6].



Fig.4 Change Sensing flip flop (CSFF)

The Figure. 4 shows circuit diagram for final CSFF designed according to the requirement. The CSFF has the transistor count same as that of TGFF but with additional advantages such as improved functionality, reduced power and removal of unwanted toggling effect. The CSFF is composed of 24 transistors which is composed of 11 PMOS transistors and 13 NMOS transistors. The CSFF circuit has 3 inverters in the designed circuit. The input data is denoted by D and inverted value of D is denoted as DN. The single-phase clock is denoted by CK and CS is the change sensing input generated by the internal CSFF circuit. The outputs are obtained at the Q and QN node points that satisfy the functionality of the flip flop operation. The transistors Tr 1, Tr 2, Tr 3 make the master latch of the CSFF and Tr 7, Tr 10, Tr 11, Tr 12, and Inv 1 are components of the latch for master. The transistors Tr 10, Tr 15, Tr 16, Tr 17, Inv 2 compose the slave latch and transistors Tr 9, Tr 22, Tr 23, Tr 24 and Inv 3 make the components for latch for the slave. The transistors Tr 4, Tr 5, Tr 6, Tr 7, Tr 8 and Tr 9 are part of change sensing scheme required to do CSFF final design with reduced power consumption.



#### 4. PROPOSED CS FLIP FLOP

## A. Change-Sensing Scheme

CSFF working principle and implementation is explained in detail in this section. The change sensing scheme is composed of two phases, namely pre-charge phase and sensing phase respectively. The Fig.5 shows the two important phases of the CSFF along with the operation when it is sensing low\_to\_high and high\_to\_low signals. Change sensing circuit will be in pre-charge mode when clock signal value is logic low\_level and it will be in sensing-phase when the clock signal value is high at rising edge of the clock. In the pre\_charge phase the CS node will be pre\_charged from the transistor Tr 4.



Fig.5 Operation of Change Sensing scheme

The discharge of the signal CS takes place in two ways namely through low\_to\_high sensing and high\_to\_low sensing as shown in Fig.5. The change in D input value decides the path taken by the CS signal for discharging. As incoming data value will change from the low\_to\_high value the CS signal will get discharged from low to high sensing path which is composed of T5, T6, T7 transistors. Next when the incoming signal value changes from high\_to\_low value that time CS signal is discharged through the T5, T8, T9 transistors that make the high\_to\_low sensing-path. Hence with change sensing scheme of CS signal toggle takes place only when the D input signal changes. In the TGFF and other flip flops explained in the Section II continuously toggle irrespective of the D input value. Hence the proposed design helps in the reduction of the power consumption to a very large extent by saving the unnecessary toggling of internal clock nodes.

#### **B.** Change-Sensing Scheme

The working of CSFF explained is shown in Figure.6 which explains the operation of the circuit when clock is low-high, data is low-high. Initially as clock signal CK is low

the CS signal which is the local clocked nodes will get pre-charged through the transistor T4 and the master latch that will be in transparent mode will send the incoming new\_data signa for FF. The slave latch is in hold mode still, so it will store old data [7].

When clock signal reaches higher value, change sensing scheme will now be entering sensing phase and discharges the CS signal after sensing the change in the input data. The path through which the discharge will take place depends on the type of the new data. If the new data is low, then CS will discharge through high\_to\_low paths and if the new\_data value is high then the discharge will take place from the low\_to\_high path. After discharge has taken place the master latch will be in the hold mode with slave latch will be in the transparent mode to transfer the data to the output node.

## C. Multi-threshold CMOS Implementation

The MTCMOS technique is one of recent technological advancements for reduction of the power consumed to lowest level possible. Operation and working of the CSFF will remain sample, only the transistors used for circuit design will be removed and placed making use of multi threshold transistors to reduce leakage current and hence reduce the power required for the circuit operation. The transistors with low VT are used for realizing logic of circuit and the transistors with high VT will be used for reduction of leakage current when it is in the stand-by mode by isolating power and ground lines from the low threshold transistors. The circuits that use this MTCMOS technique have very high speed in the active mode and reduce power to a very large extent in the standby mode. The transistors that are present in the master and slave latch in the CSFF are replaced with these multi-threshold transistors to reduce power.

#### D. Simulation of CSFF circuit

Designed low power FF implementation is performed using the Cadence 45nm technology to compare results and verify the functionality. The TGFF, CSFF and MTCMOS based CSFF circuits are implemented along with their test bench circuits. The input data, clock input, flip flop output and power waveforms are plotted at 1V and 100MHz frequency and the results are analyzed. The TGFF consumes huge dynamic power when compared to the CSFF. The MTCMOS based CSFF has the least power consumption. It is observed that CSFF has consumed power only when input data has a change. The TGFF has unwanted clock node transitions and hence consumes power in all the clock cycles. Both the MTCMOS based CSFF and CSFF circuits eliminate the unwanted clock node transitions.

The main important parameters to verify the performance are setup timing value, hold timing value, Clock\_to\_Q delays (C-Q), data\_to\_Q delays (D-Q) where both are the propagations delays of the circuit. The Power components are then determined which consist of mainly average power, PDP(C-Q) and PDP (D-Q) power delay products. With respect to the above parameters the TGFF, CSFF and proposed MTCMOS based CSFF are compared and best one is identified.



Fig.6 Operation of CSFF

The Fig.7 shows the schematic of the Multi threshold based CSFF implemented in the 45nm technology Cadence Virtuoso. The schematic consists of 24 transistors with data and clock input pins and Q output pin. The PMOS and NMOS transistors are taken from the gpdk45 library files and vdd/gnd points from analog library files. The designed FF working is as explained in the Figure.6 for high and low clock values with change in the input data signal being sensed and reduces the power consumption.



Fig.7 Schematic of CSFF

The Fig.8 shows the testbench circuit that is designed for the proposed CSFF with multi threshold. The power supply VDD supplied for the testbench circuit is 1V and the frequency supplied for the clock is 500MHz in the cadence tool. The testbench circuit is created for the TGFF, CSFF and

proposed FF. Then the comparison of the simulation results is done to determine the performance parameters.



Fig.8 Testbench for CSFF

The first important parameter is the setup time that defines the amount of time before clock event takes at which incoming data must become stabilized. The hold time, another important parameter in the flip flop design that defines the amount of time after the clock event takes place for which the data should be stable. The power is the most important design parameter in the latest technolgy with increasing complexity and desnsity and reducing area. For the comparision purpose average power is determined for all the designed flip flops. The parameters Clock-to-Q and Data-to-Q define the propagation delay of the flip flops.





Fig.9 Clock, Power, Input and Output waveform for CSFF

The Fig.9 shows the simulated results and output waveforms for the design MTCMOS CFFF. The Power delay product is calculated to determine the efficient well designed cell with lowest delay and power consumption. The Table II shows the comparision of these parameters for various flip flop designs.

Table II. Comparision of Flip Flop Designs

| Performance<br>Parameter                | TGFF    | CSFF    | MTCMOS<br>based CSFF |
|-----------------------------------------|---------|---------|----------------------|
| Number of transistors                   | 24      | 24      | 24                   |
| Setup Time (pS)                         | 38.5678 | 60.09   | 47.70                |
| Hold Time (pS)                          | 2.8760  | -6.08   | -15.1581             |
| Clock-to-Q<br>propagation delay<br>(pS) | 58.7634 | 24.18   | 24.498               |
| Data-to-Q<br>propagation delay<br>(pS)  | 96.7654 | 84.28   | 72.262               |
| Average Power (nW)                      | 1295.4  | 761.3   | 720.56               |
| PDP (C-Q) (fJ)                          | 0.07612 | 0.01840 | 0.01765              |
| PDP (D-Q) (fJ)                          | 0.12534 | 0.06416 | 0.05206              |

From the Table II we get to know that the TGFF has highest power consumption that the CSFF and MTCMOS

based CSFF. The average power of CSFF has improvement of \*\*.% over the TGFF and the MTCMOS based CSFF has \*\*.\*% improvement over TGFF, \*\*.\*% improvement over CSFF. There is huge improvement in the C\_to\_Q and D\_to\_Q delay of proposed FF. The PDP (C-Q) and PDP (D-Q) for proposed design has improved by \*\*.\*% and \*\*.\*% compared to the TGFF, when compared to CSFF it has improved by \*\*.\*% and \*\*.\*%.

The setup time has positive value for all the three flip flops. The Tc-q is the delay value considered at infinite set and hold time. The Fig.8 shows the graph for setup, hold time calculations.



Fig.8 Setup, Hold Time Calculation

The graph obtained after plotting clock-to-Q propagation delay on one axis and data-clock delay on the other axis and calculation of the difference between data and clock delay at the 110% of Tc-q value gives the setup, hold time for designed FF. For designed Multi-threshold based CSFF the setup and hold time are calculated in the manner explained in the Figure.9 and Figure.10 respectively.





Fig.9 Setup value of proposed FF 47.705 pS



Fig.10 Hold value of proposed FF -15.158 pS

#### 5. CONCLUSION

The paper describes ultra-low power FF that is designed by making use of the MTCMOS based technique. The proposed MTCMOS based CSFF has very low power compared to the CSFF.

The main motivation for the FF design is reduction of power along with reduction of the unwanted clock node transitions taking place in other flip flop designs. The proposed design reduces the requirement of highly efficient clock distribution system by using single phase clock design. Extensive simulations are performed, and the three flip flops are compared based on various performance parameters like propagation delay PDP(C-Q) and PDP(D-Q), average power, hold and setup time, number of transistors. The conclusion after the simulation and analysis is that the proposed FF designed excels TGFF and CSFF in many performance parameters.

## REFERENCES

- J. Yuan and C. Svensson, "High-speed CMOS circuit technique," IEEE J. Solid-State Circuits, vol. SC-24, no. 1, pp. 62–70, Feb. 1989. J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
- H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop for 63% power reduction," IEEE J. Solid-State Circuits, vol. 33, no. 5, pp. 807–811, May 1998.
- V. Stojanovic and V.G. Oklobdzija, "Comparative analysis of masterslave latches and flip-flops for high-performance and low-power systems," IEEE J. Solid-State Circuits, vol. 34, no. 4, pp. 536–548, Apr. 1999
- 4. J.-C. Kim, S.H. Lee, and H.J. Park, "A low-power half-swing clocking scheme for flip-flop with

- complementary gate and source drive," IEICE Trans. Electronics, vol. E82-C, no. 9, pp. 1777–1779, Sep. 1999.
- 5. N. Nedovic and V.G. Oklobdzija, "Hybrid latch flip-flop with improved bower efficiency," in Proc. Symp. Integr. Circuits Syst. Design, pp. 211–215, 2000.
- 6. V. G. Oklobdzija, "Clocking and clocked storage elements in a multigigahertz environment," IBM J. Res. Develop., vol. 47, pp. 567–584, Sep. 2003.
- A. Hirata, K. Nakanishi, M. Nozoe, and A. Miyoshi, "The cross charge control flip-flop: A low-power and high-speed flip-flop suitable formobile application SoCs," in Symp. VLSI Circuits Dig. Tech. Papers, pp. 306–307, 2005

