# Design of CABAC Entropy Decoding, Inverse Quantization, Inverse Transform Blocks of H.264 Decoder using Verilog HDL

# H. N. Shwethashree, C. Kanagasabhapati, Siva S Yellampalli

Abstract: Video compression/decompression is a term used to define a method for compressing or decompressing digital video. There are different video compression /decompression standards. H.264 which is also known as Advanced Video Coding is a paradigm, most commonly used these days. The objective of developing H.264 is to devise a standard which can impart good quality of video at a nether bit rates as compared to previous standards. To improve compression efficiency, H.264 encoders/decoders uses complex algorithms and modes which require more power. From studies we know that software-based implementations of these encoders/decoders on DSPs and CPUs suffer by few real time limitations. Hence hardware implementation of video encoders and decoders are necessary. Here in this paper Entropy decoding, Inverse Transform and Inverse Quantization blocks of H.264 Video decoder are designed using Verilog and simulated by Xilinx ISE Simulator.

Index Terms: H.264, CABAC, Inverse Quantization, Inverse Transform, Xilinx ISE.

## I. INTRODUCTION

Video compression techniques play predominant part in achieving efficient transmission and storing substantial multimedia data where there is a limitation of storage space and bandwidth. H.264 is a new compression standard that is used widely in HD applications these days. H.264 is also called MPEG 4 Part10 or AVC. This Standard is developed to fulfil growing needs of market. High definition Videos is becoming common these days and there is a strong need of storing and transmitting those high definition data more efficiently. H.264 aims to achieve high compression rate while ensuring the transmission of high-quality images at both higher and lower bit rates. H.264 is codec which is block based, meaning every frame is split into small blocks as macro blocks. Coding Tools are applied on macro blocks instead of complete frame because of this the computational complexity is reduced and motion prediction accuracy is improved [2]. Pixels are smallest addressable unit of a bitmapped image.

Each pixel is a set of 3 integers: Red, Green and Blue or is set of Luminance, colour difference in Blue and colour difference in Red. As human eye is more vulnerable to

## Revised Manuscript Received on July 22, 2019.

- H. N. Shwethashree, VLSI design and Embedded Systems, UTL Technologies Ltd, Bangalore, Karnataka, India.
- C. Kanagasabhapati, VLSI design and Embedded Systems, UTL Technologies Ltd, Bangalore, Karnataka, India.
- **Dr. Siva S Yellampalli**, VLSI design and Embedded Systems, UTL Technologies Ltd, Bangalore, Karnataka, India.

brightness compared to colour, YCbCr colour model is used in H.264 standard [1].



Fig.1: Elements of Video Sequence

H.264 standard has several profiles. Each profile uses coding tools which is defined by H.264 compression Standard. Tools are nothing, but algorithms used for video encoding and decoding. Baseline profile is the simplest profile and is supported by almost all decoders, it is useful for real-time applications, where the decoder should run quickly. It supports I and P slices, and it uses CAVLC for Entropy encoding or decoding. Extended Profile is a streaming video profile. It uses B slices and interlaced video coding is also supported, this profile is always targeted at streaming videos. It supports special slices such as SI and SP which is designed for streaming, because of these special slices the server can switch between different bitrate streams whenever needed.



Fig.2: H.264 Profiles

Main Profile is a mainstream consumer profile which is used for broadcast and storage applications. I, P and B Slices are supported in main profile. CABAC as well as CAVLC coding techniques is also supported in main

profile. For Non-HD television broadcast, main profile decoder is used. High Profile is an advanced



version of main profile, it offers high compression ratio compared to all other profiles, at a slightly higher implementation complexity and computational cost. It is used for HDTV Broadcasts [4]. H.264 standard in general supports two entropy coding techniques, they are CAVLC and CABAC. Here we emphasis on H.264 Video Decoder that uses CABAC for Entropy coding. CABAC attains appreciative compression rate compared to CAVLC.

#### II. LITEARTURE SURVEY

From [1], explains about basic concepts of Video compression, Pixels, tells about basic blocks of H.264, few points on introduction of H.264 video standard. It also deals with configuration of h.264 decoder by Verilog HDL. From [2], gives an understanding about overview of H.264 standard - Elements of video sequence, Different Frame types of H.264, H.264 Coding Tools, and comparison of H.264 with previous standards, Profiles and levels of H.264. It also provides information about all blocks of H.264 decoder. From [3], gives information about RTL model of entropy decoder for AVC standard. Design verified on Vertex5 FPGA board ML506 and verified by AVC software JM 12.2. From [4], tells about scalable VC extension of AVC standard, scalability types, applications and requirements, history of SVC, H.264/AVC basics, concepts of extending AVC to SVC standard and SVC high level design.

From [5], emphasizes about optimizing HEVC decoder for mobile devices. Solutions and performance discussed. Complexity of HEVC decoder evaluated and most demanding modules are optimized with SIMD. From [6], gives information about CABAC in AVC standard. CABAC is presented as part of new ITU-T/ISO/IEC standard. From [8] and [13], gives information about implementing forward transform and quantization, inverse transform and quantization algorithms used in AVC standard. Hardware is designed by Verilog HDL for LP AVC system used in easy to carry applications. From [9], AVC forward and inverse quantization operations are presented. Architecture is based on a highly flexible structure suited for efficient implementations which use FPGA and ASIC technologies. From [10], gives information about combined kernel architecture which decode residual data in AVC baseline decoder efficiently. Architecture of kernel consists of CAVLC decoder, Inverse Quantization (IQ), and Inverse Transforms (IT) units.

From [11], explains software, hardware combined architecture of AVC decoder. Software of decoder implemented using NIOS II processor on FPGA board. Software, hardware mixed architecture was proposed to make speed of decoder output better. From [12], a handbook on AVC Compression Standard, which provides complete information on H.264 and its importance, Formats of Video, Quality of video, video coding concepts, H.264 – basics, Syntax, Prediction, Transform and Coding, conformance, transport and licensing, performance and Extensions and directions. From [14], presents about transform and quantization designs in AVC. 4x4 transforms in H.264 can be evaluated in integer arithmetic so that mismatch problems in inverse transform is avoided.

#### III. H.264 VIDEO DECODER

Entropy Decoding decompresses the bit stream and it will give a set of quantized coefficients. These elements will be used by further steps to obtain final output. Inverse Quantization will be applied to quantized coefficients after Entropy Decoding. Quantization is main reason for video coding to be lossy. Usually error will be very small, and it cannot be noticed by human vision. Inverse transformation will be applied to co-efficient which are reconstructed to gain residual data. The predicted signal can be produced by Intra prediction or Inter Prediction.



Fig.3: H.264 Video Decoder Diagram

Predicted signal in Intra prediction is created using samples from same picture. In Inter prediction, signal is anticipated from data of different frames, which are known as reference frames by utilizing Motion Compensation. Motion Compensation is very exorbitant function with motive to produce temporal predictor which will be appended to interpreted prediction error to produce reconstructed block. Temporal predictor is produced from previously decoded images known as reference frames. Blocking Artefacts are caused due to partitioning of each frame into Macro blocks, hence a Deblocking filter is used to ameliorate the visual standard of final output [5].

## IV. IMPLEMNTATION

# A. CABAC Entropy Decoder

CABAC Algorithm:

- 1. Context & Probability Modelling
- 2. Binary Arithmetic Decoding
- 3. De-Binarization



Fig.4: Block Diagram of CABAC

H.264 defines a large amount of context information which are associated with syntax elements. Here, Context Modeler will store, and it will update context model (which is also known as probability model) based on data which is recently decoded. Binary arithmetic decoding will decode binary symbols of bit stream and for this it will use given context. De-Binarizer will map symbols which are decoded into syntax elements. Each binary symbol which is decoded is called as bin and binldx will indicate bit position of bins which are decoded [3][6][7].

## **B.** Inverse Ouantization

H.264 uses scalar quantization. Quantization parameter (Qp) is used for achieving high bit rate and performance, it ranges from 0 to 51 so we can have a wide range of quantization step size for fine bite rate and for quality adjustment. Qp can be defined either for video, frame, slice and for each macro block also [8] [9] [10] [11].

Inverse quantization is a multiplication operation and equation for the same is as below

$$Wij = Zij * Qstep$$

$$Wij = Zij * Vij* 2^{floor} (Qp/6)$$

Table 1: Formulas for calculating Inverse Quantization

Zij - Inverse Quantization input

Wij - output

Vij - rescaling factor

**Qstep - Quantization factor** 

**Qp** - Quantization Parameter

## C. Inverse Transform

H.264 employs Inverse Integer Cosine Transform. Its attributes are alike 4x4 inverses DCT. ICT is a matrix multiplication like Discrete Cosine Transform but uses distinct matrix coefficient values.



Fig.5: DC coefficients positions in a macro block

For every 4x4 block, Inverse Transform will be applied. A Suppliant Hadamard transform is appending for DC Coefficients, for 16x16 Intra prediction mode. For a 16x16 Intra coded macro block, most of energy is concerted in DC coefficients. This transform aids to de-correlate DC coefficients to take benefit of correlation among coefficients [8] [9] [10] [11]. Also, an Inverse DC Quantization is applied on DC Matrix.

## V. SIMULATION RESULTS

Figures below shows simulation results of some blocks configured in project. Blocks are designed using Verilog HDL and simulated using Xilinx ISE Simulator.





Fig.6: Top Module and simulation waveform of Normal Decoding Process





Fig.7: Top Module and simulation waveform of Bypass Decoding







Fig.8: Top Module and simulation waveform of Final Decoding Process





Fig.9: Top Module and simulation waveform of Context Variable Initialization







Fig.10: Top Module and simulation waveform of Inverse Quantization





Fig.11: Top Module and simulation waveform of Inverse Transform

|                                      | Device Utilization Summary - xc7vx330t-3.4fg1157 |                    |                   |                                       |                         |                      |                    |                    |                   |                                       |                         |                      |                    |                    |                   |                                       |                         |                      |
|--------------------------------------|--------------------------------------------------|--------------------|-------------------|---------------------------------------|-------------------------|----------------------|--------------------|--------------------|-------------------|---------------------------------------|-------------------------|----------------------|--------------------|--------------------|-------------------|---------------------------------------|-------------------------|----------------------|
| Logic Utilization                    | Used                                             |                    |                   |                                       |                         |                      | Available          |                    |                   |                                       |                         |                      | Utilization        |                    |                   |                                       |                         |                      |
|                                      | CABAC                                            |                    |                   |                                       |                         |                      | CABAC              |                    |                   |                                       |                         |                      | CABAC              |                    |                   |                                       |                         |                      |
|                                      | Normal<br>Decoding                               | Bypass<br>Decoding | Final<br>Decoding | Contest<br>Variable<br>Initialization | Inverse<br>Quantization | Inverse<br>Transform | Normal<br>Decoding | Bypass<br>Decoding | Final<br>Decoding | Context<br>Variable<br>Initialization | Inverse<br>Quantization | Inverse<br>Transform | Normal<br>Decoding | Bypass<br>Decoding | Final<br>Decoding | Context<br>Variable<br>Initialization | Inverse<br>Quantization | Inverse<br>Transform |
| Number of Slice<br>Registers         | 11                                               |                    |                   | 14                                    | 22                      | -                    | 408000             |                    |                   | 408000                                | 408000                  |                      | 0%                 |                    |                   | 0%                                    | 0%                      |                      |
| Number of Slice<br>LUTs              | 358                                              | 178                | 255               | 6280                                  | 548                     | 1888                 | 204000             | 204000             | 204000            | 204000                                | 204000                  | 204000               | 0%                 | 0%                 | 0%                | 3%                                    | 0%                      | 0%                   |
| Number of fully used<br>LUT-FF pairs | 11                                               | 0                  | 0                 | 14                                    | 4                       | 0                    | 358                | 178                | 255               | 6280                                  | 566                     | 1888                 | 3%                 | 0%                 | 0%                | 0%                                    | 0%                      | 0%                   |
| Number of bonded<br>IOBs             | 113                                              | 91                 | 100               | 51                                    | 523                     | 516                  | 600                | 600                | 600               | 600                                   | 600                     | 600                  | 18%                | 15%                | 16%               | 8%                                    | 87%                     | 86%                  |
| Number of BUFG/<br>BUFGCTRL/BUFHCEs  | 1                                                | 1                  | 1                 | 1                                     | 1                       | 1                    | 200                | 200                | 200               | 200                                   | 200                     | 200                  | 0%                 | 0%                 | 0%                | 0%                                    | 0%                      | 0%                   |
| Number of DSP48E1s                   | -                                                | -                  | -                 | 1                                     | 16                      | -                    | -                  | -                  | -                 | 1120                                  | 1120                    | -                    | -                  | -                  | -                 | 0%                                    | 1%                      | -                    |

Fig.12: Device Utilization summary of Configuration

# VI. CONCLUSION & FUTURE SCOPE

Here in this paper we have done design of Entropy Decoding, Inverse Quantization, Inverse Transform blocks of H.264 Video Decoder using Verilog. In This project work we have used CABAC algorithm for entropy decoding and for inverse transform we use Inverse Integer transform. As we know CABAC attains appreciative compression rate compared to

CAVLC [6] and Inverse Integer transform provides more accuracy and very easy



to implement as compared to Inverse Discrete Cosine Transform [10]. The configuration which we have done in this paper is an effective one and configured designs are simulated using Xilinx ISE simulator. This work presents design of only three blocks of H.264 Video Decoder using Verilog. As part of future Scope, remaining blocks of H.264 Video decoder can be designed using Verilog and architecture of these blocks can be implemented in ASIC Environment.

#### REFERENCES

- Shriram K Vasudevan, Subashri V, Sasikumar P, "H.264 Decoder Design using Verilog", International Journal of Computer Applications, Volume 1 - No.4, pp. 0975 – 8887, 2010.
- B. Juurlink et al., "Scalable Parallel Programming Applied to H.264/AVC Decoding", Springer Briefs in Computer Science, DOI 10.1007/978-1-4614-2230-3\_2, © The Author(s) 2012.
- Yi-Tsen Chen, Chun-Jen Tsai, "Design of a Unified Entropy IP for H.264 CAVLC/CABAC Decoding", National Chiao Tung University,
- Detlev Marpe, Thomas Wiegand and Gary J. Sullivan, "The H.264/MPEG4 Advanced video coding standard and its Applications", IEEE Communications Magazine, vol.44, no.8, pp.134-144, Aug.2006.
- M. Bariani, P. Lambrushchini, M. Raggio, L. Pezzoni, "An optimized SIMD implementation of the HEVC/H.265 video decoder", IEEE Trans. On Wireless Telecommunications Symposium (WTS), pp:1934-5070, June. 2014.
- D. Marpe, T. Wiegand and H. Schwarz, "Context-based adaptive binary arithmetic coding in the H.264/AVC Video compression standard", IEEE Trans. on circuits and systems for video technology, vol.13, no. 7, pp.620-636, July.2003.
- Varan Jain, Manisha Ingle, "Hardware Software co-design for CABAC Entropy Decoder", International Conference on Computation Technologies (ICICT), 26-27 Aug.2016.
- Ozgur Tasdzen, Ilker Hamzaoglu, "A high performance and low-cost hardware architecture for H.264 Transform and Quantization algorithms", Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul, Turkey, 13<sup>th</sup> European Signal Processing Conference, 2005.
- Tiago Dis, Luis Rosario, Nuno Roma, Leonel Sousa, "High Performance Unified Architecture for Forward and Inverse Quantization in H.264/AVC", 15th Euro micro Conference on Digital System Design, 2012.
- Yi-Chih Cao, Shih-Tse Wei, Bin-Da Liu, Jar-Ferr Yang, "Combined CAVLC Decoder, Inverse Quantizer, and Transform Kernel in Compact H.264/AVC Decoder", IEEE Transactions on Circuits and Systems for Video Technology, Volume: 19, Issue: 1, Jan. 2009.
- Taheni Damak, Hassen Loukil, Ahmed Ben Atitallah, Nouri Masmoudi, "Software and Hardware Architecture of H.264/AVC Decoder", International Journal of Computer Applications (0975-8887), Volume: 59, No. 19, Dec. 2012.
- Richardson, I.E.G, "H.264/AVC and MPEG4 video compression. Video Coding for Next Generation Multimedia", Wiley editor, 2003.
- 13. G. Dileep Vamshi, P. Ramakrishna," VLSI Implementation of H.264 Transform and Quantization Algorithms", International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064, Volume: 2, Issue 3, March. 2013.
- Henrique S. Malvar, Antti Hallapuro, Marta Karczewicz, Louis Kerofsky," Low Complexity Transform and Quantization in H.264/AVC", IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, no. 7, July. 2003.

## **AUTHORS PROFILE**



Shwethashree. H. N. received her Bachelor of Engineering degree in Instrumentation Technology from Visvesvaraya Technological University (VTU), Belgaum, Karnataka, India in 2009. She has 6.5 years of work experience in the field of Industrial Automation. She is currently pursuing her Master of Technology in VLSI Design and Embedded Systems from VTU Extension Centre,

UTL Technologies Limited, Bangalore.



Mr. C. Kanagasabapathi, currently working as Assistant Professor in department of VLSI Design and Embedded Systems at VTU Extension Centre, UTL Technologies Limited, Bangalore. He has 30 years of work experience in various Industries. He has published 20 research papers in various journals and conferences.



Dr. Siva. S. Yellampalli obtained his MS & Ph.D from Louisiana State University. He is currently with VTU Extension Centre, UTL Technologies Ltd. He worked on a broad range of research topics including Very Large Scale Integration (VLSI), mixed signal circuits/systems development, micro-electromechanical systems (MEMS), and integrated carbon nanotube based sensors. He has published a book in the area of

mixed signal design, and edited two books on carbon nano tubes. He also published multiple Journal papers & IEEE Conference papers in these areas of research. In addition he has given many professional presentations including invited talks. He has been a consultant to a variety of industries and acts as a reviewer for technical journals and book publishers. Currently he is the Education Activities Chair, Executive Committee IEEE Bangalore Section. He is also a life member of Indian Society for Technical Education (ISTE), senior member, International Association of Computer Science and Information Technology (IACSIT), VLSI Society of India (VSI) and National Society of Collegiate Scholars (NSCS).