Introduction.- Understanding the Application: An Overview of the H.264 Standard.- Discovering the Parallelism: Task-level Parallelism in H.264 Decoding.- Exploiting Parallelism: the 2D-Wave.- Extracting More Parallelism: the 3D-Wave.- Addressing the Bottleneck: Parallel Entropy Decoding.- Putting It All Together: A Fully Parallel and Efficient H.264 Decoder.- Conclusions.
Existing software applications should be redesigned if programmers want to benefit from the performance offered by multi- and many-core architectures. Performance scalability now depends on the possibility of finding and exploiting enough Thread-Level Parallelism (TLP) in applications for using the increasing numbers of cores on a chip.
Video decoding is an example of an application domain with increasing computational requirements every new generation. This is due, on the one hand, to the trend towards high quality video systems (high definition and frame rate, 3D displays, etc) that results in a continuous increase in the amount of data that has to be processed in real-time. On the other hand, there is the requirement to maintain high compression efficiency which is only possible with video codes like H.264/AVC that use advanced coding techniques.
In this book, the parallelization of H.264/AVC decoding is presented as a case study of parallel programming. H.264/AVC decoding is an example of a complex application with many levels of dependencies, different kernels, and irregular data structures. The book presents a detailed methodology for parallelization of this type of applications. It begins with a description of the algorithm, an analysis of the data dependencies and an evaluation of the different parallelization strategies. Then the design and implementation of a novel parallelization approach is presented that is scalable to many core architectures. Experimental results on different parallel architectures are discussed in detail. Finally, an outlook is given on parallelization opportunities in the upcoming HEVC standard.