Home | English | | | UGC | RGC

  Overseas Exchanges for Hong Kong Post Doctoral Researchers

  Civil Society Development in China: The Pivotal Role of Public Interest Lawyering

  Cross-Border Air Pollution Control: Legal and Policy Solutions

  State Channeling of Labor Conflict in China

  Heterogeneous Transfer Learning with Applications to Web Data Mining

  Development of Highly Efficient Video Encoders Based on Spatial-Temporal- View for Multiview Video Systems

  Scalable Continuous Query Processing on Imprecise Location Data 

  What's New

Prof Sam Kwong and his team members in City University of Hong Kong

Multiview video, recorded video sequences using multiple cameras, has attracted much attention recently since it is capable of representing high quality 3D world scene, and provides new visual enjoyments beyond 2D, such as 3D depth impression and interactive selection of arbitrary viewpoint/direction within a certain range of distances. With these features and the technological advancements in display technology, it would allow many new visual media applications, such as photorealistic rendering of 3D scenes, free-viewpoint television (FTV), 3D television (3DTV) broadcasting, and 3D games, to provide exciting functions for users. However, multiview video consists of video sequences (of the same scenario) simultaneously captured by multiple cameras from different angles/locations, resulting in tremendous amounts of data with extremely high temporal and interview redundancies. For example, an 8-view multiview video plus depth (1920x1080@60Hz) compromising to autostereoscopic 3D displays has 5.56 GBytes per second raw data which equivalent to 16 times of the size of a single view video. Moreover, its data volume increases with the number of views and more views would be required for more realistic 3D representation.

Thus, efficient compression technique is vital for the success of multi-view video. Multiview Video Coding (MVC) is highly demanded and developed as an amendment to H.264 MPEG-4 video compression standard in order to achieve better compression efficiency than the independent mono-view coding. However, the coding improvement is at the cost of dramatically increasing encoding complexity due to the facts that the coding units’ (block size) are variable for higher prediction accuracy (i.e. variable-block-size mode) and predictions are from not only temporally related pictures of the same camera, but also pictures of neighboring cameras. It is imperative to design optimization approaches which remove the computational obstacles for MVC. 



In this project, a series of highly efficient and low complexity optimization techniques is developed to overcome the computational problem. It is found that video contents and coding information of the multiview video are highly spatial-temporal-view correlated to each other. Consequently, coding parameters of the current coding unit could be estimated and/or predicted from previously coded information in spatial-temporal-view domain. Thus, these coding parameters do not require for transmission and the complexity of the best parameters’ selection is reduced. 

In addition, we have successfully developed several all-zero block (i.e. all residual coefficients of a block are zero after being encoded) detection algorithms to predict whether coefficients of the current block are all-zero in advance. Then, they can be applied to the complexity intensive coding modules, such as motion/disparity estimation, transform and variable-block-size mode decision, to skip all-zero block coding and other unnecessary memory access. 

Multiview Video Coding for 3D Video System

Additionally, since signals of prediction coding have certain statistical characteristics, e.g. prediction residuals are Gaussian distributed, a series of statistical early termination approaches are developed and successfully applied to multi-reference motion/disparity estimation and mode decision to significantly reduce complexity of the encoder. These novel methods open a new horizon in early termination of MVC, as well as the monoview video coding. Based on these optimization techniques, fast MVC encoders can be designed which makes MVC easily applicable to interactive 3D cinema/TV, FTV, 3D gaming, immersive virtual reality and other multi-view video-based services. In addition, it reduces the cost for industrial realization and production of 3D encoders.

Prof Sam Kwong Tak Wu
Department of Computer Science
City University of Hong Kong