New algorithms for video compression, image classification and image enhancement

Kai-Lung Hua, Purdue University

Abstract

In the first topic, we propose the use of large dictionaries of tilings for video compression. We construct a rate-distortion cost function that admits fast search algorithms to select the optimal tiling for the motion compensation stage of a video coder. The computation of the cost is enabled through novel algorithms to approximate the bit rate and the distortion. We propose efficient arithmetic coding algorithms to encode the selected tiling. We illustrate the effectiveness of our approach by showing that a video coder utilizing one of the proposed tiling selection methods results in up to 16% savings in bit rate for several standard video sequences as compared to H.264/AVC. In the second topic, we present an automatic method for classifying a document image using a probabilistic decision strategy. Our algorithm is tailored to inexpensive hardware and significantly reduces both the running time and memory requirements compared to our previously proposed algorithms of [1, 2], while substantially improving the classification accuracy. In addition, we develop a new classification module to help avoid moir´e patterns by identifying periodic halftone noise. The third topic is of JPEG-XR post-processing. JPEG-XR is a very promising, recently adopted still-image compression standard that has complexity comparable to JPEG and quality comparable to JPEG-2000. We investigate a subtle artifact of JPEG-XR coded images that arises due to the nonlinearity of transforms used by JPEG-XR. The artifact will typically not be visible in natural images; however, it is important for document images and needs to be taken into account by document image processing pipelines. We present a post-processing method to suppress this artifact. Our method consists of a region-of-interest locator and a filter. Experimental results on several document images show that this method yields up to 4.20dB improvements of the peak signal-to-noise ratio (PSNR) and up to 0.0136 improvements of the mean structural similarity (MSSIM) index.

Degree

Ph.D.

Advisors

Pollak, Purdue University.

Subject Area

Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS