Document image segmentation and compression

Hui Cheng, Purdue University

Abstract

In the first part of this research, we propose an image segmentation algorithm called the trainable sequential MAP (TSMAP) algorithm. The TSMAP algorithm is based on a multiscale Bayesian approach. It has a novel multiscale context model which can capture complex aspects of both local and global contextual behavior. In addition, its image model uses local texture features extracted via a wavelet decomposition, and the textural information at various scales is captured by a hidden Markov model. The parameters which describe the characteristics of typical images are extracted from a database of training images and their accurate segmentations. Once the training procedure is performed, scanned documents may be segmented using a fine-to-coarse-to-fine procedure that is computationally efficient. In the second part of this research, we introduce a multilayer compression algorithm for document images. This compression algorithm first segments a scanned document image into different classes, then compresses each class using an algorithm specifically designed for that class. We also propose a rate-distortion optimized segmentation (RDOS) algorithm developed for document compression. Compared with the TSMAP algorithm, the RDOS algorithm can often result in a better rate-distortion trade-off, and produce more robust segmentations than TSMAP by eliminating those misclassifications which can cause severe artifacts. Experimental results show that, at similar bit rates, the multilayer compression algorithm using RDOS can achieve a much higher subjective quality than well-known coders such as DjVu, SPIHT, and JPEG.

Degree

Ph.D.

Advisors

Bouman, Purdue University.

Subject Area

Electrical engineering|Computer science|Statistics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS