Memory-efficient algorithms for raster document image compression

Maribel Figuera Alegre, Purdue University

Abstract

In this dissertation, we develop memory-efficient algorithms for compression of scanned document images. The research is presented in two parts. In the first part, we address key encoding issues for effective compression of multi-page document images with JBIG2. The JBIG2 binary image encoder dramatically reduces bit rates below those of previous encoders. The effectiveness of JBIG2 is largely due to its use of pattern matching techniques and symbol dictionaries for the representation of text. While dictionary design is critical to achieving low bit rates, little research has been done in the optimization of dictionaries across stripes and pages. In Chapter 1, we propose a novel dynamic dictionary design that substantially reduces JBIG2 bit rates, particularly for multi-page documents. This dynamic dictionary updating scheme uses caching algorithms to more efficiently manage the symbol dictionary memory. Results show that the new dynamic symbol caching technique reduces the bit rate by between 12% and 53% relative to that of the best previous dictionary construction schemes for lossy compression when encoding multi-page documents. We also show that when pages are striped, the bit rate can be reduced by between 2% and 25% by adaptively changing the stripe size. In addition, we propose a new pattern matching criterion that is robust to substitution errors and results in both low bit rates and high encoding speeds. In the second part, we propose a hardware-efficient solution for compression of raster color compound documents. Effective compression of a raster compound document comprising a combination of text, graphics, and pictures typically requires that the content of the scanned document image be first segmented into multiple layers. Then, each layer is compressed as a separate image using a different coding method. Layer-based compression formats such as the mixed raster content (MRC) achieve very low bit rates while maintaining text and graphics quality. However, MRC-based compression algorithms typically require large memory buffers, and are therefore not easily implemented in imaging pipeline hardware. In Chapter 2, we propose a hardware-friendly block-based lossy document compression algorithm which we call mixed content compression (MCC) that is designed to work with conventional JPEG coding using only an 8 row buffer of pixels. MCC uses the JPEG encoder to effectively compress the background and picture content of a document image. The remaining text and line graphics in the image, which require high spatial resolution, but can tolerate low color resolution, are compressed using a JBIG1 encoder and color quantization. To separate the text and graphics from the image, MCC uses a simple mean square error (MSE) block classification algorithm to allow a hardware efficient implementation. Results show that for our comprehensive training suite, the average compression ratio achieved by MCC was 60:1, but JPEG only achieved 35:1. In particular, MCC compression ratios become very high on average (82:1 versus 44:1) for mono text documents, which are very common documents being copied and scanned with all-in-ones. In addition, MCC has an edge sharpening side-effect that is very desirable for the target application.

Degree

Ph.D.

Advisors

Bouman, Purdue University.

Subject Area

Electrical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS