Three Problems in Image Analysis and Processing: Determining Optimal Resolution for Scanned Document Raster Content, Page Orientation, and Color Table Compression

Zhenhua Hu, Purdue University

Abstract

This thesis deals with three problems in image analysis and processing: determining optimal resolution for scanned document raster content, page orientation, and color table compression. Determining optimal resolution for scanned document raster content aims to find an optimal scan resolution for different scan materials. Here the optimal scan resolution means the lowest resolution that keeps all the information of the scan materials. In this way we can save a lot of storage. In this study, the resolutions in question are 300 dpi, 150 dpi, and 75 dpi. We start with 300 dpi since this resolution would keep most scanned pages’ information. 75 dpi is usually the smallest scan resolution that a printer has, and 150 dpi is the resolution in between. We developed an algorithm that extracts features and use SVM to find the optimal scan resolution. The features include tile standard deviation (STDDEV) structural similarity index measure mean (tile-STDDEV SSIM), tile STDDEV structural similarity index measure STDDEV (tile-STDDEV SSIM STDDEV), sample power spectrum MSE, and spatial activity, edge density, and edge contrast. These features can reflect the truthfulness between high-resolution images (references) and their low-resolution counterparts and the intrinsic changes from the high resolution to low resolutions. By feeding these feature into support vector machine (SVM) classifier, we can have a prediction accuracy of 93.4%. Determining the scan page orientation can spare people from manually aligning printed pages before using a scanner. In this thesis, we propose an algorithm based on hand-crafted features and SVM. The features include vertical document vector (VDV), horizontal document vector (HDV), zonal density vector (ZDV) and profile document vector (PDV). Concatenating them together, we can have a feature vector for the document page. The feature vectors are then fed into SVM for training and predicting. This algorithm could work on multiple scripts, including Chinese, Devanagari, Japanese, Korean, Numeral, English, French, German, Greek, Italian, Portuguese, Russian, and Spanish. In our algorithm we detect the script first, and then the orientation of the page. We also build a script detection hierarchy based on the structure similarities of different scripts. Experimental results show that the overall script accuracy is 98.2%, and the overall orientation accuracy for all scripts is 99.2%. Color Management plays an important role in color reproduction and transformation of color information among various devices. Device profiles, such as Color look-up tables (LUTs), provide color management systems with the information necessary to convert color data between native device color spaces and device-independent color spaces. LUTs are often embedded in color documents to achieve color fidelity between different devices. The size of color tables will also increase with finer sampling of the spaces and larger bit depths. Thus, a method to compress LUTs is desirable for the purpose of conserving memory and storage, and also reducing network traffic and delay. In this dissertation, we propose a 1D color table lossless compression method based on discrete-time transformation (DCT). The compressed data consists of four files: the rounded quantized DCT coefficients for the color table, the residue table whose values are the difference of the original color tables and the initial reconstructed color tables, the coefficients bit assignment tables (CBAT) and the residue bit assignment tables (RBAT) that we proposed for quantized DCT coefficients and residue table, respectively.

Degree

Ph.D.

Advisors

Allebach, Purdue University.

Subject Area

Artificial intelligence|Computer science|Mathematics

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS