Can a scalable system be built to capture web page thumbnails?

Feras Hirzalla, Purdue University

Abstract

Screenshots of web pages are a novel user interface improvement that can be used in various systems. Several software packages can be used to generate a visual representation of a web page, but none of these packages are designed to capture web pages in a scalable way. The goal of this research is to build and test a web page thumbnailer that can simultaneously capture screenshots of multiple web pages using multiple threads, thus maximizing the throughput of web pages that can be processed per unit of time. From the data collected during tests, thread count usage recommendations can be made. To test the thumbnailer, the author used a sample set of 30,000 web pages. Web page sizes were collected from the sample set, and 10,000 web pages were assigned to small, medium, and large categories. The thumbnailer was timed for different thread counts for each of the size categories. The data showed that as more capture threads were added, the time it took to generate visual representations of web pages decreased.

Degree

M.S.

Advisors

Springer, Purdue University.

Subject Area

Information Technology

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS