Date of Award

12-2017

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics

Committee Chair

William Cleveland

Committee Co-Chair

Ryan Hafen

Committee Member 1

Bowei Xi

Committee Member 2

Vinayak Rao

Abstract

Statistical visualization of large-scale data has become an increasingly essential task in the era of big data. In particular, exploratory data analysis and visualization is the first step towards any in-depth statistical modeling and analysis. Being able to rapidly specify and generate visualizations regardless of data-scale is crucial. Trellis-cope handles data visualization at scale by attaching cognostics (univariate metrics) to each panel aiding in the organization of panels of interest. While Trelliscope provides a general framework for visualizing data at scale, there are several aspects that can be improved to help users generate displays more rapidly (such as cognostics, axis scales, etc.). When visually modeling complex data with Trelliscope, traditional two-grouped plot matrices do not allow for a mixed-scale axis to display both continuous and discrete data natively. Web-based visualization systems like Trelliscope, that retrieve information from a back-end service such as R, must maximize performance for an engaging user experience. Addressing the mixed-scale plot matrix axis, a generalized plot matrix is developed for two-grouped data which displays both continuous and discrete data using appropriate visualization methods for each panel. To compliment Trelliscope’s panel organization, automatic cognostic summaries are established by mapping the context of what is visualized to classes of metrics that are meaningful for each type of visualization layer at no additional user effort. Finally, communication from web-based visualization systems to back-end R services is greatly improved by leveraging the GraphQL query language which minimizes the number of required data queries needed to perform data extraction. Together, these three contributions curtail the increasing complexity and scale of data visualization.

Share

COinS