A probabilistic network-based mechanism for multimedia database searching and data warehousing

Mei-Ling Shyu, Purdue University

Abstract

A good multimedia database management system ( MDBMS) should be able to store, retrieve, and manage rich semantic data in multimedia database systems. Data can be stored not only in standardized databases but also in object repositories, knowledge bases, file systems, document retrieval systems, multimedia databases, and so on. Due to the complexity of real-world applications, the number of databases and the volumes of data in databases have increased tremendously. With the explosive growth in the amount and complexity of data, how to effectively manage the network of databases and utilize the large amount of data becomes important. For this purpose, a probabilistic network-based mechanism for constructing a federation of data warehouses and speeding up information retrieval to facilitate the functionality of an MDBMS is proposed. Our solution procedure consists of three steps. First, we build the probabilistic network by reasoning the probability distributions and mining the generalized affinity-based associations from a set of historical data collected from the network of operational databases. By doing so, the summarized and useful knowledge can be discovered. Second, we derive a similarity measure method to construct a federation of data warehouses so as to reduce the number of inter-warehouse accesses required for queries. Those databases with high similarity values are placed in the same data warehouse. The similarity value is measured via a stochastic process from the mined probability distributions. Third, a second stochastic process generates a list of possible paths with respect to a given query and specifies the particular media objects over the constructed data warehouses so as to speed up multimedia query processing and information retrieval. To illustrate these benefits, our approach has been implemented and empirical studies on real databases are presented. Metrics for measuring the performance of the proposed mechanism are presented and the effectiveness of the system is thereby evaluated. The empirical study results show that the probabilistic reasoning and data mining processes lead to a better federation of data warehouses and reduce the cost of query processing.

Degree

Ph.D.

Advisors

Kashyap, Purdue University.

Subject Area

Computer science

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS