Information retrieval and knowledge management in catalyst chemistry discovery environments

Balachandra B Krishnamurthy, Purdue University

Abstract

Computers are increasingly being used to manage the deluge of experimental and computational data that is currently being generated in chemical research. Chemistry related data comes in different forms including text, molecules, diagrams, tables etc. Each type of data needs a different way of handling. There are also many ways in which chemists query the data. Again, each type of query requires a different approach. Chemists search data both as approximate searches (keywords, structure similarity) and as exact searches by categorizing them into different classes of molecules. The methods for handling one type of data and/or queries are not suitable for handling another type of data and/or query. In this work, we look at three ways of handling data and answering non traditional forms of queries. In the first part, we look at how to search chemical information present in textual data such as journal articles. We show the feasibility of statistical techniques for identifying the chemical entities present in a text document. In the second part, we look at how to capture the classification hierarchy that a chemist might use to categorize the molecules of interest. We demonstrate an efficient rule based expert system that can classify molecules into different categories based on user defined rules. The classification information is then used as a search criterion for searching molecules. In the third part, we look at how to search for molecules that are similar in structure to a given molecule. We apply information retrieval techniques for computing the similarity of two molecules and also incorporate user feedback to improve search performance.

Degree

Ph.D.

Advisors

Caruthers, Purdue University.

Subject Area

Chemical engineering

Off-Campus Purdue Users:
To access this dissertation, please log in to our
proxy server
.

Share

COinS