Date of Award

5-2018

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer and Information Technology

Committee Chair

William G. McCartney

Committee Co-Chair

Joseph P. Robinson

Committee Member 1

Bartek Rajwa

Committee Member 2

John A. Springer

Abstract

This paper introduces a novel approach for assessing multiple patterns in biological imaging datasets. The developed tool should be able to provide most probable structure of a dataset of images that consists of biological patterns not encountered during the model training process. The tool includes two major parts: (1) feature learning and extraction pipeline and (2) subsequent clustering with estimation of number of classes. The feature-learning part includes two deep-learning techniques and a feature quantitation pipeline as a benchmark method. Clustering includes three non-parametric methods. K-means clustering is employed for validation and hypothesis testing by comparing results with provided ground truth. The most appropriate methods and hyper-parameters were suggested to achieve maximum clustering quality. A convolutional autoencoder demonstrated the most stable and robust results: entropy-based V-measure metric 0.9759 on a dataset of classes employed for training and 0.9553 on a dataset of completely novel classes.

Share

COinS