Model Selection for Gaussian Mixture Models for Uncertainty Qualification

Yiyi Chen, Purdue UniversityFollow
Guang Lin, Purdue UniversityFollow
Xuan Liu, Purdue UniversityFollow

Keywords

Gaussian mixture models, Model selection, EM algorithm, Penalized likelihood

Presentation Type

Event

Research Abstract

Clustering is task of assigning the objects into different groups so that the objects are more similar to each other than in other groups. Gaussian Mixture model with Expectation Maximization method is the one of the most general ways to do clustering on large data set. However, this method needs the number of Gaussian mode as input(a cluster) so it could approximate the original data set. Developing a method to automatically determine the number of single distribution model will help to apply this method to more larger context. In the original algorithm, there is a variable represent the weight of each cluster. The weight means how the cluster will affect the data set, more precisely, each data point. So the idea is, we first set the number of the clusters to be a big number, then we are going to apply a penalized likelihood method to update the weights, while we are updating other parameters. The cluster will be deleted if its weight is less than a certain number we set. After all the iteration, the number of clusters will be generated, as well as other parameters of Gaussian model. The results from the simulation(MATLAB) shows that the number of the clusters could be generated from the modified method, and the final result of the clustering perform well to demonstrate the original data set. Although the modified algorithm could be used automatically do the whole clustering process, it need further investigation about its accuracy and improve its speed.

Session Track

Data: Insight and Visualization

Recommended Citation

Yiyi Chen, Guang Lin, and Xuan Liu, "Model Selection for Gaussian Mixture Models for Uncertainty Qualification" (August 6, 2015). The Summer Undergraduate Research Fellowship (SURF) Symposium. Paper 25.
https://docs.lib.purdue.edu/surf/2015/presentations/25

Download

Included in

Categorical Data Analysis Commons, Multivariate Analysis Commons, Statistical Models Commons

COinS

Aug 6th, 12:00 AM

Model Selection for Gaussian Mixture Models for Uncertainty Qualification

The Summer Undergraduate Research Fellowship (SURF) Symposium

Model Selection for Gaussian Mixture Models for Uncertainty Qualification

Keywords

Presentation Type

Research Abstract

Session Track

Recommended Citation

Included in

Search

Links

Links for Authors

Browse

SURF Supporting Colleges

The Summer Undergraduate Research Fellowship (SURF) Symposium

Model Selection for Gaussian Mixture Models for Uncertainty Qualification

Author List

Keywords

Presentation Type

Research Abstract

Session Track

Recommended Citation

Included in

Share

Search

Links

Links for Authors

Browse

SURF Supporting Colleges