Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)



First Advisor

Katy M. Rainey

Second Advisor

William M. Muir

Committee Chair

Katy M. Rainey

Committee Co-Chair

William M. Muir

Committee Member 1

Shaun Casteel

Committee Member 2

Bruce Craig

Committee Member 3

Tobert Rocheford


Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning techniques. A soybean nested association panel (aka. SoyNAM) was the base-population for experiments designed in situ and in silico. We used mixed models and Markov random fields to evaluate phenotypic-genotypic-environmental associations among traits and learning properties of genome-wide prediction methods. Alternative methods for analyses were proposed.