Discovery Undergraduate Interdisciplinary Research Internship

Crop Yield Prediction at Multiple Spatial Scales with Statistical Machine Learning

Abstract

Understanding and accurately predicting crop yield is becoming increasingly important today in the face of global food security challenges, and thus, the availability of standardized data and scalable models is the need of the hour. To support this, researchers have developed CY-Bench (Crop Yield Benchmark), a comprehensive dataset that helps forecast maize and wheat yields on a global scale. This research project primarily involved working with the CY-Bench dataset aiming to improve crop yield prediction through machine learning. Initially, papers explaining the CY-Bench dataset and other papers for agriculture modeling were studied and analyzed in detail. The research then progressed to reproducing the benchmark results, showcasing the accessibility of the dataset. Building on this foundation, the research then progressed to developing new models such as regression trees and incorporating new derived features such as Leaf Area Index and Evapotranspiration in existing models. To streamline development, a subset focusing on Tippecanoe County was isolated from the broader US dataset. The results showed a reduction in MAPE for 2 out of the 3 models which were integrated with the newly engineered features. These initial outcomes are very promising, and there remains scope for further improvement. Including new features and experimenting with advanced models could potentially help improve the accuracy of the predictions.

Keywords

Crop Yield Prediction, Machine Learning, Feature Engineering, CY-Bench Data

Date of this Version

8-4-2025

Recommended Citation

Charan, Vaibhav and Poudel, Pratishtha, "Crop Yield Prediction at Multiple Spatial Scales with Statistical Machine Learning" (2025). Discovery Undergraduate Interdisciplinary Research Internship. Paper 66.
https://docs.lib.purdue.edu/duri/66

Download

Included in

Agriculture Commons, Computer Sciences Commons, Data Science Commons, Statistical Models Commons

COinS

Discovery Undergraduate Interdisciplinary Research Internship

Crop Yield Prediction at Multiple Spatial Scales with Statistical Machine Learning

Abstract

Keywords

Date of this Version

Recommended Citation

Included in

Search

Links

Links for Authors

Browse

Links

Discovery Undergraduate Interdisciplinary Research Internship

Crop Yield Prediction at Multiple Spatial Scales with Statistical Machine Learning

Author

Abstract

Keywords

Date of this Version

Recommended Citation

Included in

Share

Search

Links

Links for Authors

Browse

Links