Date of this Version



ReliefF, Ensemble learning, Support vector machine, Neural networks basil, Kale


Cadmium (Cd) is a toxic element that can accumulate in edible plant tissues and negatively impact human health. Traditional Cd quantification methods are time-consuming, expensive, and generate a lot of toxic waste, slowing development of methods to reduce uptake. The objective of this study was to determine whether hyperspectral imaging (HSI) and machine learning (ML) can be used to predict Cd concentrations in plants using kale (Brassica oleracea) and basil (Ocimum basilicum) as model crops. The experiments were conducted in an automated phenotyping facility where all environmental conditions except soil Cd concentration were kept constant. Cd concentrations were determined at harvest using traditional methods and used to train the ML models with data collected from the imaging sensor. Visible/near infrared (VNIR) images were also collected at harvest and processed to calculate reflectance at 473 bands between 400 to 998 nm. All reflectance spectra were subject to the feature selection algorithm ReliefF and Principal Component Analysis (PCA) to generate data and provide input to evaluate three ML classification models: artificial neural network (ANN), ensemble learning (EL), and support vector machine (SVM). Plants were categorized according to Cd concentrations higher or lower than the safety threshold of 0.2 mg kg−1 Cd. Wavelengths with the highest ranks for Cd detection were between 519 and 574, and 692 and 732 nm, indicating that Cd content likely altered the plants’ chlorophyll content and altered leaf internal structure. All models were able to sort the plants into groups, though the model with the best F1 score was the ANN for the validation subset that utilized reflectance from all wavelengths. This study demonstrates that HSI and ML are promising technologies for the fast and precise diagnosis of Cd in leafy green plants, though additional studies are needed to adapt this approach for more complex field environments.


This is the published version of the Souza A, Rojas MZ, Yang Y, Lee L, Hoagland L. Classifying cadmium contaminated leafy vegetables using hyperspectral imaging and machine learning. Heliyon. 2022 Dec 14;8(12):e12256. doi: 10.1016/j.heliyon.2022.e12256.

For the original published source visit