Multivariate Statistical Analysis and Machine Learning Methods to Predict Grain Yield in Barley (Hordeum vulgare L.) in Dry Regions of Fars Province

Document Type : Original Research Article

Authors

Department of Agroecology, College of Agriculture and Natural Resources of Darab, Shiraz University, Iran

Abstract

This study aimed to identify the importance of farm management variables that affect grain yield in barley. Barley is a significant cereal crop that farmers typically cultivate in poor, saline, and dryland regions around the world. Data corresponding to 15 agronomic variables and grain yield were collected from 104 farms in southern parts of Fars Province, Iran. Multivariate statistical analysis (stepwise linear regression, correlation, Principal component analysis (PCA)) and machine learning modeling techniques, such as support vector regression (SVR) models and partial least squares regression (PLSR), were applied to agronomic and farm management variables influencing barley grain yield under dry regions of south parts of Fars Province. The results of multivariate statistical analysis showed that barley grain yield had positive correlations with most of the studied variables except for pest damage, disease damage, number of weeds m-2, seeding depth, and salinity level. The highest positive correlation coefficients for grain yield in this study were obtained between grain yield and irrigation (0.860**). The results of stepwise regression analysis showed that irrigation (x4), salinity level (x11), Phosphorous fertilizer application (x14), and weeds infestation percentage (x8), justified the maximum grain yield in barley. The results of the 3 statistical modeling methods were close to each other and the highest R2 (0.79) belonged to the stepwise linear regression method.

Keywords

Main Subjects