Crop yield estimations are
important for national food security, people, and the environment. Timely and
accurate estimation of crop yield at the field scale is of great significance
for crop management, harvest and trade. It ultimately enables farmers to
optimize inputs and economic return. We selected an irrigated wheat field in a
region near Kaifeng, Henan province, for this study. The terrain in that region
is undulating and spatial differences. We used a low-altitude unmanned aerial
vehicle (UAV) remote sensing platform equipped with a multi-spectral camera,
thermal infrared camera, and RGB camera to simultaneously obtain different
remote sensing parameters during the key growth stages of wheat. Based on the
extracted spectral reflectivity, thermal infrared temperature, and digital
elevation information, we calculated the spatial variability of remote sensing
parameters, and growth indices under different terrain characteristics. We also
analyzed the correlations between vegetation indices, temperature parameters
and wheat yield. By means of four machine learning methods, including multiple
linear regression method (MLR), partial least squares regression method (PLSR),
support vector machine regression
method (SVR), and random forest regression method (
RFR),
we compared the yield estimation capability of single-modal data versus
multimodal data fusion frameworks. The results showed that slope was an
important factor affecting crop growth and yield. We observed significant
differences in remote sensing parameters under different slope grades. Soil
water content, water content of plants, and above-ground biomass at the three
growth stages were significantly correlated with slope. Most of the vegetation
indices and temperature parameters of three growth stages were significantly
correlated with yield as well. Based on the strength of their correlation with
yield, seven vegetation indices (NDVI,
GNDVI, EVI2, OSAVI, SAVI, NDRE, and WDRVI) and two temperature
parameters (NRCT, CTD) were
selected as the final input variables for the model. For the single-modal data
framework, the model constructed with the vegetation indices was better than
the yield model constructed with the temperature parameters, and the highest
accuracy was obtained with a RFR model based on vegetation indices at filling
stage (
R2 = 0.724, RMSE = 614.72 kg hm
?2, MAE = 478.08 kg hm
?2).
For the double modal data fusion approach, the highest accuracy resulted at
flowering stage, using the temperature parameters combined with the vegetation
indices of RFR model (
R2=0.865, RMSE=440.73 kg hm
?2, MAE=374.86 kg hm
?2).
Even higher accuracies were obtained, using the multimodal data fusion approach
with a RFR model based on vegetation indices, temperature parameters and slope
information at flowering stage (
R2 = 0.893, RMSE = 420.06 kg hm
?2, MAE = 352.69 kg
hm
?2), and the highest validation model (
R2 = 0.892, RMSE = 423.55
kg hm
?2, MAE = 334.43 kg
hm
?2) for fusion of the flowering stage. The results revealed that by using a
multimodal data fusion framework of terrain factors combined with RFR, we can
fully exploit the complementary and synergistic roles of different remote
sensing information sources. This effectively improves the accuracy and
stability of the yield estimation model, and provides a reference and support
for crop growth monitoring and yield estimation.