欢迎访问作物学报,今天是

作物学报 ›› 2018, Vol. 44 ›› Issue (04): 569-580.doi: 10.3724/SP.J.1006.2018.00569

• 耕作栽培·生理生化 • 上一篇    下一篇

随机森林方法在玉米-大豆精细识别中的应用

王利民(), 刘佳, 杨玲波, 杨福刚, 富长虹   

  1. 中国农业科学院农业资源与农业区划研究所, 北京 100081
  • 收稿日期:2017-05-08 接受日期:2018-01-08 出版日期:2018-01-26 网络出版日期:2018-01-26
  • 作者简介:

    wanglimin01@caas.cn

  • 基金资助:
    本研究由国家“十二五”科技重大专项(09-Y30B03-9001-13/15)和国家重点研发计划项目(2016YFD0300603)资助

Application of Random Forest Method in Maize-soybean Accurate Identification

Li-Min WANG(), Jia LIU, Ling-Bo YANG, Fu-Gang YANG, Chang-Hong FU   

  1. Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing 100081, China
  • Received:2017-05-08 Accepted:2018-01-08 Published:2018-01-26 Published online:2018-01-26
  • Supported by:
    This study was supported by the National Science and Technology Major Project (09-Y30B03-9001-13/15) and the National Key Research and Development Program of China (2016YFD0300603).

摘要:

研究基于遥感影像的作物精确识别技术方法, 对获取作物分布信息具有重要意义。随机森林分类(random forest classification, RFC)是机器学习的一种, 本文使用Landsat-8 OLI卫星影像数据, 针对研究区内的大豆、玉米和其他地物等3种主要作物类型, 系统比较了该方法与较为成熟的最大似然分类(maximum likelihood classification, MLC)、支持向量机分类(support vector machine, SVM)方法的分类精度。结果表明, MLC、SVM、RFC的总体分类精度分别为91.68%、91.49%、94.32%, Kappa系数分别为0.87、0.87、0.91, RFC方法作物识别精度比MLC和SVM分类显著提升。对原始7波段影像进行主成分变换(principal component analysis, PCA), 提取前4个主成分分量, 同时计算归一化植被指数(normalized difference vegetation index, NDVI)和归一化水体指数(normalized difference water index, NDWI), 将6个额外辅助特征波段叠加到原始7个波段影像上进行再次分类, MLC和SVM方法作物识别精度未有提升, RFC方法总体精度提高了1.49个百分点, Kappa系数提高0.03, 精度提升幅度有限, 主要原因是6个辅助波段在类型识别中作用较小。在分类耗时上, MLC、SVM、RFC分别为145 s、11 000 s、1800 s, 表明随机森林分类具有最好的分类精度和适中的耗时。综合评价后, 随机森林分类方法在进行大豆-玉米精细识别中具有较大优势, 具有业务应用的潜力。

关键词: Landsat-8, 随机森林, 玉米, 大豆, 遥感, 识别能力

Abstract:

It is very important to obtain the crop identification information based on remote sensing image. Remote sensing images have the advantages of high efficiency, high accuracy, low costs, and wide monitoring scope. Applying remote sensing images in maize-soybean accurate identification and planting area evaluation can give full play to the advantages of remote sensing images. Random forest classification (RFC) is a new classification method, a type of machine learning. Currently, there are very few studies on crop classification based on RFC. In order to evaluate the potential of the method on maize-soybean crop accurate identification, the paper conducted classification of major crops of soybean, maize, and other ground objects. Utilizing Landsat-8 OLI satellite image data, and three methods including maximum likelihood classification (MLC), support vector machine (SVM), and random forest classification (RFC). The overall classification accuracies of MLC, SVM, and RFC were 91.68%, 91.49%, and 94.32%, with their kappa coefficients of 0.87, 0.87, and 0.91, respectively, showing that RFC is better. The principal component analysis (PCA) was made on original seven wave band images, and the first four wave bands of the major components were extracted. Meanwhile, the normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were calculated; six additional supporting characteristic wave bands were overlapped on original seven wave band images, and the classifications with MLC, SVM, and RFC were conducted again. After adding characteristic wave bands, crop identification accuracies by MLC and SVM methods were not improved. The accuracy of RFC method was increased slightly with overall accuracy of 95.81% increasing by 1.49 percent, and Kappa coefficient of 0.94 increasing by 0.03, showing accuracy slightly increased, and limited improvement effect. Near-infrared band and two short infrared wave bands were most important, while newly added wave band was not significant for soybean-maize identification, showing the limited improvement effect of supporting wave band. SVM had the longest time spent on classification, with about 11 000 s; MLC the least, only 145 s; and RFC about 1800 s. It indicates that SVM doesn’t have any advantages in both accuracy and time-consumed, however, MLC can quickly get the classification results, and RFC has the highest classification accuracy with moderate time consumed. In conclusion, RFC has greater advantage in soybean-maize accurate identification, and is suitable to be widely applied in the operation of regional agriculture remote sensing monitoring crop area extraction.

Key words: Landsat-8, random forest, maize, soybean, remote sensing, identification capacity

图1

研究区地理位置"

表1

Landsat 8 OLI影像各波段辐射定标系数"

波段
Band
定标斜率
Gain
定标截距
Bias
海岸蓝波段 Coastal aerosol 0.01298 -64.89967
蓝 Blue 0.013292 -66.45805
绿 Green 0.012248 -61.24053
红 Red 0.010328 -51.64146
近红外 Near infrared 0.0063204 -31.602
短波红外1 SWIR 1 0.0015718 -7.85913
短波红外2 SWIR 2 0.00052979 -2.64895

图2

研究区Landsat 8 OLI影像及样方分布 a: Landsat 8原始影像及样方分布; b: 样方原图; c: 样方分类结果。"

图3

研究区主要地物类别光谱曲线"

图4

研究技术流程"

图5

基于RapidEye影像的目视解译结果 a: Rapideye影像(5/4/3波段); b: Rapideye影像的目视解译结果。"

图6

基于原始影像3种方法分类结果 a: 最大似然分类结果; b: 支持向量机分类结果; c: 随机森林分类结果; d: 最大似然分类结果局部; e: 支持向量机分类结果局部; f: 随机森林分类结果局部。"

表2

基于原始影像3种分类方法的混淆矩阵"

作物
Crop
分类方法
Method
大豆
Soybean (pixel)
玉米
Maize (pixel)
其他
Other (pixel)
总计
Total (pixel)
制图精度
Mapping accuracy (%)
大豆
Soybean
MLC 864849 15393 56545 936787 91.03
SVM 878082 7498 66202 951782 92.42
RFC 911841 9285 65006 986132 95.98
玉米
Maize
MLC 21443 1134475 76472 1232390 92.38
SVM 35185 1140200 93000 1268385 92.85
RFC 4158 1186858 68423 1259439 96.65
其他
Other
MLC 63758 78172 1436164 1578094 91.52
SVM 36783 80342 1409979 1527104 89.85
RFC 34051 31897 1435752 1501700 91.50
总计 Total (pixel) 950050 1228040 1569181 3747271
用户精度
User accuracy (%)
MLC 92.32 92.05 91.01
SVM 92.26 89.89 92.33
RFC 92.47 94.24 95.61
总体精度
Overall accuracy (%)
MLC 91.68 Kappa系数
Kappa coefficient
MLC 0.87
SVM 91.49 SVM 0.87
RFC 94.32 RFC 0.91

图7

增加辅助特征后的3种方法分类结果 a: 最大似然分类结果; b: 支持向量机分类结果; c: 随机森林分类结果; d: 最大似然分类结果局部; e: 支持向量机分类结果局部; f: 随机森林分类结果局部。"

表3

增加辅助特征波段后的3种分类方法分类结果混淆矩阵"

作物
Crop
分类方法
Method
大豆
Soybean (pixel)
玉米
Maize (pixel)
其他
Other (pixel)
总计
Total (pixel)
制图精度
Mapping accuracy (%)
大豆
Soybean
MLC 857118 3824 56780 917722 90.22
SVM 885153 7611 66296 959060 93.17
RFC 934124 2276 55865 992265 98.32
玉米
Maize
MLC 56343 1201206 167854 1425403 97.81
SVM 31759 1145348 87264 1264371 93.27
RFC 2053 1205962 63085 1271100 98.20
其他
Other
MLC 36589 23010 1344547 1404146 85.68
SVM 33138 75081 1415621 1523840 90.21
RFC 13873 19802 1450231 1483906 92.42
总计 Total (pixel) 950050 1228040 1569181 3747271
用户精度
User accuracy (%)
MLC 93.40 84.27 95.76
SVM 92.29 90.59 92.90
RFC 94.14 94.88 97.73
总体精度
Overall accuracy (%)
MLC 90.81 Kappa系数
Kappa coefficient
MLC 0.86
SVM 91.96 SVM 0.88
RFC 95.81 RFC 0.94

图8

增加辅助特征前后各特征变量的重要性 a: 增加辅助特征前各波段重要性; b: 增加辅助特征后各波段重要性。"

表4

3种分类方式作物分类提取耗费时间"

分类方式
Classification method
分类时间
Time cost (s)
最大似然分类MLC 145
支持向量机SVM 11000
随机森林分类RFC 1800
[1] 许文波, 田亦陈. 作物种植面积遥感提取方法的研究进展. 云南农业大学学报, 2005, 20(1): 94-98
Xu W B, Tian Y C.Overview of extraction of crop area from remote sensing.J Yunnan Agric Univ, 2005, 20(1): 94-98 (in Chinese with English abstract)
[2] 尤淑撑, 孙毅, 李小文. 成像光谱技术在土地利用动态遥感监测中的应用研究. 遥感信息, 2005, (3): 31-33
You S C, Sun Y, Li X W.Reseach on landuse dynamic monitoring using high spectral resolution remote sensing data.Remote Sens Inf, 2005, (3): 31-33 (in Chinese with English abstract)
[3] Gleriani J M, da Silva J D S, Epiphanio J C N. Comparative performance of neural networks and maximum likelihood for supervised classification of agricultural crops: single date and temporal analysis.Radal Ba Fnon, 2004, 4: 2959-2964
[4] Liang Y J, Xu Z M.Crop identification in the irrigation district based on SPOT-5 satellite imagery.Pratacult Sci, 2013, 30: 161-167
[5] Baup F, Flanquart S, Maraissicre C, Fieuzal R.Satellite monitoring at high spatial resolution of water bodies used for irrigation purposes.Sci Technol Innovation Herald, 2012, 32(3): 103-119
[6] Luo B, Yang C, Chanussot J, Zhang L.Crop yield estimation based on unsupervised linear unmixing of multidate hyperspectral imagery.IEEE Trans Geosci Remote Sens, 2013, 51: 162-173
[7] Wu B, Li Q.Crop planting and type proportion method for crop acreage estimation of complex agricultural landscapes.Int J Appl Earth Obs Geoinf, 2012, 16: 101-112
[8] Long J A, Lawrence R L, Greenwood M C, Marshall L, Miller P R.Object-oriented crop classification using multitemporal ETM+ SLC-off imagery and random forest.Gisci Remote Sens, 2013, 50: 418-436
[9] Jiao X F, Kovacs J M, Shang J L, McNairn H, Walters D, Ma B L, Geng X Y. Object-oriented crop mapping and monitoring using multi-temporal polarimetric RADARSAT-2 data.ISPRS J Photogramm Remote Sens, 2014, 96: 38-46
[10] Rosales H S, Bruno C, Balzarini M.Identifying yield and environment relationships using classification and regression trees (CART). Interciencia, 2010, 35: 876-882
[11] Arvor D, Jonathan M, Simoes M, Durieux L.Classification of MODIS EVI time series for crop mapping in the state of Mato Grosso, Brazil. Int J Remote Sens, 2011, 32: 7847-7871
[12] 李鑫川, 徐新刚, 王纪华, 武洪峰, 金秀良, 李存军, 鲍艳松. 基于时间序列环境卫星影像的作物分类识别. 农业工程学报, 2013, 29(2): 169-176.
Li X C, Xu X G, Wang J H, Wu H F, Jin X L, Li C J, Bao Y S.Crop classification recognition based on time-series images from HJ satellite.Trans CSAE, 2013, 29(2): 169-176 (in Chinese with English abstract)
[13] Kaur P, Singh S, Garg S, Harmanpreet. Analytical and CASE study on limited search, ID3, CHAID, C4.5, improved C4.5 and OVA decision tree algorithms to design decision support system.Strategic Change, 2010, 1324: 253-267
[14] Deng X, Zhao C, Yan H.Systematic modeling of impacts of land use and land cover changes on regional climate: a review.Adv Meteorol, 2013, 2013: 317678
[15] 刘建光, 李红, 孙丹峰, 张微微, 周连第. MODIS土地利用/覆被多时相多光谱决策树分类. 农业工程学报, 2010, 26(10): 312-318
Liu J G, Li H, Sun D F, Zhang W W, Zhou L D.Land use/cover decision tree classification fusing multi-temporal and multi-spectral of MODIS.Trans CSAE, 2010, 26(10): 312-331 (in Chinese with English abstract)
[16] 刘毅, 杜培军, 郑辉, 夏俊士, 柳思聪. 基于随机森林的国产小卫星遥感影像分类研究. 测绘科学, 2012, 37(04): 194-196
Liu Y, Du P J, Zheng H, Xia J S, Liu S C.Classification of China small satellite remote sensing image based on random forests.Sci Surv Mapping, 2012, 37(04): 194-196 (in Chinese with English abstract)
[17] 刘磊, 江东, 徐敏, 尹芳. 基于多光谱影像和专家决策法的作物分类研究. 安徽农业科学, 2011, 39(25): 1703-1706
Liu L, Jiang D, Xu M, Yin F.Crops classification based on multi-spectral image and decision tree method. J Anhui Agric Sci, 2011, 39(25): 1703-1706 (in Chinese with English abstract)
[18] 康峻, 侯学会, 牛铮, 高帅, 贾坤. 基于拟合物候参数的植被遥感决策树分类. 农业工程学报, 2014, 30(9): 148-156
Kang J, Hou X H, Niu Z, Gao S, Jia K.Decision tree classification based on fitted phenology parameters from remotely sensed vegetation data.Trans CSAE, 2014, 30(9): 148-156 (in Chinese with English abstract)
[19] 张旭东, 迟道才. 基于异源多时相遥感数据决策树的作物种植面积提取研究. 沈阳农业大学学报, 2014, 45: 451-456
Zhang X D, Chi D C.Mapping crop fields by using multi-sensor and multi-temporal remote sensing data with decision-tree.J Shenyang Univ, 2014, 45: 451-456 (in Chinese with English abstract)
[20] 黄健熙, 贾世灵, 武洪峰, 苏伟. 基于GF-1 WFV影像的作物面积提取方法研究. 农业机械学报, 2015, 46(1): 253-259
Huang J X, Jia S L, Wu H F, Su W.Extraction method of crop planted area based on GF-1 WFV Image.Trans CSAM, 2015, 46(1): 253-259 (in Chinese with English abstract)
[21] Kandrika S, Roy P S.Land use land cover classification of Orissa using multi-temporal IRS-P6 awifs data: A decision tree approach.Int J Appl Earth Obs Geoinf, 2008, 10: 186-193
[22] Peña J M, Gutiérrez P A, Hervás-Martínez C, Six J, Plant R E, López-Granados F.Object-based image classification of summer crops with machine learning methods.Remote Sens, 2014, 6: 5019-5041
[23] Pal M.Random forest classifier for remote sensing classification.Int J Remote Sens, 2007, 26: 217-222
[24] Gislason P O, Benediktsson J A, Sveinsson J R.Random forests for land cover classification.Pattern Recognit Lett, 2003, 27: 294-300
[25] Ok A O, Akar O, Gungor O.Evaluation of random forest method for agricultural crop classification.Eur J Remote Sens, 2012, 45: 421-432
[26] Deschamps B, Mcnairn H, Shang J, Jiao X.Towards operational radar-only crop type classification: comparison of a traditional decision tree with a random forest classifier.Can J Remote Sens, 2012, 38: 60-68
[27] 张晓羽, 李凤日, 甄贞, 赵颖慧. 基于随机森林模型的陆地卫星-8遥感影像森林植被分类. 东北林业大学学报, 2016, 44(6): 53-57
Zhang X Y, Li F R, Zhen Z, Zhao Y H.Forest vegetation classification of Landsat8 remote sensing image based on random forest model.J Northeast For Univ, 2016, 44(6): 53-57 (in Chinese with English abstract)
[28] 郭玉宝, 池天河, 彭玲, 刘吉磊, 杨丽娜. 利用随机森林的高分一号遥感数据进行城市用地分类. 测绘通报, 2016, (5): 73-76
Guo Y B, Chi T H, Peng L, Liu J L, Yang L N.Classification of GF-1 remote sensing image based on random forests for urban land-use.Bull Surv Mapping, 2016, (5): 73-76 (in Chinese with English abstract)
[29] 黄健熙, 侯矞焯, 苏伟, 刘峻明, 朱德海. 基于GF-1 WFV数据的玉米与大豆种植面积提取方法. 农业工程学报, 2017, 33(7): 164-170
Huang J X, Hou Y Z, Su W, Liu J M, Zhu D H.Mapping corn and soybean cropped area with GF-1 WFV data.Trans CSAE, 2017, 33(7): 164-170 (in Chinese with English abstract)
[30] 王利民, 刘佳, 杨玲波, 杨福刚, 富长虹. 短波红外波段对玉米大豆种植面积识别精度的影响. 农业工程学报, 2016, 32(19): 169-178
Wang L M, Liu J, Yang L B, Yang F G, Fu C H.Impact of short infrared wave band on identification accuracy of corn and soybean area.Trans CSAE, 2016, 32(19): 169-178 (in Chinese with English abstract)
[31] 王增林, 朱大明. 基于遥感影像的最大似然分类算法的探讨. 河南科学, 2010, 28: 1458-1461
Wang Z L, Zhu D M.A study of maximum likelihood classification algorithm based on remote sensing image.Henan Sci, 2010, 28: 1458-1461 (in Chinese with English abstract)
[32] Cortes C, Vapnik V.Support-vector networks.Mach Learn, 1995, 20: 273-297
[33] Breiman L.Random forests.Machine Learning, 2001, 45: 5-32
[34] Congalton R G.A Review of assessing the accuracy of classifications of remotely sensed data.Remote Sens Environ, 1991, 37: 35-46
[35] Hay A M.The derivation of global estimation from a confusion matrix.Int J Remote Sens, 1988, 9: 1395-1398
[36] Congalton R G.A comparison of sampling schemes used in generating error matrices for assessing the accuracy of maps generated from remotely sensing data. Photogramm Eng Remote Sens, 1988, 54: 593-600
[1] 肖颖妮, 于永涛, 谢利华, 祁喜涛, 李春艳, 文天祥, 李高科, 胡建广. 基于SNP标记揭示中国鲜食玉米品种的遗传多样性[J]. 作物学报, 2022, 48(6): 1301-1311.
[2] 崔连花, 詹为民, 杨陆浩, 王少瓷, 马文奇, 姜良良, 张艳培, 杨建平, 杨青华. 2个玉米ZmCOP1基因的克隆及其转录丰度对不同光质处理的响应[J]. 作物学报, 2022, 48(6): 1312-1324.
[3] 陈玲玲, 李战, 刘亭萱, 谷勇哲, 宋健, 王俊, 邱丽娟. 基于783份大豆种质资源的叶柄夹角全基因组关联分析[J]. 作物学报, 2022, 48(6): 1333-1345.
[4] 王丹, 周宝元, 马玮, 葛均筑, 丁在松, 李从锋, 赵明. 长江中游双季玉米种植模式周年气候资源分配与利用特征[J]. 作物学报, 2022, 48(6): 1437-1450.
[5] 杨欢, 周颖, 陈平, 杜青, 郑本川, 蒲甜, 温晶, 杨文钰, 雍太文. 玉米-豆科作物带状间套作对养分吸收利用及产量优势的影响[J]. 作物学报, 2022, 48(6): 1476-1487.
[6] 陈静, 任佰朝, 赵斌, 刘鹏, 张吉旺. 叶面喷施甜菜碱对不同播期夏玉米产量形成及抗氧化能力的调控[J]. 作物学报, 2022, 48(6): 1502-1515.
[7] 徐田军, 张勇, 赵久然, 王荣焕, 吕天放, 刘月娥, 蔡万涛, 刘宏伟, 陈传永, 王元东. 宜机收籽粒玉米品种冠层结构、光合及灌浆脱水特性[J]. 作物学报, 2022, 48(6): 1526-1536.
[8] 王炫栋, 杨孙玉悦, 高润杰, 余俊杰, 郑丹沛, 倪峰, 蒋冬花. 拮抗大豆斑疹病菌放线菌菌株的筛选和促生作用及防效研究[J]. 作物学报, 2022, 48(6): 1546-1557.
[9] 单露英, 李俊, 李亮, 张丽, 王颢潜, 高佳琪, 吴刚, 武玉花, 张秀杰. 转基因玉米NK603基体标准物质研制[J]. 作物学报, 2022, 48(5): 1059-1070.
[10] 于春淼, 张勇, 王好让, 杨兴勇, 董全中, 薛红, 张明明, 李微微, 王磊, 胡凯凤, 谷勇哲, 邱丽娟. 栽培大豆×半野生大豆高密度遗传图谱构建及株高QTL定位[J]. 作物学报, 2022, 48(5): 1091-1102.
[11] 李阿立, 冯雅楠, 李萍, 张东升, 宗毓铮, 林文, 郝兴宇. 大豆叶片响应CO2浓度升高、干旱及其交互作用的转录组分析[J]. 作物学报, 2022, 48(5): 1103-1118.
[12] 彭西红, 陈平, 杜青, 杨雪丽, 任俊波, 郑本川, 罗凯, 谢琛, 雷鹿, 雍太文, 杨文钰. 减量施氮对带状套作大豆土壤通气环境及结瘤固氮的影响[J]. 作物学报, 2022, 48(5): 1199-1209.
[13] 王好让, 张勇, 于春淼, 董全中, 李微微, 胡凯凤, 张明明, 薛红, 杨梦平, 宋继玲, 王磊, 杨兴勇, 邱丽娟. 大豆突变体ygl2黄绿叶基因的精细定位[J]. 作物学报, 2022, 48(4): 791-800.
[14] 许静, 高景阳, 李程成, 宋云霞, 董朝沛, 王昭, 李云梦, 栾一凡, 陈甲法, 周子键, 吴建宇. 过表达ZmCIPKHT基因增强植物耐热性[J]. 作物学报, 2022, 48(4): 851-859.
[15] 刘磊, 詹为民, 丁武思, 刘通, 崔连花, 姜良良, 张艳培, 杨建平. 玉米矮化突变体gad39的遗传分析与分子鉴定[J]. 作物学报, 2022, 48(4): 886-895.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!