欢迎访问作物学报,今天是

作物学报 ›› 2011, Vol. 37 ›› Issue (12): 2179-2186.doi: 10.3724/SP.J.1006.2011.02179

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

人工神经网络在作物基因组选择中的应用

束永俊,吴磊,王丹,郭长虹*   

  1. 哈尔滨师范大学生命科学与技术学院 / 黑龙江省分子细胞遗传与遗传育种重点实验室, 黑龙江哈尔滨 150025
  • 收稿日期:2011-05-30 修回日期:2011-09-09 出版日期:2011-12-12 网络出版日期:2011-10-09
  • 通讯作者: 郭长虹, Email: kaku3008@yahoo.com.cn, Tel: 0451-88060576
  • 基金资助:

    本研究由哈尔滨师范大学青年骨干教师基金(KGB201010), 哈尔滨师范大学科技创新团队(KJTD201102), 国际科技合作项目(2009DFA32470), 国家重点基础研究计划(973计划)前期项目(2011CB111505)和黑龙江省科技攻关项目(GA08B104)资助。

Application of Artificial Neural Network in Genomic Selection for Crop Improvement

SHU Yong-Jun,WU Lei,WANG Dan,GUO Chang-Hong*   

  1. Key Laboratory of Molecular Cytogenetics and Genetic Breeding of Heilongjiang Province / College of Life Science and Technology, Harbin Normal University, Harbin 150025, China
  • Received:2011-05-30 Revised:2011-09-09 Published:2011-12-12 Published online:2011-10-09
  • Contact: 郭长虹, Email: kaku3008@yahoo.com.cn, Tel: 0451-88060576

摘要: 目前, 基因组选择育种主要采用线性模型估计遗传育种值指导作物遗传育种的筛选过程, 但是生物体内的基因以及遗传位点的关系主要是复杂的非线性调控。本研究将人工神经网络技术应用到作物基因组选择育种中, 对现有的作物基因组选择育种模型进行优化, 建立了高效的作物基因组选择预测系统, 并与其他线性回归预测模型进行比较。通过分析小麦的育种数据发现, 基于人工神经网络的遗传育种估计效果优于其他线性回归预测模型, 预测育种值与实际育种值间的相关系数平均值达到0.6636, 相应的岭回归BLUP、贝叶斯线性回归模型和基于系谱信息的贝叶斯回归模型的预测能力分别为0.6422、0.6294和0.6573; 最优的预测效果达到0.8379, 远高于其他2种模型的最优结果。同时, 基于人工神经网络的基因组选择模型的预测效果稳定, 与传统的统计模型相近, 因此, 利用人工神经网络技术建立基因组选择是可行的。

关键词: 基因组选择, 小麦, 人工神经网络, 岭回归BLUP}贝叶斯线性回归

Abstract: With important progress in marker technologies, marker-assisted selection (MAS) has been used broadly for the crop improvement. Biparental populations are designed for the detection of quantitative trait loci (QTLs), but their application is retarded. The association mapping (AM) is applied directly to natural populations, which has been proposed to mitigate the lack of relevance of biparental populations in QTL identification. Many QTLs are identified by the two methods, which have encouraged genetic improvement of crop. However, they are using significant thresholds to identify QTL from estimated means that estimated effects are biased. Therefore, small-effect QTLs can’t be identified and missed entirely, while lots of traits of crop are controlled by those small-effect QTLs. Genomic selection (GS) has been proposed to make good for these deficiencies. Genomic selection predicts the breeding values of lines in a population by analyzing their phenotypes and high-density marker scores, and by including all markers in the model, and benefits from unbiased estimation of all chromosome segment effects, even when they are small. The GS incorporates all marker information in the prediction model, which avoids biased marker effect estimates and captures more of the variation from small-effect QTLs. Furthermore, markers carry information on the relatedness among lines, which contributes to prediction accuracy. Such accuracies are sufficient to select parents strictly on the basis of marker scores even for traits such as yield, tolerance to abiotic stress. From the view of perspective of the products from plant breeding, the genomic selection would greatly accelerate the breeding cycle, and enhance annual gains. GS would develop a prediction model from training population, genotyped and phenotyped, by estimated the markers effects. Then GS model would take genotypic data from candidate population to predict genomic estimated breeding values (GEBV), and there are some methods used for GS model, such as best linear unbiased prediction (BLUP), ridge regression BLUP (RR-BLUP),and Bayesian linear regression (BLR). These models are well used for crop genomic selection breeding. However, all the models are developed based on line or regression, while the relationships of genetic sites in life are not non-line or regression. The neural network was introduced to genomic selection in crop improvement in this study. The crop genomic selection model was optimized by non-linear model system. Therefore, the high efficient genomic selection system was established, and the prediction results were compared with these of other linear models, such as RR-BLUP, BLR.In wheat genetic data simulation, the correlation coefficient between the true breeding value of unphenotyped experimental lines and that predicted by genomic selection based on the neural network reached 0.6636, while that of RR-BLUP, BLR and BLR with pedigree information was 0.6422, 0.6294, and 0.6573, respectively. Meanwhile, the best prediction was 0.8379, which indicated the genomic selection based on the neural network is superior to these of other linear regression models. This level of accuracy was sufficient for selecting for agronomic performance using marker information alone. Such selection would substantially accelerate the breeding cycle, and enhance gains per unit time. Therefore, this research showed that GS has more potential for incorporating it into breeding schemes.

Key words: Genomic selection, Wheat, Artificial neural network, Ridge regression BLUP, Bayesian linear regression

[1]Henderson C. Best linear unbiased estimation and prediction under a selection model. Biometrics, 1975, 31: 423–447
[2]Henderson C R. Applications of Linear Models in Animal Breeding. Guelph (ONT): University of Guelph, 1984
[3]Cantet R J C, Smith C. Reduced animal model for marker assisted selection using best linear unbiased prediction. Genet Selection Evol, 1991, 23: 1–13
[4]Panter D M, Allen F L. Using best linear unbiased predictions to enhance breeding for yield in soybean: I. Choosing parents. Crop Sci, 1995, 35: 397–405
[5]Panter D M, Allen F L. Using best linear unbiased predictions to enhance breeding for yield in soybean: II. Selection of superior crosses from a limited number of yield trials. Crop Sci, 1995, 35: 405–410
[6]Bernardo R. Best linear unbiased prediction of maize single-cross performance given erroneous inbred relationships. Crop Sci, 1996, 36: 862–866
[7]Purba A R, Flori A, Baudouin L, Hamon S. Prediction of oil palm (Elaeis guineensis Jacq.) agronomic performances using the best linear unbiased predictor (BLUP). Theor Appl Genet, 2001, 102: 787–792
[8]Bauer A M, Reetz T C, Léon J. Estimation of breeding values of inbred lines using best linear unbiased prediction (BLUP) and genetic similarities. Crop Sci, 2006, 46: 2685–2691
[9]Xie C, Carlson M, Murphy J. Predicting individual breeding values and making forward selections from open-pollinated progeny test trials for seed orchard establishment of interior lodgepole pine (Pinus contorta ssp. latifolia) in British Columbia. New For, 2007, 33: 125–138
[10]Piepho H, Möhring J, Melchinger A, Büchse A. BLUP for phenotypic selection in plant breeding and variety testing. Euphytica, 2008, 161: 209–228
[11]Varshney R K, Graner A, Sorrells M E. Genomics-assisted breeding for crop improvement. Trends Plant Sci, 2005, 10: 621–630
[12]Kearsey M J, Farquhar A G L. QTL analysis in plants; where are we now? Heredity, 1998, 80: 137–142
[13]Wang J K(王建康), Wolfgang H P. Simulation approach and its applications in plant breeding. Sci Agric Sin (中国农业科学), 2007, 40(1): 1–12 (in Chinese with English abstract)
[14]Wang J-K(王建康), Li H-H(李慧慧), Zhang X-C(张学才), Yin C-B(尹长斌), Li Y(黎裕), Ma Y-Z(马有志), Li X-H(李新海), Qiu L-J(邱丽娟), Wan J-M(万建民). Molecular design breeding in crops in China. Acta Agron Sin (作物学报), 2011, 37(2): 191–201 (in Chinese with English abstract)
[15]Agrama H, Eizenga G, Yan W. Association mapping of yield and its components in rice cultivars. Mol Breed, 2007, 19: 341–356
[16]Zhu C, Gore M, Buckler E S, Yu J. Status and prospects of association mapping in plants. Plant Genome, 2008, 1: 5–20
[17]Zhao K, Aranzana M J, Kim S, Lister C, Shindo C, Tang C, Toomajian C, Zheng H, Dean C, Marjoram P, Nordborg M. An Arabidopsis example of association mapping in structured samples. PLoS Genet, 2007, 3: e4
[18]Goddard M E, Hayes B J. Genomic selection. J Anim Breed Genet, 2007, 124: 323–330
[19]De Roos A P W, Schrooten C, Mullaart E, Calus M P L, Veerkamp R F. Breeding value estimation for fat percentage using dense markers on Bos taurus autosome 14. J Dairy Sci, 2007, 90: 4821–4829
[20]Long N, Gianola D, Rosa G J M, Weigel K A, Avendaño S. Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. J Anim Breed Genet, 2007, 124: 377–389
[21]Meuwissen T. Genomic selection: marker assisted selection on a genome wide scale. J Anim Breed Genet, 2007, 124: 321–322
[22]Legarra A, Robert-Granié C, Manfredi E, Elsen J M. Performance of genomic selection in mice. Genetics, 2008, 180: 611–618
[23]Luan T, Woolliams J A, Lien S, Kent M, Svendsen M, Meuwissen T H E. The accuracy of genomic selection in norwegian red cattle assessed by cross-validation. Genetics, 2009, 183: 1119–1126
[24]Piyasatian N, Fernando R L, Dekkers J C. Genomic selection for marker-assisted improvement in line crosses. Theor Appl Genet, 2007, 115: 665–674
[25]Li Y(黎裕), Wang J-K(王建康), Qiu L-J(邱丽娟), Ma Y-Z(马有志), Li X-H(李新海), Wan J-M(万建民). Crop molecular breeding in China: current status and perspectives. Acta Agron Sin (作物学报), 2010, 36(9): 1425–1430 (in Chinese with English abstract)
[26]Wong C, Bernardo R. Genome wide selection in oil palm: increasing selection gain per unit time and cost with small populations. Theor Appl Genet, 2008, 116: 815–824
[27]de los Campos G, Naya H, Gianola D, Crossa J, Legarra A, Manfredi E, Weigel K, Cotes J M. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics, 2009, 182: 375–385
[28]Crossa J, Campos G D L, Pérez P, Gianola D, Burgueño J, Araus J L, Makumbi D, Singh R P, Dreisigacker S, Yan J, Arief V, Banziger M, Braun H J. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics, 2010, 186: 713–724
[29]Pérez P, de los Campos G, Crossa J, Gianola D. Genomic-enabled prediction based on molecular markers and pedigree using the Bayesian linear regression package in R. Plant Genome, 2010, 3: 106–116
[30]He Z-H(何中虎), Xia X-C(夏先春), Chen X-M(陈新民), Zhuang Q-S(庄巧生). Molecular design breeding in crops in China. Acta Agron Sin (作物学报), 2011, 37(2): 202–215 (in Chinese with English abstract)
[31]Kang H M, Sul J H, Service S K, Zaitlen N A, Kong S-Y, Freimer N B, Sabatti C, Eskin E. Variance component model to account for sample structure in genome-wide association studies. Nat Genet, 2010, 42: 348–354
[32]Jannink J L, Lorenz A J, Iwata H. Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics, 2010, 9: 166–177
[33]Heffner E L, Sorrells M E, Jannink J-L. Genomic selection for crop improvement. Crop Sci, 2009, 49: 10–12
[1] 胡文静, 李东升, 裔新, 张春梅, 张勇. 小麦穗部性状和株高的QTL定位及育种标记开发和验证[J]. 作物学报, 2022, 48(6): 1346-1356.
[2] 郭星宇, 刘朋召, 王瑞, 王小利, 李军. 旱地冬小麦产量、氮肥利用率及土壤氮素平衡对降水年型与施氮量的响应[J]. 作物学报, 2022, 48(5): 1262-1272.
[3] 付美玉, 熊宏春, 周春云, 郭会君, 谢永盾, 赵林姝, 古佳玉, 赵世荣, 丁玉萍, 徐延浩, 刘录祥. 小麦矮秆突变体je0098的遗传分析与其矮秆基因定位[J]. 作物学报, 2022, 48(3): 580-589.
[4] 冯健超, 许倍铭, 江薛丽, 胡海洲, 马英, 王晨阳, 王永华, 马冬云. 小麦籽粒不同层次酚类物质与抗氧化活性差异及氮肥调控效应[J]. 作物学报, 2022, 48(3): 704-715.
[5] 刘运景, 郑飞娜, 张秀, 初金鹏, 于海涛, 代兴龙, 贺明荣. 宽幅播种对强筋小麦籽粒产量、品质和氮素吸收利用的影响[J]. 作物学报, 2022, 48(3): 716-725.
[6] 马红勃, 刘东涛, 冯国华, 王静, 朱雪成, 张会云, 刘静, 刘立伟, 易媛. 黄淮麦区Fhb1基因的育种应用[J]. 作物学报, 2022, 48(3): 747-758.
[7] 王洋洋, 贺利, 任德超, 段剑钊, 胡新, 刘万代, 郭天财, 王永华, 冯伟. 基于主成分-聚类分析的不同水分冬小麦晚霜冻害评价[J]. 作物学报, 2022, 48(2): 448-462.
[8] 陈新宜, 宋宇航, 张孟寒, 李小艳, 李华, 汪月霞, 齐学礼. 干旱对不同品种小麦幼苗的生理生化胁迫以及外源5-氨基乙酰丙酸的缓解作用[J]. 作物学报, 2022, 48(2): 478-487.
[9] 徐龙龙, 殷文, 胡发龙, 范虹, 樊志龙, 赵财, 于爱忠, 柴强. 水氮减量对地膜玉米免耕轮作小麦主要光合生理参数的影响[J]. 作物学报, 2022, 48(2): 437-447.
[10] 马博闻, 李庆, 蔡剑, 周琴, 黄梅, 戴廷波, 王笑, 姜东. 花前渍水锻炼调控花后小麦耐渍性的生理机制研究[J]. 作物学报, 2022, 48(1): 151-164.
[11] 孟颖, 邢蕾蕾, 曹晓红, 郭光艳, 柴建芳, 秘彩莉. 小麦Ta4CL1基因的克隆及其在促进转基因拟南芥生长和木质素沉积中的功能[J]. 作物学报, 2022, 48(1): 63-75.
[12] 韦一昊, 于美琴, 张晓娇, 王露露, 张志勇, 马新明, 李会强, 王小纯. 小麦谷氨酰胺合成酶基因可变剪接分析[J]. 作物学报, 2022, 48(1): 40-47.
[13] 李玲红, 张哲, 陈永明, 尤明山, 倪中福, 邢界文. 普通小麦颖壳蜡质缺失突变体glossy1的转录组分析[J]. 作物学报, 2022, 48(1): 48-62.
[14] 罗江陶, 郑建敏, 蒲宗君, 范超兰, 刘登才, 郝明. 四倍体小麦与六倍体小麦杂种的染色体遗传特性[J]. 作物学报, 2021, 47(8): 1427-1436.
[15] 王艳朋, 凌磊, 张文睿, 王丹, 郭长虹. 小麦B-box基因家族全基因组鉴定与表达分析[J]. 作物学报, 2021, 47(8): 1437-1449.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!