欢迎访问作物学报,今天是

作物学报 ›› 2014, Vol. 40 ›› Issue (01): 72-79.doi: 10.3724/SP.J.1006.2014.00072

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

等位基因功能差异的统计遗传学分析及应用

胡文明**,阚海华**,王伟,徐辰武*   

  1. 扬州大学江苏省作物遗传生理重点实验室 / 教育部植物功能基因组学重点实验室,江苏扬州225009
  • 收稿日期:2013-04-01 修回日期:2013-07-25 出版日期:2014-01-12 网络出版日期:2013-10-01
  • 通讯作者: 徐辰武,E-mail: qtls@yzu.edu.cn,Tel: 0514-87979358
  • 基金资助:

    本研究由国家重点基础研究发展计划(973计划)项目(2011CB100100),国家自然科学基金项目(31171187)和江苏高校“青蓝工程”科技创新团队项目资助。

Statistical Genetics Approach for Functional Difference Identification of Allelic Variations and Its Application

HU Wen-Ming**,KAN Hai-Hua**,WANG Wei,XU Chen-Wu*   

  1. Jiangsu Provincial Key Laboratory of Crop Genetics and Physiology, Key Laboratory of Plant Functional Genomics of Ministry of Education, Yangzhou University, Yangzhou 225009, China
  • Received:2013-04-01 Revised:2013-07-25 Published:2014-01-12 Published online:2013-10-01

摘要:

等位基因的变异在各种生物中都是普遍存在的,并对基因的表达起着重要的调控作用。为了探索关联分析中品种数目(A)、平均等位基因多态信息含量(B)和候选基因总贡献率(C)对候选基因分析结果的影响,本研究采用经验贝叶斯(E-Bayes)方法探讨了上述因素对候选基因检测功效、遗传效应估计值的准确度和精确度以及假阳性出现频率等的影响。结果表明,(1) 随着ABC的增加,候选基因的检测功效和效应估计值的准确度和精确度明显提高,假阳性出现的频率降低。(2) B对检测功效有显著的影响。在B值保持较高的水平时,即使品种的数目保持较低的水平以及候选基因的总贡献率较低时,平均检测功效也可达到80%;当B值为中等水平时,需要较大品种数目才能使平均统计功效超过80%;当B值较小时,品种数目即使达到1003种贡献率水平下的统计功效最高也未达到50%(3) B对候选基因效应估计值的准确度和精确度有显著的影响。随着B的增加,候选基因效应估计的准确度和精确度增加。(4) B因素对假阳性频率也有显著影响。在实例分析中检测到4个基因与稻米糊化温度显著关联。因此,在进行等位基因功能差异的统计遗传学分析时等位基因多态性是主要的影响因素,同时较多的品种数和较高的贡献率对候选基因的统计功效、效应估计值的准确度和精确度也有重要影响。

关键词: 等位变异, 超饱和模型, 变量选择, E-Bayes

Abstract:

Allelic variations are ubiquitous in organisms, and play important roles in regulating genes expression. In order to study the influence of number of varieties (A), average polymorphism information content (B) and total contribution of candidate genes (C) on the association analysis of candidate genes, the empirical Bayes (E-Bayes) method was applied to explore the effects of abovementioned three factors on the statistical power of candidate genes, the accuracy and precision of the estimates of genetic effects and the false discovery rate (FDR). Results were as follows: (1) With the increase of factors A, B, and C, the statistical power and the accuracy and precision of the estimates of genetic effects were all enhanced, meanwhile the FDR was decreased; (2) Factor B had a significant influence on the statistical power of candidate genes. When factor B was at a higher level, the averaged statistical power could still reach 80% even though both factors A and C remained at lower levels. When factor B was at a medium level, more varieties were needed to ensure that the statistical power could reach 80%. However, when factor B was at a lower level, even though factor A was equal to 100, the statistical power in three different levels of factor C could not reach 50%; (3) Factor B had a significant impact on the accuracy and precision of estimated effects of candidate genes. With the increase of factor B, both the accuracy and precision of effect estimates for candidate genes were improved simultaneously; (4) Factor B also had an important effect on FDR. Through a real data analysis in rice, four detected candidate genes were significantly associated with pasting temperature (PT) by our model. Therefore, the polymorphism information content is a primary factor for detecting the functional difference of alleles. In addition, more varieties and higher contribution rate also have important influence on the statistical power and the accuracy and precision of estimates of effects.

Key words: Allelic variation, Oversaturated model, Variable selection, E-Bayes

[1]Galton F. Regression towards mediocrity in hereditary stature. J Anthropol Inst Great Brit Ireland, 1886, 15: 246–263



[2]Guo M, Yang S, Rupe M, Hu B, Bickel D R, Oscar L A. Genomewide allele-specific expression analysis using Massively Parallel Signature Sequencing (MPSS™) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue. Plant Mol Biol, 2008, 66: 551–563



[3]Schaart J G, Mehli L, Schouten H J. Quantification of allele-specific expression of a gene encoding strawberry polygalacturonase-inhibiting protein (PGIP) using pyrosequencing. Plant J, 2005, 41: 493–500



[4]Yoon M Y, Moe K T, Kim D Y, Rho I R, Kim S, Kim K T, Won M K, Chung J W, Park Y J. Genetic diversity and population structure analysis of strawberry (Fragaria ? ananassa Duch.) using SSR markers. Electr J Biotechnol, 2012, 15(2): 5



[5]Adams K L, Cronn R, Percifield R, Wendel J F. Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc Natl Acad Sci USA, 2003, 100: 4649–4654



[6]Kolev S, Ganeva G, Christov N, Belchev I, Kostov K, Tsenov N, Rachovska G, Landgeva S, Ivanov M, Abu-Mhadi N, Todorovska E. Allele variation in loci for adaptine response and plant height and its effect on grain yeild in wheat. Biotechnol Biotechnol Equip, 2010, 24: 1807–1813



[7]Kolev S, Vassilev D, Kostov K, Todorovska E. Allele variation in loci for adaptive response in Bulgarian wheat cultivars and landraces and its effect on heading date. Plant Genet Resour Char Util, 2011, 9: 251–255



[8]Xie H-L(谢会兰). The Foundation of Molecular Markers Correlated with Rice Starch and Preliminary Detection of Its Genetic Network. MS Dissertation of Yangzhou University, 2007 (in Chinese with English abstract)



[9]Soller M, Brody T. On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor Appl Genet, 1976, 47: 35–39



[10]Lander E S, Botstein D. Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics, 1989, 121: 185–199



[11]Zeng Z B. Precision mapping of quantitative trait loci. Genetics, 1994, 136: 1457–1468



[12]Li H H, Ye G Y, Wang J K. A modi?ed algorithm for the improvement of composite e interval mapping. Genetics, 2007, 175: 361–374



[13]Zeng Z B, Kao C, Basten C J. Estimating the genetic architecture of quantitative traits. Genet Res, 2000, 74: 279–289



[14]Zhang Y M, Xu S. A penalized maximum likelihood method for estimating epistatic effects of QTL. Heredity, 2005, 95: 96–104



[15]Cohen R A. Introducing the glmselect procedure for model selection. Statist & Data Anal, 31: 207–231



[16]Robin M, David D. Two-level stochastic search variable selection in GLMs with missing predictors. Int J Biostat, 2010, 6(1): 33



[17]Ntzoufras I, Forster J J, Dellaportas P. Stochastic search variable selection for log-linear models. J Stat Comput Sim, 2000, 68: 23–37



[18]Yi N, George V, Allison D B. Stochastic search variable selection for identifying multiple quantitative trait loci. Genetics, 2003, 164: 1129–1138



[19]Xu S. An empirical Bayes method for estimating epistatic effects of quantitative trait loci. Biometrics, 2007, 63: 513–521



[20]Xu S, Jia Z. Genomewide analysis of epistatic effects for quantitative traits in barley. Genetics, 2007, 175: 1955–1963



[21]Li H H, Hearne S, Bänziger M, Li Z, Wang J. Statistical properties of QTL linkage mapping in biparental genetic populations. Heredity, 2010, 105: 257–267



[22]Li H-H(李慧慧). The analysis and solution of some common questions in quantitative traits QTL mapping. Acta Agron Sin (作物学报), 2010, 36(6): 918–931 (in Chinese with English abstract)

[1] 张钰坤, 陆赢, 崔看, 夏石头, 刘忠松. 芥菜种子颜色调控基因TT8的等位变异及其地理分布分析[J]. 作物学报, 2022, 48(6): 1325-1332.
[2] 黄义文, 代旭冉, 刘宏伟, 杨丽, 买春艳, 于立强, 于广军, 张宏军, 李洪杰, 周阳. 小麦多酚氧化酶基因Ppo-A1Ppo-D1位点等位变异与穗发芽抗性的关系[J]. 作物学报, 2021, 47(11): 2080-2090.
[3] 张福彦, 程仲杰, 陈晓杰, 王嘉欢, 陈锋, 范家霖, 张建伟, 杨保安. 黄淮麦区小麦粒重基因等位变异的分子鉴定及育种应用[J]. 作物学报, 2021, 47(11): 2091-2098.
[4] 王娟,董承光,刘丽,孔宪辉,王旭文,余渝. 棉花适宜机采相关性状的SSR标记关联分析及优异等位基因挖掘[J]. 作物学报, 2017, 43(07): 954-966.
[5] 董雪,刘梦,赵献林,冯玉梅,杨燕. 普通小麦近缘种低分子量麦谷蛋白亚基Glu-A3基因的分离和鉴定[J]. 作物学报, 2017, 43(06): 829-838.
[6] 寇程,高欣,李立群,李扬,王中华,李学军*. 小麦粒重基因TaGW2-6A等位变异的组成分析及育种选择[J]. 作物学报, 2015, 41(11): 1640-1647.
[7] 李文,万千,刘风珍,张昆,张秀荣,厉广辉,万勇善. 花生转录因子基因NAC4的等位变异分析[J]. 作物学报, 2015, 41(01): 31-41.
[8] 相吉山,穆培源,桑伟,徐红军,聂迎彬,崔凤娟,庄丽,韩新年,邹波. 新疆小麦Psy-A1Ppo-A1Ppo-D1TaLox-B1等位变异对面粉色泽的影响[J]. 作物学报, 2015, 41(01): 72-79.
[9] 张福彦,尚晓丽,吴培培,宋双,陈锋,崔党群. 硬粒小麦品种Lpx-B1位点等位变异的分子鉴定及其脂肪氧化酶活性[J]. 作物学报, 2014, 40(08): 1364-1370.
[10] 孙晓棠,卢冬冬,欧阳林娟,胡丽芳,边建民,彭小松,陈小荣,傅军如,贺晓鹏,贺浩华*,朱昌兰*. 水稻纹枯病抗性关联分析及抗性等位变异发掘[J]. 作物学报, 2014, 40(05): 779-787.
[11] 张国华,高明刚,张桂芝,孙金杰,靳雪梅,王春阳,赵岩,李斯深. 黄淮麦区小麦品种(系)产量性状与分子标记的关联分析[J]. 作物学报, 2013, 39(07): 1187-1199.
[12] 范虎,文自翔,王春娥,王芳,邢光南,赵团结,盖钧镒. 中国野生大豆群体农艺加工性状与SSR关联分析和特异材料的遗传构成[J]. 作物学报, 2013, 39(05): 775-788.
[13] 刘亚男,夏先春,何中虎. 普通小麦TaDep1基因克隆与特异性标记开发[J]. 作物学报, 2013, 39(04): 589-598.
[14] 黄冰艳,张新友,苗利娟,高伟,韩锁义,董文召,汤丰收,刘志勇. 花生ahFAD2A等位基因表达变异与种子油酸积累关系[J]. 作物学报, 2012, 38(10): 1752-1759.
[15] 武玉国,吴承来,秦保平,王振林,黄玮,杨敏,尹燕枰. 黄淮冬麦区175个小麦品种的SSR多态性及其与株高、产量相关性状的关联分析[J]. 作物学报, 2012, 38(06): 1018-1028.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!