欢迎访问作物学报,今天是

作物学报 ›› 2011, Vol. 37 ›› Issue (06): 965-974.doi: 10.3724/SP.J.1006.2011.00965

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

大豆基因组和转录组的核基因密码子使用偏好性分析

张乐1,金龙国1,罗玲1,王跃平1,董志敏1,孙守红2,邱丽娟1,*   

  1. 1 国家农作物基因资源与遗传改良重大科学工程 / 农业部作物种质资源利用重点开放实验室 / 中国农业科学院作物科学研究所,北京100081; 2 中国科学院遗传与发育研究所,北京100101
  • 收稿日期:2010-12-23 修回日期:2011-03-28 出版日期:2011-06-12 网络出版日期:2011-04-12
  • 通讯作者: 邱丽娟, E-mail: qiu_lijuan@263.net
  • 基金资助:

    本研究由国家高技术研究发展计划(863计划)项目(2006AA10A110),国家自然科学基金项目(30871621)和国家转基因生物新品种培育科技重大专项重点课题(2009ZX08009-088B)资助。

Analysis of Nuclear Gene Codon Bias on Soybean Genome and Transcriptome

ZHANG Le1,JIN Long-Guo1,LUO Ling1,WANG Yue-Ping1,DONG Zhi-Min1,SUN Shou-Hong2,QIU Li-Juan1,*   

  1. 1 National Key Facility for Crop Gene Resources and Genetic Improvement / Key Laboratory of Germplasm Utilization, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing 100081, China; 2 Chinese Academy of Sciences, Beijing 100101, China
  • Received:2010-12-23 Revised:2011-03-28 Published:2011-06-12 Published online:2011-04-12
  • Contact: 邱丽娟, E-mail: qiu_lijuan@263.net

摘要: 研究大豆核基因密码子的使用模式,探讨影响其密码子组成和编码特点的因素,为运用基因工程技术提高改良大豆提供理论依据。以大豆基因组的46 430个高置信编码基因和2 071条大豆全长转录本序列为数据来源,应用CodonW软件对大豆全基因组密码子组成、同义密码子使用频率和全长转录组编码区密码子使用各项参数的计算和统计分析发现,基因的表达水平与编码区G+C和GC3s含量均呈极显著正相关,且G+C和GC3s含量越高的基因密码子使用偏好性越高,并确定了UCC和GCC为大豆最优密码子。编码区长度分组分析表明,密码子使用偏好性随编码区长度的增加而降低,编码区较长的基因则趋向于随机使用密码子,且在转录组数据范围内,编码区长度介于400~600 bp的基因表达水平最高。大豆叶片和种子中特异表达基因的密码子使用偏好性和基因表达水平较为接近,但种子特异表达基因的G+C和GC3s含量均显著高于叶片特异表达基因,而其芳香族氨基酸含量则极显著低于叶片特异表达基因。

关键词: 大豆, 基因组, 转录组, 密码子

Abstract: Research of soybean nuclear gene codon composition, usage pattern and influencing factors can provide theoretical basis for applying genetic engineering techonology to improve soybean varieties. A total of 46 430 high confidence coding sequences predicted from soybean genome and 2 071 full-length transcripts were used to analyze the composition and characteristics of soybean nuclear gene codons. CodonW software was applied to calculate the nucleotide composition, relative synonymous codon usage and other parameters of soybean genome and transcriptome. The result indicted that gene expression level was significantly and positively correlated with G+C and GC3s contents, and genes with high G+C and GC3s contents had high codon preference. UCC and GCC were identified as optimal codons in soybean. Analysis of coding sequences with different length showed that codon preference reduced as the coding sequence (CDS) length increased, and longer CDS tend to select codons randomly. CDS length between 400 to 600 bp had the highest expression level among the transcriptome data. The preference and expression level were almost the same between leaf-specific and seed-specific genes. But seed-specific genes had significantly higher G+C and GC3s contents than leaf-specific genes, and the contents of aromatic amino acids encoded by seed-specific genes were highly significantly lower than these by leaf-specific genes.

Key words: Soybean, Genome, Transcriptome, Codon

[1]Carlini D B, Stephan W. In vivo introduction of unpreferred synonymous codons into the Drosophila Adh gene results in reduced levels of ADH protein. Genetics, 2003, 163: 239–243
[2]Sharp P M, Matassi G. Codon usage and genome evolution. Curr Opin Genet Dev, 1994, 4: 851–860
[3]Stenico M, Lloyd A T, Sharp P M. Codon usage in Caenorhabditis elegans: delineation of translational selection and mutational biases. Nucl Acid Res, 1994, 22: 2437–2446
[4]Olejniczak M, Uhlenbeck O C. tRNA residues that have coevolved with their anticodon to ensure uniform and accurate codon recognition. Biochinmie, 2006, 88: 943–950
[5]Holmouist G P, Flipske J. Organization of mutations along the genome: a prime determinant of genome evolution. Trends Ecol Evol, 1994, 9: 65–69
[6]Bernard I G. The human genome: Organization and evolutionary history. Annu Rev Genet, 1995, 29: 445–476
[7]Shi X-F(石秀凡), Huang J-F(黄京飞), Liu S-Q(柳树群), Liu C-Q(刘次全). The feature of synonymous codon bias and GC-content relationship in human genes. Prog Biochem Biophys (生物化学与生物物理进展), 2002, 29(3): 411–414 (in Chinese with English abstract)
[8]Xia X. Mutation and selection on the anticodon of tRNA genes in vertebrate mitochondrial genomes. Gene, 2005, 345: 13–20
[9]Chen S L, Lee W, Hottes A K, Shapiro L, McAdams H H. Codon usage between genomes is constrained by genome-wide mutational processes. Pro Natl Acad Sci, 2004, 101: 3480–3485
[10]Romero H, Zavala A, Musto H, Bernaerdi G. The influence of translational selection on codon usage in fishes from the family Cyprinidae. Gene, 2003, 317: 141–147
[11]Moriyama E N, Powell J R. Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cervisiae and Escherichia coli. Nucl Acids Res, 1998, 26: 3188–3193
[12]Knight R D, Freeland S J, Landweber L F. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol, 2001, 2: RESERCH0010
[13]Gupta S K, Majumdar S K, Bhattacharya T, Ghosh T C. Studies on the relationships between the synonymous codon usage and protein secondary structural units. Biochem Biophys Res Commun, 2000, 269: 692–696
[14]Goldman N, Yang Z. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Molr Biol Evol, 1994, 11: 725–736
[15]Schmidt W. Phylogeny reconstruction for protein sequences based on amino acid properties. J Mol Evol, 1995, 41: 522–530
[16]Arumuganathan K, Earle E D. Nuclear DNA content of some important plant species. Plant Mol Rep, 1991, 9: 208–218
[17]Jeremy S, Steven B C, Jessica S, Ma J X, Mitros T, Nelson W, Hyten D L, Song Q J, Thelen J J, Cheng J L, Xu D, Hellsten U, May G D, Yu Y, Sakarai T, Umezawa T, Bhattacharyya M K, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S Q, Abernathy B, Du J C, Tian Z X, Zhu L C, Gill N, Joshi T, Libault M, Sethuraman A, Zhang X C, Shinozak K, Nguyen H T, Wing R A, Cregan P, Specht J, Crimwood J, Rokhsar D, Stavey G, Shoemaker R C, Jackson S A. Genome sequence of the palaeopolyploid soybean. Nature, 2010, 463: 178–183
[18]Cao H-Y(曹慧颖), Zhang R(张锐), Guo S-D(郭三堆). High level expression of human thymosin α1 concatemer in transgenic tomato plants. Sci Agric Sin (中国农业科学), 2009, 42(7): 2291–2296 (in Chinese with English abstract)
[19]Zou Y-M(邹永梅), Shi J-S(施季森), Zhu-Ge Q(诸葛强), Huang M-R(黄敏仁). Reacting the silencing genes in the transgenic plants. Mol Plant Breed (分子植物育种), 2006, 4(1): 95–102 (in Chinese with English abstract)
[20]Dong Z-M(董志敏), Li Y-H(李英慧), Zhang B-S(张宝石), Guan R-X(关荣霞), Chang R-Z(常汝镇), Qiu L-J(邱丽娟). An improved SMART method to construction full-length cDNA library for large clones. Soybean Sci (大豆科学), 2006, (5): 1-4 (in Chinese with English abstract)
[21]Wang Y-P(王跃平), Li Y-H(李英慧), Chen X-T(陈雄庭), Chang R-Z(常汝镇), Qiu L-J(邱丽娟). Construction and characterization of the filling stage’s seed cDNA library from Suinong14 (Glycine max). Chin J Oil Crop Sci (中国油料作物学报), 2008, 30(1): 40–45 (in Chinese with English abstract)
[22]Sharp P M, Haney T M F, Mosurski K R. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucl Acids Res, 1986, 14: 5125–5143
[23]Liu Q, Feng Y, Xue Q. Analysis of factors shaping codon usage in the mitochondrion genome of Oryza sativa. Mitochondrion, 2004, 4: 313–320
[24]Sau K, Gupta S K, Sau S, Mandal S C, Ghosh T C. Factors influencing synonymous codon and amino acid usage biases in Mimivirus. Biosystems, 2006, 85: 107–113
[25]Wright F. The effective number of codons used in a gene. Gene, 1990, 87: 23–29
[26]Gupta S K, Bhattacharyya T K, Ghosh T C. Synonymous codon usage in lactococcus lactis: mutational bias versus translational selection. Biomol Struct Dyn, 2004, 21: 1–9
[27]Peixoto L, Zavala A, Romero H, Musto H. The strength of translational selection for codon usage varies in three relicons of Sinorhizobium melioti. Gene, 2003, 320: 109–116
[28]Romero H, Zavala A, Musto H. Codon usage in Chlamydia trachomatis is the result of strand-specific mutational biases and a complex pattern of selective forces. Nucl Acids Res, 2000, 28: 2084–2090
[29]Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabdits, Drosophila, and Arabidopsis. Proc Natl Acad Sci, 1999, 96: 4482–4487
[30]Sharp P M, Li W H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucl Acids Res, 1987, 15: 1281–1295
[31]Wei C, Brent M R. Using ESTs to improve the accuracy of de novo gene prediction. BMC Bioinformatics, 2006, 7: 327-337
[32]Kwan A L, Li L, Kulp D C, Dutcher S K, Stormo G D. Improving gene-finding in Chlamydomonas reinhardtii Green Genie2. BMC Genomics, 2009, 10: 210–220
[33]Sharp P M, Cowe E, Higgins D G, Shield D C, Wolfe K H, Wright F. Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens; a review of the considerable within-species diversity. Nucl Acids Res, 1988, 16: 8207–8211
[34]Stoletzki N, Eyre-Walker A. Synonymous codon usage in Escherichia coli: selection for translational accuracy. Mol Biol Evol, 2007, 24: 374–381
[35]Morton B R, Wright S I. Selective constraints on codon usage of nuclear genes from Arabidopsis thaliana. Mol Biol Evol, 2007, 24: 122–129
[36]Cutter A D, Wasmath J D, Blaxter M L. The evolution of biased codon and amino acid usage in nematode genomes. Mol Biol Evol, 2006, 23: 2303–2315
[37]Duret L, Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. PNAS, 1999, 96: 4482–4487
[38]Vicario S, Mason C E, White K P, Powell J R. Developmental stage and level of codon usage bias in drosophila. Mol Biol Evol, 2008, 25: 2269–2277
[39]D’Onofrio G, Mouchiroud D, Aissani B, Gautier C, Bernardi G. Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol, 1991, 32: 504–510
[40]Moriyama E N, Powell J R. Codon usage bias and tRNA abundance in Drosophila. J Mol Evol, 1997, 45: 514–523
[41]Holmquist G P, Filipski J. Organization of mutations along the genome: a prime determinant of genome evolution. Trends Ecol Evol, 1994, 9: 65–69
[42]Perlak F J, Deaton R W, Armstrong T A, Fuchs R L, Sims S R, Greenplate J T, Fischhoff D A. Insect resistant cotton plants. Biol Technol, 1990, 8: 939–943
[1] 陈玲玲, 李战, 刘亭萱, 谷勇哲, 宋健, 王俊, 邱丽娟. 基于783份大豆种质资源的叶柄夹角全基因组关联分析[J]. 作物学报, 2022, 48(6): 1333-1345.
[2] 杨欢, 周颖, 陈平, 杜青, 郑本川, 蒲甜, 温晶, 杨文钰, 雍太文. 玉米-豆科作物带状间套作对养分吸收利用及产量优势的影响[J]. 作物学报, 2022, 48(6): 1476-1487.
[3] 王炫栋, 杨孙玉悦, 高润杰, 余俊杰, 郑丹沛, 倪峰, 蒋冬花. 拮抗大豆斑疹病菌放线菌菌株的筛选和促生作用及防效研究[J]. 作物学报, 2022, 48(6): 1546-1557.
[4] 孙思敏, 韩贝, 陈林, 孙伟男, 张献龙, 杨细燕. 棉花苗期根系分型及根系性状的关联分析[J]. 作物学报, 2022, 48(5): 1081-1090.
[5] 于春淼, 张勇, 王好让, 杨兴勇, 董全中, 薛红, 张明明, 李微微, 王磊, 胡凯凤, 谷勇哲, 邱丽娟. 栽培大豆×半野生大豆高密度遗传图谱构建及株高QTL定位[J]. 作物学报, 2022, 48(5): 1091-1102.
[6] 李阿立, 冯雅楠, 李萍, 张东升, 宗毓铮, 林文, 郝兴宇. 大豆叶片响应CO2浓度升高、干旱及其交互作用的转录组分析[J]. 作物学报, 2022, 48(5): 1103-1118.
[7] 张以忠, 曾文艺, 邓琳琼, 张贺翠, 刘倩莹, 左同鸿, 谢琴琴, 胡燈科, 袁崇墨, 廉小平, 朱利泉. 甘蓝S-位点基因SRKSLGSP11/SCR密码子偏好性分析[J]. 作物学报, 2022, 48(5): 1152-1168.
[8] 彭西红, 陈平, 杜青, 杨雪丽, 任俊波, 郑本川, 罗凯, 谢琛, 雷鹿, 雍太文, 杨文钰. 减量施氮对带状套作大豆土壤通气环境及结瘤固氮的影响[J]. 作物学报, 2022, 48(5): 1199-1209.
[9] 王好让, 张勇, 于春淼, 董全中, 李微微, 胡凯凤, 张明明, 薛红, 杨梦平, 宋继玲, 王磊, 杨兴勇, 邱丽娟. 大豆突变体ygl2黄绿叶基因的精细定位[J]. 作物学报, 2022, 48(4): 791-800.
[10] 张霞, 于卓, 金兴红, 于肖夏, 李景伟, 李佳奇. 马铃薯SSR引物的开发、特征分析及在彩色马铃薯材料中的扩增研究[J]. 作物学报, 2022, 48(4): 920-929.
[11] 李瑞东, 尹阳阳, 宋雯雯, 武婷婷, 孙石, 韩天富, 徐彩龙, 吴存祥, 胡水秀. 增密对不同分枝类型大豆品种同化物积累和产量的影响[J]. 作物学报, 2022, 48(4): 942-951.
[12] 刘丹, 周彩娥, 王晓婷, 吴启蒙, 张旭, 王琪琳, 曾庆东, 康振生, 韩德俊, 吴建辉. 利用集群分离分析结合高密度芯片快速定位小麦成株期抗条锈病基因YrC271[J]. 作物学报, 2022, 48(3): 553-564.
[13] 杜浩, 程玉汉, 李泰, 侯智红, 黎永力, 南海洋, 董利东, 刘宝辉, 程群. 利用Ln位点进行分子设计提高大豆单荚粒数[J]. 作物学报, 2022, 48(3): 565-571.
[14] 周悦, 赵志华, 张宏宁, 孔佑宾. 大豆紫色酸性磷酸酶基因GmPAP14启动子克隆与功能分析[J]. 作物学报, 2022, 48(3): 590-596.
[15] 王娟, 张彦威, 焦铸锦, 刘盼盼, 常玮. 利用PyBSASeq算法挖掘大豆百粒重相关位点与候选基因[J]. 作物学报, 2022, 48(3): 635-643.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!