欢迎访问作物学报,今天是

作物学报 ›› 2025, Vol. 51 ›› Issue (8): 2128-2138.doi: 10.3724/SP.J.1006.2025.44199

• 作物遗传育种·种质资源·分子遗传学 • 上一篇    下一篇

基于40K SNP芯片的陆地棉产量构成因素全基因组关联分析及单铃重位点挖掘

李宜谦2(), 徐守振1, 刘萍1, 马麒1, 谢斌1, 陈红1,*()   

  1. 1新疆农垦科学院棉花研究所, 新疆石河子 832000
    2浙江大学农业与生物技术学院, 浙江杭州 310058
  • 收稿日期:2024-12-03 接受日期:2025-04-27 出版日期:2025-08-12 网络出版日期:2025-05-14
  • 通讯作者: *陈红, E-mail: xjchenh990122@163.com
  • 作者简介:E-mail: 11816029@zju.edu.cn
  • 基金资助:
    科技创新2030-重大项目(2023ZD0404106)

Genome-wide association study of yield components using a 40K SNP array and identification of a stable locus for boll weight in upland cotton (Gossypium hirsutum L.)

LI Yi-Qian2(), XU Shou-Zhen1, LIU Ping1, MA Qi1, XIE Bin1, CHEN Hong1,*()   

  1. 1Cotton Research Institute, Xinjiang Academy of Agricultural and Reclamation Science, Shihezi 832000, Xinjiang, China
    2College of Agriculture and Biotechnology, Zhejiang University, Hangzhou 310058, Zhejiang, China
  • Received:2024-12-03 Accepted:2025-04-27 Published:2025-08-12 Published online:2025-05-14
  • Contact: *E-mail: xjchenh990122@163.com
  • Supported by:
    Science and Technology Innovation 2023-Major Project(2023ZD0404106)

摘要:

棉花经济产量主要受单株铃数、单铃重和衣分等产量构成因素的影响, 解析棉花产量构成因素的遗传机制对指导分子育种具有重要意义。本研究以612份陆地棉品种(系)构成的自然群体为研究材料, 利用基于液相探针杂交的40K SNP芯片进行基因型分型, 并在5个自然环境下调查单株铃数、单铃重、衣分及籽棉产量等性状。通过全基因组关联分析共检测到6个显著关联位点, 包括与单株铃数相关的2个位点(A03、A05染色体)、与单铃重相关的1个位点(A07染色体)、与衣分相关的1个位点(D01染色体)和与籽棉产量相关的2个位点(A05、D07染色体)。其中, 位于A07染色体89.01~90.45 Mb区间的QTL在5个环境中与单铃重显著关联(P = 5.3646×10-8), 表现出较高的稳定性。通过单倍型分析发现该区间存在2个主要单倍型, 携带有利单倍型的材料平均单铃重显著增加0.64 g。结合深度重测序数据和转录组数据分析, 在该区间鉴定到7个候选基因, 并确定了可用于分子标记开发的关键SNP位点。本研究不仅丰富了陆地棉产量性状的遗传解析结果, 而且为高产育种提供了重要的分子信息。

关键词: 陆地棉, 产量构成要素, 全基因组关联分析, 单铃重, 高产育种

Abstract:

Cotton yield is primarily determined by key yield components, including boll number per plant, boll weight, and lint percentage. Understanding the genetic basis of these traits is essential for advancing molecular breeding strategies. In this study, a natural population of 612 upland cotton (Gossypium hirsutum L.) accessions was genotyped using a 40K SNP array based on liquid-phase probe hybridization technology. Phenotypic data for boll number per plant, boll weight, lint percentage, and seed cotton yield were collected across five different environments. A genome-wide association study (GWAS) identified six significant loci: two associated with boll number per plant (on chromosomes A03 and A05), one with boll weight (on chromosome A07), one with lint percentage (on chromosome D01), and two with seed cotton yield (on chromosomes A05 and D07). Notably, a stable QTL located between 89.01 and 90.45 Mb on chromosome A07 was consistently associated with boll weight across all five environments (P = 5.3646×10-8). Haplotype analysis of this region revealed two major haplotypes, with accessions carrying the favorable haplotype exhibiting a significant increase in boll weight of 0.64 g. By integrating whole-genome resequencing and transcriptome data, seven candidate genes were identified within this region, and a key SNP variant was pinpointed for potential use in molecular marker development. These findings enhance our understanding of the genetic architecture of cotton yield traits and offer valuable molecular resources for high-yield cotton breeding programs.

Key words: upland cotton, yield components, GWAS, boll weight, high-yield breeding

附图1

4个产量相关性状在5个自然环境中的频率分布直方图 BN: 单株铃数; BW: 单铃重; LP: 衣分; Yield: 籽棉产量; 2019SHZ、2019KRL、2020SHZ、2020KRL、2021SHZ分别代表5个自然环境, 即2019石河子、2019库尔勒、2020石河子、2020库尔勒、2021石河子。"

表1

产量相关性状在各环境中的基本统计信息及广义遗传力"

性状
Trait
环境
Environment
最小值
Min.
最大值
Max.
平均值
Mean
标准差
SD
变异系数
CV (%)
广义遗传力
H2
单株铃数
BN
(bolls plant-1)
2019SHZ 3.45 13.65 6.94 1.63 23.44 0.50
2019KRL 3.95 14.10 7.12 1.59 22.35
2020SHZ 2.75 9.35 4.86 0.93 19.17
2020KRL 4.00 10.58 6.31 0.94 14.95
2021SHZ 3.00 13.30 6.86 1.71 24.92
平均Mean 4.06 10.53 6.42 0.90 14.09
单铃重
BW (g)
2019SHZ 3.78 7.63 5.60 0.49 8.66 0.83
2019KRL 3.82 7.90 5.74 0.58 10.13
2020SHZ 4.17 7.82 5.67 0.46 8.14
2020KRL 3.90 8.23 5.69 0.60 10.63
2021SHZ 2.58 10.35 5.73 0.68 11.90
平均 Mean 4.51 7.53 5.68 0.41 7.20
衣分
LP (%)
2019SHZ 28.58 49.57 40.81 3.02 7.39 0.94
2019KRL 33.89 51.22 43.65 2.72 6.23
2020SHZ 34.72 50.29 44.08 2.43 5.54
2020KRL 31.94 53.59 42.71 2.61 6.12
2021SHZ 27.91 56.97 42.38 2.95 6.97
平均 Mean 34.44 50.24 42.67 2.39 5.60
产量
Yield (kg)
2019SHZ 0.84 3.14 1.61 0.36 22.56 0.51
2019KRL 0.77 3.17 1.69 0.40 23.71
2020SHZ 0.62 2.12 1.18 0.21 17.92
2020KRL 1.14 4.75 3.13 0.43 13.65
2021SHZ 1.32 4.82 2.69 0.31 11.32
平均Mean 1.44 2.76 2.06 0.20 9.86

图1

4个产量相关性状的相关性分析及频率分布直方图 对角线方框中为各性状在5个环境下均值的频率分布直方图; 上三角区域显示相关系数及其显著性水平; 下三角区域为性状间的散点图。BN: 单株铃数; BW: 单铃重; LP: 衣分; Yield: 籽棉产量。*、**、***分别表示在0.05、0.01、0.001水平相关性显著。"

表2

全基因组关联分析中鉴定的QTL汇总"

性状
Trait
染色体
Chr.
最显著SNP位置
Peak SNP position (bp)
QTL区间
QTL region (bp)
P
P-value
环境
Environment
单株铃数BN
(bolls plant-1)
A03 108732262 107926223-108967905 4.28E-06 BN-2019SHZ
A05 4400157 4049988-4980793 2.59E-05 BN-2019SHZ
铃重BW (g) A07
89462845 89012271-90448020 5.36E-08 BW-2020SHZ
88885938 1.08E-05 BW-2020 KRL
90441327 1.95E-05 BW-2021SHZ
88885938 2.46E-06 BW-BLUP
88885938 1.37E-06 BW-Mean
衣分LP (%) D01 8257066 7871209-9168549 6.39E-06 LP-2020KRL
产量Yield (kg) D07 15784739 15694741-16158454 8.63E-07 Yield-2020SHZ
A05 16130212 15449114-16318936 2.53E-05 Yield-BLUP

图2

产量相关性状的全基因组关联分析曼哈顿图 (a): 2019年石河子单株铃数; (b): 2020年库尔勒衣分; (c): 2020年库尔勒单铃重; (d): 2020年石河子单铃重; (e): 2021年石河子单铃重; (f): 单铃重平均值; (g): 单铃重BLUP值; (h): 2020年石河子籽棉产量; (i): 籽棉产量BLUP值。缩写同表1。"

图3

染色体A07上单铃重QTL的单倍型分析及候选基因挖掘 (a): 单铃重QTL区间内的连锁不平衡水平; (b): 单铃重QTL的单倍型分析(t检验, ***, P < 0.001); (c): 单铃重QTL区间内存在非同义突变基因的转录组数据。左侧为19个候选基因的基因ID, 热图中显示的数值为不同发育时期的胚珠纤维的FPKM值。dpa指开花后天数。"

附表1

染色体A07上单铃重QTL候选区间内非同义突变信息汇总"

染色体
Chr.
起始
Start
终止
End
参考型
Reference
突变型
Alternate
基因ID
Gene ID
A07 89194219 89194219 A T GH_A07G2179
A07 89234256 89234256 C T GH_A07G2180
A07 89234408 89234408 G C GH_A07G2180
A07 89238319 89238319 G A GH_A07G2181
A07 89238478 89238478 G C GH_A07G2181
A07 89238508 89238508 A G GH_A07G2181
A07 89238519 89238519 C A GH_A07G2181
A07 89244445 89244445 G C GH_A07G2182
A07 89244493 89244493 G A GH_A07G2182
A07 89244494 89244494 C T GH_A07G2182
A07 89244830 89244830 A G GH_A07G2182
A07 89305519 89305519 G A GH_A07G2184
A07 89505855 89505855 T G GH_A07G2189
A07 89507731 89507731 G A GH_A07G2190
A07 89519739 89519739 G T GH_A07G2191
A07 89598121 89598121 T G GH_A07G2192
A07 89598233 89598233 T A GH_A07G2192
A07 89844356 89844356 G A GH_A07G2193
A07 89844389 89844389 C A GH_A07G2193
A07 89844432 89844432 C T GH_A07G2193
A07 89844467 89844467 T C GH_A07G2193
A07 89844501 89844501 C T GH_A07G2193
A07 89844551 89844551 G A GH_A07G2193
A07 89844599 89844599 G T GH_A07G2193
A07 89844600 89844600 A G GH_A07G2193
A07 89844641 89844641 G A GH_A07G2193
A07 89844860 89844860 G A GH_A07G2193
A07 89844980 89844980 G T GH_A07G2193
A07 89845057 89845057 T A GH_A07G2193
A07 89845101 89845101 G C GH_A07G2193
A07 89845146 89845146 A G GH_A07G2193
A07 89850561 89850561 A G GH_A07G2193
A07 89996775 89996775 G A GH_A07G2197
A07 89996897 89996897 C T GH_A07G2197
A07 89996929 89996929 A G GH_A07G2197
A07 89997083 89997083 G A GH_A07G2197
A07 90005923 90005923 A T GH_A07G2198
A07 90057893 90057893 T C GH_A07G2199
A07 90058035 90058035 T G GH_A07G2199
A07 90058102 90058102 T A GH_A07G2199
A07 90058125 90058125 A G GH_A07G2199
A07 90058134 90058134 G T GH_A07G2199
A07 90058140 90058140 C T GH_A07G2199
A07 90058175 90058175 A T GH_A07G2199
A07 90058178 90058178 C A GH_A07G2199
A07 90058232 90058232 A T GH_A07G2199
A07 90058292 90058292 T G GH_A07G2199
A07 90058320 90058320 C A GH_A07G2199
A07 90058380 90058380 C G GH_A07G2199
A07 90058385 90058385 G C GH_A07G2199
A07 90058520 90058520 T C GH_A07G2199
A07 90058524 90058524 G C GH_A07G2199
A07 90058548 90058548 C A GH_A07G2199
A07 90058565 90058565 A C GH_A07G2199
A07 90058602 90058602 T C GH_A07G2199
A07 90058673 90058673 C G GH_A07G2199
A07 90058683 90058683 G C GH_A07G2199
A07 90058719 90058719 A G GH_A07G2199
A07 90095575 90095575 G A GH_A07G2200
A07 90095636 90095636 G C GH_A07G2200
A07 90128297 90128297 G A GH_A07G2201
A07 90160058 90160058 C A GH_A07G2203
A07 90169457 90169457 G C GH_A07G2205
A07 90170915 90170915 C A GH_A07G2206
A07 90176489 90176489 T C GH_A07G2206
A07 90176534 90176534 G A GH_A07G2206
A07 90176573 90176573 T C GH_A07G2206
A07 90176808 90176808 G C GH_A07G2206
A07 90177633 90177633 C T GH_A07G2206
A07 90177788 90177788 A C GH_A07G2206
A07 90178114 90178114 A C GH_A07G2206
A07 90178237 90178237 T A GH_A07G2206
A07 90178390 90178390 C G GH_A07G2206
A07 90328539 90328539 A C GH_A07G2208

表3

染色体A07上与单铃重相关的候选基因的编码蛋白注释"

基因ID
Gene ID
蛋白注释
Protein annotation
GH_A07G2180 跨膜蛋白 Transmembrane protein
GH_A07G2181 跨膜蛋白 Transmembrane protein
GH_A07G2182 β-1,2-N-乙酰葡萄糖氨基转移酶 beta-1,2-N-acetylglucosaminyl transferase
GH_A07G2184 SNF1相关蛋白激酶 SNF1-related protein kinase
GH_A07G2189 类束状蛋白阿拉伯半乳糖蛋白2 FASCICLIN-like arabinogalactan 2
GH_A07G2201 RING/U-box超家族蛋白 RING/U-box superfamily protein
GH_A07G2203 CONSTANS样蛋白9 CONSTANS-like 9
[1] Hu Y, Chen J D, Fang L, Zhang Z Y, Ma W, Niu Y C, Ju L Z, Deng J Q, Zhao T, Lian J M, et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet, 2019, 51: 739-748.
[2] Lam H M, Xu X, Liu X, Chen W B, Yang G H, Wong F L, Li M W, He W M, Qin N, Wang B, et al. Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat Genet, 2010, 42: 1053-1059.
[3] Fang L, Wang Q, Hu Y, Jia Y H, Chen J D, Liu B L, Zhang Z Y, Guan X Y, Chen S Q, Zhou B L, et al. Genomic analyses in cotton identify signatures of selection and loci associated with fiber quality and yield traits. Nat Genet, 2017, 49: 1089-1098.
doi: 10.1038/ng.3887 pmid: 28581501
[4] Ma Z Y, He S P, Wang X F, Sun J L, Zhang Y, Zhang G Y, Wu L Q, Li Z K, Liu Z H, Sun G F, et al. Resequencing a core collection of upland cotton identifies genomic variation and loci influencing fiber quality and yield. Nat Genet, 2018, 50: 803-813.
doi: 10.1038/s41588-018-0119-7 pmid: 29736016
[5] Han Z G, Chen H, Cao Y W, He L, Si Z F, Hu Y, Lin H, Ning X Z, Li J L, Ma Q, et al. Genomic insights into genetic improvement of upland cotton in the world’s largest growing region. Ind Crops Prod, 2022, 183: 114929.
[6] Paterson A H, Brubaker C L, Wendel J F. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol Biol Rep, 1993, 11: 122-127.
[7] Si Z F, Jin S K, Li J Y, Han Z G, Li Y Q, Wu X N, Ge Y X, Fang L, Zhang T Z, Hu Y. The design, validation and utility of the “ZJU CottonSNP40K” liquid chip through genotyping by target sequencing. Ind Crops Prod, 2022, 188: 115629.
[8] Zhang T Z, Hu Y, Jiang W K, Fang L, Guan X Y, Chen J D, Zhang J B, Saski C A, Scheffler B E, Stelly D M, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol, 2015, 33: 531-537.
[9] Kang H M, Sul J H, Service S K, Zaitlen N A, Kong S Y, Freimer N B, Sabatti C, Eskin E. Variance component model to account for sample structure in genome-wide association studies. Nat Genet, 2010, 42: 348-354.
doi: 10.1038/ng.548 pmid: 20208533
[10] Wang M J, Tu L L, Lin M, Lin Z X, Wang P C, Yang Q Y, Ye Z X, Shen C, Li J Y, Zhang L, et al. Asymmetric subgenome selection and cis-regulatory divergence during cotton domestication. Nat Genet, 2017, 49: 579-587.
[11] Wang M J, Tu L L, Yuan D J, Zhu D, Shen C, Li J Y, Liu F Y, Pei L L, Wang P C, Zhao G N, et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat Genet, 2019, 51: 224-229.
[12] Ma Z Y, Zhang Y, Wu L Q, Zhang G Y, Sun Z W, Li Z K, Jiang Y F, Ke H F, Chen B, Liu Z W, et al. High-quality genome assembly and resequencing of modern cotton cultivars provide resources for crop improvement. Nat Genet, 2021, 53: 1385-1391.
doi: 10.1038/s41588-021-00910-2 pmid: 34373642
[13] Chang C C, Chow C C, Tellier L C, Vattikuti S, Purcell S M, Lee J J. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience, 2015, 4: 7.
doi: 10.1186/s13742-015-0047-8 pmid: 25722852
[14] Wang K, Li M Y, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res, 2010, 38: e164.
[15] Dai F, Chen J D, Zhang Z Q, Liu F J, Li J, Zhao T, Hu Y, Zhang T Z, Fang L. COTTONOMICS: a comprehensive cotton multi- omics database. Database, 2022, 2022: baac080.
[16] Li F G, Fan G Y, Lu C R, Xiao G H, Zou C S, Kohel R J, Ma Z Y, Shang H H, Ma X X, Wu J Y, et al. Genome sequence of cultivated upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol, 2015, 33: 524-530.
[17] Li Y Q, Si Z F, Wang G P, Shi Z L, Chen J W, Qi G A, Jin S K, Han Z G, Gao W H, Tian Y, et al. Genomic insights into the genetic basis of cotton breeding in China. Mol Plant, 2023, 16: 662-677.
doi: 10.1016/j.molp.2023.01.012 pmid: 36738104
[18] Zhang Y Y, Zhou F W, Wang H, Chen Y N, Yin T M, Wu H T. Genome-wide comparative analysis of the fasciclin-like arabinogalactan proteins (FLAs) in salicacea and identification of secondary tissue development-related genes. Int J Mol Sci, 2023, 24: 1481.
[19] Cagnola J I, Dumont de Chassart G J, Ibarra S E, Chimenti C, Ricardi M M, Delzer B, Ghiglione H, Zhu T, Otegui M E, Estevez J M, et al. Reduced expression of selected FASCICLIN-LIKE ARABINOGALACTAN PROTEIN genes associates with the abortion of kernels in field crops of Zea mays (maize) and of Arabidopsis seeds. Plant Cell Environ, 2018, 41: 661-674.
[20] Feraru E, Feraru M I, Moulinier-Anzola J, Schwihla M, Ferreira Da Silva Santos J, Sun L, Waidmann S, Korbei B, Kleine-Vehn J. PILS proteins provide a homeostatic feedback on auxin signaling output. Development, 2022, 149: dev200929.
[1] 薛晓菲, 戴云静, 李熙林, 丁艳艳, 王翔, 雷长英, 韩焕勇, 贺道华. 陆地棉杜松烯合酶基因GhCDN10的特征及其在棉酚合成中功能分析[J]. 作物学报, 2025, 51(8): 2060-2076.
[2] 高梦娟, 赵贺莹, 陈家辉, 陈晓倩, 牛萌康, 钱琪润, 崔陆飞, 邢江敏, 银庆淼, 郭雯, 张宁, 孙丛苇, 阳霞, 裴丹, 贾奥琳, 陈锋, 余晓东, 任妍. 小麦抗纹枯病新位点Qse.hnau-5AS的定位及其候选基因鉴定[J]. 作物学报, 2025, 51(8): 2240-2250.
[3] 蔡金珊, 李超男, 王景一, 李宁, 柳玉平, 景蕊莲, 李龙, 孙黛珍. 小麦幼苗根系性状全基因组关联分析及TaSRL-3B优异等位基因发掘[J]. 作物学报, 2025, 51(8): 2020-2032.
[4] 赵超男, 王金凤, 张玉, 张丽, 李瑞琦, 王鹏飞, 李鸽子, 张宏军, 虞波, 康国章. 全基因组关联分析定位与挖掘小麦氮高效基因[J]. 作物学报, 2025, 51(7): 1801-1813.
[5] 梁红凯, 赵苏蒙, 陆琼, 周鹏, 智慧, 刁现民, 贺强. 谷子微核心种质的构建[J]. 作物学报, 2025, 51(6): 1435-1444.
[6] 王琼, 邹丹霞, 陈兴运, 张威, 张红梅, 刘晓庆, 贾倩茹, 魏利斌, 崔晓艳, 陈新, 王学军, 陈华涛. 大豆开花时间和成熟期性状全基因组关联分析与候选基因预测[J]. 作物学报, 2025, 51(6): 1558-1568.
[7] 李文佳, 廖泳俊, 黄璐, 鲁清, 李少雄, 陈小平, 金晶炜, 王润风. 花生开花时间的全基因组关联分析及候选基因筛选[J]. 作物学报, 2025, 51(5): 1400-1408.
[8] 王亚雯, 戚正阳, 尤佳琦, 聂新辉, 曹娟, 杨细燕, 涂礼莉, 张献龙, 王茂军. 棉花60K功能位点基因芯片的制备及应用[J]. 作物学报, 2025, 51(5): 1178-1188.
[9] 徐建霞, 丁延庆, 曹宁, 程斌, 高旭, 李文贞, 张立异. 中国高粱株高和节间数全基因组关联分析及候选基因预测[J]. 作物学报, 2025, 51(3): 568-585.
[10] 郭淑慧, 潘转霞, 赵战胜, 杨六六, 皇甫张龙, 郭宝生, 胡晓丽, 录亚丹, 丁霄, 吴翠翠, 兰刚, 吕贝贝, 谭逢平, 李朋波. 陆地棉D11染色体一个纤维长度主效位点的遗传解析[J]. 作物学报, 2025, 51(2): 383-394.
[11] 赵斐斐, 李少雄, 刘浩, 李海芬, 王润风, 黄璐, 余倩霞, 洪彦彬, 陈小平, 鲁清, 曹玉曼. 花生主茎节间和侧枝节间长度的关联作图及候选基因分析[J]. 作物学报, 2025, 51(2): 548-556.
[12] 马敏虎, 常华瑜, 陈朝燕, 仁增, 刘廷辉, 邢国芳, 郭刚刚. 苗草专用型大麦品种鉴定及全基因组关联分析[J]. 作物学报, 2025, 51(1): 91-102.
[13] 禹海龙, 吴文雪, 裴星旭, 刘晓宇, 邓跟望, 李西臣, 甄士聪, 望俊森, 赵永涛, 许海霞, 程西永, 詹克慧. 小麦茎秆性状的转录组测序及全基因组关联分析[J]. 作物学报, 2024, 50(9): 2187-2206.
[14] 彭小爱, 卢茂昂, 张玲, 刘童, 曹磊, 宋有洪, 郑文寅, 何贤芳, 朱玉磊. 基于55K SNP芯片的小麦籽粒主要品质性状的全基因组关联分析[J]. 作物学报, 2024, 50(8): 1948-1960.
[15] 李长喜, 董占鹏, 关永虎, 刘金伟, 李航, 梅拥军. 南疆陆地棉农艺性状与皮棉产量性状的遗传贡献及决策系数分析[J]. 作物学报, 2024, 50(6): 1486-1502.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!