Acta Agronomica Sinica ›› 2021, Vol. 47 ›› Issue (8): 1460-1471.doi: 10.3724/SP.J.1006.2021.04195


Identification of the candidate genes of soybean resistance to bean pyralid (Lamprosema indicata Fabricius) by BSA-Seq and RNA-Seq

ZENG Wei-Ying(), LAI Zhen-Guang(), SUN Zu-Dong*(), YANG Shou-Zhen, CHEN Huai-Zhu, TANG Xiang-Min   

  1. Institute of Economic Crops, Guangxi Academy of Agricultural Sciences/Southwest Experimental Station of Maize-Soybean Intercrop, Ministry of Agriculture and Rural Affairs, Nanning 530007, Guangxi, China
  • Received:2020-08-25 Accepted:2021-01-13 Online:2021-08-12 Published:2021-02-18
    Natural Science Foundation of Guangxi Province, China(2017GXNSFDA198037);Science and Technology Development Fund of Guangxi Academy of Agricultural Sciences (Guinongke 2020YM116, 2015YT58)


Bean pyralid is an important leaf-feeding insect in soybean. Identification of insect-tolerant genes from soybean has great significant to the crop insect-tolerant breeding and genetic improvement. In this study, an F2 population with 303 individuals was constructed using insect-resistant line Gantai-2-2 and insect-sensitive line Wan 82-178. 30 F2 insect-resistant individuals and 30 insect-sensitive individuals were selected respectively to construct two DNA pools which were used for the whole-genome re-sequencing. The results showed that there were a total of 11,963,077 single nucleotide polymorphism (SNPs) markers identified in two parental lines and two mixed pools. According to the association analysis of SNP-index method, a total of 329 genes were located outside the 99% confidence interval. These genes were mainly concentrated in the regions of 5,601,065-5,865,237 bp with a total of 0.26 Mb on chromosome 7, 2,975,110-6,336,096 bp with a total of 3.36 Mb on chromosome 16, and 44,366,115-54,297,600 bp with a total of 9.93 Mb on chromosome 18. Correlation analysis of BSA-Seq and transcriptome sequencing showed that 12 genes were correlated. Then, 12 candidate genes, including CNGC4, WRKY transcription factor 16, AAP7, serine/threonine protein kinase and ZPR1B were identified by bioinformatics analysis, differential expression analysis, and homologous annotation. This study laid an important foundation for the analysis of the molecular mechanism of soybean resistance to bean pyralid and the cloning of anti-insect genes.

Key words: soybean, bean pyralid, BSA-Seq, RNA-Seq, candidate genes

Table 1

Quality statistics of raw data"

Raw reads
Raw bases
Clean reads
Clean bases
raw_data (%)
Clean_GC_rate (%)
皖82-178 Wan 82-178 438,210,500 65,731,575,000 432,474,442 64,871,166,300 98.69 35.57 96.50 92.13
赶泰-2-2 Gantai-2-2 441,716,184 66,257,427,600 433,990,902 65,098,635,300 98.25 35.60 96.34 91.79
Highly susceptible pool
206,898,280 31,034,742,000 202,406,834 30,361,025,100 97.83 35.54 96.44 92.01
Highly resistant pool
210,122,884 31,518,432,600 206,805,618 31,020,842,700 98.42 35.46 96.29 91.70

Table 2

Matching of quality control data with reference genome"

Clean reads
Mapping reads
Mapping rate (%)
Properly paired ratio (%)
Mean depth
1×覆盖度Coverage ≥ 1× (%) 5×覆盖度
Coverage ≥ 5× (%)
Coverage ≥ 10× (%)
Coverage ≥ 20× (%)
皖82-178 Wan 82-178 43,4152,986 433,232,376 99.79 99.73 65.78 96.67 96.12 95.67 94.62
赶泰-2-2 Gantai-2-2 435,703,158 434,715,984 99.77 99.72 65.98 96.68 96.13 95.68 94.60
Highly susceptible pool
203,202,650 202,820,220 99.81 99.76 30.78 95.75 94.55 93.29 83.88
Highly resistant pool
207,33,600 207,143,542 99.76 99.71 31.44 95.76 94.56 93.32 84.44

Table 3

Annotation of polymorphism sites"

Variation sites information
Wan 82-178
Highly susceptible pool
Highly resistant pool
内含子Intron 324,202 324,333 230,606 231,332
基因区间Intergenic region 2,589,457 2,588,540 1,851,943 1,867,897
可变剪切位点Splicing 612 620 449 444
基因上游Upstream 192,755 192,753 133,797 136,203
基因下游Downstream 162,024 161,574 113,186 114,671
基因上游/基因下游Upstream/downstream 12,038 12,088 8296 8733
5°非翻译区UTR5° 23,658 23,746 68,071 16,818
3°非翻译区UTR3° 29,843 29,595 20,949 21,579
终止子提前Stop gain 1652 1649 1177 1225
终止子丢失Stop loss 270 276 197 204
同义突变Synonymous 48,462 48,433 35,671 35,492
非同义突变Non-synonymous 67,206 67,040 49,333 49,653
SNP总数SNP numbers 3,487,363 3,485,674 2,483,855 2,506,185
纯合突变SNP Hom SNP number 3,480,115 3,478,433 2,477,251 2,499,464
杂合突变SNP Hete SNP number 7248 7241 6604 6721
纯合突变SNP比率Hom SNP rate (%) 99.79 99.79 99.73 99.73
杂合突变SNP比率Het SNP rate (%) 0.21 0.21 0.27 0.27
转换Ts 2,264,913 2,264,110 1,606,918 1,617,797
颠换Tv 1,218,400 1,217,609 872,951 884,443
转换/颠换Ts/Tv 1.86 1.86 1.84 1.83

Fig. 1

Delta SNP-index map of all chromosomes The X-axis is the chromosome number, Y-axis is the Delta SNP-index value. Different colored dots represent SNPs screened on different chromosomes, the red curve represents the Delta SNP-index value after the sliding window, the blue line represents the 99% confidence interval."

Fig. 2

GO enrichment of the candidate genes"

Fig. 3

Top 20 pathway annotations of the candidate genes"

Table 4

Information of candidate genes"

Gene ID
Gene annotation
Comparison group
Up or down a
1 XM_003526153.4 环核苷酸门控离子通道4
Cyclic nucleotide-gated ion channel 4 (CNGC4)
Chr. 6 HR0/HR48
2 XM_003533631.4 纤维素合成酶A催化亚基4 (CesA4)
Cellulose synthase A catalytic subunit 4 [UDP-forming]
Chr. 9 HR0/HR48
3 NM_001250658.2 WRKY16转录因子
WRKY transcription factor 16
Chr. 12 HR0/HR48 Up
4 NM_001255802.1 赤霉素2-β加双氧酶8
Gibberellin 2-beta-dioxygenase 8-like
Chr. 11 HR0/HR48 Up
5 XM_003519602.4 氨基酸透性酶7 (AAP7)
Probable amino acid permease 7
Chr. 2 HR0/HR48 Up
6 XM_026126143.1 Mdis1相互受体激酶2
MDIS1-interacting receptor like kinase 2
Chr. 16 HR0/HR48 Up
7 XM_006601656.3 丝氨酸/苏氨酸蛋白激酶
Serine/threonine protein kinase-like protein
Chr. 18 HR0/HR48
8 NM_001248669.2 ZPR1B
Protein LITTLE ZIPPER 2-like
Chr. 18 HS0/HS48 Up
9 XM_026126905.1 类枯草杆菌蛋白酶
Subtilisin-like protease Glyma18g48580
Chr. 18 HS0/HR0
10 XM_026125778.1 抗病蛋白RGA3
Putative disease resistance protein RGA3
Chr. 15 HS0/HR0
11 XM_026126161.1 无特征LOC100809946
Uncharacterized LOC100809946
Chr. 16 HS0/HR0 Up
12 XM_026127992.1 硫氰酸酶结构域蛋白酶10
Rhodanese-like domain-containing protein 10
Chr. 3 HS48/HR48
