Skip to main content

Table 1 Genome-wide genomic variations in a large cotton population

From: Cotton pan-genome retrieves the lost sequences and genes during domestication and selection

Variation type

Total (1913)

Gh cultivar (1623)

Ghlandrace (256)

GhImpUSO (438)

GhImpCHN (929)

Gb cultivar (261)

AD3-AD7 (26)

Bi-allele SNPa

19,246,497

9,546,748

9,265,438

4766,399

3,761,448

19,473,033

32,878,758

Splicing

2172

1213

1149

652

554

2041

11,366

Exonic

315,404

179,665

172,718

103,126

89,208

316,146

776,644

Intronic

607,301

335,212

322,141

189,798

152,656

575,524

1,010,509

UTR

220,664

120,198

116,269

65,226

52,342

197,420

390,008

Upstream

869,678

448,709

432,640

238,937

169,788

789,898

984,811

Downstream

797,469

413,937

399,140

222,266

161,602

729,445

959,584

Nonsynonymous

195,883

111,686

107,143

63,008

52,853

177,474

420,190

InDel (≤ 20 bp) a

4,815,125

3,971,277

3,744,299

1,672,195

1,726,445

3,366,481

7,625,077

Splicing

1202

1128

941

570

735

1104

2465

Exonic

31,661

27,238

28,815

12,807

14,826

26,455

65,677

Intronic

262,657

231,561

215,663

95,674

94,830

183,387

539,379

UTR

113,824

100,811

96,003

36,418

37,175

76,684

261,351

Upstream

578,086

497,660

413,192

201,122

226,848

400,965

927,134

Downstream

429,514

369,517

311,164

148,050

166,059

309,829

717,980

Frameshift

23,330

20,367

22,040

9798

11,029

19,603

42,328

SV (> 50 bp)

214,310

104,523

97,933

64,064

61,616

132,499

281,476

Deletionb

32,099

22,340

9933

7029

23,559

13,982

15,484

Duplicationb

7576

5146

4766

1721

NA

3252

3718

Inversionb

1112

724

615

310

NA

877

613

Translocationb

357

240

188

167

NA

504

412

CNVc

173,166

76,073

82,431

54,837

38,057

99,274

261,249

  1. aThe 261 G. barbadense accessions were aligned to the “TM-1” reference genome. The G. barbadense population SNP and InDel calling results against the “3–79” reference genome are shown in Additional file 1: Table S5. bGenotyping structural variations (SVs) in 742 cottons. The G. hirsutum TM-1 reference genome was used for detecting variations. The number of genotypes in each group is in parentheses. “NA” represents the missing combined SVs. DUP, INV, and TRA were not included for the GhImpCHN population. cCNVs were identified in 742 cottons. Only variation in each chromosome was counted and further analyzed