Skip to main content

Table 10 Summary of data sets used

From: Promotech: a general tool for bacterial promoter recognition

Bacterium

Genome accession

PubMed ID

NGS technology

#TSS

Genome length

T or V

Escherichia coli str. K-12 substr. MG1655

NC_000913.3

27748404 [43]

dRNA-seq

278

4,641,652

T

Escherichia coli str. K-12 substr. MG1655

NC_000913.2

25266388 [44]

dRNA-seq

2672

4,639,675

T

Helicobacter pylori 26695

NC_000915.1

20164839 [45]

dRNA-seq

1907

1,667,867

T

Helicobacter pylori 26695

NC_000915.1

30169674 [46]

dRNA-seq

449

1,667,867

T

Campylobacter jejuni subsp. jejuni 81116

NC_009839.1

30169674 [46]

dRNA-seq

269

1,628,115

T

Campylobacter jejuni subsp. jejuni NCTC 11168

NC_002163.1

23696746 [47]

dRNA-seq

1905

1,641,481

T

Campylobacter jejuni RM1221

NC_003912.7

23696746 [47]

dRNA-seq

2167

1,777,831

T

Campylobacter jejuni subsp. jejuni 81116

NC_009839.1

23696746 [47]

dRNA-seq

1944

1,628,115

T

Campylobacter jejuni subsp. jejuni 81-176

NC_008787.1

23696746 [47]

dRNA-seq

2003

1,616,554

T

Streptococcus pyogenes strain S119

LR031521.1

30902048 [48]

dRNA-seq

892

1,877,450

T

Salmonella enterica subsp. enterica serovar

NC_016810.1

22538806 [49]

dRNA-seq

1873

4,878,012

T

Typhimurium SL1344

      

Chlamydia pneumoniae CWL029

NC_000922.1

21989159 [50]

dRNA-seq

530

1,230,230

T

Shewanella oneidensis MR-1

NC_004347.2

24987095 [51]

dRNA-seq

4729

4,969,811

T

Leptospira interrogans serovar Manilae isolate L495

NZ_LT962963.1

28154810 [52]

dRNA-seq

2865

4,614,703

T

Streptomyces coelicolor A3(2)

NC_003888.3

27251447 [53]

dRNA-seq

3570

8,667,507

T

Mycolicibacterium smegmatis str. MC2 155

NC_008596.1

30984135 [54]

dRNA-seq

4054

6,988,209

V

Lachnoclostridium phytofermentans ISDg

NC_010001.1

27982035 [55]

Cappable-seq

1187

4,847,594

V

Rhodobacter capsulatus SB 1003

NC_014034.1

– [56]

dRNA-seq

5374

3,738,958

V

Bacillus amyloliquefaciens XH7

CP002927.1

26133043 [57]

dRNA-seq

1064

3,939,203

V

  1. In the last column, a T or V indicates whether the bacterium is reserved for training or validation, respectively. Additional information is included such as the number of TSS per bacterium, the genome’s length, the next-generation sequencing technology used to obtain the TSSs, and the literature sources’ PubMed ID (if PubMed ID is missing, then at the time of this publication, the source manuscript was still in preparation