Skip to main content

Table 1 Summary of sequence correction models

From: Detecting transcriptionally active regions using genomic tiling arrays

Model

Formalism

Number of parameters

Average R2

Average adjusted R2

Number of transcriptionally active regions

Uncorrected

NA

NA

NA

NA

47,463

GC

log I = β0 + β GC (N C + N G )

2

0.0293

0.0284

52,384

Nucleotide-specific

log I = β0 + β A N A + β C N C + β G N G

4

0.0412

0.0373

53,982

Bilinear

log I = β0 + ∑ i = 1 36 δ i β b ( i ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaaeWaqaaGGaciab=r7aKnaaBaaaleaacaWGPbaabeaaaeaacaWGPbGaeyypa0JaaGymaaqaaiaaiodacaaI2aaaniabggHiLdGccqWFYoGydaWgaaWcbaGaamOyaiaacIcacaWGPbGaaiykaaqabaaaaa@4116@

41 = 36 + 4 + 1

0.0980

0.0604

61,731

Full Position-specific

log I = β0 + ∑ i = 1 36 δ i , b ( i ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfeBSjuyZL2yd9gzLbvyNv2Caerbhv2BYDwAHbqedmvETj2BSbqee0evGueE0jxyaibaiKI8=vI8tuQ8FMI8Gi=hEeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciGacaGaaeqabaqadeqadaaakeaadaaeWaqaaGGaciab=r7aKnaaBaaaleaacaWGPbGaaiilaiaadkgacaGGOaGaamyAaiaacMcaaeqaaaqaaiaadMgacqGH9aqpcaaIXaaabaGaaG4maiaaiAdaa0GaeyyeIuoaaaa@3FF4@

109 = 36 × 3 + 1

0.1709

0.0703

71,400

  1. Overview of the models used to relate probe sequence to signal intensity. The Full Position-specific model has the highest R2 and also the highest adjusted R2, indicating that overfitting is not a concern. The rightmost column shows the number of probed loci classified as transcriptionally active, which varies greatly with the sequence model used. NA, not applicable.