Skip to main content
Fig. 6 | Genome Biology

Fig. 6

From: LightGBM: accelerated genomically designed crop breeding through ensemble learning

Fig. 6

Interpretation of highly effective SNPs recognized by LightGBM. a Ten grades of SNPs selected based on ranked IGs from high to low are used as features to test the precisions of LightGBM on DTT, PH, and EW, in parallel comparison with the use of 32,559 SNPs (All) and the same number of SNPs randomly selected from the genome. For each grade of the SNP number, fivefold CVs are performed to generate a distribution of the correlations. The red horizontal lines represent the precisions of rrBLUP on DTT, PH, and EW when using all the 32,559 SNPs for prediction. b Comparison of the IG distribution of the selected 384 highly effective SNPs with the GWAS signals on DTT across the 1428 maternal lines. c The 411 maternal lines are divided into AA and GG subgroups based on the genotype of the top SNP chr8.s_123039570 in the QTL of ZCN8. The two subgroups of lines exhibit significantly different expressions of the associated gene ZCN8 (p = 0.0116, t-test) and DTT phenotype (p = 1.94e−06, t-test). d Comparison of the IG distribution of the selected 384 highly effective SNPs with the GWAS signals on PH across the 1428 maternal lines. e The 411 maternal lines are divided into CC and TT subgroups based on the genotype of the top SNP chr1.s_257839283 in the QTL of BRD1. The two subgroups of lines exhibit significantly different expressions of the associated gene BRD1 (p = 5.35e−04, t-test) and PH phenotype (p = 3.63e−09, t-test)

Back to article page