Systematic detection of putative tumor suppressor genes through the combined use of exome and transcriptome sequencing
- Qi Zhao†1,
- Ewen F Kirkness†2,
- Otavia L Caballero†1,
- Pedro A Galante3,
- Raphael B Parmigiani3,
- Lee Edsall4,
- Samantha Kuan4,
- Zhen Ye4,
- Samuel Levy5,
- Ana Tereza R Vasconcelos6,
- Bing Ren4,
- Sandro J de Souza3,
- Anamaria A Camargo3,
- Andrew JG Simpson1Email author and
- Robert L Strausberg1Email author
© Zhao et al.; licensee BioMed Central Ltd. 2010
Received: 6 July 2010
Accepted: 25 November 2010
Published: 25 November 2010
To identify potential tumor suppressor genes, genome-wide data from exome and transcriptome sequencing were combined to search for genes with loss of heterozygosity and allele-specific expression. The analysis was conducted on the breast cancer cell line HCC1954, and a lymphoblast cell line from the same individual, HCC1954BL.
By comparing exome sequences from the two cell lines, we identified loss of heterozygosity events at 403 genes in HCC1954 and at one gene in HCC1954BL. The combination of exome and transcriptome sequence data also revealed 86 and 50 genes with allele specific expression events in HCC1954 and HCC1954BL, which comprise 5.4% and 2.6% of genes surveyed, respectively. Many of these genes identified by loss of heterozygosity and allele-specific expression are known or putative tumor suppressor genes, such as BRCA1, MSH3 and SETX, which participate in DNA repair pathways.
Our results demonstrate that the combined application of high throughput sequencing to exome and allele-specific transcriptome analysis can reveal genes with known tumor suppressor characteristics, and a shortlist of novel candidates for the study of tumor suppressor activities.
Cancer arises from the accumulation of genetic and epigenetic changes that disrupt the normal regulatory controls in cells. Recently, next generation sequencing technology has been employed to identify variations in protein-coding sequences and genome structure for several types of cancers [1–9]. These studies have revealed the effectiveness of high throughput sequence analysis to identify somatic genomic alterations, such as point mutations, and structural variations, including gain and loss of chromosome regions. An important finding is that integrated analysis of the various somatic alterations is key for identifying genes that may drive cancer development and progression through oncogenic or tumor suppressor functions. Here, we combine the detection of two types of molecular events, loss of heterozygosity (LOH) and allele-specific expression (ASE), to identify genes with known and potential tumor suppressor characteristics.
The common feature of LOH and ASE is loss of expression from one allele, which has frequently been observed for tumor suppressor genes. In ASE, a dominant gene product is expressed from the selected allele. For some genes, subtle changes in expression level and balance between alleles could be physiologically significant. Haploinsufficiency of many tumor suppressor genes promotes tumorigenesis and metastasis .
ASE is classically associated with epigenomic regulation, and can be heritable. Two extreme examples are inactivation of genes on the X chromosome in female cells, and imprinting of autosomal genes . ASE can arise from epigenetic modification of the genome, including DNA methylation and histone modification [12, 13]. Genetic variations in the coding or non-coding regions of a gene are likely to influence these epigenetic controls . However, allelic differences in gene expression are variable among populations and among tissue types [15, 16], suggesting that ASE can be context specific with regard to cell type, cell differentiation status, and exposure to external stimuli. Recently, subtle differences in allelic expression have been detected for numerous human genes, and in a few cases, have been associated with a genetic predisposition to disease, including cancer [17, 18].
Previously, genome-wide quantification of ASE events has been estimated by hybridization-based [15, 19, 20] and sequencing-based  methodologies. Recently, several studies have highlighted specific roles of ASE in oncogenesis, many as germline ASE [18, 21, 22]. Here, we have applied comprehensive sequence-based approaches using exome capture and transcriptome sequencing in a breast cancer cell line, HCC1954, to identify potential cancer-specific and somatically driven LOH and ASE events, and to discern their functional characteristics. This cell line, derived from a ductal breast carcinoma, is estrogen negative, progesterone receptor negative and ERBB2 positive, and has been particularly well studied at the molecular level [2, 7, 23]. A matching control cell line, HCC1954BL, which was established from lymphoblast cells of the same patient, was studied in parallel. We demonstrate that combined analysis of exome and transcriptome sequences provides a dynamic image of tumor cells that is particularly relevant to tumor suppressor networks.
Application of exome sequencing to LOH detection
Statistics of exome sequencing and reads mapping
Number of 454 reads
Total bases pairs
Uniquely mapped reads
Reads uniquely mapped to primary targets
Mean target coverage
Median target coverage
Coverage enrichment by exome sequencing
Total high-confidence (HC) SNVs (known SNVs)
HC heterozygous SNVs (known SNVs)
HC heterozygous SNVs in CDS (known SNVs)
To identify specific genes displaying LOH in HCC1954, we used more stringent criteria that required a heterozygous locus with variant allele frequency between 20% and 80% in HCC1954BL, together with homozygosity in HCC1954 (P < 0.001). In HCC1954BL, 8,203 heterozygous SNV loci were defined, with 7,848 in the coding sequence (CDS). LOH events are thus detected in 403 genes as revealed by 609 SNVs, among which 544 are known SNPs (Tables S1, S2 and S3 in Additional file 1). Most of the LOH genes are clustered together in large blocks as described above. For those single LOH genes that are isolated, we also required that the homozygous SNV in HCC1954 has been defined previously in dbSNP, that no conflicting allelic status is detected within 25 kb, and that the homozygosity of the SNV locus is supported by transcriptome reads. Genes with LOH are located on 15 chromosomes, with most on chromosomes 5 and 17, including BRCA1 (Figure 1a; Additional file 2). Using the same criteria, only one LOH gene was detected in HCC1954BL (RRAS2).
We compared the allelic status of SNPs that were defined in our LOH analysis with those that were genotyped by Affymetrix Genome-Wide Human SNP Array 6.0 [GEO:GSE13373]. In HCC1954BL, heterozygous SNP calls matched perfectly between the two platforms for all 345 known SNPs that were shared. Only one of 224 homozygous SNPs identified by SNP array was revealed as heterozygous by sequencing. For HCC1954, heterozygous SNPs calls were also 100% consistent between the two platforms for all 172 SNPs that are shared. However, 29 of 270 (11%) homozygous SNPs, defined by SNP array, were identified as heterozygous by exome sequencing. Thus, there was a high level of consistency between the two platforms, with sequencing possibly providing greater sensitivity for cancer genomes that carry a wide spectrum of copy number variations.
Top categories of general molecular types
General molecular function
Number of genes
USP26, INPP5K, PTPRS, MAT2B
CDK7, DGKE, MAP2K4, PDGFRL
ATP2B3, SLC36A3, SIL1, ABCA7
BRCA1, FOXD4, SOX5, VEZF1
G-protein coupled receptor
GPR174, OR1A2, TAS2R7, GRM6
SEMA5A, IL31RA, ITGB3, OSMR
IL3, ERBB2IP, EDA, CXCL16
CCT8L2, CNGA2, GABRA6, GRIN3B
MGMT, PLCH1, GUCY1A3, PYGL
CTBP2, SMARCA4, BCLAF1, SPEN
FGFR2, FGFR4, IP6K2, TAOK1
SLC44A5, SLC25A5, SNX15, LBP
HLA-DQA1, HLA-A, TNFRSF10D
G-protein coupled receptor
mRNA allelotyping by transcriptome sequencing
High throughput sequencing of transcriptomes for HCC1954 and HCC1954BL was performed, with 14.0 Gbp and 13.6 Gbp generated by short read paired-end sequencing, respectively. Sequence reads were subsequently aligned to the RefSeq gene set  as well as to the human reference genome with CLCBio Genomic Workbench (see Materials and methods). With a cutoff of 1× average coverage across each gene, 14,397 and 14,251 genes were found to be expressed in HCC1954 and HCC1954BL, respectively. These numbers are comparable to previous transcriptome studies [26, 27]. The average base pair coverage for the detected transcriptome is approximately 120× for HCC1954 and 115× for HCC1954BL. For HCC1954, 7,173 transcripts displayed SNVs at a minimum of one locus per transcript, indicating that these genes are expressed from both alleles (see Materials and methods). The remaining 7,224 transcripts lack detectable allelic variation. These include many cases in which coverage is not sufficient to make a call for allelic variation. For HCC1954BL, 7,595 genes have detectable allelic variation within transcribed regions, while variants were not detectable in transcripts of 6,656 genes.
Allele-specific expression detection
Selected list of allele-specific expression genes detected in the HCC1954 cell line
Number of reads ratio major/minor
Known SNV ID
C-terminal binding protein 2
3 novel SNVs
Major histocompatibility complex, class I, A
TAO kinase 1
Plasminogen activator, urokinase
Integrin, beta 2
RAB3A, member RAS oncogene family
KIN, antigenic determinant of recA protein homolog (mouse)
Galactosidase, beta 1-like 2
Pleckstrin homology domain containing, family A member 6
Threonine synthase-like 2 (S. cerevisiae)
Lipopolysaccharide binding protein
Fibroblast growth factor receptor 4
Solute carrier family 44, member 5
Fibroblast growth factor receptor 2
A similar data mining process was performed for HCC1954BL. There were 7,848 SNVs in 4,441 genes identified by exome sequencing. Of these, 766 (9.7%) are novel. A total of 3,086 of the 7,848 SNVs were found in 1,918 genes, each of which was represented by at least 20 transcriptome reads. Comparison of SNVs in the exome and transcriptome data suggests that 50 genes are under ASE regulation as demonstrated by 117 SNVs (Table S6 in Additional file 1). The chromosomal distribution of the 1,918 candidate genes and the 50 ASE genes is shown in Figure 3b.
Biological categorization of the 86 ASE genes in HCC1954 shows that many of them are associated with cell-cell signaling and interactions, with 16 encoding cell surface proteins and five encoding extracellular matrix proteins. Of the 16 cell surface proteins, seven are transmembrane receptors, including kinases in the FGFR family and G-protein coupled receptors (Table 2).
For HCC1954 and HCC1954BL combined, approximately two-thirds of the ASE genes had a single SNV locus as supported by the exome data in their CDSs while the remainder had multiple exomic SNVs for ASE concordance (Table 3). In the latter cases, the most significant P-value of the ASE locus was used.
Twenty-two ASE genes are shared by both cell lines, and five of these are located on chromosome X. For all shared ASE genes that are not on the X chromosome, the same allele was preferentially expressed in both cell lines, suggesting that common genomic sequence variants are the controlling factors for these ASE events. For 24 genes that display ASE in HCC1954, there was no preferential expression from either allele in HCC1954BL. For 26 ASE genes in HCC1954, it was not possible to determine their status in HCC1954BL because of low or undetectable expression. The remaining 14 ASE genes in HCC1954 have no genotyping status in HCC1954BL due to low exome sequencing coverage, but are likely to be ASE genes in HCC1954BL since 93% of the exome genotypes are in dbSNP, and all have biased allele expression patterns detected in the transcriptome. Only three ASE genes are unique to HCC1954BL, which are expressed in both alleles in HCC1954.
As expected, chromosome X carries ASE genes most frequently in both cell lines. The other ASE genes are distributed across most of the autosomes (Figure 3). Clustering of ASE genes is not observed in the same genomic regions; thus, ASE events are more likely to be individually controlled. Chromosome X harbors none of the unique ASE genes in HCC1954, but two unique ASE genes in HCC1954BL, suggesting that there has been differential escape from X-inactivation between the two cell lines.
Genotyping by exome sequencing and allelotyping by transcriptome sequencing revealed additional genomic aberrations. For example, local genomic disruption at a locus may result in detection of a single allele from transcriptome sequencing. Indeed, in our previous report on transcriptome studies of the same HCC1954 cell line , we identified a genomic inversion event at the PHF20L1 gene locus. It was predicted that transcription of PHF20L1 would be impaired for the rearranged allele, leaving the other allele intact. Identification of PHF20L1 as a gene expressed from only one allele in this study agrees with our previous findings. This indirectly demonstrates that our strategy can detect a spectrum of ASE events in the genome.
We identified two additional genes in HCC1954, GPR56 and FAAH2, for which the transcriptome sequence data were ambiguous. Although each gene is heterozygous at two known SNP loci, only one SNP locus has monoallelic expression while the other distant SNP is expressed from both alleles. We speculate that either local genomic rearrangement or transcription from the opposite strand occurs in HCC1954. It is also possible that there are alternative transcript forms for these two genes, and only one form has unbalanced expression.
Experimental validation of allele-specific expression events
Interestingly, FGFR2, a kinase receptor gene, undergoes ASE in HCC1954 (Figure 4a). The FGFR2 gene is known to be expressed in multiple alternative splicing forms. It is transcribed in the form of FGFR2b in mammary epithelial cells, and FGFR2c in surrounding mesenchymal cells . After de novo assembly of the Illumina cDNA reads, FGFR2b was found to be the only isoform expressed in HCC1954. FGFR2 is heterozygous as shown by exonic SNV of rs1047100, which is a synonymous SNV at V232 (GTA versus GTG), but transcribed as FGFR2b only from one strand (GTA) as revealed by mRNA reads at rs1047100 (Figure 4a).
Another validated ASE gene is MAP9 on chromosome 4, a microtubule-associated protein required for spindle function, mitotic progression, and cytokinesis (Figure 4b). FANCB, a member of Fanconi anemia complementation group (FANC) on chromosome X, was also confirmed to be inactivated on one allele in HCC1954 (Figure 4c). Unequal peak heights between two genomic DNA alleles likely result from the pseudo-tetroploidy genome status and copy number variation in HCC1954.
Exome and transcriptome sequencing captures a snapshot of the active genome in a cell population. In addition to revealing SNVs and relative gene expression levels in a sample, the combined data can be used to distinguish active from inactive alleles. By mining sequence data from exomes and transcriptomes, we have identified LOH events and ASE genes in the breast cancer cell line HCC1954 and a lymphoblast cell line from the same individual, HCC1954BL. Our approach demonstrates that the search for genome-wide allele-specific events is feasible with systematic application of sequencing technologies.
Due to its pseudo-tetraploid genomic status with frequent copy number variation in HCC1954, similar numbers of sequence reads often gave lower average coverage of the minor allele in the HCC1954 exome compared to that of the HCC1954BL. Thus, a lower number of high-confidence SNVs detected in HCC1954 is expected. This number would be expected to increase with even greater sequence coverage. After combining with transcriptome sequence data, the SNVs with a minimum of 20× coverage by transcriptome reads were used for ASE mining. We also observed greater variation in mRNA expression levels in the cancer cell line, yielding fewer SNVs with deep transcript coverage for ASE mining. The combined use of exome-capture and transcriptome sequencing focuses on SNVs in genes and captures novel SNVs that were absent from previously published array-based approaches [17, 19, 31, 32].
In general, the total number of heterozygous SNVs detected by the exome-capture sequencing is less than that identified by transcriptome sequencing. This can be attributed to heterozygous allelic variations residing in 5' and 3' UTRs of mRNAs that are not targeted by probes on the exome array. Expansion of targeted regions of the exome array to non-CDS exons would provide additional informative SNVs.
In recent years, experimental evidence has shown that haploinsufficiency of tumor suppressor genes can serve to drive the tumorigenic process . Genetic, epigenetic and environmental factors can modify this haploinsufficiency to promote the tumor phenotype. First, association between LOH and tumor susceptibility is significant only when several tumor suppressor genes are involved in the LOH events [10, 33]. Second, in addition to common tumor suppressor genes shared by many cancer types like RB1 and TP53, many tumor suppressor genes are specific to a particular tumor type and/or cell type that originate the tumor. Deficiency of BRCA1 and BRCA2 is mainly found in breast and ovarian cancers thus far. Third, epigenetic silencing of tumor suppressor genes is achieved by different mechanisms, such as DNA methylation and histone modification. These observations suggest that additional tumor suppressor genes remain to be discovered for specific tumor types.
Selected list of LOH or ASE genes: known or putative tumor suppressor genes
Gene product and functional properties
Reported functional studies in cancer
Breast cancer 1, a nuclear phosphoprotein involved in maintaining DNA stability
Tumor suppressor function 
MutS homolog 3, a subunit of MutS beta involved in DNA mismatch repair
Genetic instability caused by loss of MSH3 in cancers 
Polycomb group ring finger 2, involved in protein-protein interaction and transcription repression
Tumor suppressor function 
Platelet-derived growth factor receptor-like, a cell surface tyrosine kinase receptor
Breakpoint cluster region
Putative tumor suppressor in meningiomas 
Desmocollin 3, a cell adhesion molecule in cadherin family
Epigenetic silencing of DSC3 is a common event in breast cancer 
Fibroblast growth factor receptor 2, a transmembrane tyrosine kinase
Hypermethylation of FGFR2 found in gastric cancer 
Myeloma overexpressed, a putative transforming gene
Epigenetically inactivated in esophageal squamous cell carcinomas 
Tumor necrosis factor receptor superfamily, member 10 d, a member of TNF-receptor superfamily
O-6-methylguanine-DNA methyltransferase, a DNA repair gene
FGFRs, which have been implicated in breast cancer development, are reported to be allele-specifically expressed for the first time in a breast cancer cell in this study. FGFR2 has been identified as a risk factor in breast cancer by association studies [30, 35–37]. Two intronic SNVs in FGFR2 have been reported to increase susceptibility to breast cancer by regulating the downstream gene expression level . FGFR2 was identified as a CAN gene by combined genomic studies in breast and colorectal cancers . Moreover, prostate and bladder cancers with reduced FGFR2b expression show poorer prognosis due to increased potential for invasion and metastasis [38, 39]. We can speculate that FGFR2 functions as a tumor suppressor in breast cancer, as well as FGFR4, for which functions are still unknown.
Our analysis of the combined effect of LOH and ASE in HCC1954 reveals additional genes that may have tumor suppressor or other functions within this breast cancer cell (summarized in Additional file 1). Recently, several studies have demonstrated the importance of comprehensive characterization of diverse molecular events toward discerning genes and pathways that potentially play a role in tumorigenesis. For example, gene activation can result from various events, such as point mutations that activate a protein product, gene amplification, and gene fusion, as well as epigenetic alteration. Here we demonstrate that the combined approach of exome sequencing and transcript analysis can reveal LOH and ASE events that can each result in haploinsufficiency for specific genes. ASE reflects various types of fluidic genomic alterations, including those that are epigenetic, and thus provides a unique insight to the changing status of cancer cells. This approach will further facilitate the process of identifying additional CAN genes and better define drivers of the tumorigenesis process. We note that genetic alterations in immortalized cell lines may not accurately reflect those changes in the cells from which they were derived. Nevertheless, the proof of principle study described here demonstrates that application of this approach to clinical samples such as tumor cells, stromal cells, fibroblasts, and infiltrating T-cells would likely provide additional definition to the significance of ASE in cancer. Our study demonstrates the feasibility of such approaches based on the ever-increasing power of next generation sequencing.
Materials and methods
Exome sequencing and mapping
The cell lines HCC1954 and HCC1954 were obtained from ATCC. They were maintained in RPMI medium containing 10% fetal bovine serum, 2 mM L-glutamine and non-essential amino acids. Total DNA was isolated from the cell pellets using the DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA, USA). Genomic DNA was treated with RNase Cocktail™ (Ambion, Austin, TX, USA), followed by phenol-chloroform extraction and precipitation of the aqueous phase in 1/10 volume 3 M sodium acetate and 100% ethanol.
Exome capture was performed using 5 μg of input DNA according to the manufacturer's protocol (Roche Nimblegen, Madison, WI, USA). Briefly, genomic DNA was nebulized for 1 minute using 45 psi of pressure. Sheared DNA fragments were subsequently cleaned with the DNA Clean and Concentrator-25 Kit (Zymo Research, Orange, CA, USA) and a fragment size distribution ranging from 300 to 500 bp was verified via Bioanalyzer (Agilent, Santa Clara, CA, USA). After end-polishing of the genomic fragments, the GS FLX Titanium adaptors were ligated to the sheared genomic fragments. Ligated fragments were next hybridized to the Nimblegen Sequence Capture 2.1 M exome array within Maui hybridization stations, followed by washing and elution of array-bound fragments from the arrays within elution chambers (Nimblegen). Captured fragments were next subjected to 27 rounds of PCR amplification using primers targeting the Nimblegen linkers. Following elution, the capture efficiency was evaluated via quantitative PCR reactions. Six full runs of 454 Titanium were performed for the captured fragments for each cell line. 454 reads were aligned to the human reference genome (hg18) using gsMapper. All raw reads have been deposited to the EBI Sequence Read Archive (SRA; submission ID ERA010917).
Transcriptome sequencing and mapping
Total RNA was isolated from the cell pellets using the RNeasy Mini Kit (Qiagen). Total RNA was treated with DNase I (New England Biolabs, Ipswich, MA, USA) and purified with Qiagen RNeasy columns (Qiagen). DNA-free RNA yield and purity were initially assessed by spectrophotometry. PolyA+ RNA was prepared from 500 μg of total RNA with oligo(dT) beads using the Oligotex mRNA Mini Kit (Qiagen). First-strand cDNA was prepared from 1 μg of poly(A)+ RNA with 200 pmol oligo random primers by using 300 units of Superscript II reverse transcriptase (Invitrogen, Carlsbad, CA, USA). Second-strand synthesis was performed in 20 μl at 16°C for 2 h after addition of 10 units of Escherichia coli DNA ligase, 40 units of E. coli DNA polymerase, and 2 units of RNase H (all from Invitrogen). T4 DNA polymerase (5 units) was added and incubated for 5 minutes at 16°C. Double-strand cDNA was purified by phenol-chloroform extraction and precipitation of the aqueous phase in 1/10 volume 3 M sodium acetate and 100% ethanol.
The Illumina GAII sequencing procedure was carried out for paired-end short read sequencing. The RefSeq gene set was queried from NCBI website on 3 December 2009. The approximately 75-Mb dataset comprises 41,249 transcript entries. The longest alternative form for each gene was used as reference in the assembly process. Human reference genome build 36 was also used as the assembly reference. Solexa short reads were mapped to the references using the CLC Bio Genomic Work Bench suite (CLC Bio,8200 Arhus N, Denmark). A stringent cutoff was used, requiring unique read mapping and allowing 2-bp mismatch for each read. Expression level was calculated by RPKM as the number of reads that map per kilobase of exon model per million mapped reads for each gene. Transcriptome reads are accessible through the EBI-SRA (submission ID ERA011762).
Single nucleotide variant calling
For exome sequencing, SNVs for variant alleles were drawn from the default high-confidence SNV calls by gsMapper (the 454HCDiffs.txt file), which is defined as the variant allele supported by at least three non-duplicated high quality reads with at least 10% variant allele frequency. An annotated SNP file (snp130) was downloaded from NCBI to identify known SNPs. Only SNVs in the CDS were used for downstream LOH and ASE analysis.
For transcriptome sequencing, SNVs were called on all assembled contigs using CLCBio SNV detection tools. A minimum quality of 30 was required for the central SNV base and 15 required for the surrounding bases. A SNV for a minor allele required at least four reads or at least 30% variant allele frequency.
Statistical significance (P-value) of LOH and ASE
Binomial function was used for both LOH and ASE significance to calculate the probability of the reads being randomly distributed between the two alleles.
For biased reads coverage: P = BINOMDIST(#reads for rare allele, #total reads, p_s, TRUE). p_s is probability of success in each trial. For a normal diploid genome like HCC1954BL, 0.5 is applied to p_s. However, the p_s value is adjusted for the HCC1954 genome based on variant allele frequency data from exome sequencing. Multiple correction was also applied to the P-values at uneven allelic loci in transcriptome sequencing. We used 2,534 SNVs that have the minimum 20× coverage in the transcriptome for multiple correction calculation in HCC1954.
Validation of allele-specific expression
In general, genomic DNA flanking the SNV loci to be tested was amplified by using intronic primer pairs. The cDNA fragments were produced by RT-PCR using exonic primer pairs crossing adjacent exons. Sanger sequencing was applied to the amplified genomic DNA and cDNA. Total RNA and DNA from the cell pellets were isolated using the RNeasy Mini Kit and DNeasy Blood and Tissue Kit (Qiagen). Genomic DNA was treated with RNase Cocktail™ (Ambion), followed by phenol-chloroform extraction and precipitation of the aqueous phase in 1/10 volume 3 M sodium acetate and 100% ethanol. Total RNA was treated with DNase I (New England Biolabs) and purified with Qiagen RNeasy columns (Qiagen). DNA-free RNA yield and purity were assessed by spectrophotometry and denaturing agarose gels. A total of 0.5 to 1.0 μg of RNA was reverse transcribed into cDNA by using an Omniscript RT kit according to the manufacturer's protocol using oligo (dT)18 primers. PCRs from the genomic DNA and RT-PCR were undertaken using High-Fidelity Platinum Taq (Invitrogen) plus 10 pmol of each of the primers listed in Table S7 in Additional file 1. After gel purification, the amplicons were submitted to Sanger sequencing with the PCR primers.
Molecular functional network
Primary molecular functions and networks involved were analyzed with the IPA software developed by Ingenuity (Redwood City, CA, USA).
loss of heterozygosity
reverse transcription PCR
single nucleotide polymorphism
single nucleotide variant
Sequence Read Archive
We thank Dr Jiaqi Huang and Dr Pauline Ng for technical assistance and discussions. This work is supported in part by the Hilton-Ludwig Cancer Metastasis Initiative, funded by the Conrad N Hilton Foundation and the Ludwig Institute for Cancer Research.
- Bignell GR, Greenman CD, Davies H, Butler AP, Edkins S, Andrews JM, Buck G, Chen L, Beare D, Latimer C, Widaa S, Hinton J, Fahey C, Fu B, Swamy S, Dalgliesh GL, Teh BT, Deloukas P, Yang F, Campbell PJ, Futreal PA, Stratton MR: Signatures of mutation and selection in the cancer genome. Nature. 463: 893-898. 10.1038/nature08768.
- Stephens PJ, McBride DJ, Lin ML, Varela I, Pleasance ED, Simpson JT, Stebbings LA, Leroy C, Edkins S, Mudie LJ, Greenman CD, Jia M, Latimer C, Teague JW, Lau KW, Burton J, Quail MA, Swerdlow H, Churcher C, Natrajan R, Sieuwerts AM, Martens JW, Silver DP, Langerød A, Russnes HE, Foekens JA, Reis-Filho JS, van't Veer L, Richardson AL, Børresen-Dale AL, et al: Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature. 2009, 462: 1005-1010. 10.1038/nature08645.PubMedPubMed CentralView ArticleGoogle Scholar
- Pleasance ED, Stephens PJ, O'Meara S, McBride DJ, Meynert A, Jones D, Lin ML, Beare D, Lau KW, Greenman C, Varela I, Nik-Zainal S, Davies HR, Ordoñez GR, Mudie LJ, Latimer C, Edkins S, Stebbings L, Chen L, Jia M, Leroy C, Marshall J, Menzies A, Butler A, Teague JW, Mangion J, Sun YA, McLaughlin SF, Peckham HE, Tsung EF, et al: A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature. 463: 184-190. 10.1038/nature08629.
- Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, Greenman CD, Varela I, Lin ML, Ordóñez GR, Bignell GR, Ye K, Alipaz J, Bauer MJ, Beare D, Butler A, Carter RJ, Chen L, Cox AJ, Edkins S, Kokko-Gonzales PI, Gormley NA, Grocock RJ, Haudenschild CD, Hims MM, James T, Jia M, Kingsbury Z, Leroy C, Marshall J, Menzies A, et al: A comprehensive catalogue of somatic mutations from a human cancer genome. Nature. 463: 191-196. 10.1038/nature08658.
- Parsons DW, Jones S, Zhang X, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Siu IM, Gallia GL, Olivi A, McLendon R, Rasheed BA, Keir S, Nikolskaya T, Nikolsky Y, Busam DA, Tekleab H, Diaz LA, Hartigan J, Smith DR, Strausberg RL, Marie SK, Shinjo SM, Yan H, Riggins GJ, Bigner DD, Karchin R, Papadopoulos N, Parmigiani G, et al: An integrated genomic analysis of human glioblastoma multiforme. Science. 2008, 321: 1807-1812. 10.1126/science.1164382.PubMedPubMed CentralView ArticleGoogle Scholar
- Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, Silliman N, Szabo S, Dezso Z, Ustyanksky V, Nikolskaya T, Nikolsky Y, Karchin R, Wilson PA, Kaminker JS, Zhang Z, Croshaw R, Willis J, Dawson D, Shipitsin M, Willson JK, Sukumar S, Polyak K, Park BH, Pethiyagoda CL, Pant PV, et al: The genomic landscapes of human breast and colorectal cancers. Science. 2007, 318: 1108-1113. 10.1126/science.1145720.PubMedView ArticleGoogle Scholar
- Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE, Papadopoulos N, Vogelstein B, Kinzler KW, Velculescu VE: The consensus coding sequences of human breast and colorectal cancers. Science. 2006, 314: 268-274. 10.1126/science.1133427.PubMedView ArticleGoogle Scholar
- Beroukhim R, Mermel CH, Porter D, Wei G, Raychaudhuri S, Donovan J, Barretina J, Boehm JS, Dobson J, Urashima M, Mc Henry KT, Pinchback RM, Ligon AH, Cho YJ, Haery L, Greulich H, Reich M, Winckler W, Lawrence MS, Weir BA, Tanaka KE, Chiang DY, Bass AJ, Loo A, Hoffman C, Prensner J, Liefeld T, Gao Q, Yecies D, Signoretti S, et al: The landscape of somatic copy-number alteration across human cancers. Nature. 463: 899-905. 10.1038/nature08822.
- Chiang DY, Getz G, Jaffe DB, O'Kelly MJ, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES: High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods. 2009, 6: 99-103. 10.1038/nmeth.1276.PubMedPubMed CentralView ArticleGoogle Scholar
- Payne SR, Kemp CJ: Tumor suppressor genetics. Carcinogenesis. 2005, 26: 2031-2045. 10.1093/carcin/bgi223.PubMedView ArticleGoogle Scholar
- Knight JC: Allele-specific gene expression uncovered. Trends Genet. 2004, 20: 113-116. 10.1016/j.tig.2004.01.001.PubMedView ArticleGoogle Scholar
- Li E, Beard C, Jaenisch R: Role for DNA methylation in genomic imprinting. Nature. 1993, 366: 362-365. 10.1038/366362a0.PubMedView ArticleGoogle Scholar
- Carr MS, Yevtodiyenko A, Schmidt CL, Schmidt JV: Allele-specific histone modifications regulate expression of the Dlk1-Gtl2 imprinted domain. Genomics. 2007, 89: 280-290. 10.1016/j.ygeno.2006.10.005.PubMedPubMed CentralView ArticleGoogle Scholar
- Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK: Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010, 464: 768-772. 10.1038/nature08872.PubMedPubMed CentralView ArticleGoogle Scholar
- Lo HS, Wang Z, Hu Y, Yang HH, Gere S, Buetow KH, Lee MP: Allelic variation in gene expression is common in the human genome. Genome Res. 2003, 13: 1855-1862. 10.1101/gr.885403.PubMedPubMed CentralView ArticleGoogle Scholar
- Yan H, Yuan W, Velculescu VE, Vogelstein B, Kinzler KW: Allelic variation in human gene expression. Science. 2002, 297: 1143-10.1126/science.1072545.PubMedView ArticleGoogle Scholar
- Zhang K, Li JB, Gao Y, Egli D, Xie B, Deng J, Li Z, Lee JH, Aach J, Leproust EM, Eggan K, Church GM: Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nat Methods. 2009, 6: 613-618. 10.1038/nmeth.1357.PubMedPubMed CentralView ArticleGoogle Scholar
- Valle L, Serena-Acedo T, Liyanarachchi S, Hampel H, Comeras I, Li Z, Zeng Q, Zhang HT, Pennison MJ, Sadim M, Pasche B, Tanner SM, de la Chapelle A: Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science. 2008, 321: 1361-1365. 10.1126/science.1159397.PubMedPubMed CentralView ArticleGoogle Scholar
- Serre D, Gurd S, Ge B, Sladek R, Sinnett D, Harmsen E, Bibikova M, Chudin E, Barker DL, Dickinson T, Fan JB, Hudson TJ: Differential allelic expression in the human genome: a robust approach to identify genetic and epigenetic cis-acting mechanisms regulating gene expression. PLoS Genet. 2008, 4: e1000006-10.1371/journal.pgen.1000006.PubMedPubMed CentralView ArticleGoogle Scholar
- Milani L, Gupta M, Andersen M, Dhar S, Fryknas M, Isaksson A, Larsson R, Syvanen AC: Allelic imbalance in gene expression as a guide to cis-acting regulatory single nucleotide polymorphisms in cancer cells. Nucleic Acids Res. 2007, 35: e34-10.1093/nar/gkl1152.PubMedPubMed CentralView ArticleGoogle Scholar
- Yan H, Dobbie Z, Gruber SB, Markowitz S, Romans K, Giardiello FM, Kinzler KW, Vogelstein B: Small changes in expression affect predisposition to tumorigenesis. Nat Genet. 2002, 30: 25-26. 10.1038/ng799.PubMedView ArticleGoogle Scholar
- Raval A, Tanner SM, Byrd JC, Angerman EB, Perko JD, Chen SS, Hackanson B, Grever MR, Lucas DM, Matkovic JJ, Lin TS, Kipps TJ, Murray F, Weisenburger D, Sanger W, Lynch J, Watson P, Jansen M, Yoshinaga Y, Rosenquist R, de Jong PJ, Coggill P, Beck S, Lynch H, de la Chapelle A, Plass C: Downregulation of death-associated protein kinase 1 (DAPK1) in chronic lymphocytic leukemia. Cell. 2007, 129: 879-890. 10.1016/j.cell.2007.03.043.PubMedPubMed CentralView ArticleGoogle Scholar
- Bignell GR, Santarius T, Pole JC, Butler AP, Perry J, Pleasance E, Greenman C, Menzies A, Taylor S, Edkins S, Campbell P, Quail M, Plumb B, Matthews L, McLay K, Edwards PA, Rogers J, Wooster R, Futreal PA, Stratton MR: Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res. 2007, 17: 1296-1303. 10.1101/gr.6522707.PubMedPubMed CentralView ArticleGoogle Scholar
- Catalogue of Somatic Mutations in Cancer - COSMIC. [http://www.sanger.ac.uk/genetics/CGP/cosmic]
- NCBI Reference Sequence (RefSeq). [http://www.ncbi.nlm.nih.gov/RefSeq]
- Sugarbaker DJ, Richards WG, Gordon GJ, Dong L, De Rienzo A, Maulik G, Glickman JN, Chirieac LR, Hartman ML, Taillon BE, Du L, Bouffard P, Kingsmore SF, Miller NA, Farmer AD, Jensen RV, Gullans SR, Bueno R: Transcriptome sequencing of malignant pleural mesothelioma tumors. Proc Natl Acad Sci USA. 2008, 105: 3521-3526. 10.1073/pnas.0712399105.PubMedPubMed CentralView ArticleGoogle Scholar
- Jongeneel CV, Delorenzi M, Iseli C, Zhou D, Haudenschild CD, Khrebtukova I, Kuznetsov D, Stevenson BJ, Strausberg RL, Simpson AJ, Vasicek TJ: An atlas of human gene expression from massively parallel signature sequencing (MPSS). Genome Res. 2005, 15: 1007-1014. 10.1101/gr.4041005.PubMedPubMed CentralView ArticleGoogle Scholar
- The Single Nucleotide Polymorphism database (dbSNP). [http://www.ncbi.nlm.nih.gov/projects/SNP]
- Zhao Q, Caballero OL, Levy S, Stevenson BJ, Iseli C, de Souza SJ, Galante PA, Busam D, Leversha MA, Chadalavada K, Rogers YH, Venter JC, Simpson AJ, Strausberg RL: Transcriptome-guided characterization of genomic rearrangements in a breast cancer cell line. Proc Natl Acad Sci USA. 2009, 106: 1886-1891. 10.1073/pnas.0812945106.PubMedPubMed CentralView ArticleGoogle Scholar
- Katoh M: Cancer genomics and genetics of FGFR2. Int J Oncol. 2008, 33: 233-237.PubMedGoogle Scholar
- Gimelbrant A, Hutchinson JN, Thompson BR, Chess A: Widespread monoallelic expression on human autosomes. Science. 2007, 318: 1136-1140. 10.1126/science.1148910.PubMedView ArticleGoogle Scholar
- Tuch BB, Laborde RR, Xu X, Gu J, Chung CB, Monighetti CK, Stanley SJ, Olsen KD, Kasperbauer JL, Moore EJ, Broomer AJ, Tan R, Brzoska PM, Muller MW, Siddiqui AS, Asmann YW, Sun Y, Kuersten S, Barker MA, De La Vega FM, Smith DI, et al: Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One. 5: e9317-10.1371/journal.pone.0009317.
- Hanby AM, Kelsell DP, Potts HW, Gillett CE, Bishop DT, Spurr NK, Barnes DM: Association between loss of heterozygosity of BRCA1 and BRCA2 and morphological attributes of sporadic breast cancer. Int J Cancer. 2000, 88: 204-208. 10.1002/1097-0215(20001015)88:2<204::AID-IJC9>3.0.CO;2-1.PubMedView ArticleGoogle Scholar
- Leary RJ, Lin JC, Cummins J, Boca S, Wood LD, Parsons DW, Jones S, Sjöblom T, Park BH, Parsons R, Willis J, Dawson D, Willson JK, Nikolskaya T, Nikolsky Y, Kopelovich L, Papadopoulos N, Pennacchio LA, Wang TL, Markowitz SD, Parmigiani G, Kinzler KW, Vogelstein B, Velculescu VE: Integrated analysis of homozygous deletions, focal amplifications, and sequence alterations in breast and colorectal cancers. Proc Natl Acad Sci USA. 2008, 105: 16224-16229. 10.1073/pnas.0808041105.PubMedPubMed CentralView ArticleGoogle Scholar
- Meyer KB, Maia AT, O'Reilly M, Teschendorff AE, Chin SF, Caldas C, Ponder BA: Allele-specific up-regulation of FGFR2 increases susceptibility to breast cancer. PLoS Biol. 2008, 6: e108-10.1371/journal.pbio.0060108.PubMedPubMed CentralView ArticleGoogle Scholar
- Easton DF, Pooley KA, Dunning AM, Pharoah PD, Thompson D, Ballinger DG, Struewing JP, Morrison J, Field H, Luben R, Wareham N, Ahmed S, Healey CS, Bowman R; SEARCH collaborators, Meyer KB, Haiman CA, Kolonel LK, Henderson BE, Le Marchand L, Brennan P, Sangrajrang S, Gaborieau V, Odefrey F, Shen CY, Wu PE, Wang HC, Eccles D, Evans DG, Peto J, et al: Genome-wide association study identifies novel breast cancer susceptibility loci. Nature. 2007, 447: 1087-1093. 10.1038/nature05887.PubMedPubMed CentralView ArticleGoogle Scholar
- Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF, Hoover RN, Thomas G, Chanock SJ: A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nat Genet. 2007, 39: 870-874. 10.1038/ng2075.PubMedPubMed CentralView ArticleGoogle Scholar
- Bernard-Pierrot I, Ricol D, Cassidy A, Graham A, Elvin P, Caillault A, Lair S, Broet P, Thiery JP, Radvanyi F: Inhibition of human bladder tumour cell growth by fibroblast growth factor receptor 2b is independent of its kinase activity. Involvement of the carboxy-terminal region of the receptor. Oncogene. 2004, 23: 9201-9211.PubMedGoogle Scholar
- Ricol D, Cappellen D, El Marjou A, Gil-Diez-de-Medina S, Girault JM, Yoshida T, Ferry G, Tucker G, Poupon MF, Chopin D, Thiery JP, Radvanyi F: Tumour suppressive properties of fibroblast growth factor receptor 2-IIIb in human bladder cancer. Oncogene. 1999, 18: 7234-7243. 10.1038/sj.onc.1203186.PubMedView ArticleGoogle Scholar
- Everhard S, Tost J, El Abdalaoui H, Crinière E, Busato F, Marie Y, Gut IG, Sanson M, Mokhtari K, Laigle-Donadey F, Hoang-Xuan K, Delattre JY, Thillet J: Identification of regions correlating MGMT promoter methylation and gene expression in glioblastomas. Neuro Oncol. 2009, 11: 348-356. 10.1215/15228517-2009-001.PubMedPubMed CentralView ArticleGoogle Scholar
- Hibi K, Goto T, Mizukami H, Kitamura Y, Sakata M, Saito M, Ishibashi K, Kigawa G, Nemoto H, Sanada Y: MGMT gene is aberrantly methylated from the early stages of colorectal cancers. Hepatogastroenterology. 2009, 56: 1642-1644.PubMedGoogle Scholar
- Hibi K, Sakata M, Yokomizo K, Kitamura YH, Sakuraba K, Shirahata A, Goto T, Mizukami H, Saito M, Ishibashi K, Kigawa G, Nemoto H, Sanada Y: Methylation of the MGMT gene is frequently detected in advanced gastric carcinoma. Anticancer Res. 2009, 29: 5053-5055.PubMedGoogle Scholar
- Narod SA, Foulkes WD: BRCA1 and BRCA2: 1994 and beyond. Nat Rev Cancer. 2004, 4: 665-676. 10.1038/nrc1431.PubMedView ArticleGoogle Scholar
- Haugen AC, Goel A, Yamada K, Marra G, Nguyen TP, Nagasaka T, Kanazawa S, Koike J, Kikuchi Y, Zhong X, Arita M, Shibuya K, Oshimura M, Hemmi H, Boland CR, Koi M: Genetic instability caused by loss of MutS homologue 3 in human colorectal cancer. Cancer Res. 2008, 68: 8465-8472. 10.1158/0008-5472.CAN-08-0002.PubMedPubMed CentralView ArticleGoogle Scholar
- Guo WJ, Zeng MS, Yadav A, Song LB, Guo BH, Band V, Dimri GP: Mel-18 acts as a tumor suppressor by repressing Bmi-1 expression and down-regulating Akt activity in breast cancer cells. Cancer Res. 2007, 67: 5083-5089. 10.1158/0008-5472.CAN-06-4368.PubMedPubMed CentralView ArticleGoogle Scholar
- Seitz S, Korsching E, Weimer J, Jacobsen A, Arnold N, Meindl A, Arnold W, Gustavus D, Klebig C, Petersen I, Scherneck S: Genetic background of different cancer cell lines influences the gene set involved in chromosome 8 mediated breast tumor suppression. Genes Chromosomes Cancer. 2006, 45: 612-627. 10.1002/gcc.20325.PubMedView ArticleGoogle Scholar
- Komiya A, Suzuki H, Ueda T, Aida S, Ito N, Shiraishi T, Yatani R, Emi M, Yasuda K, Shimazaki J, Ito H: PRLTS gene alterations in human prostate cancer. Jpn J Cancer Res. 1997, 88: 389-393.PubMedView ArticleGoogle Scholar
- Wozniak K, Piaskowski S, Gresner SM, Golanska E, Bieniek E, Bigoszewska K, Sikorska B, Szybka M, Kulczycka-Wojdala D, Zakrzewska M, Zawlik I, Papierz W, Stawski R, Jaskolski DJ, Och W, Sieruta M, Liberski PP, Rieske P: BCR expression is decreased in meningiomas showing loss of heterozygosity of 22q within a new minimal deletion region. Cancer Genet Cytogenet. 2008, 183: 14-20. 10.1016/j.cancergencyto.2008.01.020.PubMedView ArticleGoogle Scholar
- Oshiro MM, Kim CJ, Wozniak RJ, Junk DJ, Muñoz-Rodríguez JL, Burr JA, Fitzgerald M, Pawar SC, Cress AE, Domann FE, Futscher BW: Epigenetic silencing of DSC3 is a common event in human breast cancer. Breast Cancer Res. 2005, 7: R669-680. 10.1186/bcr1273.PubMedPubMed CentralView ArticleGoogle Scholar
- Park S, Kim JH, Jang JH: Aberrant hypermethylation of the FGFR2 gene in human gastric cancer cell lines. Biochem Biophys Res Commun. 2007, 357: 1011-1015. 10.1016/j.bbrc.2007.04.051.PubMedView ArticleGoogle Scholar
- Janssen JW, Imoto I, Inoue J, Shimada Y, Ueda M, Imamura M, Bartram CR, Inazawa J: MYEOV, a gene at 11q13, is coamplified with CCND1, but epigenetically inactivated in a subset of esophageal squamous cell carcinomas. J Hum Genet. 2002, 47: 460-464. 10.1007/s100380200065.PubMedView ArticleGoogle Scholar
- Shivapurkar N, Toyooka S, Toyooka KO, Reddy J, Miyajima K, Suzuki M, Shigematsu H, Takahashi T, Parikh G, Pass HI, Chaudhary PM, Gazdar AF: Aberrant methylation of trail decoy receptor genes is frequent in multiple tumor types. Int J Cancer. 2004, 109: 786-792. 10.1002/ijc.20041.PubMedView ArticleGoogle Scholar
- Hornstein M, Hoffmann MJ, Alexa A, Yamanaka M, Muller M, Jung V, Rahnenfuhrer J, Schulz WA: Protein phosphatase and TRAIL receptor genes as new candidate tumor genes on chromosome 8p in prostate cancer. Cancer Genomics Proteomics. 2008, 5: 123-136.PubMedGoogle Scholar
- Piperi C, Themistocleous MS, Papavassiliou GA, Farmaki E, Levidou G, Korkolopoulou P, Adamopoulos C, Papavassiliou AG: High incidence of MGMT and RARbeta promoter methylation in primary glioblastomas: association with histopathological characteristics, inflammatory mediators and clinical outcome. Mol Med. 16: 1-9. 10.2119/molmed.2009.00140.
- Taioli E, Ragin C, Wang XH, Chen J, Langevin SM, Brown AR, Gollin SM, Garte S, Sobol RW: Recurrence in oral and pharyngeal cancer is associated with quantitative MGMT promoter methylation. BMC Cancer. 2009, 9: 354-10.1186/1471-2407-9-354.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.