Array- Based Molecular Tumor Phenotyping Techniques: Focus on lung cancer
What is Molecular Tumor Phenotyping and How is it Accomplished?
Molecular tumor phenotyping seeks to establish correlations between molecular signatures, such as biomarkers or certain expression signatures, with a specific type of tumor (1). These correlations can be used for diagnostic, prognostic, or therapeutic applications. The techniques used to obtain the various molecular signatures fit within the broad categories of functional genomics, transcriptomics, and proteomics.
Briefly, genomics studies employ “genome-wide” techniques, such as DNA sequencing, to characterize the genome (2). DNA sequencing can identify biomarkers, such as single-nucleotide polymorphisms (SNPs), which can be linked to disease, or drug response (1). In the field of transcriptomics, expression profiling techniques quantify transcript levels of genes on a genome wide scale using high-throughput technologies such as DNA microarrays (4). The corresponding expression profile can be correlated to disease, or used to evaluate drug response (4).
In proteomics studies, proteomics-based techniques identify biomarkers and protein expression signatures, which could be used to predict responses to drugs for individualized therapy (5). Genome-wide study of the proteome is still difficult, but the development of protein-arrays may change this (6). Development of protein arrays are one of several techniques which will be reviewed in this section’s focus on future research and development in the field.
The accumulation of genomic alterations, such as point mutations, copy number alterations, or indels, leads to the development of cancer due to altered gene expression (1). Characterization of altered gene expression and detection of the underlying genomic aberrations has been invaluable for advances in cancer diagnosis, prognosis, and therapeutic approaches. Such characterization is accomplished through a plethora of techniques which fall within the domains of genomics, transcriptomics, or proteomics (2). This section reviews “array-based” molecular tumor phenotyping techniques as part of a larger analysis of developments in personalized medicine for lung cancer.
Array Based Expression Profiling
Expression profiles refer to a quantification of expressed genes at the transcript level. Differences between the expression profiles of transformed cancer cells versus untransformed somatic cells are useful for diagnosic, prognosic, and therapeutic purposes (3). Using genome-wide expression analysis, such differences have identified a core set of common genes dysregulated in all human lung cancers, and sets of genes specifically dysregulated in certain types of lung cancers such as squamous cell lung carcinoma, small cell lung carcinoma, and adenocarcinoma (4).
Genome-wide expression analysis became feasible with development of microarray technology (5).DNA microarrays are small, solid supports onto which DNA, cDNA, or oligonucleotide sequences from thousands of different genes are immobilized or synthesized such that one spot on the array corresponds to one gene (6). To obtain a comparative expression profile, transcripts from transformed and non-transformed cells are reverse transcribed into differentially fluorescently labelled cDNAs, and washed over the prepared array (7). After laser excitation, fluorescence from each spot on the array is recorded to produce a ratio of hybridization intensities which correspond to relative probe abundance from the two sources of transcripts (5).
Genes with similar expression patterns are grouped together using clustering methods, and represented in a “heat map” in which colour and intensity correspond to expression levels (8). Significance analysis of microarrays (SAM)is conducted to quantify the significance of change in expression for each gene, and an algorithm designated as the false discovery rate (FDR) reports the percent of genes incorrectly determined as significant (9).
Microarrays offer several key advantages for expression analysis, such as being high-throughput, relatively inexpensive, and able to directly compare two biological samples for pairwise analysis on a genome-wide scale (7). However, microarrays cannot distinguish alternatively spliced transcripts from the same gene, and are susceptible to cross hybridization of different transcripts with high sequence similarity (2). Transcript quantification is relative between two samples, and is limited by possible probe saturation and technical limitations for intensity detection by the scanner (2).
Array Based Detection of CNVs
Specialized microarrays designed for genome characterization can identify biomarkers such as single-nucleotide polymorphisms (SNPs), or copy number variations (CNVs) (10). CNVs refer to segments of DNA which differ in copy number with respect to a reference genome, often caused by genomic alterations such as deletions, duplications, or translocations (10). Such alterations may cause mutations in genes implicated in tumor induction or progression, as is the case in small cell lung cancer where de-regulated induction of the focal adhesion pathway was found to be implicated with aberrations in DNA copy number (11). Amongst other uses in lung cancer treatment, CNV patterns can distinguish different types of lung cancer, such as squamous cell carcinoma from adenocarcinoma (12).
Array comparative genomic hybridization (aCGH) can test an individual for established CNV patterns. Briefly, equal amounts of differentially fluorescently labeled genomic DNA from a test and a reference sample are competitively co-hybridized to an array containing DNA targets derived from most of the known genes and non-coding regions of the genome (13). The arrays are then scanned into image files, and spot intensities are measured for copy number analysis. The resulting ratio of the fluorescence intensities is proportional to the ratio of the copy numbers of DNA sequences in the test and reference genomes. Unequal ratio indicates a loss or gain of patient DNA at that genomic region (14).
aCGH derived from chromosomal CGH which is distinguished primarily by hybridization of sample DNA to metaphase chromosomes. aCGH offers several advantages for CNV detection over the cytogenic approach, including high resolution, high-throughput, high reproducibility, and precise mapping of aberrations. Cell culturing is not required, and less genomic DNA is required (13). Limitations associated with aCGH include limitations associated with any array-based technique, such as non-specific hybridization, or artifacts present during scanning. However, the most severe limitation of aCGH is an inability to detect balanced chromosomal re-arrangements such as translocations and inversions. This is especially concerning for diagnostic potential due to the numerous discoveries of balanced cancer-causing gene fusions (14). Such balanced translocations can be detected using SNP arrays (10).
Array Based Detection of SNPs
SNPs are single nucleotide base alterations which occur in at least 1% of the population at regular intervals in the genome. SNPs vary amoungst individuals, and can be linked to phenotypes such as cancer susceptibility and drug response (10). SNP microarrays can test an individual for established SNP patterns associated with diseases such as lung cancer. SNP array studies of non-smoking lung-cancer patients have been identified novel SNP patterns for diagnostic use (15).
Array based detection of SNPs utilizes the same hybridization principles as the aforementioned array based techniques. However, in a SNP array, a single gene is represented by both “perfect match” probes and multiple immobilized “mismatch” probes which differ by one or a few specific nucleotides to represent SNP patterns (16). Genomic DNA from a single sample hybridizes to the chip with greater frequency to specific SNPs associated with that person. Fluorescence intensity is measured, and the resulting SNP pattern is screened for similarity with established SNP patterns associated with certain phenotypes (17). SNP genotyping by SNP array is less expensive and time consuming as sequencing methods, and can be used to study copy-number neutral loss of heterozygosity. Limitations include the aforementioned limitations associated with all array-based techniques (18).
In summary, array-based techniques have revolutionalized research and treatment of not just lung cancer, but all cancers. Array-based techniques have proven superior to cytogenic techniques for molecular tumor phenotyping due to genome-wide and high-throughput capabilities (1). The transition of array-based molecular tumor phenotyping from “bench-to-bedside” is becoming a reality. Vast amounts of expression analysis or biomarker data will create challenges for data interpretation, and utilization in clinic, and demand competency in bioinformatics in the field of oncology (19).