as many as possible, the probes are designed to be nearly evenly
distributed across the genome [45] and mostly target the SNPs
with relatively higher minor allele frequency. Microarray platforms
have evolved rapidly, and current platforms contain up to two
million probes that integrate the genome at<1000 resolution for
detection of SNPs as well as CNVs/CNAs, which is increased
~200-fold from the first SNP microarrays used for ALL [46]. In
another hand, after investigating the genome variants in thousands
of individuals by using next-generation sequencing which we will
describe in the next section, nearly 90 million SNPs have been
identified with around 12 million shared in diverse races/ethnici-
ties. Non-singleton SNPs (i.e., identified in more than one individ-
ual) that can alter protein coding (e.g., missense, nonsense) were
selected and genotyped by specific probes’ integrated microarray,
named as exome-array [47], which is widely used for identifying
potential functional variants for specific phenotypes through GWAS
strategy (e.g., ALL susceptibility).
Moreover, a new type of microarray was developed named as
tiling array. This type of microarray probes intensively for sequences
to characterize regions that are sequenced but whose location
functions are largely unknown. Besides the normal function of
cDNA/DNA microarrays, it can be used for transcriptome
mapping as well as discovering DNA/protein interaction and fine
mapping the break points of copy number variations [48]. How-
ever, due to the high cost and, more importantly, the development
of next-generation sequencing, tiling array technology has not been
well used and almost been abandoned today.
Collectively, transcriptome and genomic microarrays have been
widely used for molecular profiling of cancers including leukemia
[49, 50]. For instance, ALL-related aberrant gene expressions/
mutations/genomic alterations and their related pathways can be
revealed genome-widely. Also, cluster-based molecular subtypes or
treatment outcome-related alterations can be determined and
translated into individualized clinical treatment. Additionally,
germline variants can also be detected to find the inherited predis-
positions to diseases (e.g., cancer) as well as the treatment out-
comes [51, 52]. As one of the best models and examples for the
translational genetic researches, a lot of novel findings have been
revealed for ALL, which has not only largely broadened our insight
of understanding for this disease but also greatly contributed to
improve the survival rate and life qualities of the patients [53, 54].
3.2 Bioinformatics
Analysis
With thousands of signals for genome-wide screening for gene
expression, SNP genotypes, and CNVs/CNAs, bioinformatics ana-
lyses are largely needed in two steps. The first one is normalization
of raw data where technical variations should be removed
[55]. Visualization of data would be indispensable to find the
appropriate method for normalization [41]. The second one is
394 Heng Xu and Yang Shu