Cannabis sativa L. - Botany and Biotechnology

(Jacob Rumans) #1

Next-Gen sequencing (high-throughput sequencing) has generated a plethora of
genetic information. Van Bakel et al. ( 2011 ) used a whole genome shotgun
(WGS) method with Illumina technology to sequence“Purple Kush”and two hemp
cultivars,‘Finola’and‘USO-31.’Van Bakel and colleagues also obtained tran-
scriptomes (cDNA libraries) from different tissues in these plants. Soon two other
Cannabisgenomes were sequenced with WGS/Illumina machines,“Chemdawg”
and“LA Confidential”(Medicinal Genomics Corporation 2011 ).
Tejkalová ( 2015 ) utilized Cannabis genomes (van Bakel et al. 2011 ) for
SNP-calling and genotyping with the KASP/SNPline platform. Haplotypes based
on 57 SNP positions for 44 samples of“Sativa”and 77 of“Indica”were analyzed
with STRUCTURE. This probabilistic software identifies the optimal number of
clusters (K) to divide a population, based on allele frequencies. Testing K values
from one to nine, the haplotype data best fit K = 2 (two populations), but
STRUCTURE’s assignment of individuals into“Sativa”and “Indica”matched
poorly with theira prioriidentification.
Sawler et al. ( 2015 ) used genotyping-by-sequencing (GBS), which utilizes
restriction enzymes to break the genome into short reads (WGS uses random
ligation). They coupledApeKI enzymes with Illumina machines for SNP discovery
and genotyping infiber-type and drug-type samples. GBS identified 14,031 SNPs
for analysis, after qualityfiltering. Drug-type strains were classified along a gradient
of ancestry proportions (percent“Sativa”vs. percent“Indica”) reported in online
strain databases.
Their PCA analysis of genetic structure (SNP variations) using PLINK 1.9
clearly segregated 43fiber-type samples from 81 drug-type samples. The clusters of
“Sativa”and“Indica”partially overlapped. Proportional ancestry in each sample
correlated moderately (r^2 =0.36) with the principle component (PC axis 1) of
genetic structure. Similar results were obtained with fastSTRUCTURE, where data
from all 124 samples bestfit K = 2. The inability to separate“Sativa”and“Indica”
and the poor correlation of report ancestry was due, in part, to counterfeit strain
names: In a comparison of 17 paired samples with the same strain name, six pairs
(35%) were dissimilar, and shared more genetic similarity with other strain names.
Sawler calculated the fixation index (FST) between subgroups based on
identity-by-state (IBS, implemented in PLINK). FSTvalues range from 0 to 1; a
zero value indicates the subgroups interbreeding freely; a 1 value indicates the
subgroups are completely isolated from one another. The average FSTbetween
fiber- and drug-type plants was 0.156, which is similar to the degree of genetic
differentiation in humans between Europeans and East Asians. Average FST
betweenfiber-type plants and“100% Sativa”was 0.161; FSTbetweenfiber-type
plants and“100% Indica”was 0.136; no comparison was made between“Sativa”
and“Indica.”
Medicinal Genomics Corporation ( 2015 ) used Reduced Representation Shotgun
(RRS) sequencing to identify 100,000–200,000 SNPs per strain. These data were
used to generate a nearest-neighbor tree with“Purple Kush,”‘Finola,’‘USO-31,’
and 50 ganjanym strains. Henry ( 2015 ) utilized open-access RRS data to evaluate
28 strains, using Adegenet 2.0. K-partition optimized at K = 1. PCA clustering with


4 Cannabis sativaandCannabis indica... 115

Free download pdf