9780521516358book.pdf

to be captured within a very short timescale. More recently alternative methods of analysis including high performance liquid chromatography based approaches have gained in popularity, especially for DNA mutation analysis. Mass spectrometry is also becoming increasingly used for nucleic acid analysis.

5.8 Molecular biology and bioinformatics

5.8.1 Basic bioinformatics

Bioinformatics is now an established and vital resource for molecular biology research and is also a mainstay of routine analysis of DNA. This increase in use of bioinformatics has been driven by the increase in genetic sequence information and the need to store, analyse and manipulate the data. There are now a huge number of sequences stored ingenetic databasesfrom a variety of organisms, including the human genome. Indeed the genetic information from various organisms is now an indispensable starting point for molecular biology research. The main primary databases include GenBank at the National Institutes of Health (NIH) in the USA, EMBL at the European Bioinformatics Institute (EBI) at Cambridge, UK and the DNA Database of Japan (DDBJ) at Mishima in Japan. These databases contain the nucleotide sequences which are annotated to allow easy identification. There are also many other databases such as secondary databases that contain information relating to sequence motifs, such as core sequences found in cytochrome P450 domains, or DNA-binding domains. Importantly all of the databases may be freely accessed over the internet. A number of these important databases and internet resources are listed in Table 5.4. Consequently the new expanding and exciting areas of bioscience research are those that analyse genome and cDNA sequence databases (genomics) and also their protein counterparts (proteomics). This is sometimes referred to asin silicoresearch.

5.8.2 Analysing information using bioinformatics

One of the most useful bioinformatics resources is termed BLAST (Basic Local Align- ment Search Tool) located at the NCBI (www.ncbi.nlm.nih.gov). This allows a DNA sequence to be submitted via the internet in order to compare it to all the sequences contained within a DNA database. This is very useful since it is possible once a nucleotide sequence has been deduced by, for example, Sanger sequencing, to identify sequences of similarity. Indeed if human sequences are used and have already been mapped it is possible to locate their position to a particular chromosome using NCBI Map Viewer. Further resources such as ORF (open reading frame) finder allow a search to be undertaken for open reading frames, e.g. sequences beginning with a start codon (ATG) and continuing with a significant number of ‘coding’ triplets before a stop codon is reached. There are a number of other sequences that may be used to define coding sequences; these include ribosome binding sites, splice site junctions, poly(A) polymerase sequences and promoter sequences that lie outside the coding

170 Molecular biology, bioinformatics and basic techniques

9780521516358book.pdf

5.8 Molecular biology and bioinformatics

Get our desktop app

Company

Features

Documentation

Resources