17 Answers to Selected Exercises
ANSWER TO
EXERCISE1.1
<bio_sequence element_id="U83302" sequence_id="MICR83302"
organism_name="Colaptes rupicola" seq_length="1047" type="DNA"/>
<bio_sequence element_id="U83303" sequence_id="HSU83303"
organism_name="Homo sapiens" seq_length="3460" type="DNA"/>
<bio_sequence element_id="U83304" sequence_id="MMU83304"
organism_name="Mus musculus" seq_length="51" type="RNA"/>
<bio_sequence element_id="U83305" sequence_id="MIASSU833"
organism_name="Accipiter striatus" seq_length="1143" type="DNA"/>
ANSWER TO
EXERCISE1.2
<!ATTLIST bio_sequence
element_id ID #IMPLIED
sequence_id CDATA #IMPLIED
organism_name CDATA #IMPLIED
seq_length CDATA #IMPLIED
molecule_type (DNA | mRNA | rRNA | tRNA | cDNA | AA)
#IMPLIED>
This example was taken from the AGAVE DTD (AGAVE 2002). The actual
element has some additional attributes, and it differs in a few other ways as
well. For example, some of the attributes are restricted toNMTOKENrather
than justCDATA.NMTOKENspecifies text that starts with a letter (and a few
other characters, such as an underscore), and is followed by letters and digits.
Programming languages such as Perl restrict the names of variables and pro-
cedures in this way, and many genomics databases use this same convention
for their accession numbers and other identifiers.