Mitochondria are the cellular organelles that are the site for the production of adenosine triphosphate (ATP), the molecule that provides the energy needed for many metabolic processes. Several lines of evidence support the theory that all mitochondria are descended from a free-living bacterium that formed a symbiotic relationship with a eukaryotic (nucleus-containing) cell in the distant past.
The human mitochondrial genome is haploid and the number of identical copies of mitochondrial DNA per human cell ranges from 10000 to 100000 molecules. It contains 16569 bps in 37 genes. Twenty-eight of these genes are encoded by the heavy strand, and nine by the light strand. Of the 37 genes, a total of 24 specify a mature RNA product: 22 mitochondrial tRNA molecules and two mitochondrial rRNA molecules (a 23S rRNA and a 16S rRNA). The remaining 13 genes encode polypeptides, which are synthesized on mitochondrial ribosomes.
The human mtDNA genome has two general regions: the coding region and the control region. The coding region is responsible for the production of various biological molecules involved in the process of energy production in the cell. The control region is responsible for regulation of the mtDNA molecule. Two regions of mtDNA within the control region have been found to be highly polymorphic, or variable, within the human population. These two regions are termed Hypervariable Region I (HV1), which has an approximate length of 442 base pairs (bp), and Hypervariable Region II (HV2), which has an approximate length of 268 bp
In humans, individuals inherit mitochondrial DNA strictly from their mothers. The mitochondria in mammalian sperm are usually destroyed by the egg cell after fertilization. Also, most mitochondria are present at the base of the sperm's tail, which is used for propelling the sperm cells. Sometimes the tail is lost during fertilization. In 1999 it was reported that paternal sperm mitochondria (containing mtDNA) are marked with ubiquitin to select them for later destruction inside the embryo. It has been reported that mitochondria can occasionally be inherited from the father in some species such as mussels. Paternally inherited mitochondria have additionally been reported in some insects such as fruit flies, honeybees, and periodical cicadas.
Mitochondrial DNA is known for maternal clonal inheritance, rapid evolutionary rate, lack of introns, absence of recombination events, and haploidy]. Mitochondrial variations are linked to the origin of humans, and play a substantial role in forensics, degenerative diseases, cancers, and the aging process.
For example forensic scientists typically turn to mtDNA for:
• identification of an individual when the recovered specimen contains too little useful DNA for nuDNA analysis
• identification of remains using a maternal relative as a reference
• identification of species
Approximately 610 bp of mtDNA are currently sequenced in forensic mtDNA analysis. Recording and comparing mtDNA sequences would be difficult and potentially confusing if all of the bases were listed. Thus, mtDNA sequence information is recorded by listing only the differences with respect to a reference DNA sequence. By convention, human mtDNA sequences are described using the first complete published mtDNA sequence as a reference (Anderson et al. 1981). This sequence is commonly referred to as the Anderson sequence. It is also called the Cambridge reference sequence or the Oxford sequence. Each base pair in this sequence is assigned a number. Deviations from this reference sequence are recorded as the number of the position demonstrating a difference and a letter designation of the different base.
Thus, the mtDNA sequences obtained from maternally related individuals, such as a brother and a sister or a mother and a daughter, will exactly match each other in the absence of a mutation. This is accomplished in humans by sequencing one or more of the hypervariable control regions (HVR1 or HVR2) of the mitochondrial DNA, as with a genealogical DNA test. HVR1 consists of about 440 base pairs. These 440 base pairs are then compared to the control regions of other individuals (either specific people or subjects in a database) to determine maternal lineage. Most often, the comparison is made to the revised Cambridge Reference Sequence (or Anderson sequence). This characteristic of mtDNA is advantageous in missing person cases as any maternal relative of the missing individual can supply reference samples. However, mtDNA analysis is limited when compared with nuclear DNA analysis in that it cannot distinguish between individuals of the same maternal lineage or individuals who have the same mtDNA sequence by chance.
In most cases mtDNA analysis involves the comparison of the mtDNA sequences of two or more specimens. The procedure for determining the sequence usually involves some version of the Sanger reaction. Investigators do not typically sequence the entire mtDNA molecule, but instead they examine a region that shows variation that is appropriate for their purposes. D-loop sequences are used to distinguish individual humans (Holland et al., 1995) and also may be used to separate very closely related species, while protein-coding genes are most often used to recognize species. Although there is little evidence that any of the protein-coding genes are (for a given number of base pairs) a more effective marker than any of the others, the need to compare new results with previous work has led to a concentration of studies on just a few regions.
For vertebrate species, this is the cytochrome b gene (Cyt b), whose forensic utility has been clearly recognized.
For insects it is most common to sequence some or all of cytochrome oxidase subunits one and two (COI+II), although this is not the overwhelming choice that Cyt b is for those who study vertebrates
These mitochondrial DNA employments caused the development of various mitochondrial databases:
MITOMAP: a comprehensive database for the human mitochondrial DNA (mtDNA), the first component of the human genome to be completely sequenced. MITOMAP uses the mtDNA sequence as the unifying element for bringing together information on mitochondrial genome structure and function, pathogenic mutations and their clinical characteristics, population associated variation, and gene- gene interactions.
GOBASE (organelle genome database): contains all published mitochondrion-encoded sequences (approximately 913,000) and chloroplast-encoded sequences (approximately 250,000) from a wide range of eukaryotic taxa. For all sequences, information on related genes, exons, introns, gene products and taxonomy is available, as well as selected genome maps and RNA secondary structures
MitoVariome: a freely accessible web application and database that enables human mitochondrial genome researchers to study genetic variation in mitochondrial genome with textual and graphical views accompanied by assignment function of haplogrouping if users submit their own data. Hence, the MitoVariome containing many kinds of variation features in the human mitochondrial genome will be useful for understanding mitochondrial variations of each individual, haplogroup, or geographical location to elucidate the history of human evolution.
HmtDB: a database of Human Mitochondrial Genomes, annotated with population data, and a set of bioinformatic tools, able to produce site-specific variability data and to automatically characterize newly sequenced human mitochondrial genomes. A query system for the retrieval of genomes and a web submission tool for the annotation of new genomes have been designed and will soon be implemented. The first release contains 1255 fully annotated human mitochondrial genomes.
V-MitoSNP: a web-based software platform that provides a user-friendly and interactive interface for mtSNP information, especially with regard to RFLP genotyping
mtDNAmanager: provide systematic routines for mtDNA sequence data management and analysis via easily accessible Web interfaces, and thus should be very useful for population, medical and forensic studies that employ mtDNA analysis. mtDNAmanager can be
accessed at http://mtmanager.yonsei.ac.kr.
Mitome: a specialized mitochondrial genome database designed for easy comparative analysis of various features of metazoan mitochondrial genomes such as base frequency, A+T skew, codon usage and gene arrangement pattern. A particular function of the database is the automatic reconstruction of phylogenetic relationships among metazoans selected by a user from a taxonomic tree menu based on nucleotide sequences, amino acid sequences or gene arrangement patterns
MitoRes: a specialized mitochondrial resource which has been developed to complement the other available mitochondrial databases in their biological utility and application. In particular, it tries to fill the void of a comprehensive resource of mitochondria-related sequences and, to this end, it collects and integrates data on gene, transcript and protein sequences of any metazoan species from the most accredited worldwide sources.
MitoDrome: a web-based database which provides genomic annotations about nuclear genes of Drosophila melanogaster encoding for mitochondrial proteins
MitoNuc: a database containing detailed information on sequenced nuclear genes coding for mitochondrial proteins in Metazoa. The MitoNuc database can be retrieved through SRS and is available via the web site http://bighost.area.ba.cnr.it/mitochondriome where other mitochondrial databases developed by our group, the complete list of the sequenced mitochondrial genomes, links to other mitochondrial sites and related information, are available. The MitoAln database, related to MitoNuc in the previous release, reporting the multiple alignments of the relevant homologous protein coding regions, is no longer supported in the present release. In order to keep the links among entries in MitoNuc from homologous proteins, a new field in the database has been defined: the cluster identifier, an alpha numeric code used to identify each cluster of homologous proteins
Anderson et al. (1981) Nature 290, 457-465
Gray et al., Curr Opin Genet Dev. 1999 Dec;9(6):678-87. Review
Kogelnik et al, Nucleic Acids Res. 1998 Jan 1;26(1):112-5.
O'Brien et al, Nucleic Acids Res. 2009 Jan;37: D946-50. Epub 2008 Oct 25
Catalano et al, BMC Bioinformatics. 2006 Jan 24;7:36.
Sardiello et al, Nucleic Acids Res. 2003 Jan 1;31(1):322-4.
Attimonelli et al Nucleic Acids Res. 2002 Jan 1;30(1):172-3.
Lee et al. Nucleic Acids Res. 2008 Jan;36:D938-42.
Attimonelli et al. BMC Bioinformatics 2005, 6:S4
Chuang et al. BMC Bioinformatics 2006, 7:379
Lee et al BMC Bioinformatics 2008, 9:483
Lee et al, BMC Genomics 2009, 10:S12