A genome is all the genetic information of an organism or cell. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria a

A genome is all the genetic information of an organism or cell.[1] It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences (see non-coding DNA), and often a substantial fraction of junk DNA with no evident function.[2][3] Almost all eukaryotes have mitochondria and a small mitochondrial genome.[2] Algae and plants also contain chloroplasts with a chloroplast genome.

The study of the genome is called genomics. The genomes of many organisms have been sequenced and various regions have been annotated. The first genome to be sequenced was that of the virus φX174 in 1977;[4] the first genome sequence of a prokaryote (Haemophilus influenzae) was published in 1995;[5] the yeast (Saccharomyces cerevisiae) genome was the first eukaryotic genome to be sequenced in 1996.[6] The Human Genome Project was started in October 1990, and the first draft sequences of the human genome were reported in February 2001.[7]

Origin of the term

The term genome was created in 1920 by Hans Winkler,[8] professor of botany at the University of Hamburg, Germany. The website Oxford Dictionaries and the Online Etymology Dictionary suggest the name is a blend of the words gene and chromosome.[9][10][11][12] However, see omics for a more thorough discussion. A few related -ome words already existed, such as biome and rhizome, forming a vocabulary into which genome fits systematically.[13]

Definition

The term "genome" usually refers to the DNA (or sometimes RNA) molecules that carry the genetic information in an organism, but sometimes it is uncertain which molecules to include; for example, bacteria usually have one or two large DNA molecules (chromosomes) that contain all of the essential genetic material but they also contain smaller extrachromosomal plasmid molecules that carry important genetic information. In the scientific literature, the term 'genome' usually refers to the large chromosomal DNA molecules in bacteria.[14]

Nuclear genome

Eukaryotic genomes are even more difficult to define because almost all eukaryotic species contain nuclear chromosomes plus extra DNA molecules in the mitochondria. In addition, algae and plants have chloroplast DNA. Most textbooks make a distinction between the nuclear genome and the organelle (mitochondria and chloroplast) genomes so when they speak of, say, the human genome, they are only referring to the genetic material in the nucleus.[2][15] This is the most common use of 'genome' in the scientific literature.

Ploidy

Most eukaryotes are diploid, meaning that there are two of each chromosome in the nucleus but the 'genome' refers to only one copy of each chromosome. Some eukaryotes have distinctive sex chromosomes, such as the X and Y chromosomes of mammals, so the technical definition of the genome must include both copies of the sex chromosomes. For example, the standard reference genome of humans consists of one copy of each of the 22 autosomes plus one X chromosome and one Y chromosome.[16]

Sequencing and mapping

Further information: Whole genome sequencing and Genome project

A genome sequence is the complete list of the nucleotides (A, C, G, and T for DNA genomes) that make up all the chromosomes of an individual or a species. Within a species, the vast majority of nucleotides are identical between individuals, but sequencing multiple individuals is necessary to understand the genetic diversity.

In 1976, Walter Fiers at the University of Ghent (Belgium) was the first to establish the complete nucleotide sequence of a viral RNA-genome (Bacteriophage MS2). The next year, Fred Sanger completed the first DNA-genome sequence: Phage X174, of 5386 base pairs.[17] The first bacterial genome to be sequenced was that of Haemophilus influenzae, completed by a team at The Institute for Genomic Research in 1995. A few months later, the first eukaryotic genome was completed, with sequences of the 16 chromosomes of budding yeast Saccharomyces cerevisiae published as the result of a European-led effort begun in the mid-1980s. The first genome sequence for an archaeon, Methanococcus jannaschii, was completed in 1996, again by The Institute for Genomic Research.[18]

The development of new technologies has made genome sequencing dramatically cheaper and easier, and the number of complete genome sequences is growing rapidly. The US National Institutes of Health maintains one of several comprehensive databases of genomic information.[19] Among the thousands of completed genome sequencing projects include those for rice, a mouse, the plant Arabidopsis thaliana, the puffer fish, and the bacteria E. coli. In December 2013, scientists first sequenced the entire genome of a Neanderthal, an extinct species of humans. The genome was extracted from the toe bone of a 130,000-year-old Neanderthal found in a Siberian cave.[20][21]

Viral genomes

Viral genomes can be composed of either RNA or DNA. The genomes of RNA viruses can be either single-stranded RNA or double-stranded RNA, and may contain one or more separate RNA molecules (segments: monopartit or multipartit genome). DNA viruses can have either single-stranded or double-stranded genomes. Most DNA virus genomes are composed of a single, linear molecule of DNA, but some are made up of a circular DNA molecule.[22]