Microsoft Store
 

Human Genome Project


 

The Human Genome Project (HGP) endeavored to map the human genome down to the nucleotide (or base pair) level and to identify all the genes present in it.

Goals

The goals of the original HGP were not only to determine all 3 billion base pairs in the human genome with a minimal error rate, but also to identify all the genes in this vast amount of data. This part of the project is still ongoing although a preliminary count indicates about 25,000 genes in the human genome, which is far fewer than predicted by most scientists.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Another goal of the HGP was to develop faster, more efficient methods for DNA sequencing and sequence analysis and the transfer of these technologies to industry.

Related Topics:
Sequencing - Sequence analysis

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The sequence of the human DNA is stored in databases available to anyone on the Internet. The U.S. National Center for Biotechnology Information (and sister organizations in Europe and Japan) house the gene sequence in a database known as Genbank, along with sequences of known and hypothetical genes and proteins. Other organizations such as the University of California, Santa Cruz, and ENSEMBL present additional data and annotation and powerful tools for visualizing and searching it. Computer programs have been developed to analyse the data, because the data itself is difficult to interpret without them.

Related Topics:
DNA - Database - Internet - National Center for Biotechnology Information - University of California, Santa Cruz - Computer program

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The process of identifying the boundaries between genes and other features in raw DNA sequence is called genome annotation and is the domain of bioinformatics. While expert biologists make the best annotators, their work proceeds slowly, and computer programs are increasingly used to meet the high-throughput demands of genome sequencing projects. The best current technologies for annotation make use of statistical models that take advantage of parallels between DNA sequences and human language, using concepts from computer science such as formal grammars.

Related Topics:
DNA - Genome annotation - Bioinformatics - Language - Formal grammar

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

All humans have unique gene sequences, therefore the data published by the HGP does not represent the exact sequence of each and every individual's genome. It is the combined genome of a small number of anonymous donors. The HGP genome is a scaffold for future work in identifying differences between individuals. Most of the current effort in identifying differences between individuals involves single nucleotide polymorphisms.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~