Date: 1 March 2012 19:36
Hi all,
As you may recall, the Protein Data Bank in Europe (PDBe; http://pdbe.org) has launched a number of PDB archive browsers in the past two years. These allow users to explore and analyse what is in the PDB based on concepts and classifications they are familiar with, such as the EC system, chemical compounds, taxonomy or amino-acid sequences (see http://pdbe.org/browse for more information).
The most recent addition is a browser that is based on the GO system (http://pdbe.org/go). GO stands for Gene Ontology, a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases (http://geneontology.org/). The SIFTS project (http://pdbe.org/sifts) maps GO annotations from UniProt to all proteins and protein fragments that occur in the PDB. These GO terms describe:
* molecular function, the elemental activities of a gene product at the molecular level (e.g., catalysis of free radical formation)
* cellular component, the localisation of a gene product in a cell or its extracellular environment (e.g., outer membrane-bounded periplasmic space)
* biological process, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms (e.g., neuron apoptosis)
To start exploring, surf to http://pdbe.org/go
In the left panel you can either:
- click on one of the three examples and then hit the Submit button
- start exploring the GO classification by expanding the molecular_function, cellular_component or biological_process term. Clicking on any of these will expand the classification to show the underlying terms, and these can be clicked on for further drilling. If a term is shown on a grey background, it means that there are no proteins in the PDB that have been annotated with that term.
- start typing a term in the input box (above the Submit button). Once you have typed a few characters, an auto-complete function will show you a list of all the matching GO terms. Select any one of these and hit the Submit button.
Once you have selected a GO term that is of interest to you, the browser will load all PDB entries that contain a protein (fragment) that has been annotated with that term in the central panel of the browser. (The right panel contains more information about the GO term and how it fits in the GO classification - click on the image to get a bigger version.) In the central panel, the "PDB entries" tab shows a simple list of the PDB entries. However, there are other tabs that provide different views on this set of entries, such as:
- which ligands are found in these entries?
- what folds are represented (CATH)?
- what quaternary structures occur (PISA)?
- what sequence families are present (Pfam)?
- from which taxa have structures been determined?
- who has determined these structures?
For instance, if your are interested in "purine nucleotide biosynthetic process", you may find that:
- there are 310 relevant structures in the PDB
- the most common ligands are Mg, SO4, PO4, GDP, K, CL, AMP and ADP
- the structures are mostly of the alpha/beta type (82%), with 45% of all domains adopting a 3-layer ABA sandwich fold
- 70% of the entries contain homo-oligomeric structures (and 50% of those are homodimers, but there are also 5 homohexameric structures)
- the set of entries covers 43 different Pfam families
- there are 11 proteins in 5 distinct entries from Yersinia pestis
- R.B. Honzatko is the most prolific depositor of PDB entries in this category (useful to know if you are looking for collaborators or referees)
As you can see, the GO browser can be used to explore many aspects of what is known in terms of 3D structures for proteins with a given function, role or localisation.
If you have any questions, comments or suggestions, please use the button marked "FEEDBACK" in the top right corner of any PDBe webpage.
Hi all,
As you may recall, the Protein Data Bank in Europe (PDBe; http://pdbe.org) has launched a number of PDB archive browsers in the past two years. These allow users to explore and analyse what is in the PDB based on concepts and classifications they are familiar with, such as the EC system, chemical compounds, taxonomy or amino-acid sequences (see http://pdbe.org/browse for more information).
The most recent addition is a browser that is based on the GO system (http://pdbe.org/go). GO stands for Gene Ontology, a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases (http://geneontology.org/). The SIFTS project (http://pdbe.org/sifts) maps GO annotations from UniProt to all proteins and protein fragments that occur in the PDB. These GO terms describe:
* molecular function, the elemental activities of a gene product at the molecular level (e.g., catalysis of free radical formation)
* cellular component, the localisation of a gene product in a cell or its extracellular environment (e.g., outer membrane-bounded periplasmic space)
* biological process, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms (e.g., neuron apoptosis)
To start exploring, surf to http://pdbe.org/go
In the left panel you can either:
- click on one of the three examples and then hit the Submit button
- start exploring the GO classification by expanding the molecular_function, cellular_component or biological_process term. Clicking on any of these will expand the classification to show the underlying terms, and these can be clicked on for further drilling. If a term is shown on a grey background, it means that there are no proteins in the PDB that have been annotated with that term.
- start typing a term in the input box (above the Submit button). Once you have typed a few characters, an auto-complete function will show you a list of all the matching GO terms. Select any one of these and hit the Submit button.
Once you have selected a GO term that is of interest to you, the browser will load all PDB entries that contain a protein (fragment) that has been annotated with that term in the central panel of the browser. (The right panel contains more information about the GO term and how it fits in the GO classification - click on the image to get a bigger version.) In the central panel, the "PDB entries" tab shows a simple list of the PDB entries. However, there are other tabs that provide different views on this set of entries, such as:
- which ligands are found in these entries?
- what folds are represented (CATH)?
- what quaternary structures occur (PISA)?
- what sequence families are present (Pfam)?
- from which taxa have structures been determined?
- who has determined these structures?
For instance, if your are interested in "purine nucleotide biosynthetic process", you may find that:
- there are 310 relevant structures in the PDB
- the most common ligands are Mg, SO4, PO4, GDP, K, CL, AMP and ADP
- the structures are mostly of the alpha/beta type (82%), with 45% of all domains adopting a 3-layer ABA sandwich fold
- 70% of the entries contain homo-oligomeric structures (and 50% of those are homodimers, but there are also 5 homohexameric structures)
- the set of entries covers 43 different Pfam families
- there are 11 proteins in 5 distinct entries from Yersinia pestis
- R.B. Honzatko is the most prolific depositor of PDB entries in this category (useful to know if you are looking for collaborators or referees)
As you can see, the GO browser can be used to explore many aspects of what is known in terms of 3D structures for proteins with a given function, role or localisation.
If you have any questions, comments or suggestions, please use the button marked "FEEDBACK" in the top right corner of any PDBe webpage.
No comments:
Post a Comment