![]() |
What Protein Are You? Explore protein structure files in the public Protein Data Bank. |
|
I. Introduction
The Protein Data Bank (PDB) stores all known three-dimensional (3D) structures of proteins in freely accessible datafiles. The datafile names for the more than 13,000 proteins in the PDB are all four characters long, beginning with a digit (usually 1) and usually followed by three letters. An example is 1EMB for the green fluorescent protein from jellyfish. The PDB is accessed at http://www.rcsb.org/pdb/.
II. Protein Name
Go to the PDB web site. The PDB homepage has a "Site Search" box for a protein identifier (ID). Protein IDs are usually the digit "1" followed by three letters. Enter the digit "1" followed by your three initials and click the Explore button. There is an excellent chance that the PDB has a 3D protein structure stored under that filename.
If no protein structure has that filename, try a simple variation. If your name has more initials, use a different 3-letter combination. If you have just two initials, like MN, add a third initial between them, like MAN, or following them, like MNA. If you have three initials like JKL but get no hit with filename 1JKL, try 1JLA, 1JAL, or some other 3-letter variation. Eventually you'll get a hit. As a last resort, try the initials of a friend or family member (not a classmate).
III. Protein Game
Once your search has found a protein examine the initial page. Note the protein ID at the top of the page, followed by more information. You can download or display the full annotation, including the atomic coordinates of all the protein atoms, by clicking on the small icons to the right of the PDB ID. You can view displays of the protein 3D structure at the right of the main page or in the "Display files" option at the left. Explore these different methods of displaying 3D structure. Another way to display the 3D structure is to download the PDB file (described above) to your local computer's desktop. Then display the structure with the 3D molecular viewing program RasMol on your computer. Chime and RasMol can be downloaded and installed from the Molecular Visualization Freeware page at http://www.umass.edu/microbio/rasmol/.
Read the file annotation and view the 3D structure to learn about your protein. When viewing, rotate the protein molecule, change display modes to highlight the polypeptide backbone and to distinguish subunits and ligands. Detailed instructions for the use of these programs may be found in the tutorials and instruction manuals also found at http://www.umass.edu/microbio/rasmol.
IV. Scoring
Score points for your protein, from features as seen in the PDB file. Score in each of these categories. If your protein has identical subunits, score the structural features of one subunit only. If a protein fits more than one description in a category, take the highest score (not the sum).
Type: Score 7 points for a membrane protein, 6 points for a DNA binding protein (with DNA atoms in the file), 5 for structural protein (not enzyme or regulatory protein), 4 for regulatory protein (not DNA), 3 for a protein-binding protein, 2 for a small-molecule ligand-binding protein (not enzyme), 1 for an enzyme, 0 for can't tell.
Species: Score 8 points for primate (not human), 7 for vertebrate (not mammalian), 6 for archaebacterial, 5 for human, 4 for mammalian, 3 for viral (any type), 2 for eukaryotic (not vertebrate), 1 for eubacterial, 0 for can't tell.
Subunits: Score 5 points for 5 or more subunits, 4 for 4, 3 for 3, 2 for 2, 1 for 1. If your PDB structure comes from a crystal unit cell with more than one protein molecule, just count the subunits of one molecule.
Size: Count the amino acids in the polypeptide chain of the largest subunit in the file. Score 8 points for 1000 or more, 7 for 800-1000, 6 for 600-800, 5 for 500-600, 4 for less than 100, 3 for 100-200, 2 for 400-500, and 1 for 200-400. The average number of amino acids per polypeptide chain is 280-300.
Secondary structure: Score 5 points for all beta-sheet (with random coil), 3 for all alpha-helix (with coil), and 1 for mixed alpha and beta.
Structure method: Score 5 points for a theoretical model, 3 for NMR structure, 1 for xray crystal structure, 0 for can't tell.
Prosthetic groups: If your protein has prosthetic groups like hemes or metal ions as integral parts of the structure, score 5 points for 5 or more, 4 for 4, 3 for 3, 2 for 2, 1 for 1.
Ligand: The protein structure may have been determined with a small molecule ligand, inhibitor, or substrate analog. Score 5 for yes, 3 for no, 1 for can't tell.
Mutant: The structure determined may be for a mutant or variant protein rather than wild type. The protein may have been engineered. Score 5 for non-engineered mutant, 3 for wild type, 1 for engineered mutant or variant, 0 for can't tell.
Cys and Trp: Add up the number of cys and trp residues in one subunit polypeptide. Score 7 points for zero, 6 for 1, 5 points for 6 or more, 4 for 2, 3 for 5, 2 for 3, 1 for 4, 0 for can't tell.
Add up the total from all ten categories. Generally speaking, the rarer, more unusual proteins in the PDB have the higher scores. Ask around among your friends and classmates for their protein names and scores.