Within this document:
In addition to this document there are also the following pages, describing particular features of SCOP:
The online WWW access to SCOP has been designed to facilitate both detailed searching of particular families and browsing of the whole database. To this end, there are a variety of different techniques for navigation:
Every page in the database has several components. At the top, there is an array of buttons providing different functionality, as described below. Some buttons may be "inactive" because they are not appropriate for the current page. The button list is immediately followed by a title, indicating both the level within the hierarchy (e.g., "fold") and the name of the current item. Beneath this, the "lineage" of the current item is shown, showing to what groups, at successively lower (more specific) levels of the hierarchy the current item belongs. Selecting any of the highlighted items brings you to that level of the tree.
It should be noted that not every level of the tree gets its own page. The levels which have pages are root, fold, superfamily, family, and species.
The main body of the page shows the entities which have been combined to form the current level. For example, the globin fold includes the globin and phycocyanin superfamilies. The most "normal" method of moving around the SCOP tree is simply clicking on highlighted children to view their children. As a special case, Protein Data Bank (PDB) codes link to the actual PDB files. The number in parenthesis after an entry shows how many children will be found there. Children may have additional links images, references, sequences, and interactive viewers. These are shown as small boxes after the text, and are described below.
At the bottom of each page in the database, there is a search box. You can search any word (>2 ch) that appears in a SCOP page (titles, comments, pdb entries, etc.), any SCOP domain identifier, and entries in SCOP corresponding to a given ec number. The search facility allows right truncation (append a '+' to a truncated word), searching for combinations of words (a '+' prefix return the pages containing all words marked with '+'), and excluding words (add a '-' prefix to words you want to exlude.)
The buttons found on every page of the database can be broken down into three types: general, navigation, and display:
Home: Go to the SCOP introductory page.
Mail: Send a message to the SCOP
database authors
Help: Display this page
Root: Go to the root of the tree
Up: Move
up one level in the tree (e.g., from a superfamily to a globin
fold)
Expand and Collapse: Add or remove additional levels of
detail on the current page.
The SCOP database provides hypertext links to a variety of different information sources. For speed and real-estate reasons, these links are provided in the form of very small boxes after the associated entry. Some links (particularly images) are "propagated" up the tree (e.g. an image for Actinidia chinensis Actinidin is shown next to the Cysteine protease superfamily).
A description of the various link-boxes follows:
Swiss-3DImage: These
are links to entries in the
Swiss-3DImage
database maintained by Dr. Manuel C. Peitsch. Because
these images are indexed by PDB
entry name, and not regions, some
images may show portions of a PDB
file which is appropriate to the
fold currently being examined.
RasMol Script: These cause a local version of RasMol to be
started to view the chosen PDB entry,
with the relevant fold
highlighted. For information on how to make this work, and
explaining the colour scheme, see here.
Chime view: These cause the chosen
PDB entry to be displayed by
the Netscape Chemscape
Chime Plug-in, with the relevant fold highlighted. For
information on how to make this work, and explaining the colour scheme,
see here.
NCBI Entrez Sequence Entries:
These bring up the entry in the
Entrez
database (maintained by
MRC)
associated with the current protein. In addition to providing the
sequence information in a standardized format, it is also possible to
obtain relevant
MEDLINE
references as further links from these.
Protein Databank Entries: The links to the Protein Data Bank (PDB) are simply the PDB identifiers of the entries, found at the lowest level of the hierarchy. As such, they have no representative icon. Displayed is the header of the PDB entry in question with links to a number of different optional views. These views can also be accessed directly via the PDB entry viewer from the MRC site and from mirrors sites with a local copy of PDB files.
Nucleic Acids Database Entries:
These retrieve entries from the
Nucleic Acids Database
(NDB),
which contains coordinates, images, and experimental information about
nucleic acid structures.
Protein Motions Database Entries:
These retrieve entries from the
Macromolecular Motions Database,
which contains a classification of proteins which have internal
movements, as well as descriptive text, images, and animations.
Scop Cross-References:
When a single PDB
entry is divided into different folds, these icons
provide links to the locations, in SCOP,
of other regions of the
PDB entry.
SCOP contains the domains of all PDB entries available at the time of the current release's construction. For each of these entries a coordinate file is available and can be displayed via the various graphical interfaces. The sequence of each protein chain has also been extracted.
The release also contains many literature references. These are structures that have been published in sufficient detail to be classified in SCOP, but where the coordinates are not yet available from PDB. Whereas PDB entries are identified by a 4 letter code starting with a digit, these literature entries have been given 4 letter codes starting with the letter s followed by 3 digits, e.g. s149. This code is arbitrary and may change between releases. Entries included in SCOP in this way tend only to be those structures that are significantly different from any structure already in PDB.
In the absence of a PDB header, a pseudo header file has been constructed. These header files contain the reference; the SWISSPROT identifier for the sequence; the fragment of the sequence for which structural information exists (where known) and the domain classification of each part of this sequence fragment in SCOP. (Note that occassionally some of this information is incomplete where there was insufficient information about the sequence in the orginal reference.)
The filenames for the SCOP HTML database are currently arbitrary and subject to frequent change. Therefore, external users should not make links directly to files in the database. Rather, they should link a search engine (such as the one at http:search.cgi). The fields which you can set are:
Note:The linking facilities of SCOP continue to be under active development, and soon the above features may be supplanted (but will not immediately disappear).
Because much of the information on these pages is in graphical form, we highly recommend users turn on image loading. The total size of all the icons used is only a fraction of even the smallest pages, so transfer time should be minimal. Moreover, the same images are used on all pages, so there will be no need to "reload" any of these pictures as you continue browsing.