Within this document:
In addition to this document there are also the following pages, describing particular features of scop:
The online WWW access to scop has been designed to facilitate both detailed searching of particular families and browsing of the whole database. To this end, there are a variety of different techniques for navigation:
Every page in the database has several components. At the top, there is an array of buttons providing different functionality, as described below. Some buttons may be "inactive" because they are not appropriate for the current page. The button list is immediately followed by a title, indicating both the level within the hierarchy (e.g., "fold") and the name of the current item. Beneath this, the "lineage" of the current item is shown, showing to what groups, at successively lower (more specific) levels of the hierarchy the current item belongs. It selecting any of the highlighted items brings you to that level of the tree.
It should be noted that not every level of the tree gets its own page. The levels which have pages are root, fold, superfamily, family,
The main body of the page shows the entities which have been combined to form the current level. For example, the globin fold includes the globin and phycocyanin superfamilies. The most "normal" method of moving around the scop tree is simply clicking on highlighted children to view their successive children. As a special case, Protein Data Bank (PDB) codes link to the actual PDB files at Brookhaven National Laboratory. The number in parenthesis after an entry shows how many children will be found there. Many children may have additional links images, references, sequences, and interactive viewers. These are shown as small boxes after the text, and are described below.
At the bottom of each page in the database, there is a search box. Enter a single key word (append a '+' for right-truncation; otherwise you will only match a whole word), and you will retrieve either the page where that keyword is found, or a list of pages. (Note: if you see two boxes, one of which contains a URL like "http://..." then you are using an out-of-date viewer which doesn't understand forms with hidden fields.)
The buttons found on every page of the database can be broken down into three types: general, navigation, and display:
Home: Go to the scop introductory page.
Mail: Send a message to the scop database authors
Help: Display this page
Root: Go to the root of the tree
Up: Move
up one level in the tree (e.g., from a superfamily to a globin
fold)
Expand and Collapse: Add or remove additional levels of
detail on the current page.
The scop database provides hypertext links to a variety of different information sources. For speed and real-estate reasons, these links are provided in the form of very small boxes after the associated entry. Some links (particularly images) are "propagated" up the tree (e.g. an image for Actinidia chinensis Actinidin is shown next to the Cysteine protease superfamily).
A description of the various link-boxes follows:
Swiss-3DImage: These
are links to entries in the
Swiss-3DImage
database maintained by Dr. Manuel C. Peitsch. Because
these images are indexed by PDB entry name, and not regions, some
images may show portions of a PDB file which is appropriate to the
fold currently being examined.
RasMol Script: These cause a local version of RasMol to be
started to view the chosen PDB entry, with the relevant fold
highlighted. For information on how to make this work, and
explaining the colour scheme, see here.
Chime view: These cause the chosen PDB entry to be displayed by
the Netscape Chemscape
Chime Plug-in, with the relevant fold highlighted. For
information on how to make this work, and explaining the colour scheme,
see here.
NCBI Entrez Sequence Entries:
These bring up the entry in the
Entrez
database (maintained by NCBI)
associated with the current protein. In addition to providing the
sequence information in a standardized format, it is also possible to
obtain relevant MEDLINE references as further links from these.
Protein Databank Entries: The links to the Protein Data Bank (PDB) are simply the PDB identifiers of the entries, found at the lowest level of the hierarchy. As such, they have no representative icon. Displayed is the header of the PDB entry in question with links to a number of different optional views. These views, which can also be accessed directly via the PDB entry viewer, include:
Nucleic Acids Database Entries:
These retrieve entries from the
Nucleic Acids Database (NDB),
which contains coordinates, images, and experimental information about
nucleic acid structures.
Protein Motions Database Entries:
These retrieve entries from the
Protein Motions Database,
which contains a classification of proteins which have internal
movements, as well as descriptive text, images, and animations.
Scop Cross-References:
When a single PDB entry is divided into different folds, these icons
provide links to the locations, in scop, of other regions of the
PDB entry.
scop contains the domains of all PDB entries available at the time of the current release's construction. For each of these entries a coordinate file is available and can be displayed via the various graphical interfaces. The sequence of each protein chain has also been extracted and is contained in databases that can searched for sequence similarity from here (see here for help, here for information about the different databases available for searching).
The release also contains many literature references. These are structures that have been published in sufficient detail to be classified in scop, but where the coordinates are not yet available from PDB. Whereas PDB entries are identified by a 4 letter code starting with a digit, these literature entries have been given 4 letter codes starting with the letter s followed by 3 digits, e.g. s149. This code is arbitrary and may change between releases. Entries included in scop in this way tend only to be those structures that are significantly different from any structure already in PDB.
In the absence of a PDB header, a pseudo header file has been constructed (see here for an example: the header for s001). These header files contain the reference; the swissprot identifier for the sequence; the fragment of the sequence for which structural information exists (where known) and the domain classification of each part of this sequence fragment in scop. (Note that occassionally some of this information is incomplete where there was insufficient information about the sequence in the orginal reference.)
The sequences of these literature references have also been included in the sequence databases that can searched for sequence similarity from here (see here for help, here for information about the different databases available for searching).
Because much of the information on these pages is in graphical form, we highly recommend users turn on image loading. The total size of all the icons used is only a fraction of even the smallest pages, so transfer time should be minimal. Moreover, the same images are used on all pages, so there will be no need to "reload" any of these pictures as you continue browsing.
The filenames for the scop HTML database are currently arbitrary and subject to frequent change. Therefore, external users should not make links directly to files in the database. Rather, they should link a search engine (such as the one at http:search.cgi). The fields which you can set are: