Tim Hubbard, Centre for Protein Engineering (CPE), Medical Research Council (MRC) Centre, Cambridge, CB2 2QH. th@mrc-cpe.cam.ac.uk.
An alternative version of this page, with full images rather than icons embedded, can be accessed here
rmscript has been written to make it easier to obtain and display protein structures. It enables the easy access to PDB (brookhaven protein databank) format files and their display using the 3-D viewer RasMol (Sayle, 1994). Specifically it is designed to take advantage of information embedded in WWW (world wide web) pages that specify a molecule, the parts that should be displayed and how they should be coloured, but which delegate obtaining the co-ordinate file of the molecule to the client computer.
The scop (structural classification database) is a WWW database that embeds information in this way to show protein folds and domain information within large proteins. The scop database classifies protein structures by their structural and evolutionary relationships (Murzin et al, 1995). It can be accessed using any WWW browser from <URL http://scop.mrc-lmb.cam.ac.uk/scop/>. With its new sequence alignment facility this embedding also allows alignments to be highlighted in the structure they are similar to (Visual Sequence Comparison, VSC).
With rmscript installed, clicking on an icon or highlight in a WWW page is all that is required to cause a structure to be displayed as specified. The most important concept is that the actual PDB co-ordinate files (which can be very large) are not sent, but rather a set of instructions (very small) specify which protein structure is to be displayed and in what way. The instructions are processed by rmscript which is responsible for obtaining the PDB file required. In this way views of proteins specified on distant WWW pages can be accessed rapidly relying on local copies of the relevant structure files. An on-line version of a poster gives a number of examples.
At present this system can only be used with the RasMol 3D viewer, however a more general and powerful standard, annmm (Brenner and Hubbard, 1995) has already been proposed. Future versions of rmscript will take advantage of this.
Since the objective is displaying proteins in a 3-D viewer, a fast Macintosh with a colour screen is required. RasMac will run on 68030 machines, but generally the minimum is 68040 and preferably a PowerMac. On 680x0 machines, performance of RasMac will be poor without a co-processor.
To run satisfactorily, 16M memory or 8M + Ramdoubler is required. It is possible to run the system with 8M on a 680x0 system with some care, provided few system extensions are loaded - see Troubleshooting. Remember it will be necessary to have rmscript (500K+), Netscape (3M+), RasMac (1M+system memory) running at the same time. If you are using system 7.1, Finder Liaison (64K) will also be running. If you are using the compression options, MacGzip (512K) will be running. More than 8M will be required to run 2 copies of RasMac.
rmscript is supplied as an installer, which can be downloaded here. Double click on the installer program to install on your hard disk. The default installation will restart your Macintosh after completion, so save any files in open applications before starting installation. If AppleScript 1.1 is already installed on your machine you can use the customise option to avoid a reinstalling AppleScript and so avoid a restart. Version 0.92 of rmscript has been compiled with FaceSpan 2.0.1 and non-apple AppleScript Scripting Additions (OSAXEN) files are now embedded in the application. This should eliminate a few compatibility problems and means that if you select not to install AppleScript your existing AppleScript config<uration will be unchanged.
Notes:
rmscript is pre configured to fetch co-ordinate files via internet access to PDB (USA) and with a
single PDB file (5pti) in the local file cache which can be displayed
immediately. To change the configuration open the Setup window
(figure 1).
![]()
Figure 1(a) Setup Window
For most users the Default Setup window will be sufficient to select 1) nearby server of PDB files, 2) a home scop site, and 3) a local directory to which cached PDB files can be moved, if required. By default internet access is enabled; the PDB file server selected is the Brookhaven database itself and no local directory is selected or enabled.
Selecting the Advanced setup window, using the Setup Mode
popup, gives many more options (this window can be selected directly
by holding down option whilst selecting Setup in the File menu). This
should allow users to configure rmscript for use with a private
server of PDB files (by entering a URL string directly with 'xxxx'
indicating the PDB file ID), or to access copies of PDB files on a
network disk or CDROM.
![]()
Figure 1(b) Setup Window (Advanced)
Information about each option in these windows can be displayed using ballon help. The configuration setup is saved within the program file, meaning that a configured rmscript can be distributed to other users (provided of course they have first run the installer to place the helper programs on their hard disk).
While starting, rmscript will probably ask about the
location of programs that it needs to interact with (Netscape 1.1N,
RasMac v2.5 and Finder Liaison 1.1) the first time it is used (figure
2). This information should be saved after rmscript has been
quit and only asked for once. Note that any later version of Netscape
can be selected when the program asks about the location of 'Netscape
1.1N' and any later version of RasMac when the program asks about the
location of 'RasMac v2.5'.
![]()
Figure 2 Dialogue presented by AppleScript if it
can't locate required application
If it has been installed correctly, the Files window should
appear (figure 3), listing pdb5pti.ent as the only file. Clicking on
this should display header information. Double clicking on it should
start RasMol and display the protein showing the Carbon-alpha atom
trace with colouring by secondary structure. Note that on a slow Mac,
RasMol 2.5 takes several seconds just to load this file (RasMol 2.6
reads files from disk much faster).
![]()
Figure 3 Files Window (shows a typical list of
files after a number of downloads)
For transparent file display from instructions embedded in WWW
pages (e.g. scop) Netscape also need to be configured to interpret the
MIME tag application/x-rasmol to be a signal to send data
to rmscript. Probably the quickest way to configure Netscape
is click on a rasmol icon in a scop page and then when Netscape asks
what to do with the data being sent to it, select 'Launch Application'
and select the rmscript application from the file dialogue. If
Netscape has previously been configured for an old version of
rmscript, delete this first. Sometimes this new configuration
information refuses to save correctly in Netscape. One solution to
this may be to delete the Netscape preferences file and restart
Netscape.
rmscript comes configured with a temporary cache (System
Folder:PDB Cache Folder) and the internet address of PDB. Since PDB
files are large, when rmscript is quit, there is an option to
clean this cache (figure 4). Since you may wish to keep some files
longer than others, using the Setup window you can also define
a directory as a permanent cache. Use the Files window to
transfer files from the temporary to permanent cache.
![]()
Figure 4 Cache cleaning dialogue on
Quit.
Since integration with a compression program MacGzip is built in, to allow transparent decompression of compressed files transferred over the internet, a compress all option is also built in so you can choose to compress files rather than delete them.
If you have access to an Appletalk network disk or even a CDROM of PDB either locally or on the network, this may be faster than using internet, so rmscript can be configured to look for the file on this disk first. Since CDROM's go out of date very fast, if a file is not found it will still be fetched over the internet. Since most people's Appletalk networks are quite slow, files will still be cached locally.
Clicking on the green rasmol icon (
) in any scop page in
Netscape should activate rmscript and lead to that molecule
being displayed in RasMol. First rmscript will look for the
required PDB file. If it does not find a copy on your local disk, it
will request Netscape to download it over the network (How long this
takes and whether it is successful will depend on how busy the
internet is and how busy the site you are using as a source of files
is). Provided the download is successful, rmscript will start
RasMac if it is not already running and instruct it to display the
file (Figure 5 shows a download in progress and Figure 6 shows the
molecule that has been downloaded displayed in rasmol).
![]()
Figure 5 Download of pdb1ddt.ent in progress.
Foreground window is from Netscape, background window ['fetching
1ddt'] is from rmscript, indicating it is waiting for
Netscape.
![]()
Figure 6 pdb1ddt.ent displayed in RasMac after
download. The colour scheme is the standard one used for domains in
scop:
alpha helix (magenta);
beta strand (yellow);
turn residues (blue);
remaining residues, i.e. random coil (white);
parts of this PDB chain not in this
domain (red-orange);
other chains in
this PDB file (violet). Ligands are displayed in space filling
representation, coloured by atom.
After files have been downloaded they remain in a temporary cache directory (System Folder:PDB Cache Folder) in your system folder. When you quit rmscript you have an option to delete these files (move to the trash) and you are recommended to do this if you don't need the files, since they can be large and soon fill up your disk. To keep some files for longer, define a permanent cache directory and move them there (see above).
Netscape requires a lot of memory to run and if you are using a PowerBook, you may not always be connected to the network anyway, so rmscript also provides a simple interface via the Files window to display any of the files you have in either cache. The display is with the backbone/structure coloured format, similar to that used in scop (see figure 6), however note that whereas selecting a structure via scop may highlight only a single domain within one chain of a PDB file, selecting a structure from the Files window will colour all parts of all protein chains by secondary structure and ligands will not be displayed.
Display instructions from scop only specify PDB files, however you may have access to some private co-ordinate sets so if you put these in the permanent cache you can use rmscript to manipulate them for display.
Since one of the hardest part concerning structure files is remembering the contents (since PDB files have cryptic names), you can also keep some notes about each file for your own reference in the window below the file list. By default the contents are set to a combination of the HEADER and SOURCE lines of the PDB file, if they exist. These notes are saved when rmscript is quit and can be exported and imported to allow exchange between other users or import into a new version of rmscript.
The import/export format is one line per file with 2 fields separated by a tab:
filename <tab> Description --------------------------------------------------- pdb5pti.ent Bovine Pancreatic Trypsin Inhibitor ---------------------------------------------------
The filename field must omit any compression extension (pdb5pti.ent not pdb5pti.ent.gz) and the description field cannot contain tabs or newline characters.
Other rmscript commands, such as download (figure 7), can
be found in the File menu (figure 8).
![]()
Figure 8 File Menu
Ballon help is provided for most items in each window explaining their function
Windows remember the position they were moved to after rmscript is quit.
If you wish to cancel a download via Netscape (because it is too slow), click the rmscript cancel button before the Netscape one. This is because rmscript is waiting for Netscape to close the file, so if you do things in the wrong order the partially downloaded file will be displayed in RasMac. If done the correct way, rmscript will delete the partial file to avoid confusion.
This is an experimental option and might be risky (crashing your Macintosh) however has been used successfully without problems by some users.
If you wish to compare 2 structures it is necessary to run more than one copy of RasMol, since the program will only read one structure at a time. One of the key features of rmscop for unix is the ability to communicate with multiple copies of RasMol. On a Macintosh, double clicking on a program that is already running will not start a second copy running, instead it is necessary to make a copy of the program with a different name. Find RasMac and from the finder use apple-D (duplicate) to create "RasMac copy". You can rename this if you wish, although the name must begin with "RasMac" to be recognised by rmscript.
If 2 copies of rasmol are running, rmscript will display the
Select Viewer window (figure 9) to allow selection of a viewer.
![]()
Figure 9 Select Viewer Window
For hackers only: Note a subtle feature is the distinction between clicking 'Display'/double-clicking in the Files Window and using Apple-D in the File Menu. The Former will check how many RasMac applications are running and if more than one, present the Select Viewer window. The latter will display the Select Viewer if more than one RasMac was running when rmscript was last Quit. This implements a shortcut, since it allows a second RasMac to be started without needing to find it on the harddisk!
To deal with a shortage of memory, try to launch everything in the right order. Start rmscript. Use its Files window to display a file in RasMol (this will also start Finder Liaison if required
This application is appears to be quite stable, however due to the interprocess communication used, it could hang your macintosh in various circumstances (such as if you request a network download and are not connected to the network or if you run out of memory). Do not use initially unless you have saved everything from other application that are open. Use at your own risk. Please report errors and crashes, bugs to th@mrc-lmb.cam.ac.uk with details of machine, OS version and pdb source and download method.
rmscript is a development of rmscop which was written for unix machines as a wish script (written in the tcl/tk language). rmscop assumes that the entire PDB database (>1 Gigabyte of data in 1995 and more than 3300 files) was accessible on a local or NFS (network file system) hard disk. For most users of Macintoshes this is unrealistic, so rmscript has the added features of transparent file access from both the internet and network disks as well as local file caching. To fetch files over the internet, rmscript relies on the apple event GetURL in Netscape 1.1N, so it could in principle use any other application that provides GetURL, although this has never been tested.
On Macintoshes rmscript is a FaceSpanª application which uses the AppleScript extensions to the Macintoshª OS with a number of freeware extensions to AppleScript from Glenn L. Austin and Mark Alldritt.
Macintosh and AppleScript are trademarks of Apple Computer Inc.
FaceSpan is a trademark of Software Designs Unlimited, Inc.
FileBusy OSAXEN is Copywrite ¨ 1994 Glenn L. Austin
Regular Expressions and processes OSAXEN are Copywrite © Mark Alldritt
Finder Liaison is © 1993 Gregory H. Dow.
MacGzip 0.3§ is macspd@ivo.cps.unizar.es's Macintosh port of Jean-loup Gailly's gzip 1.2.4.
RasMol is written by Roger Sayle and is available for Unix, Macintosh and windows platforms. The Macintosh version is called RasMac and the PC version RasWin however throughout this document the generic name RasMol has been used, except in cases where the specific Macintosh version is being refereed to.
Murzin, A., Brenner, S. E., Hubbard, T. J. P. and Chothia, C. (1995). scop: a structural classification of proteins database for the investigation of sequences and structures. JMB, 247, 536-540.
Brenner, S. E. and Hubbard, T. J. P. (1995). annmm: A Specification for Defining and Annotating Regions of Macromolecular Structures. Proceedings of ISMB'95, Cambridge, UK (see here)
Sayle, R. (1994). RasMol. WWW, ftp://ftp.dcs.ed.ac.uk/rasmol.