Since Phase 2 started we have been gaining ground and we are at #9 as of this post! Keep it up!
Announcement
Collapse
No announcement yet.
Everyone join DF!
Collapse
X
-
So where do we go about getting the client loaded, and joining the matroxusers team? If there were some nice easy instructions for newbies like me then you might get some new facesDM says: Crunch with Matrox Users@ClimatePrediction.net
-
Originally posted by GNEP
So where do we go about getting the client loaded, and joining the matroxusers team? If there were some nice easy instructions for newbies like me then you might get some new faces
June 13, 2003
Using the Foldtraj client
-------------------------
System Requirements
-------------------
The Distributed Folding client will run on just about any computer, but
there are a few minimal system requirements. Generally, your machine must
have:
233 MHz Pentium II processor (or equivalent) or faster
(see CompatiblityNotes.txt for details)
20 MB free hard drive space (and up to 100 MB additional temporary space)
128 MB RAM
An internet connection
Colour display (optional)
A supported operating system (Windows and most flavors of UNIX)
UNIX Installation
-----------------
UNIX users will then have to un-gzip and un-tar the package, like so:
> gzip -d distribfold*.gz
> tar xvf distribfold*.tar
A distribfold directory will be created with the required files in it. Go
into that directory, and type "./foldit" and hit enter to begin the program.
Windows Installation
--------------------
The downloaded file is a ZIP archive so simply use WinZip or PKUNZIP to decompress
it to an empty directory on your machine. The file should be called
distribfold-current-win9x.zip or something similar. Once this is done,
browse to the directory you unzipped the files to and double-click
on the "foldit" icon (the file "foldit.bat") to automatically open up a DOS box and
begin the program. You may wish to make a shortcut to the program on your
Desktop for quicker access.
What does it do?
----------------
The Foldtraj client is part of the Distributed Folding Project, an
experiment in distributed computing through the internet to help solve the
Protein Folding Problem, one of the computationally intensive problems facing
scientists of the 21st century. The program probabilistically generates
protein structures, and uploads them to our server every once in a while. A
small, colorful protein will build itself up on your screen as the program
generates structures so you can see how many you have made and how fast it is
progressing. When you want to use your computer again, simply hit 'Q' and the
program will end (or even better, leave it running in the background).
Signing up
----------
We require you register in order to participate. To register, simply visit:
and click on the High-Flyers link. Fill in your e-mail address, Organization
and a password. You may also optionally tell us a bit more about yourself
by filling in the appropriate fields, such as your name, address and
location. We assure you that this information will remain strictly
confidential and will not be used or sold to marketing agencies. On this
site you will also find links to a wealth of information about the
Distributed Folding Project, and the algorithm used by the Foldtraj client to
generate protein folds.
Getting your handle
-------------------
A handle is an 8 character code that is needed so that the server can track
structures made by your computer. Once you have registered, your handle
will appear on the web page, and you will shortly receive an e-mail
containing your handle. No two participants will have the same handle.
Make sure you keep this e-mail message for future reference.
Running the Program
-------------------
The first time you run the program, you will be asked to enter your handle.
This is the 8-character name that was e-mailed to you when you registered and
identifies any files your computer sends to our server as being generated by
you. Only proteins generated by a user with a valid handle will be accepted
by the server, so enter it exactly as shown in the e-mail you receive. You
may use this handle on as many machines as you like, it is simply a way of
identifying yourself in an anonymous fashion.
Following this, you will be asked a series of questions to help configure the
program and customize it to your needs. Simply answer the question as
appropriate, or if unsure just hit Enter to accept the default choices.
These options can be changed later as outlined towards the end of this
document. After answering the questions, the program will exit.
The next time you start the program it will remember your handle and settings
but if you need to change your handle for some reason, you can do so by
pressing 'C' when prompted just after starting the program, and then waiting a few
seconds. The program uploads any remaining protein folding data in its buffer when
it starts.
Changes to the Algorithm since Phase I
--------------------------------------
If you have been using the software previously, you will notice some changes to
the operation of the software now. The structures are now built in 'generations',
and the program always continues from the point where it left off, when it is
stopped and restarted. There are four steps to each generation:
1) Generate a set of structures (as in Phase I)
2) Minimize the energy of the best structure generated in the set
3) Upload the results to the server
4) Build a 'trajectory distribution' for the next generation based on the
structure from step 2
The trajectory distribution in step 4 is a map of protein conformational
space. In this case, it is designated in such a way that all structures
sampled in generation X are very similar in conformation to the best
structure (from step 2) in generation X-1. Thus while generation zero consists
of 10000 completely random conformations, the following generations then all
depend somewhat on the best structure out of these 10000. Generations other
than zero are much smaller, consisting of just 50 structures. This is because
all are fairly similar in conformation and less variation will occur. This
process repeats until generation 250 is reached, at which point the 'set' is
said to be complete, and the whole process starts over with a new generation
zero. This approach is more likely to lead to realistic, compact protein
shapes and is expected to find native-like conformations much faster than the
previous technique of brute force random sampling used in phase I.
There are a number of side effects to the new approach that you should be
aware of however. Firstly, only the best structure from each generation is
kept and uploaded, so uploading only occurs at the end of each generation
and never partway through (except for uploading buffered data of generations
which have already been completed, which can occur when you start up the
program). Also, at the end of each generation, a progress bar will appear
as the best structure is energy minimized. You may quit the program during
this process, and when you restart it will begin the energy minimization
from the start again but otherwise no work will be lost. The minimization
should only take 1-10 minutes depending on the structure and your computer
speed. Following this, another progress bar will appear as it builds the
trajectory distribution for the next generation. Again you may exit at any
time without problem, and it will start over at making the trajectory
distribution next time you run the program. This should generally take 2
or 3 minutes at most. Once complete, it will begin folding the new
generation of structures.
Note that because generations other than zero are not random, but are
confined to be 'near neighbour' structures to the previous best structure,
the protein chain has less freedom where it can build, and can thus
sometimes get 'stuck' for a while. When this occurs, a counter appears
at the top of the screen so you know it is still thinking. Eventually
after a set amount of tries if it is still stuck, it will give up and
start building that structure again. If it repeatedly gets stuck, it
will automatically increase what we term 'laxness' parameters. This is
a set of three numbers which affect the tradeoff between structural
integrity and building speed. Thus when laxness is low, structures have
good geometry and few if any atomic overlaps, but could get stuck a lot.
When laxness is high, structures are able to avoid getting stuck by
permitting a few geometry violations or atomic overlap. Obviously the
goal is to keep laxness as low as possible while still avoiding getting
permanently stuck. Laxness will keep increasing until the protein
eventually gets unstuck. Laxness will decrease slightly at the start
of each new generation, but otherwise once increased, will retain the
new value for future structures as well to avoid further delays.
Lastly, as the protein is now following a directed pathway, you will be
able to view a 'protein folding movie' along with your best structure,
when you log on to the main web site, provided you have installed the
Cn3D structure viewing software, available from NCBI.
(http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml)According to the latest official figures, 43% of all statistics are totally worthless...
Comment
-
Joining a Team
--------------
If you wish to collaborate with others, you can now do so AFTER you have
registered. Simply login with your e-mail and password at our website, and you
will be able to select from a list of possible teams to join. You will also be
able to choose a Username for yourself, which will be used to identify you, along
with your Organization, on statistics pages and such. Each team has its own page
of statistics automatically generated for it every few minutes, so even if you are
by yourself you may want to create a team so you'll get a team page made for you.
Checking Progress
-----------------
Once you have registered, you may login using the handle or e-mail and
password you are registered with. From here, you will be able to view your
total contribution to-date to the project, and the overall progress. You may
also download the best structure you have created, and the true protein
structure, and view them with the molecular viewing program 'Cn3D', available
for download for most platforms at:
Cn3D ('see in 3-D') is a structure and sequence alignment viewer for NCBI databases that allows viewing of 3-D structures along with sequence and structure alignments.
This will allow you to see how you are doing compared to others. The
information available here may change over time, so check back often!
Communication between Client and Server
---------------------------------------
After every 'generation' of structures (several hours), information about the
structures generated is uploaded to our server. This includes their size,
similarity to the native structure, energy score and certain structural
characteristics. Also, the complete structure for the conformer which is
most similar to the native that was generated will be sent and stored on our
server. This ensures that all the best structures will be saved for later
inspection and analysis (unfortunately, we do not have space to store 10
billion complete protein structures!). At no time is any information ever
sent to or stored on your computer other than an OK or ERROR response from
the server when your computer sends us data. Also, the generated structures
and a log file are stored temporarily on your computer until they are sent
to the server, at which point they are automatically removed from your
machine. This amounts to approximately 20K of data uploaded to our server
approximately once every few hours. This should transfer in under ten
seconds even via today's slowest internet connections.
The client will also check automatically for new versions of itself, and when
one becomes available, an update will automatically be downloaded and
installed (after obtaining your approval first). All updates are digitally
signed and encrypted to prevent unauthorized updates being sent. If an
update is downloaded and NOT digitally signed, you will be advised as such
and warned not to proceed with the update.
Otherwise, the new version of the client will then continue running without
further intervention required. If for some reason the update cannot be
downloaded or installed, you will be told, and either the program will
terminate, or it will continue and try again later. You can always manually
download the latest version at http://www.distributedfolding.org/
Possible Problems
-----------------
A constantly updated list of known bugs and workarounds can also be found at:
so check here first if you suspect there is a bug in your version of the
program. To check your version of the program, run the 'foldtrajlite'
executable, with no arguments, at a DOS or UNIX prompt. The first line
should have the compilation date in it.
Due to the nature of the program, an internet connection is required to send
structure data to our server. The program is smart enough to detect if the
network is down and cannot contact the server for some reason. In such cases,
it will buffer any such data and try to send the unsent data along with new
data after completing the next unit of work. Warning messages may also be
displayed on the client screen if network problems or other problems are
encountered.
Be careful to enter your handle exactly as given. If you make a mistake, the
client may refuse to run. Especially be sure not to confuse 0 with o,
and l with 1. If you find a mistake, you can change your handle as described
above under 'Running the Program'.
We have also observed that strange network problems can be caused by caching
http proxy servers. If your ISP uses one, it may possibly interfere with
communication to our server, so if you experience problems and can bypass
the proxy, doing so may fix the problem.
Using a Proxy Server
--------------------
If you don't know what a proxy server is, you probably don't need to read
this section. If your network is set up to block most outgoing traffic on
port 80, the client may not be able to connect successfully to our server
(for example, if you require a proxy server to browse the web). If you
suspect you are in this category, simply create a small text file called
"proxy.cfg" in the same directory as the client program (and this readme).
In it type two lines; the first line is the name of your proxy server and
the second is the proxy port (often 8080). If this file is found by the
client at startup, it will try to connect to our server using the proxy
specified. Here is a sample proxy.cfg:
proxy.foobar.com
8080
If your proxy server requires Basic Authentication, simply add your
username and password after the port number, like so:
proxy.foobar.com
8080
username
password
If your proxy server uses NTLM Authentication replace username above
with username.DOMAIN where DOMAIN is your NT Domain/workgroup name.
Non-interactive Auto-update
---------------------------
For your security, you will be asked for confirmation before any updates
are download and installed on your computer. All updates are digitally
signed and so it is fairly safe to always allow digitially signed updates
to be installed. A malicious user would have to compromise the private
encryption key in order to "spoof" an update and make it appear to come
from us. Thus you have the option of allowing the client to automatically
accept and install digitally signed updates (and automatically refuse
unsigned ones). By default this feature is disabled and you must give
your consent for downloads to begin, because we feel the choice should be
that of the user. To enable this feature and allow automatic updating
without your intervention required, simply create a text file in the same
directory as the client, called "autoupdate.cfg" with the the digit 1 on a
single line. If you change your mind, simply remove the file, and it will
revert back to its default behaviour the next time you run it.According to the latest official figures, 43% of all statistics are totally worthless...
Comment
-
Quiet Mode
----------
To run the client in quiet mode (i.e. no output whatsoever to the terminal),
simply edit the foldit script (foldit.bat on Windows) and add "-qt" (without
the quotes) to the end of the line containing "foldtrajlite -f protein -n
native". This may be useful if you are, for example, starting jobs remotely.
You can then terminate jobs by removing foldtrajlite.lock or hitting the 'Q'
key (the latter works on Windows only however). It is recommended you select
"E-mail me when updates are available" on the logged-in page at
http://www.distributedfolding.org if you are running in quiet mode, or else
you won't be aware when a new update is required. Alternatively, see
non-interactive auto-update above. To check progress while in quiet mode, a
file called "progress.txt" will be written to the directory where the program
is installed so you can monitor its progress still. You may customize this
further by adding '-g ###' where ### is a number, to the foldit script.
This is how often (in number of structures) you want this progress.txt to
be updated, to avoid excessive disk writing. If you put '-g 0' this will
disable the progress.txt output altogether.
Disabling Network access/Running offline
----------------------------------------
If you are running on a machine behind a strong firewall, or a machine with
modem access to the internet, or no access at all, you may wish to tell the
client not to access the network. In this case, all attempts to access the
internet will be bypassed. Simply add '-i f' to the foldit script as
described above for other optional features. Eventually you will need to
either connect to the internet and remove the '-i f' option, or copy the
entire directory to another machine which is connected to the internet to
allow results to be uploaded. Then after the upload, copy the remnants
back to the original machine to a fresh directory to continue folding (and
delete the old directory from the offline machine now that it has been
uploaded).
Upload Only Option
------------------
If you add '-u t' to the foldit script, the client will perform all normal
startup procedures such as checking for new versions and uploading buffered
data, but then it will exit before starting to generate any structures.
This may be useful for people making 3rd party interfaces to the software
or writing other sorts of scripts. The 'average' user probably will not use
this option.
Changing the priority of the client [Advanced Users]
----------------------------------------------------
You can change the default priority of the client by modifying the foldit
script as described above for quiet mode and increasing the buffer size.
Simply add the switch '-p ##' where ## is the desired priority. This can
range from -20 (extremely aggressive; will pre-empt most other tasks) to
the default of 20 (very passive, only runs when the CPU is idle). On UNIX
the full range from -20 to 20 is available. Windows defines three process
priority levels: Low, Normal and High. Within each of these are several
thread priority levels which allow 'fine tuning'. Thus the same range
[-20, 20] is used on Windows and mapped accordingly. Use 15 for Low, 0
for Normal and -15 for High on Windows. Other values will more finely
tune the priority further.
Changing Upload Frequency [Advanced Users]
------------------------------------------
You can no longer control the upload frequency, it is fixed to once per
generation of structures.
Running as a service [Windows NT/2000/XP only]
-----------------------------------------------
The Windows version of the client can be set up to run as a normal Windows NT
service, so that it will start automatically whenever your computer starts up
and run in the background until you shut down. Structures will be uploaded
every generation still, and buffered for later upload at shutdown (so your
shutdown procedure will not be delayed). It runs at low priority so other
tasks will push it out of the way and so it should not noticeably slow your
machine down. Be warned that it will only upload after making a generation
of structures (a few hours, depending on your machine speed) so make sure you
are connected to the internet when using the client as a service. Also,
updates will automatically be accepted (provided they are digitally signed) in
this mode (regardless of whether autoupdate.cfg is present or not) and a large
amount of data may be buffered if you do not connect to the network for a long
period of time. To check progress while running as a service, a file called
"progress.txt" will be written to the directory where the program is installed
so you can monitor its progress still.
To setup the program to work as a service:
1. Open a DOS prompt to the directory you installed the client.
2. If you have not already done so, run the program once (by typing 'foldit'
in the directory where the software is installed) and enter your
8 character handle and answer the other configuration questions it
asks.
3. Type: foldtrajlite /install (and hit Enter)
4. The next time you reboot your machine it will start automatically
To remove the service:
1. Go to Administrative Tools -> Services
2. Find and Stop the service called Distributed Folding Project
3. Open a DOS prompt to the directory you installed the client.
4. Type: foldtrajlite /remove (and hit Enter)
5. It will not be run on your machine automatically anymore
You may install a second instance of the service by changing step 3 above
from 'foldtrajlite /install' to 'foldtrajlite /install2'. Similarly, to
remove this second instance, change step 4 to 'foldtrajlite /remove2'.
Note that if you wish to run two instances, you must install two full
copies of the software in separate directories and install one using
/install and the other /install2. Not following this procedure correctly
will result in unpredictable behaviour. Also there is no benefit to
installing two copies of the service unless you are using a dual-processor
machine.
Configuring the service [Windows NT/2000/XP only]
-------------------------------------------------
You should have a SERVICE.CFG file in your client directory after
following the above steps. If it is not there, remove and re-install
the service. The first line of this file must contain a number
identifying whether it is copy 1 or copy 2 of the service. Next, you
may include any of the following options in any order if you wish:
useram=1 makes use of extra RAM for improved speed (see -rt option
elsewhere in this readme); default is useram=0
priority=### where -20<=###<=20 sets the priority of the task just like
the -p option (20=passive, -20=aggressive); default is 20
progress=### where 0<=###<=100 sets the update interval for the
progress.txt file, like the -g option (0=disable)
default is 50
connect=0 disables internet usage - use this if you are on dial-up
or temporarily unable to access the internet for some
reason; you must eventually remove this to upload your
data, but your work will be buffered until then; works
the same as the -if option; default is connect=1
uploadonly=1 behaves as the -ut option, instructing the program to
upload all buffered work and then terminate. Do NOT
combine with the connect=0 option! Default is
uploadonly=0
Running Non-interactively or on a Cluster [Advanced Users]
----------------------------------------------------------
You may wish to run the client in a mode requiring absolutely no user
interaction. This is especially convenient if you want to run it on, for
example, a Beowulf Cluster. For your convenience we have made it easy to do
so. See also the section above on "Non-interactive auto-update".
To run on a cluster computer (and we encourage you to do so if you have
access to such a set of machines):According to the latest official figures, 43% of all statistics are totally worthless...
Comment
-
1. Decompress the package on one node as described above, and add proxy
information as described above if necessary. Run the program once by
typing 'foldit' as the command prompt, and enter your 8-character
handle that you received upon registration. Answer the other configuration
questions as well. Be sure to answer 'N' to 'Run in quiet mode (no output
to terminal)?' so you can see output for step 1a.
1a. Try running the client on the node to ensure it starts without any input
from you. If any updates are available, for example, you will have to
give it permission to download and install them. Once you have ensured your
client is up to date, manually edit the foldit script (foldit.bat on Windows)
and on the line containing 'foldtrajlite -f protein -n native', either change
the '-qf' option to '-qt', or add '-qt' (if there is no '-qf' present). Then
proceed to step 2.
2. Now distribute the directory to the other nodes of your cluster (presumably
you have some script(s) to efficiently do this). Each node will need a
complete copy of the client software.
3. Now start the "foldit" script on each machine (again, you presumably have
some scripts to do this efficiently. It will run until you stop it as
described in step 4, or until a mandatory update is released (in which case
it will NOT be automatically installed (unless you have enabled
non-interactive autoupdate, described earlier), the client will simply exit
once you have performed step 4; you must then manually run the client on a
node, give permission to install the update, and then go back to step 2).
4. To halt the client on one or all the nodes, simply delete the file
"foldtrajlite.lock" which will be found in the same directory as all the other
files. The program will then terminate gracefully (could take up
to two minutes but normally only a couple of seconds, so please be patient).
Making it run faster [Advanced Users]
-------------------------------------
The software can be made to run roughly twice as fast, at the cost of using
more memory. Normally the software uses about 25MB RAM while running, but
if you add the '-rt' option to the foldit script, the client will then use
up to 150MB RAM, but run twice as fast. Make sure you have enough free
RAM (and NOT virtual memory) before using this option or you will not benefit
from the speed improvement. If you are running the Windows NT/2000/XP
service mode of the client, edit the service.cfg file (you may have to
/remove and then /install the service if this file is missing in your
folding directory). The service.cfg normally contains a single line with a
number (to tell it whether it is copy #1 or copy #2 of the service). Add a
second line to this file with the text 'useram=1' (without the quotes) in
order to tell the service to use up to 150MB RAM as well.
Running a benchmark [Advanced Users]
------------------------------------
At the request of many users, we have incorporated a benchmarking scheme
directly into the client software. This is a standard test, which should
always return the same results, when run on the same machine, and serves
as a rough indicator as to the relative performance of the software on
that machine. To tun the benchmark, simply open a DOS/UNIX window and
at the command prompt, type 'foldtrajlite -bench' and hit enter. Make
sure no other programs are running at the same time if you wish to get
an accurate reading. It will run for up to 15 minutes (depending how
slow or fast your machine is) and then output a table of four numbers
like so:
Usr time Sys time
-------- --------
Maketrj 8.030 1.050
Foldtraj 47.200 13.460
These are timings, in seconds, with smaller being better (faster). Total
CPU time is the sum of Usr and Sys time. The Foldtraj timings correspond
to the building of actual structures, while the Maketrj timings are
somewhat less important and reflect the time take to make new trajectory
distributions at the end of each generation.
About the Client
----------------
Sure, we could give you a big flashy 3-D interactive screensaver showing the
actual protein being built up and letting you rotate it around in 3-D space as
it is being built, and it would probably look really cool. In fact we have
done this for Windows (avaiable on the same site you downloaded this client).
But we want to spend your CPU cycles where they count, in the actual
scientific computation. Hence we recommend that even Windows users use this
text-based client instead. Here we have opted for a more "abstract"
representation of the folding protein, using color ASCII art. While not
looking like a true protein, it should still give you an idea of what is going
on inside your computer's head while the program is running. Each structure
looks completely different, and each one is only a single sample of an
extremely enormous conformational space. As a bonus, since the client runs in
text mode, you can even run it through a Telnet session or a dumb terminal!
Final Words
-----------
Please visit the official Trajectory Directed Ensemble Sampling (TraDES) web
site at:
to learn more about protein folding and Foldtraj. Visit the Distributed
Folding Project web site for more info about this client:
Please direct any technical support questions, comments or suggestions to us at:
trades@mshri.on.caAccording to the latest official figures, 43% of all statistics are totally worthless...
Comment
-
Oh and use dfGUI http://gilchrist.ca/jeff/dfGUI/According to the latest official figures, 43% of all statistics are totally worthless...
Comment
-
Thanks Guru! Might give it a try when I get a chance...DM says: Crunch with Matrox Users@ClimatePrediction.net
Comment
-
Thank you.
Ah, no wonder why we've dropped behind.
Since I'm unemployed right now, you've only got my free home machines. (Damn stupid graduation)
2xAMD MP 2200+ 1GB DDR W2K SP2 (Use RAM)
P4 1.8 1GB DDR WXPPro SP1 (Use RAM)
Hmmm... looks like I'm about to overtake Novdid by tonight if Novdid doesn't join the race again.
J1NG
Comment
-
Relax! These machines are only on when I'm awake. Which means only for 12 hours each day are they running. It'll take me at least till Friday to reach your current score.
Hmmm, there's something weird at the folding team statistics page. According to my login section there's 33 members of matroxusers, yet it only lists 31. Who's the other two?
And it looks like TomNuttall has joined in too!
J1NG
Comment
Comment