Tutroial:alltut
Modeling books in dropbox
https://www.dropbox.com/sh/g3myhes5y1rd6sv/AAAW2FsReOuamNvVWHjqzEA9a?dl=0
Running Gaussian QM job in tacc
Stampede each node in the Normal queue has 16 core and 32GB memory. Each job can run for 2 days.
There are additional 16 nodes in the “largemem” queue, each with 1TB memory and 32 core.
Create a submission script for each job. Only the job name, name of standard out, number of cores and the last line need to be changed for each job:
For example: qm.job:
#!/bin/bash
#----------------------------------------------------
# Example SLURM job script to run MPI applications on
# TACC's Stampede system.
#
# $Id: job.mpi 1580 2013-01-08 04:10:50Z karl $
#----------------------------------------------------
#SBATCH -J dmpwat # Job name
#SBATCH -o dmpwat.%j.out # Name of stdout output file (%j expands to jobId)
#SBATCH -p largemem # Queue name
#SBATCH -N 1 # Total number of nodes requested (16 cores/node)
#SBATCH -n 16 # Total number of mpi tasks/cores requested. If you want to run 2 QM jobs on one node, this needs to be 8.
#SBATCH -t 48:00:00 # Run time (hh:mm:ss) - 48 hours
#SBATCH -A TG-MCB100057 # <-- Allocation name to charge job against
# Launch the MPI executable named "a.out"
module load gaussian
module list
g09 < DMPwat-mp2.com > DMPwat-mp2.out
To submit the job on stampede, log into stampede
ssh myusername@stampede.tacc.utexas.edu
It is best to run the jobs in work directory:
cdw
pwd
mkdir DMP
cd DMP
vi qm.job #edit the content as above example
in the *.com file, specify the Nproc and mem accordingly. For example
Nproc=8 and mem=16GB will allow you to run two QM jobs on each node in the "normal' queue
Nproc=16 and mem=32GB will work for one QM job on each node in the "normal' queue
Nproc=16 and mem=300GB will work for two QM jobs on each node in the "largemem' queue
To submit the job
sbatch qm.job
To check the status
squeue –u myusername
To cancel the job
scancel jobid_from_squeue
You can write a script to submit several jobs via a loop as above.
propka
Download from
https://github.com/jensengroup/propka-3.1
Installed on bme-pluto (python 2.6)and can run on node100 etc.
python setup.py install --user --install-scripts ~pren/bin
To run:
~pren/bin/propka31 1hpx.pdb
propka31 -help
Output has pka for each residue:
SUMMARY OF THIS PREDICTION Group pKa model-pKa ligand atom-type ASP 15 X 2.90 3.80 ASP 30 X 4.37 3.80 ASP 31 X 4.02 3.80
From the above pka, you can estimate that at pH=7 all ASP is negatively charged. At pH=4, the bottom two may be neutral.
Brownian Dynamics and nanoparticle
Current working directory for this project is: /users/gchattree/Research/brownian/
Email at gaurav.chattree@utexas.edu for questions.
-------------
Brownian Dynamics Tutorial Using Browndye:
Note:
There are two version of Browndye installed.
Strangely you will need them both.
/home/gchattree/Research/browndye/bin/ and /home/muxiao/software/browndye/bin/
I use the * character extensively to represent parts of the file name differs for each simulation. In general * = molecule name. But it varies depdending on context.
Browndye manual is found: http://browndye.ucsd.edu/browndye/doc/manual.html
Browndye tutorial is found: http://browndye.ucsd.edu/browndye/thrombin-example/tutorial12.html
A sample directory can be found: /users/gchattree/Research/brownian/spring-2013/trastuzumab/gold-ninth-A1-1/
--
1. Download your two pdb's. Give them both succinct names.
2. Go to: http://kryptonite.nbcr.net/pdb2pqr_1.8/ to convert these pdb files to pqr.
-Note, many issues can occur here. Any residue label that has a letter in it, such as 99B, will be ignored. Change these residue labels to a number. 99B -> 999.
-HETATMs are also not usually recognized and have to be added to the pqr manually.
-Be sure to generate the APBS input file as well.
-Download both the *.in file and the *.pqr file
-Overlook the *.pqr file to make sure everything looks okay.
-Delete all the lines containing just TER from the file
3. Now enter the *.pqr file in vim. Run the following command: %s/\([0-9]\)-\([0-9]\)/\1 -\2/
-This adds spaces between numbers in case they got conjugated during pdb2pqr
4. Certain nodes have APBS installed. One of these is bme-earth. One one of these nodes run: apbs *.in
-If this takes much too long or there is a memory error, edit the the dime line to something smaller, such as: dime 61 61 61.
-Check the apbs output in pymol using the APBS visualizer tool. There is a chance that some of the potential map got cut off. In that case increase the size of the cglen line in the *.in file and re run apbs.
-If you want more info about the keywords in the APBS input file, the APBS website has a good faq.
5. Now rename the APBS output file. I usually do something like: mv *.dx *-PE0.dx
6. Now run pqr2xml to create an atoms xml file as follows: /home/gchattree/Research/browndye/bin/pqr2xml < *.pqr > *-atoms.xml
7. Now the annoying part. You must create a pairs.xml file. You must decide which pairs will count towards the reaction occuring. I use pymol's view hydrogen bonding tool to decide on these atoms.
-Example pair file can be found: /users/gchattree/Research/brownian/spring-2013/trastuzumab/gold-ninth-A1-1/pairs.xml
-You can also use the default command to create the file: /home/gchattree/Research/browndye/bin/make_rxn_pairs -mol1 molecule1.pqrxml -mol2 molecule2.pqrxml -ctypes contacts.xml -dist 5.0 > pairs.xml
8. Now you make the reactions file (*-rxns.xml). Basically here you just define what distance between each pair will constitute as binding, and how many bound pairs are needed to count as a reaction (keywords 'distance' and 'nneeded' respectively). Example command: /home/gchattree/Research/browndye/bin/make_rxn_file -pairs pairs.xml -distance 4.0 -nneeded 2 > mol0-mol1-rxns.xml
9. Now copy over the following input file and change the file names as necessary and make other edits as you see fit (such as changing 'n-threads'): /users/gchattree/Research/brownian/spring-2013/trastuzumab/gold-ninth-A1-1/input.xml
10. Now run /home/muxiao/software/browndye/bin/bd_top input.xml to create all the necessary input files.
-must use the latest bd_top binary which is found in muxiao's directory
-If this step takes too long you may have to kill it and alter the size of your system.
-IMPORTANT: You may want to check the *-surface.xml file after this runs. If there seems to be way too few atoms on the surface, surface_spheres may have to be run manually with a different probe size, and then bd_top should be rerun.
--visualize the surface as follows:
a. cat gold_A1-atoms.xml | surface_spheres -probe_radius 4.0 > gold_A1-surface.xml (in this example 4.0 is set as probe size. default is 1.6)
b. make_surface_sphere_list -surface gold_A1-surface.xml -rxn1 R-gold_A1-rxns.xml -spheres gold_A1-atoms.xml > gold_A1-surface-atoms-hard.xml
c. cat gold_A1-surface-atoms-hard.xml | make_soft_spheres > gold_A1-surface-atoms.xml
d. vim gold_A1-surface-atoms.xml
--in vim, run: :%s/<\// <\// and then :%s/>/> /
e. now run: /home/gchattree/Research/brownian/scripts/find_surface.py
-- open find_surface.py in vim first and edit the file names as needed
11. Now run the simulation: /home/gchattree/Research/browndye/bin/nam_simulation *-simulation.xml &
-Wait a few days.
Progress can be measured two ways:
A. Vim the file 'results.xml' file
B.cat results.xml | /home/gchattree/Research/browndye/bin/compute_rate_constant
Option B provides information on the current state of the rate constant that is being calculated.
Output can be visualized as follows:
1. for i in {1..100}; do process_trajectories -traj traj.xml0.xml -index traj.xml0.index.xml -n $i | cat > ptraj.xml; xyz_trajectory -mol0 mol0-atoms.xml -mol1 mol1-atoms.xml -trajf ptraj.xml >> traj.xyz; done
2. vmd traj.xyz
Adjust the above command as needed. As written it adds first 100 end states calculated by the first thread to traj.xyz. All of these states can be viewed as a movie in vmd.
Creating the gold nanoparticle (AuNP) and conjugating molecules to the surface:
1. Go to /home/gchattree/Research/brownian/scripts/
2. Run creategold.py with the arguments diameter size and output file name
-example: python creategold.py 40 gold_out.xyz
-gold_out.xyz should be an AuNP with radius of about 40 angstrom
-This script can take quite a while to run depending on the size of the AuNP
-optional, run either of the creategoldshell files to turn the AuNP into just a shell.
-open the files and adjust the file names as needed and then run it
-the smooth creategoldshell file can take quite a while to run
3. Use openbabel to convert from xyz to pdb
-example: babel -ixyz gold_out.xyz -opdb gold_out.pdb
4. Trim the molecule if it is too large using the trimgold.py script
-Open the file to adjust file names as necessary
-the trim_general.py file works for trimming not just gold molecules. It needs the *-pairs.xml file to make sure it does not trim an atom in a pair.
5. Now run attach.py to attach molecules to the gold particles surface
-Open the file and set all the parameters as necessary. There are many parameters to be set.
-It is currently set up for attaching 9 Trastuzumabs (A1) to a gold nanoparticle of diameter ~18nm
-A2 = Pertuzumab
-Open the output in pymol to make sure everything is okay.
6. Go to: <a href="http://kryptonite.nbcr.net/pdb2pqr_1.8/">http://kryptonite.nbcr.net/pdb2pqr_1.8/</a> to convert the pdb files to pqr.
-All the gold atoms will be stripped during this step
-must be added manually
--How to add manually:
--a. Open the newly created AuNP *.pqr file and the original AuNP *.pdb file using vim's vsp command.
--b. Copy over all of the gold atom lines from the *.pdb file to the *.pqr file
--c. Select all of the gold atom lines in the *.pqr file using Shift+v and then run the following command: s/ 1.00 0.00/ 0.00 1.35/
---The gold atom lines should look like this now:
---HETATM 1 AU LIG 1 11.803 -22.201 31.558 0.00 1.35 Au
--d. The id's of the atoms are all wrong now however. To find the new id's open the script find_newIds.py, edit the parts marked by #set, such as file names and starting numbers, and then run it.
--e. Now do a copy and replace of these new ids over the old ones
Current Status:
The various runs can be found in /brownian/spring-2013/trastuzumab
As you can see the current results suggest no signifant change in association constant using this AuNP conjugation
Problems here involve how the surface is found. The surface probe size could be too big.
The results also don't make intuitive sense. We would expect greater association constants with so many antibodies in close proximity
Where to go now:
A smaller gold shell (smaller diameter) so there is less surface area.
More trim (The more you trim the larger you have to make the probe size for the surface_spheres command)
Faster computer
Manually adjusting inside_points so that it can use multiple cores
Having many receptors attached to like a cell membrane
pymol tips
to show the virus capsid, the whole biological assembly as oppose to a single asymmetric unit:
set all states,1
qchem
qchem 5
qchem 5 can be run on node40, 41, 42, 142 and g2-node38 (or node38).
Needed to install openmpi related rpm and the LD path (see below): yum install -y *openmpi*
You can run multiple jobs on each node using 4-8 cores.
setenv QCSCRATCH /opt/scratch/qchemscratch setenv QCLOCALSCR /scratch/qchemscratch setenv QCMPI mpich source /opt/qchem/5/qcenv.csh setenv LD_LIBRARY_PATH /usr/lib64/openmpi-1.10/lib/:$LD_LIBRARY_PATH
qchem 4.2
Qchem (4.2) can be run on the following nodes (all cores):
node50-57, 8 cores each, 7GB mem. node53, 55 dead now (9/2016).
Also work on these nodes with large memory:
node35 node36 node37 node38 genome
The max memory can be used on the nodes is set to 50GB for now. If need more, please let me know.
Never set the memory in the qchem input more than the actual mem (free -g) on each node.
(mem & disk in MB):
/opt/qchem/4.2/config/preference MEM_TOTAl is set to 50GB
v 4.2
July 2015 (last version of free upgrade) change 4.1 below to run multithread version. MPICH version is in /opt/qchem/4.2mpi
setenv QCSCRATCH /opt/scratch/qchemscratch setenv QCLOCALSCR /scratch/qchemscratch setenv QCMPI mpich source /opt/qchem/4.2/qcenv.csh
Then (see /home/pren/QuantumMechanics/qchem/ for example inputs)
qchem -nt 6 n60.in n60.out &
v 4.1
User manual: http://www.q-chem.com/qchem-website/manual_4-1.html
setenv QCSCRATCH /opt/scratch/qchemscratch; setenv QCLOCALSCR /scratch #setenv PBS_NODEFILE /home/pren/qchemnode setenv QCMPI mpich source /opt/qchem/4.1/qcenv.csh
to run multi-thread:
qchem -nt 6 n60.in n60.out &
before 4.1
source the following if you want to use version 4.0 (bash shell):
- QCHEM #qchem
QC=/work/opt/qchem4.0
export QC
export OMP_NUM_THREADS=1
QCAUX=$QC/aux;
export QCAUX
QCPLATFORM=LINUX_Ix86_64;
export QCPLATFORM
QCSCRATCH=/opt/scratch/qchemscratch;
export QCSCRATCH
export QCLOCALSCR=/scratch
export QCMPI=mpich
export QCRSH=/usr/bin/ssh
export PBS_NODEFILE=/home/pren/qchemnode
noclobber=""
export PATH=/work/opt/qchem4.0/bin/:$PATH
if [ -e $QC/bin/qchem.setup.sh ] ; then
$QC/bin/qchem.setup.sh
fi
source the following if you want to use verion 3.2:
setenv QC /opt/qchem-para if (-e "$QC"/bin/qchem.setup) source $QC/bin/qchem.setup;
setenv QCAUX $QC/aux; setenv QCPLATFORM LINUX_Ix86_64;
setenv QCSCRATCH /opt/scratch/qchemscratch;
setenv QCLOCALSCR /scratch
setenv QCMPI mpich
setenv QCRSH /usr/bin/ssh setenv noclobber ""
setenv PBS_NODEFILE /home/pren/qchemnode
To run (example: /home/pren/QuantumMechanics/qchem) qchem n60.in n60.out & qchem -pbs -np 7 2huw-sp.in 2huw-sp.out &
mpich2
mpdboot debug
Cleanup first:
/usr/local/sbin/rcom "/opt/mpich2-fc13-p26/bin/mpdcleanup" Do this form nova and This will cleanup your mpd on each node. or ssh nodexxx "/opt/mpich2-fc13-p26/bin/mpdcleanup"
~/.mpd.conf has the user's secretword, chmod 600 ~~/.mpd.conf
df -h make sure disk is not full
iptables stop
hosts file has all the nodes
make sure you can ssh into the nodes
On the node you have trouble, you can check with "mpdtrace -l" if it is clean (mpdtrace: cannot connect to local....)
Network
Make sure you can ssh Hosts file has all the nodes. The top line should be "127.0.0.1 localhost.bme.utexas.edu localhost"
Diable firewall:
/sbin/chkconfig --list iptables
Disable NetworkManager !
/sbin/chkconfig --list NetworkManger /sbin/chkconfig --level 345 NetworkManager off /etc/init.d/NetworkManager stop /etc/init.d/network restart
You may need to add to resolv.conf
nameserver 10.0.0.1
Make sure the hostname is correct
less /etc/sysconfig/network mpd& ###here you may see error message about hostname ip etc.
You may have to reboot.
manual
Syntax
mpdboot [ -n <#nodes> ] [ -f <hostsfile> ] [ -h ] [ -r <rshcmd> ] \ [ -u <user> ] [ -m <mpdcmd> ] [ --loccons ] [ --remcons ] \ [ -s ] [ -d ] [ -v ] [ -1 ] [ --ncpus=<ncpus> ]
or
mpdboot [ --totalnum=<#nodes> ] [ --file=<hostsfile> ] [ --help ] \ [ --rsh=<rshcmd> ] [ --user=<user> ] [ --mpd=<mpdcmd> ] \ [ --loccons ] [ --remcons ] [ --shell ] [ --debug ] \ [ --verbose ] [ -1 ] [ --ncpus=<ncpus> ]
Arguments
-h, --help Display help message -d, --debug Print debug information –v, --verbose Print extra verbose information. Show the rshcmd attempts -n <#nodes> --totalnum=<#nodes> Number of nodes in mpd.hosts on which daemons start -r <rshcmd> --rsh=<rshcmd> Specify remote shell to start daemons and jobs -f <hostsfile> --file=<hostsfile> Path/name of file that contains the list of machine names on which daemons start. -1 Remove a restriction of starting only one mpd per machine -m <mpdcmd> --mpd=<mpdcms> Specify the full path name of mpd on the remote hosts -s, --shell Specify shell -u <user> --user=<user> Specify user --loccons Do not create local MPD consoles --remcons Do not create remote MPD consoles --ncpus=<ncpus> Indicate how many processors to use on the local machine (the rest runs on other nodes)
/opt/mpich2 /opt/mpich2-fc7 for parallel mpi jobs.
If you have trouble with mpdboot, it is typically becasue disk is full, which is cuased by egregious amount of error message of previous mpi jobs. Oe see the debug lists under MPICH2 above.
Please always do this when you run paallel job using mpi to get rid of the mpi error messages.
"mpirun ....your prgram... 2> /dev/null"
2 means standard error, which is redirected to /dev/null (black hole) instead of system log.
If you have trouble running mpdboot (which is a script starting mpd), you can mannaully start mpd, based on mpich2 installation guide:
mpd & # starts the local daemon (i.e. node18) mpdtrace -l # makes the local daemon print its host and port in the form <host>_<port> Then log into second machines, and do: mpd -h <hostname> -p <port> & Note you can add --ncpus=4 to mpd on both nodes as well.
Here is an example (put /opt/mpich2-fc7/bin in PATH): log into node18
mpd --ncpus=4 --listenport=55551 &
log into node19
mpd -h node18 -p 55551 --ncpus=4 &
log into node21
mpd -h node18 -p 55551 --ncpus=4 &
Try mpdtrace to see which nodes are in the mpd ring. You should see node18, node19 and node21. Note node 18 is the head node here, and every other node will connect to it. The 55551 is a number I specified but you can change. cpus is also a variables. mpd -h gives mode info.
Gaussian
https://biomol.bme.utexas.edu/tinkergpu/index.php?title=Tinkergpu:Software-list#Gaussian
Simulation
Chemistry
compute cd spectrum from QM
Orca offer CIS and RDFT but the spectrum part is broken (author notified. will fix in next release) G03 offer TDFT: #B3LYP/6-31+G* TD(NSTATE=60)
The NSTATE can be determined by a quick test run. The initial state can also be changed (see g03 manual).
The output (log file) will list different states (eV/nm) and rotation strength R. /opt/software/CHEM/cdspectrum/ has the software fit the data into a spectrum.
vi x (get the length R from g03 log file, copy log to x and delete everything but velocity R) grep Singlet-A td.log | cut -c35-44 > x2 (get the eV) paste x2 x > td.rot vi td.rot (add header according to cdpsectrum, see righttd.rot) /opt/software/CHEM/cdspectrum/plotspecTM < td.rot
see example in:
/home/pren/cd-qm/helix/g03/631+g /home/pren/cd-qm/dpd (gpt file shows conversion frm ev to nm)
GABEDIT can be used to extract and vis the CD results.
Docking
DATABASE
- The small molecule databases for Docking or Virtual Screening have been put in /opt/database directory
ChemBridge: /opt/database/ChemBridge 1. EXPRESS-Pick™ Database: EXPRESS-Pick October 2008.sdf or EXPRESS-Pick October 2008.db 2. KINASet Database (ATP analogue): KINASet.sdf or KINASet.db This is the same as kinaset in CHembridge/ KINASet@: New_custom_Kinaset_3D_05072007 as a sdf or db or zip file (/opt/database/chembridge/kinaset) 3. NOVACore/PHARMACophore Combined Database: NCL-PCL October 2008.sdf or NCL-PCL October 2008.db 4. GPCR and KINACore Library databases: GPCR September2008.zip 5. FOCUSCore Library database: FocusCore_September2008.zip Diversity set (non ATP analogue)
6. NCI(National Cancer Institute)-diversity Set: NCI-Open_09-03.sdf.gz(/opt/database/NCI-diversity) 7. Maybridge Chemical Company (England): MayBridge.sdf (/opt/database/maybridge)
- pdbbind opt/reprints/benchmark/pdbbind/all-core-10-2008.xls
- binding DB (Gilson)
Small Molecule Preparation
If you need to convert the structure from 2D to 3D, add hydrogen and optimize. 1) You can use babel to add hydrogen’s, generate 3D structure, and optimize. An example is in /opt/database/chembridge/DIVERSet-06/2d-3d-test/ 2) Alternatively, you can use openeye software to do this. Chunli can help you with this. 3) Or use my script in /op/database/sdfbatch.pl The first option is probably the best. Let me know if you have questions.
GOLD
2022 GOLD CCDC license and download - I have not install this since 2020 version is still working.
https://utexas.app.box.com/folder/153544625646 (edited)
Updated license info can be found : Note updated as of 2022 http://legacy.lib.utexas.edu/chem/scifinder/ccdc.html
2020
GOLD Tutorial
Before running GOLD from the command line, it is recommended to follow a tutorial using the graphical interface to gain familarity with how GOLD works.
A tutorial for docking two ligands into a protein while taking flexibility into consideration is located here:
A tutorial for ligand binding pose prediction is located here:
https://www.ccdc.cam.ac.uk/support-and-resources/ccdcresources/GOLD-tutorial-PosePrediction.pdf
Remove old env in your bashrc or cshrc if you have one (so far I don't think we need it for 2020 version)
Activate GOLD the first time you use it:
/opt/CCDC/CCDC2020/CSD_2020/bin/ccdc_activator -k 570769-C9BD90-4986BB-904E59-D56C9B-025394
If you want to use gradphical interface for activating or setting up GOLD job initially (running job requires no GUI), you need a X grapics on your local computer.
Mac has Xquartz. On windows, use mobaxterm that comes with X (or you can install VcXsrv & use it with ubuntu). Make sure you use ssh -Y
when logging into a node.
You can check that X is working by typing xeyes
and hitting enter. You should get a pair of googly eyes (which may be in another window).
/opt/CCDC/CCDC2020/CSD_2020/bin/ccdc_activator_gui
& enter key from above (570769-C9BD90-4986BB-904E59-D56C9B-025394)
Running graphical interface (e.g. to generate gold.conf)
ssh -Y nova
then
/opt/CCDC/CCDC2020/Discovery_2020/bin/gold
Once you set up the job/create the conf file on nova, you can run the job on other nodes (below)
GOLD user guide: https://www.ccdc.cam.ac.uk/support-and-resources/ccdcresources/GOLD_User_Guide.pdf
Running GOLD from the command line (you can use gold.conf above or mannually creat/edit conf)
/opt/CCDC/CCDC2020/Discovery_2020/bingold_auto gold.conf &
2018 update
(installation and activation same as 2016)
setenv GOLD_DIR "/opt/CCDC2018/GoldSuite_2018/GOLD" setenv PVM_ROOT "/opt/CCDC2018/GoldSuite_2018/GOLD/pvm3" setenv PVM_RSH "/usr/bin/ssh" setenv CCDC_CSD_LICENCE_FILE "/opt/CCDC2018/CSD_2018/csd/csd_license.dat" setenv LD_LIBRARY_PATH /opt/CCDC2018/GoldSuite_2018/c_linux-64/lib/:$LD_LIBRARY_PATH
To activate on a node or WS: /opt/CCDC2018/GoldSuite_2018/bin/gold, Enter site 771, Confirmation Code: 1f9986 and email.
After installation, you may run the GUI (hermes) on workstation such as water.bme.utexas.edu
2016
Installation: /opt/CCDC/installation/csds-linux-x64.run Mannual in /op/CCDC
Activate on a node (e.g. node101) by running /"'opt/CCDC/GoldSuite_2016/bin/gold" (using X windows if you do so remotely). Windows will pop up, asking you to enter the site number (771) and purchase number/confirmation code to register online. This confirmation number changes every spring each year (1f9986 for 2018). The code is now available from x-ray lab web site: https://sites.utexas.edu/chem-x-ray/ (right panel)
Once activated, specify the license file: csd_licence.dat /opt/CCDC//CSD_2016/csd/csd_licence.dat
To run:
Notes from Dave Broadwell
You can split the jobs and combine the results by uinsg the script discussed in the link above (Once all the jobs have been launched, a script is generated that may be used to collate all the results once the dockings have completed. )
Gold Tutorial:
All scripts and example gold.hosts and gold.conf files are available here
First, you need to initialize a file structure for docking. run the makedirs.sh script, making sure that the gold.hosts file( explained above) exists in the current working directory, as well as the gold.conf. The gold.conf file should not contain a ligand line, this is created by the makedirs.sh script.
Secondly, you need to run makerunscript.sh with an argument equal to the total number of cores, This will create a file run.sh that you will run to start job execution
Start up the master daemon on node36 by entering nohup /opt/CCDC/GoldSuite_2016/GOLD/pvm3/lib/pvmd hostfile &, with hostffile being the hostfile descripted above( not the gold.hosts)
To check that the daemon is active, run /opt/CCDC/GoldSuite_2016/GOLD/pvm3/lib/pvm and enter conf in the resulting command prompt. You should see a list of all the hosts provided to docking.
run nohup ./run.sh. This starts docking. Expect docking to take 1-1.5 weeks
to analyze, cat */bestranking.lst to a new file. This collects all ligand scores.
I have provided a script( extracthigh.py) that takes all lines in the collected bestranking file that score above a limit. To use run python extracthigh.py inputfile.txt outputfile.txt minimum score( ex. 90).
revised TI3 usage
The queuing script is ineffecent. Instead, run jobs on each individual node. example code:
for dir in */; do
cd $dir cwd=$(pwd) echo $cwd check=0 sleep 30 limit=70 while $check == 0; do if cut -f2| sed -n 1p|awk '{printf "%.0f\n", $1}') -lt $limit ; then ssh compute-0-9 /apps/gold/goldsuite-5.2/bin/gold_auto $cwd/gold.conf echo "9" check=2 elif cut -f2| sed -n 2p|awk '{printf "%.0f\n", $1}') -lt $limit ; then ssh compute-0-3 /apps/gold/goldsuite-5.2/bin/gold_auto $cwd/gold.conf echo "3" check=2 else sleep 360 fi done
cd ../ done
notes on gold inputs
Gold will not accept ligand files written in sdf format. They need to be mol2s
For gold.conf file, make sure that path is full, don't use ~ shortcut.
AUTODOCK
<a href="Research%3ADrugDesign">autodock 4</a>
SSH
- Start SSH
- Create a public and a private key.
- Click "Generate New"
- Change DSA to RSA
- Private Key Stats
- Filename: id_rsa (default, but can put whatever in here)
- Passphase: (Optional, used to access private key)
- Put the key into the computers that are to be remotely accessed.
- After putting the keys into those other comptuers, click "Configure"
- Click on "Profiles" in the console, and enter name of computer.
- bme-(planetname).bme.utexas.edu (All planets but Mercury)
- To copy the public key, open console, go to the home directory, and then click the .ssh directory.
- Open a file called authorized_keys.
- Copy the public key to that file, make sure it is 1 line. Be sure to add "ssh-rsa " to the front of the key.
TACC portal
Submit a job on lonestar
ssh username@lonestar.tacc.utexas.edu
showq -u #check your job
Put yoyr job in work directory: cdw or cd $WORK
bsub < job_script &
example script /home1/00433/prentacc/EXAMPLE/run
To use TINKER:
Before run tinker job, do "module load mkl", or add to your source file.
Regular tinker binary: /home1/00433/prentacc/tinker_6.2.06_tacc/source
Osrw tinker bianry: /home1/00433/prentacc/tinker-osrw/source-osrw-2-17s
Example job_script (change BOLD part for each job):
Submit a series of jobs in TACC
you can create a series of jobs that are dependent on previous steps being completed in lonestar or ranger. I found this very useful when you have a few steps but you can submit them at the same time; or when you have a job that would last for a few days but got killed every 24 hours. By defining the dependency between jobs, you don't have to resubmit your jobs manually everyday. In addition, it saves your waiting time a lot because your jobs will be automatically in the waiting list whenever the previous ones are finished. To specify a job dependency, use "-hold_jid" tag in your qsub job script.
login2$ qsub -help | grep hold
[-hold_jid job_identifier_list] define jobnet interdependencies [-hold_jid_ad job_identifier_list] define jobnet array interdependencies
For example, if you want to run a job first and you have just submitted it, which has a job id of"67121". Assume you have your restart input files in place for the next step, you can specify "-hold_jid" tag in your qsub job script "job2", for example in the following case ("#$ -hold_jid 67121"). Then you can submit the second job right away to the scheduler: qsub < job2. The second job will wait until the first job is finished (or killed) to begin execution or to be placed in the waiting queue. You can also have job3 (day3), job4 (day4), job5 (day5).... which are dependent on the previous ones.
job2: #!/bin/tcsh #$ -V #$ -cwd #$ -N min3 #$ -A A-biomod #$ -j y #$ -o alphA #$ -pe 12way 24 #$ -q development #$ -l h_rt=01:00:00 #$ -m be #$ -hold_jid 67121 ...... ......
GROMACS
REMD
<a href="Research%3ARemd">REMD HOWTO</a>
GROMACS tutorial
<a href="http://www.dddc.ac.cn/embo04/#lecture">http://www.dddc.ac.cn/embo04/#lecture</a> IED for visualize major mode of motion: <a href="http://mccammon.ucsd.edu/ied/">http://mccammon.ucsd.edu/ied/</a>
creating custom residue
Pyrrole as an example: /home/pren/gromacs-top-pyr/readme-pyr
AMBER Tutorial
Basic MD
set up parameters for new ligands and unnatural resiude
MM-PBSA for binding: http://ambermd.org/tutorials/advanced/tutorial3/
AMBER-GPU
see this page for installation, nodes and additional information AMBER-GPU
To run AMBER-GPU, first include in your bashrc file:
source /opt/intel/composer_xe_2013.2.146/bin/compilervars.sh intel64 export PATH=$PATH:/usr/local/cuda-6.5/bin export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64:$LD_LIBRARY_PATH export AMBERHOME=/opt/software/amber14-gpu export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$AMBERHOME/lib alias python=/usr/bin/env\ python2.6 export PYTHONPATH=/usr/bin/python2.6 export PATH=/opt/mpich2-1.4.1p1/bin/mpicc/:/opt/mpich2-1.4.1p1/bin:$PATH export CUDA_HOME=/usr/local/cuda-6.5 export DO_PARALLEL='mpirun -np 1'
The export DO_PARALLEL statement tells amber how many gpus to use. We have three gpus in sugar, but in order not to double up one gpu, just use one gpu per job.
You can find out if the gpu's are being used by the command: nvidia-smi (assuming you have the above variables in you bashrc/equivalent file) In order to run on a particular gpu, just set the environment variable:
export CUDA_VISIBLE_DEVICES="#" ;where # is the gpu number (0,1,2 for sugar)
For node105 (possibly 106 and 107 too) use /opt/software/amber14-gpu-earth/ rather than /opt/software/amber14-gpu/
The executables can be found in: /opt/software/amber14-gpu/bin/
An example executable command line is given below(taken from /home/davbell/software/amber/test/cuda/dhfr/Run.dhfr)
$DO_PARALLEL $sander -O -o $output -c md12.x -r restrt -x mdcrd < dummy || goto error
Further tutorials can be found through the AMBER manual: http://ambermd.org/doc12/Amber14.pdf and on the website http://ambermd.org/gpus/
MM-PBSA free energy estimation tutorial
a. Theory preparation
A very good tutorial can be found in http://ambermd.org/tutorials/advanced/tutorial3/. I will suggest you to first read this tutorial so you can get a basic understanding of MM-PB/GBSA. The following steps will guide you to b) set up the complex system, c) run MD simulations on our GPU cards using pmemd.cuda, and d) estimate the complex binding affinity.
b. System setting up
System setting up in AMBER can be done in either xleap or tleap program. The former provides a graphical interface thus it is easier for the beginners. Here we will use tleap to set the system up (generate the necessary files for AMBER run).
Copy the leaprc.ff14SB (this is the force field file we will be using) to the current working directory as leaprc.
cp $AMBERHOME/dat/leap/cmd/leaprc.ff14SB ./leaprc
Append the following commands to the leaprc file.
Notes: these commands are helping us to generate the complex prmtop and rst7 files in gas phase (COMP.prmtop & COMP.rst7), add Na+ and Cl- ions to the salt concentration we wanted (addions commands), add solvent molecules (TIP3P water here) and generate complex prmtop and rst7 files in explicit water (COMP_wat.prmtop & COMP_wat.rst7), generate the ligand and protein prmtop and rst7 files (LIG.prmtop & LIG.rst7; PRO.prmtop and PRO.rst7)
COMP = loadpdb COMP.pdb saveamberparm COMP COMP.prmtop COMP.rst7 addions COMP Na+ 0 addions COMP Cl- 0 addions COMP Cl- number (see notes *) addions COMP Na+ number (see notes *) solvateoct COMP TIP3PBOX 8.0 saveamberparm COMP COMP_wat.prmtop COMP_wat.rst7 savepdb COMP COMP_wat.pdb LIG = loadpdb LIG.pdb saveamberparm LIG LIG.prmtop LIG.rst7 PRO = loadpdb PRO.pdb saveamberparm PRO PRO.prmtop PRO.rst7 quit
Execute tleap command to generate the bunch of prmtop and rst7 files
tleap
After we neutralize the system, the number of ions needed to reach the salts concentration should be calculated.*
The above steps help us generate one individual system for the next runs. Here is a python script to automatically produce multiple systems. Modifications are very possibly needed for your own cases.
#!/usr/bin/env python #set up AMBER input files for use in MM-PBSA free energy calculations # files need to be prepared: # 1. leaprc file # 2. complex pdb file #Chengwen Liu import sys,os def prepareInputs(complexPdbFile): lines = file(complexPdbFile).readlines() filename = complexPdbFile.split(".pdb")[0] f1 = open("Integrin_%s.pdb"%filename, "w") if "I1" in complexPdbFile: nAtoms = 1557 if "I2" in complexPdbFile: nAtoms = 1544 for n in xrange(nAtoms): f1.write(lines[n]) f1.close() f2 = open("Collagen_%s.pdb"%filename, "w") for n in xrange(nAtoms, len(lines)): f2.write(lines[n]) f2.close() f3 = open("temp", "w") f3.write("COMP = loadpdb %s\n"%complexPdbFile) f3.write("saveamberparm COMP %s.prmtop %s.rst7\n"%(filename, filename)) f3.write("addions COMP Na+ 0\n") f3.write("addions COMP Cl- 0\n") f3.write("solvateoct COMP TIP3PBOX 8.0\n") f3.write("saveamberparm COMP %s_wat.prmtop %s_wat.rst7\n"%(filename, filename)) f3.write("savepdb COMP %s_wat.pdb\n"%filename) f3.write("LIG = loadpdb Collagen_%s.pdb\n"%filename) f3.write("saveamberparm LIG Collagen_%s.prmtop Collagen_%s.rst7\n"%(filename, filename)) f3.write("PRO = loadpdb Integrin_%s.pdb\n"%filename) f3.write("saveamberparm PRO Integrin_%s.prmtop Integrin_%s.rst7\n"%(filename, filename)) f3.write("quit\n") f3.close() os.system("cat leap temp >leaprc") os.system("tleap") #do the HMassRepartition and generate a new prmtop file f4 = open("parmed.in", "w") f4.write("HMassRepartition\n") f4.write("parmout %s_wat_heavyH.prmtop\n"%filename) f4.write("go\nquit\n") f4.close() os.system("/opt/software/ParmEd-master/parmed.py -p %s_wat.prmtop -i parmed.in"%filename) return #Main currDir = os.getcwd() files = os.listdir(currDir) for eachPdb in files: if ".pdb" in eachPdb: #print eachPdb filename = eachPdb os.system("mkdir %s/%s"%(currDir,filename.split(".pdb")[0])) os.system("cp %s/leap %s/%s"%(currDir, currDir,filename.split(".pdb")[0])) os.chdir("%s/%s"%(currDir, filename.split(".pdb")[0])) os.system("cp %s/%s ."%(currDir, filename)) prepareInputs(filename)
c. Run MD simulations with pmemd.cuda
Routinely in AMBER, we will run minimization, heating, equilibration, cooling, equilibration and final production run in NVT ensemble. Here are the input and run files I am using in collagen-integrin system.
min.in
initial minimization prior to MD &cntrl imin = 1, maxcyc = 6000, ncyc = 2500, cut = 12 /
heat360.in
200ps MD &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 0.0, temp0 = 360.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
md200ps360.in
200ps of MD &cntrl imin = 0, irest = 1, ntx = 7, ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0, cut = 10.0, ntr = 0, ntc = 2, ntf = 2, tempi = 360.0, temp0 = 360.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
cool300.in
200ps MD &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 360.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
md200ps300.in
200ps of MD &cntrl imin = 0, irest = 1, ntx = 7, ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0, cut = 10.0, ntr = 0, ntc = 2, ntf = 2, tempi = 300.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
mdprod.in
100ns MD NVT &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 300.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 25000000, dt = 0.004, ioutfm = 1, ntpr = 1000, ntwx = 1000, ntwr = 1000 /
Run script:
#!/bin/bash export filename=your file name source ~pren/amber-git/amber17/bashrc.amber16 export CUDA_VISIBLE_DEVICES="0" #or 1 $AMBERHOME/bin/pmemd.cuda -O -i min.in -o $filename\_min.out -c $filename\.rst7 -p $filename\.prmtop -r $filename\_min.rst7 $AMBERHOME/bin/pmemd.cuda -O -i heat360.in -o $filename\_heat360.out -c $filename\_min.rst7 -p $filename\.prmtop -r $filename\_heat360.rst7 -x $filename\_heat360.nc $AMBERHOME/bin/pmemd.cuda -O -i md200ps360.in -o $filename\_md200ps360.out -c $filename\_heat360.rst7 -p $filename\.prmtop -r $filename\_md200ps360.rst7 -x $filename\_md200ps360.nc $AMBERHOME/bin/pmemd.cuda -O -i cool300.in -o $filename\_cool300.out -c $filename\_md200ps360.rst7 -p $filename\.prmtop -r $filename\_cool300.rst7 -x $filename\_cool300.nc $AMBERHOME/bin/pmemd.cuda -O -i md200ps300.in -o $filename\_md200ps300.out -c $filename\_cool300.rst7 -p $filename\.prmtop -r $filename\_md200ps300.rst7 -x $filename\_md200ps300.nc $AMBERHOME/bin/pmemd.cuda -O -i mdprod.in -o $filename\_md1.out -c $filename\_md200ps300.rst7 -p $filename\_heavyH.prmtop -r $filename\_md1.rst7 -x $filename\_md1.nc
After the production run, we may want to delete the explicit water molecules in the trajectory file (*.mdcrd or *.nc), which helps us to save the hard disk storage. In this step the ptraj (or cpptraj) program will be used.
Here is the trajin file
parm COMP_wat.prmtop trajin COMP_wat_md1.nc strip :WAT:Na+:Cl- trajout COMP_wat_md1_traj.nc
cpptraj -i trajin
d. MM-PB/GBSA calculation
We will use MMPBSA.py to estimate the binding affinity, where the entropy contribution is not computed due to the expensive cost of normal mode analysis calculations for biomolecules. (See theory preparation section tutorials for more details)
pbsa.in file
MMPBSA from JCTC paper -CW &general startframe=1, endframe=25000, keep_files=0, interval=50, debug_printlevel=2, strip_mask=:WAT:Cl-:Na+, / &gb igb=2, saltcon=0.100 / &pb istrng=0.1, exdi=80, indi=1.0, inp=2, cavity_surften=0.0378, cavity_offset=0.5692, fillratio=4, scale=2.0, linit=1000, prbrad=1.4, radiopt=1, /
Run file
#!/bin/bash export filename=your file name source /home/pren/amber-git/amber17/ambercpu/amber.sh $AMBERHOME/bin/MMPBSA.py -O -i pbsa.in -o results_mmpbsa.dat -sp $filename\.prmtop -cp $filename\.prmtop -rp Integrin_$filename\.prmtop
e. MM-PB/GBSA results
see a. theory preparation for how to read/understand the MMPBSA results.
CHARMM tutorial
<a href="http://www.ch.embnet.org/MD_tutorial/">http://www.ch.embnet.org/MD_tutorial/</a> <a href="http://www.charmming.org/charmming/">http://www.charmming.org/charmming/</a> <a href="http://www.charmm.org/ubbthreads/ubbthreads.php?ubb=cfrm">http://www.charmm.org/ubbthreads/ubbthreads.php?ubb=cfrm</a>
Movie making
You can use trjconv to convert groamcs trajectory to a xxx.pdb (many frames), and then use the pdb to make movies. One option for trjconv is –pbc, which can be used to keep the molecules in the box. One of these may work for you:
- -pbc mol. This put the center of mass of molecules in the box.
- -pbc nojump. Here you use the first frame (-s xxx.tpr) use as reference, and get rid of any jumps out of the box. This only works if the firs frame is whole in the box.
Amber has similar utilities to convert trajectory to pdb
To make movies look slower, you can more frames 9smaller steps between frames), decrease the no of frames per second or increase delay in animated gif if you use unfreez under windows…
- use vmd movie maker. You can decrease frames per seconds (default is 24 the lowest already; see Format/compression setting).
<a href="http://www.ks.uiuc.edu/Training/Tutorials/vmd/tutorial-html/node3.html#SECTION00034200000000000000">http://www.ks.uiuc.edu/Training/Tutorials/vmd/tutorial-html/node3.html#SECTION00034200000000000000</a> <a href="http://ambermd.org/tutorials/basic/tutorial2/section9.htm">http://ambermd.org/tutorials/basic/tutorial2/section9.htm</a>
- Use pymol in Mac, you can directly export to mov file
- Use pymol on Windows or linux, you can create a series of png files from "export movie", and then use program such as unfreez <a href="http://www.whitsoftdev.com/unfreez/">http://www.whitsoftdev.com/unfreez/</a> or virtuadub <a href="http://virtualdub.sourceforge.net/">http://virtualdub.sourceforge.net/</a> to create an animated gif file. You can cnage the delays in the UNFREEZ. Animated gif is mall but usable on web or ppt.
to turn on ray tracing for each png file:
viewport 320,240 9decide how big the picture is) set ray_trace_frames=1 set cache_frames=0 mpng mymov (or go to File/export movie)
Additional pymol commands;<a href="http://pymol.sourceforge.net/newman/user/S0300movies.html">http://pymol.sourceforge.net/newman/user/S0300movies.html</a>
Examples mset 1 x30 # creates a 30 frame movie consisting of state 1 played 30 times. mset 1 -30 # creates a 30 frame movie: states 1, 2, ... up through 30. mset 1 -30 -2 # 58 frames: states 1, 2, ... , 29, 30, then 29, 28, ... , 4, 3, down to 2 mset 1 6 5 2 3 # 5 frames: states 1, 6, 5, 2, 3 in that order.
Discovery Studio & Pipeline Pilot
Latest version
DS 3.5 PP 8.5.3
Server
# network access problem from outside lab
The default route ( ip route) shows 10.0.0.200 was the default gateway, in addition to 146.6.132.1.
Use "route del default" twice to remove all the default and then add using "/sbin/route add default gw 146.6.132.1 eth0"
Also removed gateway 10.0.0.200 from ifcfg-eth1.
- PP installation dir on pharma is /home/qwang/software/Accelrys/PP_install
Web acess: pharma.bme.utexas.edu:9943 (https) pharma.bme.utexas.edu:9944 (http) Admin Accounts: scitegicadmin and/or root (same password)
- DS installation is done through the PP installation script. A single license file contains both PP and DS is required to install both of them in a integrated installation.
A such license file looks like:
INCREMENT License_Holder msi 2016.120 30-dec-2016 uncounted 62B406C87B7EAC123E26 VENDOR_STRING="200531:218555:5:University of Texas at Austin" HOSTID=ANY TS_OK INCREMENT scitegic_ppclient msi 2016.120 30-dec-2016 uncounted A2949618D29B97720F7F VENDOR_STRING=599622[25854] HOSTID=ANY TS_OK ... INCREMENT DSCollection msi 2016.120 30-dec-2016 uncounted 42B4168820BAED55C170 VENDOR_STRING=614772[25554] HOSTID=ANY TS_OK INCREMENT DSDeveloperClient msi 2016.120 30-dec-2016 uncounted 527406E85FF8986EF951 VENDOR_STRING=614772[25554] HOSTID=ANY TS_OK INCREMENT scitegic_core msi 2016.120 30-dec-2016 uncounted 82C4C62864E053BDFFDF VENDOR_STRING=614772[25554] HOSTID=ANY TS_OK ...
Once the installation is done, there should have two more new directories created (DiscoveryStudio35 and LicensePack) at the same level of your PP install dir (PP_install in our case).
- Then, another license file is required to activate the DS component. A request form like below sending to Accelrys can get us this file.
COMPANY: <your company's name> CONTACT: <your name> EMAIL: <email address to which license file should be sent> PRODUCT LIST Enter the Accelrys products for which you are requesting a license. (include order number if available) <Product Name> <Order Number> You should not need to edit any of the remaining portion of the form. COMPUTER: pharma.bme.utexas.edu HOST ID: MAC address OS TYPE: Linux End Accelrys Software License File Request Form
Such a license file should looks like this:
SERVER pharma.bme.utexas.edu MACaddress 1715 (This line defines the port and server name that we can used to connect to in client machines) VENDOR msi PORT=1720 (NB: This port number is crucial for client in connecting to the server.) ( It is used for the license managing program to communicate between client and server.) ( So we need to make sure it is open! The DS installation manual does NOT even mention it.) ( We have been stuck at this point for quite a while.) INCREMENT License_Holder msi 2016.120 30-DEC-2016 999 2D42D7207F6D64AE0EBC "200531:218557:1:University of Texas at Austin" INCREMENT DS_ADMET_User_MP msi 2016.120 30-DEC-2016 1 ADB287204B698029EC64 "599619[28071]" INCREMENT DS_MCSS_User_MP msi 2016.120 30-DEC-2016 1 9DC29750CCAE2EF8B007 "599619[28071]" ...
- It is important to make sure the system has the older ethX network interface card naming convention rather than the newer emX names. Otherwise, the DS installer will not smart enough to extract the MAC address of the server, thus the license server won't work. Dell machines and Fedora15 or newer usually follow the emX naming scheme, so they need to be changed to ethX. A method that works for our pharma server can be found at: <a href="http://www.sysarchitects.com/em1_to_eth0">http://www.sysarchitects.com/em1_to_eth0</a>
1. Add biosdevname=0 to the kernel boot arguments in /etc/grub.conf. 2. Rename /etc/sysconfig/network-scripts/ifcfg-em1 to /etc/sysconfig/network-scripts/ifcfg-eth0, changing the line DEVICE="em1" to DEVICE="eth0" 3. Delete /etc/udev/rules.d/70-persistent-net.rules 4. Reboot
- DS license server
Sometimes, the license server may go down. To restart, go to /home/qwang/software/Accelrys/LicensePack/linux/bin
And run ./lp_server -s to start the server. -h for help -w for status
Client
Once above things are done, we can download and install a DS client in a linux or windows machine from our server's website by following the following instructions:
- Go to pharma.bme.utexas.edu:9944, and click on the Discovery Studio link at the right bottom corner. This will bring your to the DS website. Then you can simply click on the Install/Upgrade Discovery Studio Client link to install either windows or linux version. This will install a free version of DS client.
- To enable the full version, we can click on the 'Enable additional features' button at the right bottom corner of the client window. Then choose connect to a license server option, and enter the port number 1715 and the server name pharma.bme.utexas.edu to appropriate boxes. Then you should be able to tell whether it is connected or not from the window. And the 'Enable additional features' button should have been changed to 'Server: none' now. You would also notice there are much more options appear (but still are grayed menu items) in the left panel of each module.
- If you click on these grayed options, there should be a pop-up window asking for server name. Just enter pharma.bme.utexas.edu, and then use your username and password to login. Now, you should have the full power of the software!
Schrodinger and GLIDE Docking
We have 20 perpetual Glide licenses on Ti3D cluster; 2 perpetual ligprep license. (see Box Sync/LAB/software/ti3d glide)
We have 50 annual GLIDE licenses on Ren cluster and expires in 2015?
Prepping Protein and Grid
Preparation of the protein and grid can be accomplished using the maestro interface, available at at /opt/Schrodinger_Suites_2013-2/maestro. protein preparation and grid preparation instructions are available at https://www.schrodinger.com//AcrobatFile.php?type=supportdocs&type2=&ident=252. The final output of the grid generation is a .zip file. This file contains protein coordinates as well as other data. No other protein input is necessary.
Ligand Preparation
The ZINC server has a discrepancy in the size of the drug_lime in stock library depending on if one selects .mol2 or .sdf output. The latest molecules appear to be only in the .sdf output. To download the drug-like in stock library, download the file under UNIX SDF at http://zinc.docking.org/subsets/drugs-now. In terminal, run this file using csh. This will download the library as many .sdf files, which can be decompressed using gunzip. Each .sdf file corresponds to part of the library.
Do not attempt to run ligprep using maestro or the bme-nova nodes. We are not licenced for ligprep. To run ligprep, you need to connect to a remote server in Kong Ren's lab( see Pengyu for information). Once connected, one can run ligprep from /opt/schrodinger2013/ligprep. For a complete manual on ligprep, refer to http://isp.ncifcrf.gov/files/isp/uploads/2010/07/lp23_user_manual.pdf. Example of simple usage is:
ligprep -g -epik -n 1:23 -isd inputfile.sdf -omae outputfile.mae.
-g keeps current chiral state( important for ZINC, as not all chiralities are avaialable). -epik uses epik for tautomerization. -n 1:23 tells the server to only run ligprep on molecule 1-23. If one wants to run multiple instances of ligprep, one can run using same input and use -n to divide molecules between processes. -isd is to indicate input file is in .sdf format, and -omae indicates .mae output format. The output file(s) are now ready for docking.
Docking using Glide
Do not attempt to dock using maestro's interface. It cannot submit jobs to the nodes. instead, start glide using the command line interface. Before starting a job, one needs to make a .in file. an example .in file is :
USECOMPMAE YES
GRIDFILE "/users/mh43854/Desktop/glide_prepped/glide-grid_1.zip"
LIGANDFILE "/users/mh43854/Desktop/glide_prepped/ligprep_4-out.maegz"
PRECISION "SP"
I don't know the function of USECOMPMAE. GRIDFILE should point to the prepared protein grid file( a .zip file). LIGANDFILE points to the output of ligprep, and PRECISION selects the degree of precision , either HTVS, SP, or XP, in increasing order of precision.To run glide, connect to an available node, and run the command
/opt/software/Schrodinger_Suites_2014/glide MY_INFILE.in.
All options are written into the in file, so no more options are neccesary. By default, log and output files are written to the current working directory. output is as one large .mae file readable by maestro, and contains all poses as well as score information.
license server (root required)
Try
/opt/goldsuite-5.2/bin/gold_licence status /opt/software/Schrodinger_Suites_2014/licadmin stat
to check what is running.
In bashrc on NOVA:
export SCHRODINGER="/opt/software/Schrodinger_Suites_2014" export tmpdir=/scratch
to run of ti3d:
ssh into one of the compute nodes ( names accessible by ganlia cpu-user) export SCHRODINGER="/apps/schrodinger/2010_update_1/"
Then cd $SCHRONDINGER
/opt/software/Schrodinger_Suites_2014/licadmin start -c ./license
You may have to stop GOLD server first:
/opt/goldsuite-5.2/bin/gold_licence stop (this now works on uranus)
The Schrodinger license_key file was generated for bme-nova and use the MAC of em1 ($SHRODINGER/machid)
Client (node50-56)
https://www.schrodinger.com//AcrobatFile.php?type=supportdocs&type2=&ident=455
See /opt/software/Schrodinger_Suites_2014/schrodinger.hosts
To run Schrodinger on selected nodes requires a HOST file. A sample is shown below:
I (David Bell) experienced trouble logging into Schrodinger as the root user. It seems like the chosen nodes had trouble communicating when logged in as root. Finally, Schrodinger employs a gui that requires the X display server (not BME-mars).
PSI4
Install
howto (2019//6/7)
On node 38, follow instructions at https://admiring-tesla-08529a.netlify.com/installs/v132/( installer)
Use
Psi4 is available at /opt/software/CHEM/PSI4/PSI4new.
Most nodes use centos 6, source /opt/software/CHEM/PSI4/.bashrc.psi4.centos6
For centos7(node36) source /opt/software/CHEM/PSI4/.bashrc.psi4.centos7
The only difference is in Glibc usage, centos 6 requires an external newer version of glibc, while centos7 has a newer version pre-installed .
how to calculate energy component of a dimer.
To run PSI4:
a). First, you need to define some parameters and settings in a '.inp' file for PSI4 to recognize.:
#psi4 # memory 10 gb molecule Pyridine_Ethene { 0 1 N 1.381382191 -0.000233477 0.131463739 C 0.679350788 -1.140239457 0.092079660 H 1.258719604 -2.054962232 0.125883611 C -0.709722319 -1.193114069 0.006664265 H -1.214087679 -2.148561633 -0.025308510 C -1.421613574 0.000133427 -0.040816901 H -2.500696146 0.000257571 -0.109169733 C -0.709401197 1.193175380 0.006521983 H -1.213511626 2.148747839 -0.025528309 C 0.679651666 1.139956229 0.091893028 H 1.259260730 2.054510898 0.125502478 -- 0 1 C 0.016268721 0.666423181 3.116599871 H 0.926742724 1.225908900 2.957485028 H -0.893278777 1.228827408 3.273560375 C 0.016601405 -0.666264119 3.116732110 H 0.927337096 -1.225346603 2.957721677 H -0.892686689 -1.229088890 3.273821261 units angstrom no_reorient #important for SAPT in psi4, default symmetry c1 #important for SAPT in psi4, default } set { basis aug-cc-pVDZ scf_type DF freeze_core True } energy('sapt2+')
b). Second, you need to call PSI4:
#!/bin/sh export PSI_SCRATCH=/scratch/YourUserName if [ ! -d "$PSI_SCRATCH" ] then mkdir $PSI_SCRATCH fi source /opt/intel/compilerpro-12.0.0.084/bin/compilervars.sh intel64 my_psi4=/home/xzhu216/psi4/psi4-install $my_psi4/bin/psi4 -n 8 -i 2901_26UracilUracilpipi100_TZ.inp -o 2901_26UracilUracilpipi100_TZ.out
By defining 'PSI_SCRATCH', user can tell PSI4 to store the temporary data into this directory. Without it, temporary data willl be stored in the default directory of PSI4, which may not be big enough.
Molpro 2010
Installation Info
a) Making a ~/.molpro dir, and copying the token file into it
b) Configured with the following options:
./configure -icc -ifort -mpp -mppbase /opt/mpich2-fc13-p262/include/ -blaspath /opt/intel/compilerpro-12.0.0.084/mkl/lib/intel64/ -lapackpath /opt/intel/compilerpro-12.0.0.084/mkl/lib/intel64/ -prefix /home/qwang/local
c) For some unknown reason, the generated CONFIG file contains a invalid path to mpich2, so we need to manually correct it to:
-I/opt/mpich2-fc13-p262/include -L/opt/mpich2-fc13-p262/lib
d) The final installation step will install the program in a new dir under -prefix, and cp the token file under -prefix/lib automatically
e) The full guide of Molpro can be found at: <a href="http://www.molpro.net/info/current/doc/manual/node4.html">http://www.molpro.net/info/current/doc/manual/node4.html</a>
f) Example of input files, including SAPT calculations can be found at: /home/qwang/software/Molpro/examples
g) To run Molpro, please make sure you source the lib of MKL and Intel compilers (/opt/intel/compilerpro-12.0.0.084/).
An example script to run Molpro would be:
#!/bin/sh export TMPDIR=/scratch/dir_name_you_prefer export LD_LIBRARY_PATH=/opt/mpich2-fc13-p262/lib:$LD_LIBRARY_PATH source /opt/intel/compilerpro-12.0.0.084/bin/compilervars.sh intel64 mpich2_home=/opt/mpich2-fc13-p262 molpro_home=/home/qwang/local/molprop_2010_1_Linux_x86_64_i8/ if [ ! -d "$TMPDIR" ] then mkdir $TMPDIR fi $mpich2_home/bin/mpiexec -np 1 $molpro_home/bin/molpro -n 4 -s h2odimer_sapt_hf.com
MOLPRO 2013
An example script (if you have trouble, remove your bashrc or cshrc)
#!/bin/sh export TMPDIR=/scratch/pren export LD_LIBRARY_PATH=/opt/mpich2-fc13-p262/lib:$LD_LIBRARY_PATH # this doesn't seem to be necessary #source /opt/intel/compilerpro-12.0.0.084/bin/compilervars.sh intel64 mpich2_home=/opt/mpich2-fc13-p262 #molpro_home=/home/qwang/local/molprop_2010_1_Linux_x86_64_i8/ molpro_home=/opt/MOLPRO/install/nightly/molprop_2012_1_Linux_x86_64_i8/ #$mpich2_home/bin/mpdboot -f mpd.hosts -n 1 --ncpus=8 & $mpich2_home/bin/mpd & if [ ! -d "$TMPDIR" ] then mkdir $TMPDIR fi #$mpich2_home/bin/mpiexec -np 1 $molpro_home/bin/molpro -n 4 -s h2odimer_sapt_hf.com #$molpro_home/bin/molpro -n 4 -s h2odimer_sapt_hf.com #$mpich2_home/bin/mpiexec -np 1 $molpro_home/bin/molpro -n 4 -s ch2cl2_pbe0_opt_ion_pot.com >&/dev/null & $molpro_home/bin/molpro -n 4 -s ion_pot.com > log &
ion_pot.com is the input file has the structures of the molecules and options for calculations.
!about 1.5G mem memory,200,M gthresh,energy=1.d-8,orbital=1.d-8,grid=1.d-8 symmetry,nosym orient,noorient angstrom GEOMTERY={ 1, O,, -0.525329794 -0.050971084 -0.314516861 2, H,, -0.942006633 0.747901631 0.011252816 3, H,, 0.403696525 0.059785981 -0.073568368 } basis=avdz ks,pbe0 !optg
ks,pbe0 !orbprint,5
!ground state e_wat=energy
!one electron less set,nelec=9 ks,pbe0 !orbprint,5
ip_wat=(energy-e_wat)
T-WHAM, TWHAM
Scripts to compute T-wham free energy:
Matlab script to draw the phi-psi map for dialanine
<a href="http://biomol.bme.utexas.edu/~yues/remd/">http://biomol.bme.utexas.edu/~yues/remd/</a>
<a href="http://biomol.bme.utexas.edu/~yues/remd/energy">http://biomol.bme.utexas.edu/~yues/remd/energy</a> has the energies extracted at different temperatures. <a href="http://biomol.bme.utexas.edu/~yues/remd/dihedral">http://biomol.bme.utexas.edu/~yues/remd/dihedral</a> has the phi psi angles for all snapshots. <a href="http://biomol.bme.utexas.edu/~yues/remd/map">http://biomol.bme.utexas.edu/~yues/remd/map</a> has Jenny's matlab codes (.m files) with T-wham algorithm. All the other files in <a href="http://biomol.bme.utexas.edu/~yues/remd/">http://biomol.bme.utexas.edu/~yues/remd/</a> are trajectories from AMBER.
SAPT2008/ATMOL1024
A compiled version of SAPT2008 can be found at: /home/qwang/software/SAPT2008
Installation Info
a) Compiled with gcc4.1.2 compilers installed at /home/qwang/software/gcc-4.1.2-install
b) Higher version of gcc installed on the nodes does NOT work for ATMOL1024.
c) Examples to run SAPT with ATMOL can be found at /home/qwang/software/SAPT
2008/examples/ATMOL1024
d) Useful links:
<a href="http://www.physics.udel.edu/~szalewic/SAPT/manual.html">http://www.physics.udel.edu/~szalewic/SAPT/manual.html</a> <a href="http://www.theochem.ru.nl/~pwormer/atmol/integw.html">http://www.theochem.ru.nl/~pwormer/atmol/integw.html</a>
Intel Compiler
V13 fortran and c compilers are installed (1/2014). The installation directory for both ifort and icc is "/opt/intel/composer_xe_2013.2.146". The bin in this directory is also linked to composer_xe_2013/
mkl library is included within.
Csh or tcsh:
source /opt/intel/composer_xe_2013.2.146/bin/compilervars.csh intel64
bash
O
source /opt/intel/composer_xe_2013.2.146/bin/compilervars.sh intel64
Open MM
Installation
needs to be done on a node with GPU and Cuda( ex. node 201)
may need to install ccmake(http://www.cmake.org/), depending on version present on machine (or do yum install cmake as root; may also need swig, flex etc)
download openMM source from https://simtk.org/project/xml/downloads.xml?group_id=161
make a build folder
source bash script from /opt/software/bash/bashrc.openmm
run ccmake ( ccmake -i (source directory)) from inside build directory.
press "c" on keyboard
turn on desired features ( such as CUDA, etc.), both cuda directories should point to /usr/local/cuda-6.5
press g
make
make install
make PythonInstall
perform tests inside build directory ( make test)
cd into ~pren/openMM/dhfr-test/ and run simulateMTS.py from within this directory. Expect performance of around 3 ns/day ( observed in node 201).
Softcore implementation
notes:
in ccmake , need to set C compiler flag --std=c++0x. This is needed to enable use of tuples
changed /plugins/amoeba/openmmapi/include/openMM/AomebaVdwForce.h to have lambda particle descriptor
changed /plugins/amoeba/openMMapi/src/AmoebaVdwForceimpl.cpp to use soft core Vdw.
changed /plugins/amoeba/plaforms/cuda/src/AmoebaCudaKernals.cpp so that SigmaEpsilon array contains a 3rd variable ( lambda)
Input Set up
- 1) you need pdb file and xml file that contains parameters. Examples are in ~pren/openMM/dhfr-test/. OpenMM match the amoeba2013.xml with the pdb file and assign parameters by using residue names.
- 2) or openMM can use a "system" xml file which contains parameters for all particles. Example: /home/pren/OpenMM/Output-2fs/7.jac-dhfr/mutual-1e-5/ see input.xml and input.pdb
A script for doing this (not sure exactly which one of above):
python processTinkerForceField.py amoebapro13.prm
Need to modify the residueFinal.xml location See /home/pren/OpenMM/Accuracy/7.jac-dhfr/mutual-1e-6
---2014-- Xiao, I am able to run OpenMM/AMOEBA on the GPU (BME-SUGAR) now. You will need my ~/.bashrc file to set up all the env vars and paths.
The openMM source (precompiled binary) is in /opt/software/OpenMM6.1-Linux64/. It is installed into /usr/local/openmm, and /opt/software/python27/appdata/canopy-1.4.1.1975.rh5-x86_64/lib/python2.7/site-packages/simtk/
The two examples are /home/pren/OpenMM/zzz/simulate.py and simulateMTS.py. The later uses larger time step.
A couple of things we need to work with Lee-Ping
1) the tinker/OpenMM interface that we need to convert tinker xyz/key to OpenMM input (xml). The above examples uses a PDB and amoeba2013.xml and OpenMM will assign the parameters. However for complicated system this would not work.
2)The soft core stuff. TINKER uses “ligand -100 110” to specific a solute (atom index between 100 and 110); VDW-LAMBDA and ELE–LAMBDA to scale the vdw and ele interaction between ligand and the rest. The vdw soft core is implement in ehal.f and ehal3.f (energy) and ehal1.f (force), search vlambda. The electrostatic part just scale the multipoles and polarizability by the “elambda”. This means we turn off ele intramolecular interaction when decouple it so recharging in gas-phase is necessary (but cheap).
Peter suggests: The easiest way for you to do it is probably to start from the existing AmoebaVdwForce class and modify it to use a soft core potential. The relevant code is in amoebaVdwForce2.cu for the CUDA platform, and AmoebaReferenceVdwForce.cpp for the Reference platform. It should be quite straightforward to change.
---2014 from Lee-Ping-- (1) You can use Peter's "pdbfixer" code (https://github.com/SimTk/pdbfixer) to set up any PDB you want (you start with the PDB ID and end up with a simulation-ready PDB file), and simulateAmoeba.py should run it properly.
However, if you keep molecules and residues not supported in the force field XML file it will crash. There is a script to convert TINKER .key/.prm files into OpenMM .xml files (https://www.dropbox.com/s/7urbdgl0bja6hmd/processTinkerForceField.py?dl=0); I believe Mark Friedrichs wrote the code when he implemented AMOEBA into OpenMM.
(2) To use the multiple timestep Langevin integrator, paste the code for MTSVVVRIntegrator() into simulateAmoeba.py.
Then instead of LangevinIntegrator(298.15*U.kelvin, 1.0/U.picosecond, 1.0*U.femtosecond) use MTSVVVRIntegrator(298.15*U.kelvin, 1.0/U.picosecond, 2.0*U.femtosecond, system).
The extra "system" argument is needed because the MTSVVVRIntegrator partitions the forces in the system into different groups (slow and vast), and applies a different time step to the force groups.
Generate structures of orientations
In AMOEBA parameter determining process, we often change the VDW parameters to match the total interaction energy. In this situation, QM data from different orientated dimers are very necessary. Here I will provide a python script to automatically (or semi-automatically) generate dimers with different orientation.
The basic idea of how we get a rotated (x',y',z') coordinate is to first construct a rotation matrix, in which the rotation axis and rotation angle should be specified. Then we simply apply the dot product of the rotation matrix and the original coordinate (x,y,z).
Here is the script used for ChloroBenzene-Water. It could be easily modified to work on your case if you understand how this works.
Python script
#!/usr/bin/env python import numpy as np import math import sys '''Rotate the first Molecule along an Axis for Theta radians''' '''Chengwen Liu''' def rotMatrix(axis, theta): """ Return the rotation matrix associated with counterclockwise rotation about the given axis by theta radians. """ axis = np.asarray(axis) axis = axis/math.sqrt(np.dot(axis, axis)) a = math.cos(theta/2.0) b, c, d = -axis*math.sin(theta/2.0) aa, bb, cc, dd = a*a, b*b, c*c, d*d bc, ad, ac, ab, bd, cd = b*c, a*d, a*c, a*b, b*d, c*d return np.array([[aa+bb-cc-dd, 2*(bc+ad), 2*(bd-ac)], [2*(bc-ad), aa+cc-bb-dd, 2*(cd+ab)], [2*(bd+ac), 2*(cd-ab), aa+dd-bb-cc]]) def rotate_water(txyz): """rotate water molecule; fix benzenechloride """ lines = file(txyz).readlines() coord = [] for line in lines[1:]: data = line.split() coord.append( [float(data[2]), float(data[3]),float(data[4]),]) #read some points Cl_p = np.array(coord[9]) C6_p = np.array(coord[5]) C7_p = np.array(coord[6]) C8_p = np.array(coord[7]) Wox_p1 = np.array(coord[0]) Wh1_p1 = np.array(coord[1]) Wh2_p1 = np.array(coord[2]) #define some arrays C7C6_v = [] C7C8_v = [] C6C8_v = [] for i in xrange(3): C7C6_v.append(C6_p[i] - C7_p[i]) C7C8_v.append(C8_p[i] - C7_p[i]) C6C8_v.append(C8_p[i] - C6_p[i]) C7C6_v=np.array(C7C6_v) C7C8_v=np.array(C7C8_v) C6C8_v=np.array(C6C8_v) theta_list = [] for i in xrange(0,7,1): theta_list.append(np.pi*(1.0/9.0)*float(i)) axis1 = C6C8_v for theta in theta_list: Wox_p2 = np.dot(rotMatrix(axis1,theta), Wox_p1) Wh1_p2 = np.dot(rotMatrix(axis1,theta), Wh1_p1) Wh2_p2 = np.dot(rotMatrix(axis1,theta), Wh2_p1) '''rotate around the axis across Cl atom, so we need a translation''' trans_v = (Wox_p2 - Cl_p)/np.linalg.norm(Wox_p2-Cl_p) * (np.linalg.norm(Wox_p2-Cl_p) - np.linalg.norm(Wox_p1 - Cl_p)) Wox_p2 = Wox_p2 - trans_v Wh1_p2 = Wh1_p2 - trans_v Wh2_p2 = Wh2_p2 - trans_v filename = str("%.1f"%float(theta*180.0/np.pi)) ifile = open("chlorobenzH20_OP_%sdeg.txyz"%filename,'w') #axis1 out of plane rotation ifile.write(lines[0]) ifile.write("%s%12.6f %12.6f %12.6f%s\n"%(lines[1][:6],Wox_p2[0], Wox_p2[1], Wox_p2[2], lines[1][49:-1])) ifile.write("%s%12.6f %12.6f %12.6f%s\n"%(lines[2][:6],Wh1_p2[0], Wh1_p2[1], Wh1_p2[2], lines[2][49:-1])) ifile.write("%s%12.6f %12.6f %12.6f%s\n"%(lines[3][:6],Wh2_p2[0], Wh2_p2[1], Wh2_p2[2], lines[3][49:-1])) for i in xrange(4,len(lines),1): ifile.write(lines[i]) ifile.close() rotate_water(sys.argv[1])
I rotated along the axis parallel to C6-C8 and across Cl atom (see txyz file)
Initial Tinker file
15 1 O 4.553365 -0.050957 0.011596 1 2 3 2 H 5.042109 0.300726 -0.736075 2 1 3 H 5.043243 0.270403 0.772051 2 1 4 C -3.077858 0.019185 0.016973 402 5 10 11 5 C -2.362505 1.222155 -0.004935 403 4 6 12 6 C -0.963557 1.214342 0.010611 401 5 7 13 7 C -0.282885 -0.007338 -0.024152 404 6 8 9 8 C -0.986495 -1.215973 0.010115 405 7 9 C -2.385288 -1.197092 -0.005379 401 7 10 14 10 Cl 1.451131 -0.025174 -0.010370 403 4 9 15 11 H -4.164013 0.029438 0.010388 407 4 12 H -2.890909 2.171654 0.010734 406 5 13 H -0.401633 2.143467 -0.001102 408 6 14 H -0.441889 -2.155321 -0.001783 408 9 15 H -2.931533 -2.136433 0.009921 406 10
Here are the rotated structures.