Tutorial:tinkertut

From biowiki
Jump to navigation Jump to search

Purposes

This page provide tutorials on using AMOEBA force field via Tinker (CPU) and Tinker-OpenMM (GPU) programs.

Tutorials

Some simple tutorials about Tinker I used in my teaching below. Note you can directly download tinker executables from https://dasher.wustl.edu/tinker/ (Windows, mac. linux) for the following tutorials.

You can either use command lines in Windows CMD window or Linux terminals, or use FFX interface for some exercises. Actually applications always use command lines to operate in Linux OS. If you are not familiar with Linux, this is the oppurtunity to practice.

 https://biomol.bme.utexas.edu/~pren/courses/tinker-tut/

Below you will find more detailed instructions and discussions.

Input files

You can find examples of input files in tinker distribution under tinker/bench, tinker/example or tinker/test. Two files are required to run TINKER calculations: *.xyz and *.key For example, open butane.xyz and butane.key in tinker/example/ (available in all official distributions) to see what’s inside.

*.xyz file

  • The first number on the 1st line is how many atoms total.
  • There may be a second line that specify the box dimensions if the system is periodic (newer format since tinker 6)
  • The first column is the atomic index
  • The second column is the atomic symbol
  • The 3 -5th columns are x,y,z coordinates in Angstrom
  • The 6th is the “atom type” defined in the *.key file. This is the index tinker uses to assign parameters from the key/parameter file.
  • The 7th – last columns are lists of atoms that are connected to the current atom

*.key file

The key file may have all the actual parameters or a link to the actual parameters file specified in the first line. The parameters specify the bond, angle, torsion, vdW and electrostatic interactions between atoms based on the “atom type”. If you see an error related to OMP, please set the OPENMP-THREADS in the key file to a number less than the # of CPU cores on your computer. This sets how many CPU cores are used in the parallel execution.

For example, in protein.key below, borrowed from tinker/bench/bench7.key, the first line specific the actual parameters are contain in the amoebapro13.prm. More examples can be found in tinker/bench, tinker/example, or tinker/test.

 

parameters           $TINKERDIR/params/amoebapro13.prm
# 
# the amoebapro13.prm is the AMOEBA protein force field in tinker/params/. If you have a ligand you can add the parameters below. More info below
#
# verbose                                  # printing info for every step, for debugging mostly
#
#randomseed            123456789
integrator            respa                #this multi-time step integrator allows TINKER to use 2 fs time step
#HEAVY-HYDROGEN                    #This will increase H atom mass automatically so 3 or 3.5 fs time step can be used.                          
                                   #Kinetics will  be affected               
neighbor-list                              # this below requires your box is twice the cutoff plus 2-3 Ang.
                                          # If your box is too small for vdw cutoff but OK for Ewald, you can use "mpole-list" here.
#openmp-threads    16                       # how many core you want to use on the node.
#
#  Define the Periodic Box and Cutoffs
#
a-axis                62.23
vdw-cutoff            12.0
vdw-correction
#
#  Set Parameters for Ewald Summation
#
ewald
ewald-cutoff          7.0                  # we use such small cutoff because dipole/quadrupole die off faster than point charge.
# pme-grid  64 64 64                         #If speed is a concern, set this mannually to be slightly bigger than simulation box size. E.g. use 64 is box is 62.23 (1.2x is the best but slower). 
                                           Must be even with factors of only 2, 3 and 5. see source/kewald.f for a list. Sometimes the default grid size is conservative 
                                  

#fft-package           FFTW
#
#  Set Parameters for Induced Dipole Convergence
#
#polarization OPT3                         #OPT4 is mre accurate but OPT3 is faster
polar-eps             0.001                # the induced dipole convergence threshold
polar-predict
#
# Example of overwriting the parameters in the .prm file. The bond parameters between 
# atom classes 1 and 4 are redefined below, which
# will overwrite those already in the amoebapro13.prm. 
#this is only for illustration purpose
# bond          1    4          200.00     1.1

How to generate input files of your own

proteins, nucleic acids, common organics

In tinker/params, you can find pre-existing parameters for certain molecular systems, both AMOEBA, amber, charmm, opls, mmff, and mm2/3.

PDB to xyz

If you have a pdb file you can convert it tinker xyz file by specifying a prm file above. It will remove the heteroatoms such as ligand, which you can use POLTYPE to generate parameters for.

First you need to determine the protonation state of charged residue. Tools like propka can do this quikly: http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/

Once you decided the protonation state, make sure the residue name in PDB file match the protonation state (list below). For example, ASH is the neutral form of ASP. Then you can run "pdbxyz.x"

'GLY' 'ALA' 'VAL' 'LEU' 'ILE' 'SER' 'THR' 'CYS' 'CYX' 'CYD' 'PRO' 'PHE' 'TYR' 'TYD' 'TRP' 'HIS' 'HID' 'HIE' 'ASP' 'ASH' 'ASN' 'GLU' 'GLH' 'GLN' 'MET' 'LYS' 'LYD' 'ARG' 'ORN' 'AIB' 'PCA' 'UNK'

UNK means unownk; AIB, ORN and PCA  are modified AA.

https://www.ebi.ac.uk/pdbe/entry/pdb/3q9g/modified/ORN 

http://www.ebi.ac.uk/pdbe/entry/pdb/4ua7/modified/PCA

Terminal residue names:

COH ACE NH2 NME  FOR

HETATM (residue name) recoginzed by pdbxyz:

HOH K CA MG NA CL

AMOEBA library

For AMOEBA, please use amoeba09.prm for common small molecules, amoebapro13.prm for proteins. Nucleic acid parameters coming soon (end of 2017). More information here: http://biomol.bme.utexas.edu/tinker-openmm/index.php/TINKER-OPENMM:Amoeba

If you have a protein, you use "pdbxyz" to convert it to tinker xyz file. It will ask you for a key file (1bty.key below) or you can create a key file with "parameters $TINKERDIR/params/amoebapro13.prm" in it:

pdbxyz 1bty.pdb -k ./1bty.key

AMOEBA for a new ligand

Note that pdbxyz recognize proteins, water (res name HOH and some ions). The ligand (benzamidine above) is stripped. For that you need to derive your own parameters. For AMOEBA this can be done using POLTYPE: https://biomol.bme.utexas.edu/wiki/index.php/Research_ex:Poltype

Input: ligand.sdf or ligand.pdb (for example, you can truncate the ligand "BEN" out of the 1bty.pdb) 
Output: will produce the xyz and corresponding key files (ttt.xyz and ttt.key):

If you will merge the ligand xyz file with another molecule, make sure you set the suitable range for atom types (an option for pOLTYPE) so that they won't overlap. See the "Check the results section" on POLTYPE website before using them. Use "analyze.x" to make sure the xyz and key files work correctly: "analyze.x ttt.xyz -k ttt.key ep"

build a solvent box

arbirary solvent

Use xyzedit.x to build a water (or any solvent) box starting from a water monomer (e.g. tinker/test/water.xyz). You can specify many monomers to add and how big the box is. Type xyzedit.x at the command line and you will be asked to enter relevant inputs (or you may type all the parameters in one line).

Make sure edit the key file to add box size and Ewald related keywords (see http://biomol.bme.utexas.edu/tinker-openmm/index.php/TINKER-OPENMM:Tinker-tut#.2A.key_file)

prebuilt water box

Some prebuilt waterboxs: http://biomol.bme.utexas.edu/~pren/downloads/waterbox

Larger boxes can be created as supercells of smaller boxes, e.g. this following command will create a box 64x larger than the orignial box.

#!/bin/bash
echo -e "5\n4\n\n" | crystal.x watersmall.xyz -k tinker.key

Combining two xyz files

Tinker "xyzedit.x" program (option 20) can be used to combine two xyz files into one (matching key file for each xyz is required). You can merge the key or parameter files by appending one to the other, but please make sure the atom types are not overlapping between the two.

For the above 1BTY example, poltype will translate/rotate the ligand in standard orientation. You may need a script to copy the coordinates from ligand.pdb into ttt.xyz (hwich you get out of the POLTYPE run).

Then you can use xyzedit.x to combine 1bty.xyz and lig.xyz into one xyz file with the two molecules orient/position as in PDB. Merge the key file by appending ttt.key (except the first line which contains header already in amoebapro13.prm of 1bty.key) to 1bty.key; again avoid overlapping atom types between the ligand and protein.

Soaking solute in solvent

 Tinker "xyzedit.x" is an interactive program (option 20) that can be used for this as well. You just need a molecule.xyz and waterbox.xyz along with matching key files (actual names do not matter).

Some water files (monomers, boxes) are in tinker/example or test folder. One can make a water box of different size using this program too (starting from one water or a cluster of water and use option 19). Option 21 can add ions to solvent box. To distiguish between AMOEBA or fixed charged FF simulation files, check the key file to see which prm file it is pointing to.

Or you can use this utility: http://www.ime.unicamp.br/~martinez/packmol/home.shtml

On renlab cluster, there is "packmol" installed at /home/liuchw/Softwares/packmol-20.14.4/packmol

Two examples for generating a cubic box:

example 1: generate a pure liquid box, using water as an example. In the real cases, one could calculate the number of molecules and the length of the cube according to desired density.

filetype tinker
output water-box.xyz
tolerance 2.0

structure Wat.xyz
number 216
inside cube 0. 0. 0. 18.6
end structure

example 2: soak a MeOH molecule into the above box. We first let MeOH be in the center of the box with coordinates (9,9,9), and then soak with water in 18.6 A cubic box.

filetype tinker
output liquid-box.xyz
tolerance 2.0

structure MeOH.xyz
number 1
center
fixed 9.0 9.0 9.0 0.0 0.0 0.0
end structure

structure Wat.xyz
number 216
inside cube 0. 0. 0. 18.6
end structure

In the above two examples, all one needs are the "Wat.xyz" and "MeOH.xyz" files. After the box is generated, it is straightforward to use Tinker to minimize the structure.

After construct the simulations, relax the system using minimize and dynamics (restraint the solute first), with proper restraints and heating. See MD setup for details

Check your xyz/key files and common errors

Once you have a pair of xyz and key (and prm file the key uses one), you can use "analyze" to do some basic check

analyze xxx.xyz -k xxx.key e" will print energy and its components.

If there are any error related to missing parameters, you need to fix them. 

"analyze xxx.xyz -k xxx.key em" prints total charge and dipole moments etc.

The total charge should match your expectation (0, -1, +1) etc
If this is a box of solute in water, you should add neutralizing ions (K+ or Cl-) and then "additional" 0.1 mM KCl. So the next charge should be 0!

"analyze xxx.xyz -k xxx.key ep" prints all parameters

Check if every atom has a multipole (number of multipole matches no of atoms) 
check if every atom has vdw (if an atom has 0.000 0.000 for vdw R and eps, that means the vdw parameters are missing; need to add) 
check polarization groups are making sense. If you see every atom is its own group, this is likely wrong. Typically we keep each function group (e.g. benzyl) as one group. 

analyze with EL will print out large interactions, for example large ele or vdw interactions of two atoms that are too close (in early PDB structure this can happen)

analyze with "ED" option (debug) will print out all individual interactions and lot more information. The output is of course huge if you do this to a large bix, but you can use vdwterm only, bondterm only, angleterm only. multpoleterm only, polarizeterm only, etc to check a specific energy component. Or use bondterm none, angleterm none....ect to turn off some interactions you don't want to check. A full list of xxxTERM can be found in tinker/source/prmkey.f(grep TERM).

Induced dipole not converging error

If analyze give you error related to induced dipole not converging, the structure is bad. If you see "induced dipole not converging" error in the beginning of your MD, it is also due to bad structures (atoms too close). The structure needs refinement, or missing parameters above. To refine the structure

1. minimize the structure. you can turn off polarization (polarizeterm none) or even permanent ele completely (multipoleterm none & polarizeterm none) first to minimize using vdw to move close atoms apart. Then turn on multipole and then induce back

2 For MD simulations of protein/RNA/DNA in water, run MD first at low T with pro/rna/dna "heavy atoms" restrained (e.g. restrain-position -1 1000 5.0 or restrain-position 2 , , , 1.0) to let water & counter ions to relax for few ns. Then gradually heat up and remove the restraints. Note "restrain-position -1 1000 0.0" or "restrain-position 2 , , , 0.0" DOENOT mean 0 restraint (k=0.0) on atoms 1 to 1000 or atom 2!!!  "0.0" is replaced by 100.0 in tinker by default. If you don't want restraint, comment out/remove these lines.

3. But if you see this error randomly over few hundreds of ps, which may happen for tough systems of large/many charges e.g DNA/RNA, you can add "USOLVE-CUTOFF 0.0" option to key. This will change the precondition behavior and make induced dipole solver more stable. It is not used by default because for "easy" systems like boxes of water, this makes MD little slower (more iterations to converge induced dipoles). If still have stability issue you may further add "USOLVE-DIAG xx" where xx is 2.0 by default and a smaller number towards 1 (e.g. 1.5) will make more stable but slightly slower.

4. If system is rather big, use "polar-eps 0.01" or 0.001 during system setup and 0.0001 for production.

Modified PRO & NA residues

Build a structure with modified residues

Modify residues in a PDB file.

Input: PDB files of the biopolymer, an original residue and a modified residue. The original residue must be present in the biopolymer, and the modified residue must share at least 3 atoms with the original residue with identical coordinates. The modified residue is usually obtained by manually modifying the original residue.

Output: A PDB file with specified residues modified to the new residue. link

# change residues 2 and 14 in dna15.pdb from res_A to res_pA
morphling.py -i dna15.pdb -o pdna15.pdb -t0 res_A.pdb -t1 res_pA.pdb -n 2,14

Convert pdb to xyz

link

More on running Tinker and tinker9 (GPU)

Manual

https://tinkerdoc.readthedocs.io/en/latest/

Compiling & running:

Software:tinkergpu

Command line

Tinker programs can be run interactively, which is the best way to learn what are the required inputs. Tinker programs can also run in background with all parameters specified, for the purpose of automation:

analyze.x ttt.xyz -k ttt.key ep
dynamic.x bench7 100000 3.0 6.0 2 298.0 N > ben7.log &
# the bench7.xyz above can be found in tinker official distribution inside tinker/bench/
#2-fs time step for MD here is ok because of the "integrator respa" in the key file

Or for tinker9


#!/bin/bash
#source /home/liuchw/.bashrc.tinker9
#export TINKER9=/home/pren/tinker9/tinker9_sugar/build/
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.8/targets/x86_64-linux/lib/
export CUDA_DEVICE_ORDER=PCI_BUS_ID
export CUDA_VISIBLE_DEVICES=0 # device number; can use 1 or 2 if there are multiple GPU cards
$TINKER9/tinker9 dynamic bench7.xyz -k bench7.key 5000 2.0 1000.0 4 298.15 1.0 N > out &

Pause and resume MD

Tinker creates .dyn file that contains coordinates and velocities needed for restart MD. To stop MD, simply create a .end file (e.g. touch myrun.end) in the folder where MD is running. At the next time MD frame was written, the end file will signal Tinker to stop. To resume later simply rerun dynamics with the presence of the .dyn file. Note the output file of dynamics does not resume the count of MD steps/frames.

Additional notes for "tinker-openmm"

  • It s recommended to use the respa inetgrator and 2-fs time step
  • "heavy-hydrogen" in key file allows a 3-fs time step (not reommended)
  • Bussi thermostat
  • Only MC barostat is available for now. We are adding virial/Langevin piston pressure to openmm.

How to specify different ensembles for MD simulation

For TINKER-openmm

Tinker-Openmm is no longer supported since 2021. Use tinker9 for GPU MD (see Tinker GPU)


Not many options available. 

For NVT, use Bussi for thermostat, RESPA integrator (2fs)

For NPT, use Montecarlo for Barostat, Bussi thermostat. Verlet (1 fs) is safer than RESPA for use with MC barostat if very accurate density is needed.

See below for keyword syntax.

Available thermostat and barostat in TINKER (2021)

Thermostats:

METHOD                     KEYWORD (lines you add to .key file)
Bussi-Parrinello           thermostat bussi
Berendsen                  thermostat berendsen
Andersen Stochastic        thermostat andersen
Nose-Hoover                thermostat nose-hoover

There are 3 barostats available via the “barostat” keyword:

METHOD                     KEYWORD
Berendsen                  barostat berendsen
Monte-Carlo                barostat montecarlo
Lagevin                    Barostat Langevin


The above thermostat and barostats are available for the Verlet, Beeman and RESPA integrators, and can be used in combination with those integrators. These integrators are available via:

METHOD                     KEYWORD
Verlet                     integrator verlet
Beeman                     integrator beeman (tinker 8 CPU only)
RESPA                      integrator respa

Note that the defaults are Bussi for thermostat, Berendsen for barostat, and Beeman for integrator.

Then there are two special integrators, a stochastic one, and a Nose-Hoover that does NPT. The stochastic integrator uses a kind of Langevin temperature bath for thermostating, and does listen to the barostat keyword. The Nose-Hoover integrator uses a separate code branch and does only NPT with Nose-Hoover methods following Martyna-Tuckerman-Klein. You can get these via:

METHOD                      KEYWORD
Stochastic                   integrator stochastic (no need for other T control)
Nose-Hoover NPT             integrator nose-hoover (no need other keywords for T or P; starting structures need to reasonable)

Recommended NVT and NPT combinations

NVT keywords combination in the key file (2fs time step)

thermostat bussi 
integrator RESPA

Preferred/Recommended NPT (2fs, relative isotropic and homogenous systems)

archive (dcd-archive to produce compressed traj for large systems; need .xyz or .pdb to render in VMD)
thermostat bussi
barostat MonteCarlo
integrator RESPA
vdw-cutoff 9 (12 for amoeba)
vdw-correction
Ewald
Ewald-cutoff 9.0 (7.0 for amoeba)
neighbor-list
polar-eps 0.0001 (only for amoeba)
polar-predict (only for amoeba)

Recommended NPT (2fs, anisotropic systems such as membrane, or large local volume fluctuation in protein folding. Slower than above for virial calculations)

Documentation: https://tinkerdoc.readthedocs.io/en/latest/text/feature/integrator-langevin-piston.html#implementation-2 Semi-isotropic pressure control not finished yet

barostat Langevin
Integrator RESPA   #or Verlet/1 fs; T control not needed
a-axis 40 #change this to your actual box size; b or c can be different from a
vdw-cutoff 12
vdw-correction
Ewald
Ewald-cutoff 7.0
neighbor-list
polar-eps 0.0001
polar-predict

 

Alternative NPT (1fs, built in P and T control)

integrator nose-hoover

Recommended keywords (add somewhere in the .key file) for NVT

integrator respa
thermostat bussi

Command:

dynamic xxx.xyz 100000 2.0 1.0 2 298  (100000 steps, 2.0 fs time step, dump structure every 1.0 ps, option 2 is NVT, 298 is the target T)

It is also possible to combine "integrator Beeman" or "thermostat Berendsen” or "thermostat Andersen". But RESPA allows large time steps (2.0 or 2.5 fs) than Beeman. Berendsen thermostat does not provide canonical ensemble fluctuation.

NVE

integrator respa

command:

dynamic xxx.xyz 100000 2.0 1.0 1

No need for thermostat or barostat of course. It is best to use smaller time step such as 1.0fs to conserve energy better. May even use smaller polar-eps (10^-6) than default (10^-5) in the key file.

Non periodic system, e.g. gas molecules not in a box

integrate stochastic

If there is no box (a-axis) in the key file, or no box dimensions in the den file, the system is non-periodic. This will set the stochastic temperature control along with the stochastic MD integration. For gas phase molecular cluster (very few atoms), the recommended time step is 0.1 fs.

Free energy calculations

Theory about free energy calculation: http://alchemistry.org/wiki/Bennett_Acceptance_Ratio

About AMOEBA softcore and BAR, read our book chapter:

PL BAR softcore.pdf

MD setup-initial equilibration

Determine protonation state of ionizable groups (ASP, GLU, LYS, ARG, HIS). Use propka here http://nbcr-222.ucsd.edu/pdb2pqr_2.0.0/ for proteins. For ligand, you may use pka predicction tool from Chemaxon.

  • Add water and counter ions (neutralize system) so that the density is ~1.0 g/cc. The distance between protein and box wall should be 10-15 A. Check/remove extra water in the binding pocket as necessary (keep the crystal water molecules in the pocket)
  • In the key file, set PME-grid to be 1.2x box size in Ang. For example, if box size is 55, “pme-grid 64 64 64” is enough. The default of Tinker is usually more conservative. See kewald.f for allowed grid values. Add “neighbor-list”, “polar-eps 0.01”, “vdw-cutoff 12”, "vdw-correction" “integrator respa”, “ewald”, “ewald-cutoff 7.0”  to .key file. You can even turn off polarization initially (polarizeterm NONE) during EQ and add it back (polar-eps 0.001 or tighter for production)
  • Minimize the box before MD. If you see errors related to polarization (induced dipole not converge), do this in two steps: first minimize with electrostatic (multipoleterm NONE) and polarization turned off (polarizeterm NONE in .key file), to ~5.0 or lower; then minimize again with ele then ele+polarization back on to ~2.0 or lower. You may use position-restraints (HEAVY ATOMS only) if you don't want your solute to undergo dramtic changes. (RESTRAIN-POSITION -1 200 50.0, means restrain atoms 1 to 200 using a force constant of K=50 kcal/mol). 
  • MD relaxation. You may use GPU for MD now (dynamic_omm.x). Use repsa integrator and 2fs time step. Add position-restraints to restrain protein & ligands (HEAVY ATOMS only) in the key file initially; For metal ion-protein binding or weak binding ligand, we also suggest to use ~3 distance restraints between ion/ligand and first shell atoms. This is to prevent water disrupt the initial solute structure during the equlibration. 
    • Use 2 fs for all MD below. If you want to use 3fs (e.g for large systems), you need to "heavy-hydrogen" in the key file. Another way to spede up is to use the OPTx: "polarization OPT4" or "polarization OPT3" OPT3 is faster but bigger error.
    • With the positional/distance restraints, run ~2ns  NVT MD to gradually (e.g. exponentially) heat up the system from 10K  to 298K or whatever expt T should be. Water/ions are relaxed after this step.
    • Run NVT at 298K or expt T for ~2ns while gradually turning off all the position and distance restraints on protein-ligands.
    • Run 1.5 ns NPT to compute the average density/box size (ignore the first 300ps). For GPU, only "barostat MonteCarlo" is available. 
    • NVT MD for ~2ns with box fixed at the average lengths from above.
    • Check protein RMSD (esp. around binding pocket) from crystal structure after every step above. If any step gives large RMSD, redo that (and previous step) with longer/slower MD to correct the problems.
  • MD production run. For alchemical free energy, this involves setting the ligand group and various lambda values for ele and vdw to scale the interactions between ligand and surrounding (see below BAR section).

Hydration or binding using BAR

Alchemical free energy calculations are available in TINKER, Tinker9 GPU (TINKER-OpenMM no longer supported). One needs to specify the ligand or solute using the group keyword in the key file (example below). The lambda scaling schedule can be specified by user, automated by the "bar.x" in TINKER. For each set of lambda value (scaling the interaction betweem ligand and enviroemnt and inside ligand), one needs to perform one MD simulation. The bar.x is then used to analyze the dG between neighboring steps i and j, using the arc files from MD simulation i and j. The total free energy is then sum of 1-2, 2-3, ...N-1 and N. See this reference for examples: J Comput Chem. 2017 Sep 5;38(23):2047-2055

Note GPU MD and bar code is much faster than the CPU one 


Running BAR code in Tinker

To compute free energy between two states (i and j), the bar.x (CPU or GPU) will need two trajectories (and two key files of course) of the same simulation length (frames). the two states can be neighboring lambda states in HFE or binding.

See above (and PNAS 2008 paper) for BAR equations.

Step 1 (bar option 1) is to generate two files, each contain the energy of its own state and perturbed energy (Eii, Eij). The second energy is using arc i and key of state j (bar will do this for you).

tinker9 bar 1 arc1 300 arc2 300 N > barfile &

"tinker 9 bar" is for GPU. You can replace with CPU version bar.x (slower). the arc1 and arc2 can be in different folders with their matching key files.

Example barfile produced from this step: https://github.com/leucinw/autoBAR/blob/main/examples/Phenol-HFE/liquid/liquid-e000-v000.bar.

The output below shows 500 frame used, 300K, for frame 1, energy of Eii, Eij and volume of the system (since we can do NPT where volume changes)

 500    300.00  comments
      1          -6396.8707        -6396.7580        27379.1152

The second section of this same file will have Ejj, Eji, volume

Step 2 (bar option 2) Use the barfile produced from above to compute dG , dH and dS

tinker9 bar 2 {barfile} {startsnapshot} {totalsnapshot} 1 {startsnapshot} {totalsnapshot} 1 > {enefile}

Note this step is much faster since it uses energy from above. You can also set to use a subset of the total frames, e.g. to exam how the dG converge with longer simulaitons. Example out: https://github.com/leucinw/autoBAR/blob/main/examples/Phenol-HFE/liquid/liquid-e000-v000.ene

HFE

Common keywords for hydration free energy calculation (suggest NPT with MC barostat or Langevin piston barostat)

ligand                  -1 24, 26    #negative means range: atoms 1 to 24 plus 26
a-axis 40
vdw-cutoff 12
vdw-correction
Ewald
Ewald-cutoff 7.0
vdw-lambda              1.00
ele-lambda              0.50
barostat Monte Carlo   #NPT. Use Langevin NPT if care e.g. membrane 
thermostat Bussi
Integrator RESPA # 2-fs time step for solution; 1 for gas
neighbor-list
polar-eps 0.0001
polar-predict
#openmp-threads    16 #this is for CPU
Vdw-annihilation      #this is to help sampling conformation at low lambda by removing intrmol vdw interactions; not needed for rigid solute.

The ele and vdw schedules are found in our previous paper. Also recommended as below. Note you can add or remove steps based dG results and error bars.

Run NPT for each lambda below. 2-5 ns for each lambda typically depending how complex your system (eg, 2ns is sufficient for K+ in water but a large ligand mayneed 5ns). Use BAR-NPT to analyze free energy between neighboring lambda (ignore first 200 ps). The toal FE is the sum of each step. You can check cumulative free energy convergence as a function of simulation time. Add more MD if needed.

Repeat the set of simulations for ligand in solution and ligand itself in gas phase (not necessary if gas energy is 0 like single ion). The gas phase part should use the matching keywords as above (but no a-axis, Ewald, and time step for constant T MD is 0.1fs).

Typical lambda schedule:

Ele L	Vdw L
0	0
0	0.4
0	0.5
0	0.525
0	0.55
0	0.575
0	0.6
0	0.625
0	0.65
0	0.675
0	0.7
0	0.725
0	0.75
0	0.775
0	0.8
0	0.9
0	1
0.1	1
0.2	1
0.3	1
0.4	1
0.5	1
0.6	1
0.7	1
0.8	1
0.9	1
 1     1

binding free energy

For host-guest binding, the simulation process similiar to above but involve two sets: one is host-guest-water, and the other guest-water. In both cases, guest is the ligand that is being scaled. For host-guest-water, one also applies a bond restraint between host and guest. This restraint can be turned off in one or two steps for lambda=1 but should be kept when L<1.

One way is to set the restraint K=0 when Lele, Lvdw=1, then K=90% when Lele=0.9/Lvdw=1, then K=100% when Lele=80%/Lvdw=1..... K=100% for all rest of L including L=0. 
The goal is for L=1 (host-guest full interaction), there is no restraint. when L=0, restraint is at 100%. The L=0 state seems "incorrect" due to the restraint but can be corrected (below).

This "bond" is between group of guest/ligand and group of host atoms. It is best to minimize the distance between the centers of the two groups (for sampling). For example, if you are simulating an ion binding to a spherical cavity like CPP, you can pick 3 atoms on the host whose center is roughly the center of the ion; the restraint will be between the ion and this group. A correction is needed to "remove" the effect of this restraint and standard volume that goes into the final binding free energy.Tinker/utiity/freefix.f can be used to calculate the correction. Note this correction is typically positive (make binding energy less negative) since restraint leads to overestimation of binding.

*The correction for lambda=0 from RTln(C0*V), where C0=1/1660 A^3 and V=integrate{4*pi*r^2*exp[-k*(r-r0)^2/RT]dr}. k=15 kcal/mol in the example below. If you are using a harmonic restraint (example below) and the equilibrium r0 is not 0, numerical integration is necessary. A good reference is JACS v126, NO. 24, 2004.

Brandon setup script can also set up protein-lig distance restraint now. Basically select a central heavy atom on ligand (or a function group) and another group of nearby 3-4 Ca from protein.

 Francis wrote a program to pick restraints based on Boresch paper. It can include additional angle and torsion restraints. However we found it is best to just use the simple distance restraints. The additional restraints may introduce bias towards ligand or host dynamics if not picked carefully.

Tinkergpu:get_rot_rest


If you use distance restraint when lambda =1 for both ele and vdw, you can remove the effect of restraint from FEP or BAR; or you can avoid this correction if the restraint strength is set to 0 when the lambda =1 (gradually turned on when lambda-> 0 for both vdw and ele).

ligand                  -1 24, 26    #negative means range: atoms 1 to 24 plus 26
vdw-lambda              1.00
ele-lambda              0.50
GROUP 1 -1 3 //ligand group
GROUP 2 100, 102, 138 //protein group
RESTRAIN-GROUPS  1  2  15.0  2.0 2.0
Vdw-annihilation
#lower bound & upper bound (2.0, 2.0) do not have to be the same. 15 is the force constant.

Each set of lambda corresponds to one separate MD simulations of 5-10ns or longer. Save MD frames at every 3-5 ps. Use bar.x (bar_omm.x for GPU) to analyze the free energy dG between the neighboring lambda values.

Unlike HFE, no need for gas-phase simulation since the end states of the ligand-water vs. ligand-protein cancels.

Some scripts fron CW: https://github.com/leucinw/ComputTools/tree/master/bardemo 

ion HFE steps

To compute ion hydration FE,

  • prepare ion.xyz and waterbox.xyz. One key file containing all parameters (amoeba09.prm has ion and bunch of ions). If you have a new ion, add its parameters including multipoles, polarizability/damping, vdw
  • water box should be ~50 A. A list of prebuilt water box (you many need to change atom types in xyz if you are using different key) https://biomol.bme.utexas.edu/tinkergpu/index.php?title=Tinkergpu:Tinker-tut#prebuilt_water_box
  • use xyzedit to soak ion in water (see tutorial above about building and combining xyz)
  • Run NPT for ~1ns to relax. See tutorial on this page about dynamics on GPU NPT keywords (using Langevin or MonteCarlo)
  • Take the last relaxed structure to create inputs for HFE. Best to create a series of folders, one for each lambda with inputs and run scripts. See here for lambda schedule https://biomol.bme.utexas.edu/tinkergpu/index.php?title=Tinkergpu:Tinker-tut#HFE
  • Get a list of nodes from Google spreadsheet of lab cluster. Ask for it if you can't find ~10 GPU nodes.
  • each MD for ~2ns (add more if needed)
  • Use a script to send jobs to each node. You can have 2 jobs concurrent on one GPU and rotate your jobs (~50 lambdas) through.

hydration and binding free energy examples

Please unzip the files and see README for instructions.

HFE and binding free energy calc using Brandon's script

Set up from complex PDB and ligand xyz/key (POLTYE): https://github.com/bdw2292/AMOEBAAnnihilator/blob/main/README_HELP.MD 

  • create conda env according to instructions on the annihilator github page
  • make .allpurpose.bashrc something like /home/eew947/.allpurpose.bashrc with your username replacing mine everywhere in it
  • ssh to node you want to start from
  • source ~/.allpurpose.bashrc
  • prepareannihilator.ini
  • if you want to use daemon, you have to add the keyword for it in annihilator *ini and have it already running before starting annihilator
  • start annihilator by nohup yourpathtoannihilatorhere/AMOEBAAnnihilatorModules/amoebaannihilator.py &

OSRW

Only implemented in tinker CPU. GPU version is under development.

Visualization

Force Field Explorer

By Mike Schneider and Jay Ponder, http://dasher.wustl.edu Can visualize the xyz and arc (MD trajectory) files; create and start TINKER calcualtions

VMD

Choose TINKER format when open a xyz file. Trajectory file (.arc) also works.

Pymol

Sometimes the xyz file can not be displayed correctly

Other resources

https://sites.google.com/site/biomolsimstw/workshop-materials/tutorials

Please email me if you have tutorials related to AMOEBA or Tinker you would like to share.

AMOEBA Force Field Papers:

AMEOBA FF:

  1. Water model: Ren, P. Y.; Ponder, J. W., Polarizable atomic multipole water model for molecular mechanics simulation. Journal of Physical Chemistry B 2003, 107 (24), 5933-5947.
  2. Small molecules: Ren, P.; Wu, C.; Ponder, J. W., Polarizable Atomic Multipole-based Molecular Mechanics for Organic Molecules. J Chem Theory Comput 2011, 7 (10), 3143-3161.
  3. Proteins: Shi, Y.; Xia, Z.; Zhang, J.; Best, R.; Wu, C.; Ponder, J. W.; Ren, P., The Polarizable Atomic Multipole-based AMOEBA Force Field for Proteins. J Chem Theory Comput 2013, 9 (9), 4046-4063.
  4. DMP/TMP/Base & nucleic acids: 
  • Zhang, C.; Lu, C.; Wang, Q.; Ponder, J. W.; Ren, P., Polarizable Multipole-Based Force Field for Dimethyl and Trimethyl Phosphate. J Chem Theory Comput 2015, 11 (11), 5326-39.
  • Zhang, C.; Bell, D.; Harger, M.; Ren, P., Polarizable Multipole-Based Force Field for Aromatic Molecules and Nucleobases. J Chem Theory Comput 2017, 13 (2), 666-678.
  • Zhang, C.; Lu, C.; Jing, Z.; Wu, C.; Piquemal, J. P.; Ponder, J. W.; Ren, P., AMOEBA Polarizable Atomic Multipole Force Field for Nucleic Acids. J Chem Theory Comput 2018, 14 (4), 2084-2108.

AMOEBA+ model (water published, small and biomolecules in progress)

  1. Liu, C.; Piquemal, J. P.; Ren, P., AMOEBA+ Classical Potential for Modeling Molecular Interactions. J Chem Theory Comput 2019.
  2. Liu, C.; Piquemal, J. P.; Ren, P., Implementation of Geometry-Dependent Charge Flux into the Polarizable AMOEBA+ Potential. J Phys Chem Lett 2019, 11, 419-426.

AMOEBA+NN model: incorporating Neural Networks

Wang, Y., et al. (2024). "Incorporating Neural Networks into the AMOEBA Polarizable Force Field." The Journal of Physical Chemistry B 128(10): 2381-2388.


Lectures on Force Field, AMOEBA, AMOEBA+

Under Adv Mol Modeling menu:

 Link

More Explicit Free Energy Calc Steps - Brandon Walker

Old/obsolete description here: Obsolete description here