Tutorial:amber
MM-PBSA free energy estimation tutorial
a. Theory preparation
A very good tutorial can be found in http://ambermd.org/tutorials/advanced/tutorial3/. I will suggest you to first read this tutorial so you can get a basic understanding of MM-PB/GBSA. The following steps will guide you to b) set up the complex system, c) run MD simulations on our GPU cards using pmemd.cuda, and d) estimate the complex binding affinity.
b. System setting up
For AMBER env setting up, please refer to [1]
System setting up in AMBER can be done in either xleap or tleap program. The former provides a graphical interface thus it is easier for the beginners. Here we will use tleap to set the system up (generate the necessary files for AMBER run).
Copy the leaprc.ff14SB (this is the force field file we will be using) to the current working directory as leaprc.
cp $AMBERHOME/dat/leap/cmd/leaprc.ff14SB ./leaprc
Append the following commands to the leaprc file.
- Notes: these commands are helping us to generate the complex prmtop and rst7 files in gas phase (COMP.prmtop & COMP.rst7), add Na+ and Cl- ions to the salt concentration we wanted (addions commands), add solvent molecules (TIP3P water here) and generate complex prmtop and rst7 files in explicit water (COMP_wat.prmtop & COMP_wat.rst7), generate the ligand and protein prmtop and rst7 files (LIG.prmtop & LIG.rst7; PRO.prmtop and PRO.rst7)
COMP = loadpdb COMP.pdb saveamberparm COMP COMP.prmtop COMP.rst7 addions COMP Na+ 0 addions COMP Cl- 0 addions COMP Cl- number (see notes *) addions COMP Na+ number (see notes *) solvateoct COMP TIP3PBOX 8.0 saveamberparm COMP COMP_wat.prmtop COMP_wat.rst7 savepdb COMP COMP_wat.pdb LIG = loadpdb LIG.pdb saveamberparm LIG LIG.prmtop LIG.rst7 PRO = loadpdb PRO.pdb saveamberparm PRO PRO.prmtop PRO.rst7 quit
Execute tleap command to generate the bunch of prmtop and rst7 files
$AMBERHOME/bin/tleap
- Notes: After we neutralize the system, the number of ions needed to reach the salts concentration should be calculated.
The above steps help us generate one individual system for the next runs. Here is a python script to automatically produce multiple systems. Modifications are very possibly needed for your own cases.
#!/usr/bin/env python #Generate AMBER input files for use in MM-PBSA free energy calculations # files need to be prepared: # 1. leaprc file # 2. complex pdb file #Chengwen Liu import sys,os def prepareInputs(complexPdbFile): lines = file(complexPdbFile).readlines() filename = complexPdbFile.split(".pdb")[0] f1 = open("Integrin_%s.pdb"%filename, "w") if "I1" in complexPdbFile: nAtoms = 1557 if "I2" in complexPdbFile: nAtoms = 1544 for n in xrange(nAtoms): f1.write(lines[n]) f1.close() f2 = open("Collagen_%s.pdb"%filename, "w") for n in xrange(nAtoms, len(lines)): f2.write(lines[n]) f2.close() f3 = open("temp", "w") f3.write("COMP = loadpdb %s\n"%complexPdbFile) f3.write("saveamberparm COMP %s.prmtop %s.rst7\n"%(filename, filename)) f3.write("addions COMP Na+ 0\n") f3.write("addions COMP Cl- 0\n") f3.write("solvateoct COMP TIP3PBOX 8.0\n") f3.write("saveamberparm COMP %s_wat.prmtop %s_wat.rst7\n"%(filename, filename)) f3.write("savepdb COMP %s_wat.pdb\n"%filename) f3.write("LIG = loadpdb Collagen_%s.pdb\n"%filename) f3.write("saveamberparm LIG Collagen_%s.prmtop Collagen_%s.rst7\n"%(filename, filename)) f3.write("PRO = loadpdb Integrin_%s.pdb\n"%filename) f3.write("saveamberparm PRO Integrin_%s.prmtop Integrin_%s.rst7\n"%(filename, filename)) f3.write("quit\n") f3.close() os.system("cat leap temp >leaprc") os.system("tleap") #do the HMassRepartition and generate a new prmtop file f4 = open("parmed.in", "w") f4.write("HMassRepartition\n") f4.write("parmout %s_wat_heavyH.prmtop\n"%filename) f4.write("go\nquit\n") f4.close() os.system("/opt/software/ParmEd-master/parmed.py -p %s_wat.prmtop -i parmed.in"%filename) return #Main currDir = os.getcwd() files = os.listdir(currDir) for eachPdb in files: if ".pdb" in eachPdb: #print eachPdb filename = eachPdb os.system("mkdir %s/%s"%(currDir,filename.split(".pdb")[0])) os.system("cp %s/leap %s/%s"%(currDir, currDir,filename.split(".pdb")[0])) os.chdir("%s/%s"%(currDir, filename.split(".pdb")[0])) os.system("cp %s/%s ."%(currDir, filename)) prepareInputs(filename)
Note that in the above script I also generated a prmtop file using parmed.py (*_wat_heavyH.prmtop). With this file, we can use 4 fs time step in the production MD simulations (where we only want to sample the conformations and do not care about the kinetics, which is wrong with 4fs time step)
c. Run MD simulations with pmemd.cuda
Routinely in AMBER, we will run minimization, heating, equilibration, cooling, equilibration and final production run in NVT ensemble. Here are the input and run files I am using in collagen-integrin system.
min.in
initial minimization prior to MD &cntrl imin = 1, maxcyc = 6000, ncyc = 2500, cut = 12 /
heat360.in
200ps NPT from 300 K to 360 K &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 0.0, temp0 = 360.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
md200ps360.in
200ps NPT at 360 K &cntrl imin = 0, irest = 1, ntx = 7, ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0, cut = 10.0, ntr = 0, ntc = 2, ntf = 2, tempi = 360.0, temp0 = 360.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
cool300.in
200ps NPT from 360 K to 300 K &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 360.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
md200ps300.in
200ps NPT at 300 K &cntrl imin = 0, irest = 1, ntx = 7, ntb = 2, pres0 = 1.0, ntp = 1, taup = 2.0, cut = 10.0, ntr = 0, ntc = 2, ntf = 2, tempi = 300.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 100000, dt = 0.002, ioutfm = 1, ntpr = 250, ntwx = 250, ntwr = 10000 /
After the above steps, it is possible that our system is in its equilibrium at 300 K. So we will run NVT simulations for a long time (100 ns here)
mdprod.in
100ns NVT simulations &cntrl imin = 0, irest = 0, ntx = 1, ntb = 1, cut = 10.0, ntc = 2, ntf = 2, tempi = 300.0, temp0 = 300.0, ntt = 3, gamma_ln = 1.0, nstlim = 25000000, dt = 0.004, ioutfm = 1, ntpr = 1000, ntwx = 1000, ntwr = 1000 /
These will be running sequencially. Here is a script:
#!/bin/bash export filename=your file name source ~pren/amber-git/amber17/bashrc.amber16 export CUDA_VISIBLE_DEVICES="0" #or 1 if you have a second card $AMBERHOME/bin/pmemd.cuda -O -i min.in -o $filename\_min.out -c $filename\.rst7 -p $filename\.prmtop -r $filename\_min.rst7 $AMBERHOME/bin/pmemd.cuda -O -i heat360.in -o $filename\_heat360.out -c $filename\_min.rst7 -p $filename\.prmtop -r $filename\_heat360.rst7 -x $filename\_heat360.nc $AMBERHOME/bin/pmemd.cuda -O -i md200ps360.in -o $filename\_md200ps360.out -c $filename\_heat360.rst7 -p $filename\.prmtop -r $filename\_md200ps360.rst7 -x $filename\_md200ps360.nc $AMBERHOME/bin/pmemd.cuda -O -i cool300.in -o $filename\_cool300.out -c $filename\_md200ps360.rst7 -p $filename\.prmtop -r $filename\_cool300.rst7 -x $filename\_cool300.nc $AMBERHOME/bin/pmemd.cuda -O -i md200ps300.in -o $filename\_md200ps300.out -c $filename\_cool300.rst7 -p $filename\.prmtop -r $filename\_md200ps300.rst7 -x $filename\_md200ps300.nc $AMBERHOME/bin/pmemd.cuda -O -i mdprod.in -o $filename\_md1.out -c $filename\_md200ps300.rst7 -p $filename\_heavyH.prmtop -r $filename\_md1.rst7 -x $filename\_md1.nc
After the production run, we may want to delete the explicit water molecules and salts ions (NaCl) (note: other metal ions, if there exist, in your binding site should not be deleted, because they are part of the protein) in the trajectory file (*.mdcrd or *.nc), which helps us to save the hard disk storage. In this step the ptraj (or cpptraj) program will be used.
Here is a input file named "trajin"
parm COMP_wat.prmtop trajin COMP_wat_md1.nc strip :WAT:Na+:Cl- trajout COMP_wat_md1_traj.nc
cpptraj -i trajin
Here is a python script to automatically process the systems that production runs have been done
#!/usr/bin/env python #delete water and NaCl #Chengwen Liu import os currDir = os.getcwd() def delWater(): currDir = os.getcwd() lines = file("filelist").readlines() for line in lines: filename = line.split("\n")[0] if os.path.isfile(currDir+"/"+filename+"/%s_wat_md1.out"%filename)==True : lines1 = file(currDir+"/"+filename+"/%s_wat_md1.out"%filename).readlines() if "Total wall time" in lines1[-1] and os.path.isfile(currDir+"/"+filename+"/%s_wat_md1_traj.nc"%filename)==False: os.chdir(currDir+"/"+filename) ofile = open("trajin", "w") ofile.write("parm %s_wat.prmtop\n"%filename) ofile.write("trajin %s_wat_md1.nc\n"%filename) ofile.write("strip :WAT:Na+:Cl-\n") ofile.write("trajout %s_wat_md1_traj.nc\n"%filename) ofile.close() os.system("cpptraj -i trajin") os.system("rm -f %s_wat_md1.nc"%filename) return #Main delWater()
d. MM-PB/GBSA calculation
We will use MMPBSA.py to estimate the binding affinity, where the entropy contribution is not computed due to the expensive cost of normal mode analysis calculations for biomolecules. (See theory preparation section tutorials for more details)
pbsa.in file
MMPBSA from JCTC paper -CW &general startframe=1, endframe=25000, keep_files=0, interval=50, debug_printlevel=2, strip_mask=:WAT:Cl-:Na+, / &gb igb=2, saltcon=0.100 / &pb istrng=0.1, exdi=80, indi=1.0, inp=2, cavity_surften=0.0378, cavity_offset=0.5692, fillratio=4, scale=2.0, linit=1000, prbrad=1.4, radiopt=1, /
Run file
#!/bin/bash export filename=your file name source /home/pren/amber-git/amber17/ambercpu/amber.sh $AMBERHOME/bin/MMPBSA.py -O -i pbsa.in -o results_mmpbsa.dat -sp $filename\.prmtop -cp $filename\.prmtop -rp Integrin_$filename\.prmtop
e. MM-PB/GBSA results
See the tutorial in a. Theory preparation section for how to read/understand the MMPBSA results.