Tutorial:amber

From biowiki
Jump to navigation Jump to search

MM-PBSA free energy estimation tutorial

Author: Chengwen Liu
Latest update: Dec. 6th, 2017

a. Theory preparation

A very good tutorial can be found in http://ambermd.org/tutorials/advanced/tutorial3/. I will suggest you to first read this tutorial so you can get a basic understanding of MM-PB/GBSA. The following steps will guide you to b) set up the complex system, c) run MD simulations on our GPU cards using pmemd.cuda, and d) estimate the complex binding affinity.


b. System setting up 

For AMBER env setting up, please refer to [1]

System setting up in AMBER can be done in either xleap or tleap program. The former provides a graphical interface thus it is easier for the beginners. Here we will use tleap to set the system up (generate the necessary files for AMBER run).

Copy the leaprc.ff14SB (this is the force field file we will be using) to the current working directory as leaprc.

 cp $AMBERHOME/dat/leap/cmd/leaprc.ff14SB ./leaprc

Append the following commands to the leaprc file.

  • Notes: these commands are helping us to generate the complex prmtop and rst7 files in gas phase (COMP.prmtop & COMP.rst7), add Na+ and Cl- ions to the salt concentration we wanted (addions commands), add solvent molecules (TIP3P water here) and generate complex prmtop and rst7 files in explicit water (COMP_wat.prmtop & COMP_wat.rst7), generate the ligand and protein prmtop and rst7 files (LIG.prmtop & LIG.rst7; PRO.prmtop and PRO.rst7)
 COMP = loadpdb COMP.pdb
 saveamberparm COMP COMP.prmtop COMP.rst7
 addions COMP Na+ 0
 addions COMP Cl- 0
 addions COMP Cl-  number (see notes *)
 addions COMP Na+ number (see notes *)
 solvateoct COMP TIP3PBOX 8.0
 saveamberparm COMP COMP_wat.prmtop COMP_wat.rst7
 savepdb COMP COMP_wat.pdb
 LIG = loadpdb LIG.pdb
 saveamberparm LIG LIG.prmtop LIG.rst7
 PRO = loadpdb PRO.pdb
 saveamberparm PRO PRO.prmtop PRO.rst7
 quit

Execute tleap command to generate the bunch of prmtop and rst7 files

  $AMBERHOME/bin/tleap
  • Notes: After we neutralize the system, the number of ions needed to reach the salts concentration should be calculated.

The above steps help us generate one individual system for the next runs. Here is a python script to automatically produce multiple systems. Modifications are very possibly needed for your own cases.

#!/usr/bin/env python
#Generate AMBER input files for use in MM-PBSA free energy calculations
# files need to be prepared:
# 1. leaprc file
# 2. complex pdb file
#Chengwen Liu

import sys,os 

def prepareInputs(complexPdbFile):
  lines = file(complexPdbFile).readlines()
  filename = complexPdbFile.split(".pdb")[0]
  f1 = open("Integrin_%s.pdb"%filename, "w")
  if "I1" in complexPdbFile:
    nAtoms = 1557
  if "I2" in complexPdbFile:
    nAtoms = 1544
  for n in xrange(nAtoms):
    f1.write(lines[n])
  f1.close()

  f2 = open("Collagen_%s.pdb"%filename, "w")
  for n in xrange(nAtoms, len(lines)):
    f2.write(lines[n])
  f2.close()
  
  f3 = open("temp", "w")
  f3.write("COMP = loadpdb %s\n"%complexPdbFile)
  f3.write("saveamberparm COMP %s.prmtop %s.rst7\n"%(filename, filename))
  f3.write("addions COMP Na+ 0\n")
  f3.write("addions COMP Cl- 0\n")
  f3.write("solvateoct COMP TIP3PBOX 8.0\n")
  f3.write("saveamberparm COMP %s_wat.prmtop %s_wat.rst7\n"%(filename, filename))
  f3.write("savepdb COMP %s_wat.pdb\n"%filename)
  f3.write("LIG = loadpdb Collagen_%s.pdb\n"%filename)
  f3.write("saveamberparm LIG Collagen_%s.prmtop Collagen_%s.rst7\n"%(filename, filename))
  f3.write("PRO = loadpdb Integrin_%s.pdb\n"%filename)
  f3.write("saveamberparm PRO Integrin_%s.prmtop Integrin_%s.rst7\n"%(filename, filename))
  f3.write("quit\n")
  f3.close()
  os.system("cat leap temp >leaprc") 
  os.system("tleap") 
 
  #do the HMassRepartition and generate a new prmtop file 
  f4 = open("parmed.in", "w")
  f4.write("HMassRepartition\n")
  f4.write("parmout %s_wat_heavyH.prmtop\n"%filename)
  f4.write("go\nquit\n")
  f4.close()
  os.system("/opt/software/ParmEd-master/parmed.py -p %s_wat.prmtop -i parmed.in"%filename)
  return

#Main
currDir = os.getcwd()
files = os.listdir(currDir)
for eachPdb in files:
  if ".pdb" in eachPdb:
    #print eachPdb 
    filename = eachPdb
    os.system("mkdir %s/%s"%(currDir,filename.split(".pdb")[0]))
    os.system("cp %s/leap %s/%s"%(currDir, currDir,filename.split(".pdb")[0]))
    os.chdir("%s/%s"%(currDir, filename.split(".pdb")[0]))
    os.system("cp %s/%s ."%(currDir, filename))
    prepareInputs(filename)

Note that in the above script I also generated a prmtop file using parmed.py (*_wat_heavyH.prmtop). With this file, we can use 4 fs time step in the production MD simulations (where we only want to sample the conformations and do not care about the kinetics, which is wrong with 4fs time step) 

c. Run MD simulations with pmemd.cuda

Routinely in AMBER, we will run minimization, heating, equilibration, cooling, equilibration and final production run in NVT ensemble. Here are the input and run files I am using in collagen-integrin system.

min.in

initial minimization prior to MD 
 &cntrl
  imin   = 1,
  maxcyc = 6000,
  ncyc   = 2500,
  cut    = 12
 / 

heat360.in

200ps NPT from 300 K to 360 K
 &cntrl
  imin   = 0,
  irest  = 0,
  ntx    = 1,
  ntb    = 1,
  cut    = 10.0,
  ntc    = 2,
  ntf    = 2,
  tempi  = 0.0,
  temp0  = 360.0,
  ntt    = 3,
  gamma_ln = 1.0,
  nstlim = 100000, dt = 0.002,
  ioutfm = 1,
  ntpr = 250, ntwx = 250, ntwr = 10000
 /

md200ps360.in

200ps NPT at 360 K
 &cntrl
  imin = 0, irest = 1, ntx = 7,
  ntb = 2, pres0 = 1.0, ntp = 1,
  taup = 2.0,
  cut = 10.0, ntr = 0,
  ntc = 2, ntf = 2,
  tempi = 360.0, temp0 = 360.0,
  ntt = 3, gamma_ln = 1.0,
  nstlim = 100000, dt = 0.002,
  ioutfm = 1,
  ntpr = 250, ntwx = 250, ntwr = 10000
 /

cool300.in

200ps NPT from 360 K to 300 K 
 &cntrl
  imin   = 0,
  irest  = 0,
  ntx    = 1,
  ntb    = 1,
  cut    = 10.0,
  ntc    = 2,
  ntf    = 2,
  tempi  = 360.0,
  temp0  = 300.0,
  ntt    = 3,
  gamma_ln = 1.0,
  nstlim = 100000, dt = 0.002,
  ioutfm = 1,
  ntpr = 250, ntwx = 250, ntwr = 10000
 /

md200ps300.in

200ps NPT at 300 K
 &cntrl
  imin = 0, irest = 1, ntx = 7,
  ntb = 2, pres0 = 1.0, ntp = 1,
  taup = 2.0,
  cut = 10.0, ntr = 0,
  ntc = 2, ntf = 2,
  tempi = 300.0, temp0 = 300.0,
  ntt = 3, gamma_ln = 1.0,
  nstlim = 100000, dt = 0.002,
  ioutfm = 1,
  ntpr = 250, ntwx = 250, ntwr = 10000
 /

After the above steps, it is possible that our system is in its equilibrium at 300 K. So we will run NVT simulations for a long time (100 ns here)  

mdprod.in

100ns NVT simulations 
 &cntrl
  imin   = 0,
  irest  = 0,
  ntx    = 1,
  ntb    = 1,
  cut    = 10.0,
  ntc    = 2,
  ntf    = 2,
  tempi  = 300.0,
  temp0  = 300.0,
  ntt    = 3,
  gamma_ln = 1.0,
  nstlim = 25000000, dt = 0.004,
  ioutfm = 1,
  ntpr = 1000, ntwx = 1000, ntwr = 1000
 /

These will be running sequencially. Here is a script:

#!/bin/bash
export filename=your file name

source ~pren/amber-git/amber17/bashrc.amber16 
export CUDA_VISIBLE_DEVICES="0" #or 1 if you have a second card

$AMBERHOME/bin/pmemd.cuda -O -i min.in -o $filename\_min.out -c $filename\.rst7 -p $filename\.prmtop -r $filename\_min.rst7 
$AMBERHOME/bin/pmemd.cuda -O -i heat360.in -o $filename\_heat360.out -c $filename\_min.rst7 -p $filename\.prmtop -r $filename\_heat360.rst7 -x $filename\_heat360.nc 
$AMBERHOME/bin/pmemd.cuda -O -i md200ps360.in -o $filename\_md200ps360.out -c $filename\_heat360.rst7 -p $filename\.prmtop -r $filename\_md200ps360.rst7 -x $filename\_md200ps360.nc 
$AMBERHOME/bin/pmemd.cuda -O -i cool300.in -o $filename\_cool300.out -c $filename\_md200ps360.rst7 -p $filename\.prmtop -r $filename\_cool300.rst7 -x $filename\_cool300.nc 
$AMBERHOME/bin/pmemd.cuda -O -i md200ps300.in -o $filename\_md200ps300.out -c $filename\_cool300.rst7 -p $filename\.prmtop -r $filename\_md200ps300.rst7 -x $filename\_md200ps300.nc 

$AMBERHOME/bin/pmemd.cuda -O -i mdprod.in -o $filename\_md1.out -c $filename\_md200ps300.rst7 -p $filename\_heavyH.prmtop -r $filename\_md1.rst7 -x $filename\_md1.nc 

After the production run, we may want to delete the explicit water molecules and salts ions (NaCl) (note: other metal ions, if there exist, in your binding site should not be deleted, because they are part of the protein) in the trajectory file (*.mdcrd or *.nc), which helps us to save the hard disk storage. In this step the ptraj (or cpptraj) program will be used. 

Here is a input file named "trajin"

 parm COMP_wat.prmtop
 trajin COMP_wat_md1.nc
 strip :WAT:Na+:Cl-
 trajout COMP_wat_md1_traj.nc
Here is run command
cpptraj -i trajin

Here is a python script to automatically process the systems that production runs have been done

#!/usr/bin/env python
#delete water and NaCl 
#Chengwen Liu

import os

currDir = os.getcwd()
def delWater():
  currDir = os.getcwd()
  lines = file("filelist").readlines()
  for line in lines:
    filename = line.split("\n")[0]
    if os.path.isfile(currDir+"/"+filename+"/%s_wat_md1.out"%filename)==True :
      lines1 = file(currDir+"/"+filename+"/%s_wat_md1.out"%filename).readlines()
      if "Total wall time" in lines1[-1] and os.path.isfile(currDir+"/"+filename+"/%s_wat_md1_traj.nc"%filename)==False:
        os.chdir(currDir+"/"+filename)
        ofile = open("trajin", "w")
        ofile.write("parm %s_wat.prmtop\n"%filename)
        ofile.write("trajin %s_wat_md1.nc\n"%filename)
        ofile.write("strip :WAT:Na+:Cl-\n")
        ofile.write("trajout %s_wat_md1_traj.nc\n"%filename)
        ofile.close()
        os.system("cpptraj -i trajin")
        os.system("rm -f %s_wat_md1.nc"%filename)
  return
#Main
delWater()

d. MM-PB/GBSA calculation

We will use MMPBSA.py to estimate the binding affinity, where the entropy contribution is not computed due to the expensive cost of normal mode analysis calculations for biomolecules. (See theory preparation section tutorials for more details)

pbsa.in file

MMPBSA from JCTC paper -CW
 &general
  startframe=1, endframe=25000,
  keep_files=0, interval=50, debug_printlevel=2,
  strip_mask=:WAT:Cl-:Na+,
  /
  &gb
   igb=2, saltcon=0.100
  /
&pb
  istrng=0.1, exdi=80, indi=1.0,
  inp=2, cavity_surften=0.0378, cavity_offset=0.5692,
  fillratio=4, scale=2.0,
  linit=1000, prbrad=1.4, radiopt=1,
/

Run file

#!/bin/bash
export filename=your file name
source  /home/pren/amber-git/amber17/ambercpu/amber.sh
$AMBERHOME/bin/MMPBSA.py -O -i pbsa.in -o results_mmpbsa.dat -sp $filename\.prmtop -cp $filename\.prmtop  -rp Integrin_$filename\.prmtop

e. MM-PB/GBSA results

See the tutorial in a. Theory preparation section for how to read/understand the MMPBSA results.