Tutorial:kturn tut

From biowiki
Jump to navigation Jump to search

Purpose

This page documents the standard protocol for preparing and running molecular dynamics (MD) simulations of RNA k-turn systems (and other similar small biomolecular solutes) using AMOEBA in Tinker9. It takes a starting PDB structure through system preparation, equilibration, and production MD. Umbrella sampling / pulling is not covered here.

Overview

  1. Obtain a clean starting PDB
  2. Standardize the PDB via a pdbxyzxyzpdb round-trip, which also generates the .seq file
  3. Solvate + neutralize + build a key file with Tinker-GUI
  4. Energy minimization (minimize9)
  5. Heating: 5 temperatures with strong position restraints on the solute (NVT)
  6. Restraint phase-down: gradually relax position restraints at 298 K (NVT)
  7. Production: unrestrained NPT MD

Force field: AMOEBA (amoebabio18.prm). Engine: Tinker8 and Tinker9 (dynamic9, minimize9, pdbxyz, xyzpdb).

1. PDB preparation

Source PDB

Start from a crystal / cryo-EM structure in PDB format. A good test / example system for this protocol is PDB 5FJ0 — a 27-nt RNA k-turn. Download the file in PDB format (on RCSB: Download Files → PDB Format); the direct link is files.rcsb.org/download/5FJ0.pdb. The mmCIF format will not work with pdbxyz.

Before running anything:

  • Decide the protonation states of ionizable residues (for RNA/DNA the standard assumption is deprotonated phosphates; for proteins use PROPKA or similar).
  • Remove or keep heteroatoms intentionally (crystallographic waters, buffer ions, ligands). pdbxyz will silently drop anything it doesn't recognize.
  • Decide what to do with missing atoms/residues. If the structure is incomplete, fix it externally first (e.g. with PDBFixer) before the steps below.

pdbxyz → xyzpdb round-trip

The goal of this step is to produce a PDB in the exact format that xyzpdb will emit from then on, so that any future PDB written from an MD trajectory (for visualization, diffing, or restart) is line-for-line comparable to the starting reference.

pdbxyz  myfile.pdb  -k tinker.key
xyzpdb  myfile.xyz  -k tinker.key

Where tinker.key points at the AMOEBA parameter file:

parameters /path/to/Tinker8/latest/params/amoebabio18.prm

pdbxyz outputs two files:

  • myfile.xyz — the Tinker-format coordinate file
  • myfile.seq — a sequence / base-pair map of the solute. This file is required whenever xyzpdb is run.

xyzpdb requires the .seq file to share the same prefix as the .xyz — there is no command-line option to point at a specific .seq. If you call xyzpdb myfile.xyz, Tinker looks for myfile.seq in the same directory. If you rename or move the .xyz, rename / copy the .seq to match.

Output files are not overwritten. Tinker writes to the next available numeric suffix:

myfile.pdb      (original)
myfile.pdb_2    (first xyzpdb output)
myfile.pdb_3    (next xyzpdb output)
…

Same convention for .xyz_2, .xyz_3, etc. The canonical PDB for this system is the first xyzpdb output (myfile.pdb_2). Use that — not the raw crystal PDB — as your reference structure from this point on when comparing to simulation frames.

Why do the round-trip:

  • The crystal PDB contains headers (HEADER, TITLE, SOURCE, COMPND, REMARK, MASTER, CONECT…) and metadata that xyzpdb does not emit.
  • Hydrogens added by pdbxyz get standardized names/ordering.
  • Atom numbering, chain ID handling, and record termination all match the Tinker convention after the round-trip.

Without this step, a diff between the input PDB and a later xyzpdb output is dominated by format noise rather than real coordinate differences.

2. Solvation and neutralization (Tinker-GUI)

System building is handled by cli.py in Tinker-GUI at /path/to/Tinker-GUI. The program takes a YAML file and produces a minimization-ready .xyz / .key pair, plus intermediate files in temp/.

config.yaml

Example for an RNA k-turn system with neutralizing Mg²⁺/Na⁺/Cl⁻, no added bulk salt, 12 Å buffer:

tinker_path: /path/to/Tinker8/latest/bin
amoeba_prm:  /path/to/Tinker8/latest/params/amoebabio18.prm
output_prefix: minimize
solutes:
  nucleic_acid:
    - myfile.pdb            # use the round-tripped PDB from step 1
solvent:
  name: water
ions:
  neutralizers:
    - Mg2+
    - Na+
    - Cl-
  salts:
    names:
      - Mg2+
      - Na+
      - Cl-
    concentrations:
      - 0.0
      - 0.0
      - 0.0
box:
  type: rectangular          # cubic or rectangular
  buffer: 12.0               # Å between solute and box wall
  # size: [60, 60, 60]       # if set, overrides buffer

Key fields:

  • output_prefix — the prefix for the final output files. If set to X, Tinker-GUI writes X_final.xyz and X_final.key. In the rest of this page we use minimize, giving minimize_final.xyz / minimize_final.key.
  • neutralizers — Tinker-GUI adds whichever of these are needed to zero out the net charge. For RNA (negative backbone) you'll typically need Mg²⁺ and/or Na⁺.
  • salts.concentrations — set to 0.0 for pure counter-ion neutralization. Use non-zero values (mol/L) to add bulk salt on top of neutralization.
  • buffer: 12.0 — distance from solute to the nearest box face. 12–15 Å is standard; 15 Å is safer for larger conformational changes.

Running Tinker-GUI

python /path/to/Tinker-GUI/cli.py -c config.yaml

Outputs:

  • minimize_final.xyz — solvated, neutralized starting structure
  • minimize_final.key — matching key file with box dimensions and parameter path
  • temp/ — intermediate files (solute-only xyz, water box, per-ion-addition snapshots, logs)

Rerunning: if Tinker-GUI errors or you need to change config and re-run, always clear the intermediate directory first:

rm -rf temp

Otherwise Tinker-GUI may pick up stale intermediates and silently produce a mismatched output. The same applies if you need to rebuild from scratch for any reason — delete temp/.

Sanity check

Before proceeding, verify the system is sensible:

analyze  minimize_final.xyz  -k minimize_final.key  em
  • Net charge should be 0 (within rounding). If it isn't, see below.
  • Total dipole should be finite (no NaNs).
analyze  minimize_final.xyz  -k minimize_final.key  ep
  • Every atom should have multipole and vdW parameters. Missing parameters on any atom means an unrecognized residue / atom name that pdbxyz couldn't map.

Fixing a non-zero net charge

If analyze ... em reports a non-zero net charge, either the neutralizer list in config.yaml was wrong, or the counts Tinker-GUI added don't quite cancel the solute charge. Don't re-run the full pipeline — edit the .xyz directly. Two operations cover every case:

Converting a water to an ion (when net charge is negative and you need another cation):

  1. Find a water molecule whose oxygen is far from the solute (e.g. ≥ 5 Å away).
  2. Delete both hydrogen lines of that water.
  3. Change the oxygen's atom type (last integer before the connectivity columns) to the desired ion's AMOEBA type — e.g. for Na⁺ in amoebabio18 the oxygen type (349) becomes Na⁺ type (352).
  4. Clear the oxygen line's connectivity entries (an ion has no bonds).
  5. Update the atom count on line 1 of the file to match the new total (decrease by 2 for every water → ion conversion).
  6. Do not renumber the remaining atom indices — Tinker is tolerant of gaps.

Deleting an ion (when net charge is positive and you have one more cation than needed):

  1. Delete the ion's atom line.
  2. Decrease the atom count on line 1 by 1.
  3. Do not renumber — leave the gap in the indices.

After either operation, re-run analyze ... em to confirm the net charge is now 0. (There's also a script in ~/scripts/ that does the water → Na⁺ replacement in bulk while respecting distance filters — useful when many waters need replacing to reach a target salt concentration.)

3. Key file structure

All Tinker9 MD runs (minimize9, dynamic9) read a plain-text .key file. Every key file in this protocol has the same three-part layout:

  1. Default block — identical keywords in every key file (force field, integrator, cutoffs, thermostat/barostat). Only the polar-eps value inside this block changes per step.
  2. Box axes — same three a-axis / b-axis / c-axis lines in every key file for a given system, but the numeric values come from whatever Tinker-GUI wrote into minimize_final.key for your box.
  3. Step-specific terms — differ per step: the RESTRAIN-POSITION line is present in some key files and absent in others, with varying force constant.

Default block

parameters /path/to/Tinker8/latest/params/amoebabio18.prm
integrator respa
archive

ewald
ewald-cutoff 7.0

vdw-cutoff 12.0
vdw-correction

neighbor-list
polar-predict

polar-eps <value>

thermostat bussi
barostat MonteCarlo

Brief rationale:

  • parameters …amoebabio18.prm — AMOEBA 2018 biomolecular parameter set (covers proteins and nucleic acids).
  • integrator respa — multi-timestep integrator; required for a 2 fs time step to be safe.
  • archive — write each dumped frame to .arc. Without this, Tinker9 only keeps the final frame.
  • ewald / ewald-cutoff 7.0 — real-space electrostatic cutoff for PME (long-range is handled in reciprocal space).
  • vdw-cutoff 12.0 + vdw-correction — vdW cutoff and analytic long-range correction.
  • neighbor-list — pair-list speed-up, assumes box ≳ 2 × cutoff.
  • polar-predict — reuse the previous step's induced-dipole solution as the initial guess (fewer solver iterations per step).
  • polar-eps <value> — induced-dipole convergence threshold. This value is the one entry in the default block that changes per step (see table below).
  • thermostat bussi — Bussi velocity-rescaling thermostat (correct canonical fluctuations).
  • barostat MonteCarlo — MC barostat for NPT. Inert during NVT runs, safe to leave in every key file.

Box axes (default but system-specific)

Add directly below the default block:

a-axis  XX.XXXX
b-axis  YY.YYYY
c-axis  ZZ.ZZZZ

Copy the numeric values verbatim from minimize_final.key (Tinker-GUI writes them there after solvation). The same three lines go into every key file for a given system. For a cubic box all three values are equal; for a rectangular box they differ.

If minimization isn't converging, it's possible that the box size entered is slightly too small. Increase each axis by 1-4 to prevent atoms overlapping across the periodic box boundaries.

Step-specific terms

What actually changes between the key files in this protocol: the polar-eps value inside the default block, and the presence / force constant of the RESTRAIN-POSITION line.

Key file polar-eps RESTRAIN-POSITION line
minimize.key 0.01 RESTRAIN-POSITION -1 N 50.0
heating.key 0.01 RESTRAIN-POSITION -1 N 50.0
restraint_10.key 0.01 RESTRAIN-POSITION -1 N 10.0
restraint_1.key 0.01 RESTRAIN-POSITION -1 N 1.0
restraint_0.1.key 0.01 RESTRAIN-POSITION -1 N 0.1
restraint_none.key 0.01 (omit — no restraint line)
production.key 0.001 (tunable, see Production section) (omit)

N = solute atom count (see Minimization section for the full convention — do not include added water / ions / kept crystal waters unless you explicitly want them restrained).

Always add the RESTRAIN-POSITION line (when present) with an inline comment noting what it restrains, so the key file is self-documenting:

RESTRAIN-POSITION -1 1000 50.0   # solute atoms 1-1000, FC = 50 kcal/mol/Ų

Example: complete heating.key

parameters /path/to/Tinker8/latest/params/amoebabio18.prm
integrator respa
archive

ewald
ewald-cutoff 7.0

vdw-cutoff 12.0
vdw-correction

neighbor-list
polar-predict

polar-eps 0.01

thermostat bussi
barostat MonteCarlo

RESTRAIN-POSITION -1 1000 50.0   # solute atoms 1-1000, FC = 50 kcal/mol/Ų

a-axis  62.2300
b-axis  62.2300
c-axis  62.2300

Every other key file in the protocol is built by changing only the polar-eps value and / or the RESTRAIN-POSITION line, keeping everything else identical.

4. Minimization

Performed by minimize9 with a gradient tolerance of 0.5 kcal/mol/Å, using a key file that includes a position restraint on the solute:

RESTRAIN-POSITION  -1  N  50.0

N = number of atoms in the original PDB (the solute only — the RNA). RESTRAIN-POSITION -1 N 50.0 restrains atoms 1 through N, leaving everything after atom N (added water, neutralizing ions, bulk salt) free to move. The force constant 50.0 is in kcal/mol/Ų.

If the source PDB already contained waters or ions that you kept through pdbxyz, set N to just the RNA atom count, not the RNA + original-waters count — otherwise those waters / ions are held in place too (a 1000-atom RNA plus 15 crystal waters gives a pdbxyz .xyz with 1045 atoms, but you should still set N = 1000 so the crystal waters can move).

Command:

minimize9  minimize_final.xyz  -k minimize.key  0.5

The trailing 0.5 is the gradient tolerance (RMS gradient at which minimization stops, kcal/mol/Å). 0.5 is a good default; if minimization is unusually slow or grinds without converging, loosen it to e.g. 2.0 — larger value = looser convergence = fewer iterations.

Why a restraint during minimization: without it, small clashes between water/ion placement and the solute can pull the solute geometry off the reference structure before dynamics even starts. The restraint also suppresses any collapse of the experimentally-determined conformation during the initial energy drop.

5. Heating (restrained, NVT)

Five sequential NVT runs at increasing temperatures. All use the same key file (heating.key) with RESTRAIN-POSITION -1 N 50.0 on the solute and polar-eps 0.01.

Step Temperature (K) Steps dt (fs) Total (ps)
Heating_to_10K 10 200 000 2.0 400
Heating_to_75K 75 200 000 2.0 400
Heating_to_150K 150 200 000 2.0 400
Heating_to_225K 225 200 000 2.0 400
Heating_to_298K 298 200 000 2.0 400

Command used for each temperature:

dynamic9  pre_production.xyz  -k heating.key  200000  2.0  10.0  2  <T>

Argument order: nsteps dt_fs write_interval_ps ensemble(2=NVT) T_K. NVT does not take a pressure argument — do not append 1.0 at the end.

The output trajectory pre_production.arc accumulates across all five heating steps (Tinker9 appends to an existing .arc). Total heating time: 2 ns across 5 temperature rungs.

Why this schedule:

  • Starting at 10 K avoids the initial velocity kick that would shake apart a minimized-but-cold system.
  • Going straight to 298 K routinely produces "induced dipole not converging" errors in AMOEBA because of transient bad contacts. Gradual heating in 5 rungs drains that risk.
  • The strong restraint (FC = 50) keeps the solute essentially frozen while solvent and ions thermalize.

6. Restraint phase-down (NVT, 298 K)

Four sequential NVT runs at 298 K, stepping the solute position-restraint force constant down:

Step FC (kcal/mol/Ų) Steps Total (ps)
Restraint_10 10.0 200 000 400
Restraint_1 1.0 200 000 400
Restraint_0.1 0.1 200 000 400
Restraint_None — (no restraint line) 200 000 400

Each step uses a separate key file (restraint_10.key, restraint_1.key, restraint_0.1.key, restraint_none.key) that differs only in the RESTRAIN-POSITION line (absent in the last one).

Command (NVT, no pressure argument):

dynamic9  pre_production.xyz  -k restraint_<FC>.key  200000  2.0  10.0  2  298

Why step down rather than release at once: an abrupt release at FC = 50 → 0 can cause a sudden "spring release" where the solute snaps to a lower-energy conformation that is different from the experimental reference. A 50 → 10 → 1 → 0.1 → 0 sequence lets the solute relax gradually while solvent / ion distribution tracks it.

After this stage the system is fully equilibrated at 298 K, unrestrained, at the constant-volume box from solvation. 1.6 ns total across the 4 steps.

7. Production (NPT)

The production run is NPT at 298 K, 1 atm, unrestrained:

  • barostat MonteCarlo, thermostat bussi
  • integrator respa, 2 fs timestep
  • polar-eps — tunable, see below

Starting frame: extract the final frame of the pre-production trajectory into production.xyz:

xyz_linecount=$(wc -l < pre_production.xyz)
tail -n "$xyz_linecount" pre_production.arc > production.xyz

Then launch production (NPT, so a pressure argument is required):

dynamic9  production.xyz  -k production.key  5000000  2.0  10.0  4  298  1.0

Argument order (NPT): nsteps dt_fs write_interval_ps ensemble(4=NPT) T_K P_atm.

Production length is case-dependent. 5 000 000 steps × 2 fs = 10 ns is an arbitrary default that's fine for a first pass; adjust nsteps based on the system and the scientific question. Short equilibration-like checks can be 5 ns; publication-quality sampling for slow degrees of freedom may need 50–100 ns or more.

polar-eps tradeoff

polar-eps sets the convergence threshold for the induced-dipole solver each step. It directly trades simulation speed against energy / force accuracy.

polar-eps Relative speed Notes
0.01 fastest Often sufficient for structural observables and general equilibration
0.001 medium Reasonable middle ground for routine production
0.0001 slow High accuracy; use when energetics / forces matter (e.g. free energy work)
0.00001 slowest Very precise; rarely needed for structural work

More zeros = tighter convergence = more solver iterations per step = slower. Pick based on the tradeoff: 0.01 is often good enough, 0.0001 is very precise. Default to something around 0.001 unless you know you need tighter.

Why NPT for production: the box dimensions from the solvation step correspond to a density that is close to but not exactly 1.0 g/cc at 298 K under the AMOEBA potential. NPT lets the box settle to the correct density. Skipping this and running NVT with the as-built box can leave the system slightly compressed or stretched, which biases structural observables.

8. Running on a GPU

All Tinker9 MD above runs on a single GPU. Select which GPU before launching:

export CUDA_VISIBLE_DEVICES=0

The integer is the GPU index on that node. Change it (1, 2, …) to select a different GPU on multi-GPU nodes.

Typical launch on a GPU node:

ssh nodeXXX
cd /path/to/project
export CUDA_VISIBLE_DEVICES=0
nohup bash masterscript.sh > masterscript.log 2>&1 &

Notes and common pitfalls

  • Induced dipole not converging during heating almost always means the starting structure has a bad contact. Re-check minimization went to a reasonable gradient, and that the solvation step didn't place a water atop a solute atom. You can temporarily add USOLVE-CUTOFF 0.0 to the heating key if convergence is marginal; remove for production.
  • RESTRAIN-POSITION force constant is in kcal/mol/Ų, not kcal/mol. Don't change the numbers without understanding the scale.
  • Restarting a crashed run: Tinker9 resumes from .dyn if present. To restart from scratch, remove both .dyn and .arc (otherwise the new trajectory appends to the old one).
  • Stopping a running MD cleanly: touch pre_production.end (or production.end). Tinker9 checks for this file each frame and stops gracefully, leaving a valid .dyn to resume from.
  • Always rm -rf temp before re-running Tinker-GUI.

Example masterscript

A reference Bash script that runs minimization → heating → restraint phase-down → production using the key-file and convention described above. Edit TINKER_BIN and the parameter path at the top to match your installation.

#!/usr/bin/env bash
set -euo pipefail

# -------------------------------------------------------------------
# Configuration
# -------------------------------------------------------------------
TINKER_BIN="/path/to/Tinker9/latest/build"
AMOEBA_PRM="/path/to/Tinker8/latest/params/amoebabio18.prm"

COUNT_FILE="./minimize_final.xyz"
INIT_KEY="./minimize_final.key"

if [[ ! -f "$COUNT_FILE" ]]; then
    echo "Error: $COUNT_FILE not found." >&2; exit 1
fi
if [[ ! -f "$INIT_KEY" ]]; then
    echo "Error: $INIT_KEY not found." >&2; exit 1
fi

# N = number of solute atoms in the original PDB (from the solute-only xyz
# produced by Tinker-GUI in temp/). Adjust manually if you want to restrain
# a different range.
first_line="$(head -n 1 ./temp/minimize.xyz)"
N="$(echo "$first_line" | grep -oE '[0-9]+' | head -n 1)"
if [[ -z "$N" ]]; then
    echo "Error: could not read atom count from ./temp/minimize.xyz" >&2
    exit 1
fi

# -------------------------------------------------------------------
# Pull a-axis / b-axis / c-axis lines from the starting key file
# -------------------------------------------------------------------
get_line_or_empty() {
    local key="${1}"
    grep -i -E "^[[:space:]]*$key\b" "$INIT_KEY" | head -n1 || true
}
A_AXIS="$(get_line_or_empty 'a-axis')"
B_AXIS="$(get_line_or_empty 'b-axis')"
C_AXIS="$(get_line_or_empty 'c-axis')"

append_axes() {
  local outfile="$1"
  [[ -n "$A_AXIS" ]] && echo "$A_AXIS" >> "$outfile"
  [[ -n "$B_AXIS" ]] && echo "$B_AXIS" >> "$outfile"
  [[ -n "$C_AXIS" ]] && echo "$C_AXIS" >> "$outfile"
}

# -------------------------------------------------------------------
# Key file generator
# -------------------------------------------------------------------
write_key_file() {
  local outfile="$1"
  local restraint_val="$2"
  local polar_eps="$3"
  local add_restrain="$4"

  cat > "$outfile" <<EOF
parameters $AMOEBA_PRM
integrator respa
archive

ewald
ewald-cutoff 7.0

vdw-cutoff 12.0
vdw-correction

neighbor-list
polar-predict

polar-eps $polar_eps

thermostat bussi
barostat MonteCarlo

EOF

  if [[ "$add_restrain" == "yes" && -n "$restraint_val" ]]; then
    echo "RESTRAIN-POSITION -1 $N $restraint_val" >> "$outfile"
  fi
  append_axes "$outfile"
}

# filename              FC     polar-eps  restraint?
write_key_file "heating.key"         "50.0"  "0.01"    "yes"
write_key_file "minimize.key"        "50.0"  "0.01"    "yes"
write_key_file "restraint_10.key"    "10.0"  "0.01"    "yes"
write_key_file "restraint_1.key"     "1.0"   "0.01"    "yes"
write_key_file "restraint_0.1.key"   "0.1"   "0.01"    "yes"
write_key_file "restraint_none.key"  ""      "0.01"    "no"
write_key_file "production.key"      ""      "0.001"   "no"    # tune polar-eps to taste

# -------------------------------------------------------------------
# Minimization
# -------------------------------------------------------------------
echo "--- Energy minimization ---"
"$TINKER_BIN/minimize9" minimize_final.xyz -k minimize.key 0.5

cp minimize_final.xyz pre_production.xyz

# -------------------------------------------------------------------
# Heating (NVT, no pressure arg)
# -------------------------------------------------------------------
echo "--- Heating (5 rungs) ---"
for T in 10 75 150 225 298; do
    "$TINKER_BIN/dynamic9" pre_production.xyz -k heating.key \
        200000 2.0 10.0 2 "$T"
done

# -------------------------------------------------------------------
# Restraint phase-down (NVT, 298 K)
# -------------------------------------------------------------------
echo "--- Restraint phase-down ---"
for FC in 10 1 0.1 none; do
    "$TINKER_BIN/dynamic9" pre_production.xyz -k "restraint_${FC}.key" \
        200000 2.0 10.0 2 298
done

# -------------------------------------------------------------------
# Production (NPT, with pressure arg)
# -------------------------------------------------------------------
echo "--- Production ---"
xyz_linecount=$(wc -l < pre_production.xyz)
tail -n "$xyz_linecount" pre_production.arc > production.xyz

"$TINKER_BIN/dynamic9" production.xyz -k production.key \
    5000000 2.0 10.0 4 298 1.0

echo "--- Done ---"