Tutorial:kturn tut

From biowiki
Revision as of 16:57, 22 April 2026 by Brendan (talk | contribs)
Jump to navigation Jump to search

Purpose

This page documents the standard protocol for preparing and running molecular dynamics (MD) simulations of RNA k-turn systems (and other similar small biomolecular solutes) using AMOEBA in Tinker9. It takes a starting PDB structure through system preparation, equilibration, and production MD. Umbrella sampling / pulling is not covered here; see the umbrella sampling / REUS pages for that.

The protocol is implemented by ~/scripts/base_masterscript.sh (copied into each project as masterscript.sh). This page describes what that script does, the steps before it runs, and the reasoning behind each choice.

Overview

  1. Obtain a clean starting PDB
  2. Standardize the PDB via a pdbxyzxyzpdb round-trip
  3. Solvate + neutralize + build a key file with Tinker-GUI/cli.py
  4. Energy minimization (minimize9)
  5. Heating: 5 temperatures with strong position restraints on the solute (NVT)
  6. Restraint phase-down: gradually relax position restraints at 298 K (NVT)
  7. Production: unrestrained NPT MD

Force field: AMOEBA (amoebabio18.prm). Engine: Tinker9 (dynamic9, minimize9).

1. PDB preparation

Source PDB

Start from a crystal / cryo-EM structure in PDB format. Before running anything:

  • Decide the protonation states of ionizable residues (for RNA/DNA the standard assumption is deprotonated phosphates; for proteins use PROPKA or similar).
  • Remove or keep heteroatoms intentionally (crystallographic waters, buffer ions, ligands). pdbxyz will silently drop anything it doesn't recognize.
  • Decide what to do with missing atoms/residues. If the structure is incomplete, fix it externally first (e.g. with PDBFixer) before the steps below.

pdbxyz → xyzpdb round-trip

The goal of this step is to produce a PDB in the exact format that xyzpdb will emit from then on, so that any future PDB written from an MD trajectory (e.g. for visualization, diffing, or restart) is line-for-line comparable to the starting reference.

pdbxyz  input.pdb  -k tinker.key
xyzpdb  input.xyz  -k tinker.key

Where tinker.key points at the AMOEBA parameter file:

parameters /home/yw24267/programs/tinker/Tinker8/2501/params/amoebabio18.prm

The resulting input.pdb (overwriting the input, so back up the original) is the canonical PDB for this system. Use this as your reference structure from this point on — not the crystal PDB — when comparing to simulation frames.

Why do the round-trip:

  • The crystal PDB contains headers (HEADER, TITLE, SOURCE, COMPND, REMARK, MASTER, CONECT…) and metadata that xyzpdb does not emit.
  • Hydrogens added by pdbxyz get standardized names/ordering.
  • Atom numbering, chain ID handling, and record termination match the Tinker convention after the round-trip.

Without this step, a diff between the input PDB and later xyzpdb output is dominated by format noise rather than real coordinate differences.

2. Solvation and neutralization (Tinker-GUI)

System building is handled by cli.py in Tinker-GUI. The CLI takes a YAML config and produces a minimized-ready .xyz / .key pair plus intermediates in temp/.

config.yaml

Example for an RNA k-turn system with neutralizing Mg²⁺/Na⁺/Cl⁻, no added bulk salt, 12 Å buffer:

tinker_path: /home/yw24267/programs/tinker/Tinker8/2501/bin
amoeba_prm:  /home/yw24267/programs/tinker/Tinker8/2501/params/amoebabio18.prm
output_prefix: minimize
solutes:
  nucleic_acid:
    - myrna.pdb             # use the round-tripped PDB from step 1
solvent:
  name: water
ions:
  neutralizers:
    - Mg2+
    - Na+
    - Cl-
  salts:
    names:
      - Mg2+
      - Na+
      - Cl-
    concentrations:
      - 0.0
      - 0.0
      - 0.0
box:
  type: rectangular          # cubic or rectangular
  buffer: 12.0               # Å between solute and box wall
  # size: [60, 60, 60]       # if set, overrides buffer

Key fields:

  • output_prefix: minimize — the downstream masterscript expects minimize_final.xyz / minimize_final.key, so keep this as minimize.
  • neutralizers — the CLI adds whichever of these are needed to zero out the net charge. For RNA (negative backbone) you'll typically need Mg²⁺ and/or Na⁺.
  • salts.concentrations — set to 0.0 for pure counter-ion neutralization. Use non-zero values (mol/L) to add bulk salt on top of neutralization.
  • buffer: 12.0 — distance from solute to the nearest box face. 12–15 Å is standard; 15 Å is safer for larger conformational changes.

Running the CLI

python /home/bae969/programs/Tinker-GUI/cli.py -c config.yaml

Outputs:

  • minimize_final.xyz — solvated, neutralized starting structure
  • minimize_final.key — matching key file with box dimensions and parameter path
  • temp/ — intermediate files (solute-only xyz, water box, per-ion-addition snapshots, logs). Do not delete temp/minimize.xyz — the masterscript reads it to determine how many solute atoms to restrain.

Sanity check

Before proceeding, verify the system is sensible:

analyze  minimize_final.xyz  -k minimize_final.key  em
  • Net charge should be 0 (within rounding). If not, the neutralizer list in config.yaml needs adjustment.
  • Total dipole should be finite (no NaNs).
analyze  minimize_final.xyz  -k minimize_final.key  ep
  • Every atom should have multipole and vdW parameters. Missing parameters on any atom means an unrecognized residue / atom name that pdbxyz couldn't map.

3. Minimization

Performed by minimize9 with a gradient tolerance of 0.5 kcal/mol/Å using a key file that includes a position restraint on the solute:

RESTRAIN-POSITION  -1  N  50.0

where N is the number of solute atoms (read by the masterscript from temp/minimize.xyz). The restraint keeps the solute near its starting coordinates while water and ions relax around it. The force constant 50.0 is in kcal/mol/Ų.

The exact command run by the masterscript:

minimize9  minimize_final.xyz  -k minimize.key  0.5

Why a restraint during minimization: without it, small clashes between water/ion placement and the solute can pull the solute geometry off the reference structure before dynamics even starts. The restraint also suppresses any collapse of the experimentally-determined conformation during the initial energy drop.

4. Heating (restrained, NVT)

Five sequential NVT runs at increasing temperatures, all with the same key file (heating.key, FC = 50 kcal/mol/Ų on all solute atoms, polar-eps 0.01):

Step Temperature (K) Steps dt (fs) Total (ps)
Heating_to_10K 10 200 000 2.0 400
Heating_to_75K 75 200 000 2.0 400
Heating_to_150K 150 200 000 2.0 400
Heating_to_225K 225 200 000 2.0 400
Heating_to_298K 298 200 000 2.0 400

Command used for each:

dynamic9  pre_production.xyz  -k heating.key  200000  2.0  10.0  2  <T>  1.0

Argument order: nsteps dt_fs write_interval_ps ensemble(2=NVT) T_K P_atm.

The output trajectory pre_production.arc accumulates across all five heating steps (Tinker9 appends to an existing .arc). Total heating time: 2 ns across 5 temperature rungs.

Why this schedule:

  • Starting at 10 K avoids the initial velocity kick that would shake apart a minimized-but-cold system.
  • Going straight to 298 K routinely produces "induced dipole not converging" errors in AMOEBA because of transient bad contacts. Gradual heating in 5 rungs drains that risk.
  • The strong restraint (FC = 50) keeps the solute essentially frozen while solvent and ions thermalize.

5. Restraint phase-down (NVT, 298 K)

Four sequential NVT runs at 298 K, each stepping the solute position-restraint force constant down:

Step FC (kcal/mol/Ų) Steps Total (ps)
Restraint_10 10.0 200 000 400
Restraint_1 1.0 200 000 400
Restraint_0.1 0.1 200 000 400
Restraint_None 200 000 400

Each step uses a separate key file (restraint_10.key, restraint_1.key, restraint_0.1.key, restraint_none.key) that differs only in the RESTRAIN-POSITION line (absent in the last one).

Command:

dynamic9  pre_production.xyz  -k restraint_<FC>.key  200000  2.0  10.0  2  298  1.0

Why step down rather than release at once: an abrupt release at FC = 50 → 0 can cause a sudden "spring release" where the solute snaps to a lower-energy conformation that is different from the experimental reference. A 50 → 10 → 1 → 0.1 → 0 sequence lets the solute relax gradually while solvent/ion distribution tracks it.

After this stage the system is fully equilibrated at 298 K, unrestrained, at the constant-volume box from solvation. 1.6 ns total across the 4 steps.

6. Production (NPT)

The production run is NPT at 298 K, 1 atm, unrestrained, with a tighter polarization convergence:

  • polar-eps 0.0001 (vs 0.01 during equilibration)
  • barostat MonteCarlo, thermostat bussi
  • integrator respa, 2 fs timestep

Starting frame: the masterscript extracts the final frame of the pre_production.arc trajectory (the last xyz_linecount lines) into production.xyz, then launches production from there:

tail -n "$xyz_linecount" pre_production.arc > production.xyz
dynamic9  production.xyz  -k production.key  20000000  2.0  10.0  4  298  1.0

Argument order (NPT): nsteps dt_fs write_interval_ps ensemble(4=NPT) T_K P_atm.

20 000 000 steps × 2 fs = 40 ns production. Frames are written every 10 ps (4 000 frames total). Adjust nsteps for the system / question — 20–100 ns is typical.

Why NPT for production: the box dimensions from the solvation step correspond to a density that is close to but not exactly 1.0 g/cc at 298 K. NPT lets the box settle to the correct density under the AMOEBA potential. Skipping this and running NVT with the as-built box can leave the system slightly compressed or stretched, which biases structural observables.

Why tighter polar-eps: 0.01 is fine for equilibration (faster, noise is averaged out over the short run). For production the induced-dipole error should be as small as practical because it enters every observable; 1e-4 is the routine choice and 1e-5 is worth it if energetics are the main observable.

7. Running the whole pipeline

Copy ~/scripts/base_masterscript.sh into the project directory as masterscript.sh. It assumes:

  • minimize_final.xyz and minimize_final.key are present in the current directory (outputs of step 2).
  • temp/minimize.xyz is present (so N = solute atom count can be read).
  • Tinker9 binaries are at /home/yw24267/programs/tinker/Tinker9/latest/build (edit TINKER_BIN if yours differ).
  • The correct GPU is selected via CUDA_VISIBLE_DEVICES.

Typical launch on a GPU node:

ssh nodeXXX
cd /path/to/project
export CUDA_VISIBLE_DEVICES=0
nohup bash masterscript.sh > masterscript.log 2>&1 &

What the script generates on the fly (overwriting each run):

  • heating.key, minimize.key (FC = 50)
  • restraint_10.key, restraint_1.key, restraint_0.1.key
  • restraint_none.key, production.key

All derived from minimize_final.key — it reads the a-axis / b-axis / c-axis / pme-grid lines from there and appends them, then adds or omits the RESTRAIN-POSITION line and sets the appropriate polar-eps.

Summary of wall-clock and file sizes

Stage MD time Runtime (rough, single GPU) Output
Minimization minutes minimize_final.xyz_2
Heating (5 × 400 ps) 2.0 ns ~few hours pre_production.arc (appended)
Restraint phase-down 1.6 ns ~few hours pre_production.arc (appended)
Production (20 M steps) 40 ns ~1–2 days production.arc

Runtime scales with system size and GPU generation; the numbers above are order-of-magnitude for a ~700-atom solute in a ~60 Å water box on a modern GPU.

Notes and common pitfalls

  • Induced dipole not converging during heating almost always means the starting structure has a bad contact. Re-check minimization went to a reasonable gradient, and that the solvation step didn't place a water atop a solute atom. You can temporarily add USOLVE-CUTOFF 0.0 to the heating key if convergence is marginal; remove for production.
  • Net charge non-zero after solvation means the neutralizers list in config.yaml didn't cover the real net charge. Re-run with the right ion set.
  • RESTRAIN-POSITION force constant is in kcal/mol/Ų, not kcal/mol. The masterscript embeds this unit convention; do not change the numbers without understanding the scale.
  • Do not delete temp/ until after the masterscript has read temp/minimize.xyz. If you've deleted it, re-running Tinker-GUI/cli.py regenerates it.
  • Restarting a crashed run: Tinker9 resumes from .dyn if present. To restart from scratch, remove both .dyn and .arc (otherwise the new trajectory appends to the old one).
  • Stopping a running MD cleanly: touch pre_production.end (or production.end). Tinker9 checks for this file each frame and stops gracefully, leaving a valid .dyn to resume from.

See also