Software:GB Param
Pavel
crsparam.f int tinker-bat
Bradley
http://biomol.bme.utexas.edu/wiki/index.php/Research:Dna#GAY-BERNE_PARAMETERIZATION /home/others/hamiltonba/dev/crs-param-cpp/trunk
Brad has prgroams to generating all atom dimer configurations at various distances, and at each distance the molecule is rotated about the symmetry axis to get a Boltzmann average all-atom energy for that distance.
/home/other/hamiltonab/nuc/vdw2
He has one script for each configuration, face, t-shape, and cross.
beside rotating, face-face also need to average flipping the molecules. For nonbenzene molecule e.g. base ring, the flipped one can not be sampled by rotation. xbase may be newer than base.pl
Kelly
This is based on Brad's work /home/others/kstanton/dev/crs-param-cpp/trunk
Required files:
The parameterization program: coarse_grain.exe
The Config file: CrsParamConfig.ini
The post parameterization plot script: param_results_plot.txt
In order to check out the parameterization source compile the program and run it:
1. Check out the source using subversion: svn co svn+ssh://Bme-Earth/subversion/crs-param-cpp other useful commands include: svn list svn+ssh://Bme-Earth/subversion which will give you a look a the directory structure of the subversion on Bme-Earth
2. Run make on the trunk directory. This should compile the source
3. Edit the config file CrsParamConfig.ini and make any changes. It is important that the correct directory for Lenard Jones data is provided if running from weighted Lendard Jones data, or the correct archive file if running from xyz archive data.
3.1 Make sure that the lenard jones data is present in the specified directory and that the energy versus distance file with the name specified in the config file is also in that directory. If you would like the results plot after the simulation make sure that param_results_plot.txt is located in the same directory as it will get run on completion of parameterization.
4. Run the coarse_grain.exe program to obtained optimized paramters for the structures provided.
Program Flow
1. Configs are initialized from the config file
2. The archive file is split into xyz files
3. An array (vector container object) of molecule objects is made a molecule object represents an xyz file that has a pair of geometries to test energy values.
4. Each molecules object is initialized from the xyz files
5. Analyze is called to get the all atom energy
5.1 Alternatetively instead of using an archive, the Energy versus distance data can be obtained by using Lenard Jones Data.
6. xyzeul is called to get the Gay Berne coordinates and angles
7. All of this information is stored in the molecules object.
8. The array of molecules is then compared against the outlier cutoff energy and all molecule pairs with too large an energy are discarded
9. The optimizer is then called passing a minimization function to it and the scaled initial guess parameters (guesses and scales from the config file).
10. Minimization is performed on a least squared over all geometries basis. Least squared difference between the Gay Berne energy and the all atom energy over all the geometries. Egb (GB calculation) is invoked from a molecules method that calculates the gb energy at the current parameters on the fly. Scaling is applied within the optimizer as well.
11. The rms is calculated and the final parameters are sent to std out.
Config file
You must specify initial guesses for all the Gay Berne parameters. These are doubles.
Ex. dInitE0 = 0.5
You can specify not to optimize a parameter by providing the optimize flag a boolean false (0)
Ex. bFindE0Flag = 0 #don’t optimize this
Ex. bFindE0Flag = 1 #optimize this
strLJEnergyFile
strLJEnergyDir
If you are using Lenard Jones data from for example Brads program, specify the energy versus distance information in a file specified as the strLJEnergyFile in the config file. This file must reside in a directory specified by strLJEnergyDir in the config file. This same directory must also contain corresponding xyz files named using thier atomic separation distance. EX: d.10.9.xyz where 10.9 is the separation distance in angstroms. The format of the enery versus distance file is tab delimited distance then energy with each line separated by a carraige return.
The arc file is the archive of geometries to be tested (if testing using an archive)
The tinker key is needed so that analyze knows to use amoeba or whatever all atom dictionary you want.
Scaling are scaling factors for the optimizer.
Ex. dScaleDw = 7.0e1
Fminimum is a optimizer parameter (I don’t know what this does)
Fgrdmin is the minimum gradient goal for the optimizer. It will stop when it reaches this.
fDelta is the size of the delta value used in calculating the numerical gradients with repect to the parameters.
dOutlierEnergyThreshold is the cutoff energy of the all atom model for excluding a given geometry as an outlier
bUseLJdata: 1 specifies to use included Lenard Jones Data in the Format required
0 specifies to obtain parameters using analyze and an archive file (this requires analyze in the same directory)
Needed to compile and run
Source files (.c, .cpp, and .h)
Libtinker.a
Makefile
Needed to run
Executable
Config file
Xyzeul (no longer required)
Tinker.key
Amoeba
Analyze (only if not using Lenard Jones Data)
Specified archive file (only if not using Lenard Jones Data)
Lenard Jones data directory with correct file (only if using Lenard Jones Data)
For output graphs (the Gbvsallatom plot script is also needed)
Needed to debug
Fortran and C++ source files
Gdb
Eclipse (technically not required but manually using gdb without an IDE interface can be time consuming)
Compiled code with –g option on (should be turned on in the make file)
First need to compile libtinker.a with –g then compile the parameterization code with –g
This is currently on in the makefile and the libtinker.a in the crs-param-cpp directory is already compiled with -g
Scripting
Currently all of the input to the parameterization program comes from either the config file or the archive file. This means that perl scripts etc. must modify the config file to specify a different archive or initial guesses. Because it may be useful to be able to specify initial guesses or archive files on the command line, this may be added.
All output is sent in a stream to stdout with the exception of a Gb vs all atom energy vs distance profile plot that is output as well.
Output for long jobs can be sent to a file using the “>” command
Ex. Course_grain.exe > output.txt
Future Directions
Integration with brads RNA program to obtain Gay Berne values for biologically important molecules. (Done)
Fortran 90 ?