Talk:Mod:Hunt Research Group/ChemShell

Input for ChemShell

To run a QM/MM optimisation, three input files are required:

 opt.chm
 cluster.pun
 ff.dat

ff.dat : the forcefield in ChemShell format - The ff.dat file generated in MolCluster needs to be edited!
opt.chm: ChemShell input file.
cluster.pun: coordinate, atom_charges and connectivity records

All input files are explained in the Explaining ChemShell files page from the Hunt Research Group wiki home page. The ff.dat is also explained in the Force Field Parameters page.

as part of generating the "cut" cluster using MolCluster you will have generated a range files, e.g. check the directory "cluster_1", and the files you need now are cluster_1.chm, ff.dat and opt.hm

note that MolCluster has not generated cluster.pun, but has generated cluster_n.chm, the coordinates file, which is used to generate cluster.pun

login to CX1 and copy your cluster directories over then ...

To generate cluster.pun from cluster_n.chm, load ChemShell and then run cluster_n.chm directly on the cx1 login shell

module load chemshell mpi
  chemsh.x cluster_n.chm

successful result will generate a file cluster.pun and the screen info will look like this:

Initialising ChemShell 3.5.0 on linux
c_create/======================================== Tstep:    0.1 Ttot:    0.1 ==
ChemShell exiting code 0

UPDATE June 2017

Giuseppe has installed chemshell onto my HPC account, so I'm not using the chemshell code on the HPC available to everyone.

To generate the cluster.pun file copy the cluster_1.chm file generated by MolCluster to the HPC. Create a new file called cluster_pun_generate and paste the following into it.

export TCLROOT=/work/$USER/tcl/tcl8.4.20___gcc_4.4.7/
export TCLLIBPATH=/work/$USER/ChemShell/chemsh-3.5.0___intel-suite__2016.3___tcl8.4.20___gcc_4.4.7/tcl/
export TCL_LIBRARY=/work/$USER/tcl/tcl8.4.20___gcc_4.4.7/lib
export LD_LIBRARY_PATH=$TCL_LIBRARY:$LD_LIBRARY_PATH

/work/klw14/ChemShell/chemsh-3.5.0___intel-suite__2016.3___tcl8.4.20___gcc_4.4.7/bin/chemsh.x cluster_*.chm

The lines in this script are specific to my file set up so make sure you are calling the correct files.

It is then important to give yourself permission to run the script. "chmod u+x cluster_pun_generate"

to run the script and generate the cluster.pun file type "./cluster_pun_generate"

UPDATE END

NB: check the connectivity in the cluster.pun

add some more information on this

if you have only water molecules there should be no issue, but if you have a solvated species spurious "connectivity" may occur.

in the CuSO4 example ....

search for /conn lines to remove, total number to change

EXAMPLE SECTION OF FILE

edit the opt.chm according to the system under study
open opt.chm and edit the 'qm_theory' options (nproc, scfconv, g98_mem, charge, multiplicity, basis set, method etc )

edit the number of nproc so it is one less than the number called by PBS mocluster generates defaults, maxcyc should relate to the number of degrees of freedom, so g09 suggests 3N+20 so for 51 atoms =173) memory is in bytes 1,000,000 is 1 MB.

There are 2 different pages to look at for finding QM keywords. 1 page is a general page for all QM_theory that can be used in ChemShell 1 and the other is specific to gaussian 2. I can't seem to provide links!

1. can be found by looking on the ChemShell user manual homepage and clicking QM interfaces under the Energy/Gradient Evaluators heading on the left of screen.

2. can be found by clicking gaussian, after following step 1.

you will need to add conn and mxexcl options manually to this file after the mm_theory.

mxexcl depends on the QM region, in the chemshell manual this is "Allocation parameter for excluded atom list, may need to be increased for qm/mm calculations with a large qm region" please refer ChemShell manual for more details link

chemshell has a list of all the atoms, and those in the QM region need to be excluded from the MM computation this number needs checking!! This relates to the number ...

conn=cluster.pun tells chemshell to read the connectivity from the cluster.pun file
for our example you will need to change ...

    qm_theory=gaussian : { nproc=15 maxcyc=200 scfconv=5 basis=631gdp g98_mem=640000000 charge=0 mult=2 hamiltonian=b3lyp } \
     mm_theory=dl_poly : { mm_defs=ff.dat \
     conn=cluster.pun \
     mxexcl=500 \

NB: please make sure that there is no space left after the backslash in every line. If there is any space after the '\', the job will be terminated

the submit script , submit_opt.sh, to run ChemShell optimisation is here link alternatively if ChemShell is installed on your HPC account you can use the submission script here

this script submit the job, note that only the xx file is redirected while the job is running all other files are only copied over at the termination add a comment re maxcycle being changed as wall time is hard, it will kill the job cuso4+water QM +x active +y frozen a maxcycl of x and wall time of y are a good option

don't forget different ques have different waltzes, and the more processors you use the "more" time you have

to submit the job

qsub submit_opt.sh

the ChemShell optimisation creates as set of checkpoint files, gaussian files and 'path' files along with the output 'opt.out'
load 'path_active.xyz' in VMD to follow the optimisation

To restart a job

rename the 'op.out' to 'opt.out-n' (n=1,2,3..), else the previous opt.out will be overwritten and will loose the data. Please maintain the format as 'opt.out-n', since the python script to analyse the data reads this file format

open opt.chm

increase maxcyle at the end of the file and add 'restart = yes \' command as the second last line

     list_option = full \
     maxcycle = 1500 \
     dump = 1 \
     restart = yes \
     result = cluster_opt.pun

edit the submit script to read the checkpoint files before submitting the job link

Analysis of the ChemShell optimisation

A python utility has been developed by Vincent to extract the various contributions to the total QM/MM energy, atom-atom distances and other parameters from the ChemShell output

Among the files generated 'n_Cu_OW_first_solvation_shell_init_and_final_dist.txt' lists the number of each of the water oxygens in the first salvation shell (here for the first salvation shell of Cu along with the distance of each of the Ow from Cu) and 'n' is the cluster number link

To trace back the particular water molecule in the 'n_Cu_OW_first_solvation_shell_init_and_final_dist.txt' to the DL_POLY HISTORY file, first map it to the Ow atom number in the opt.chm
To map the Ow number to that in opt.chm, go to 'active_atoms' in opt.chm

active_atoms = { 1 2 3 4 5 6 7 8 9 13 14 15 16 17 18 31 32 33 52 53 54 55 56 57 58 59 60 61 62 63 67 68 69 73 74 75 76 77 78 79 80 81 85 86 87 88 89 90

bring the curser to '{'
say for example the Ow number from the 'n_Cu_OW_first_solvation_shell_init_and_final_dist.txt' is 163, type '163' and press 'w'
It will give the Ow number in opt.chm (e.g365). to go back to '{', enter163 and press 'b'
In the cluster folder has 'atom_no_mapping.txt' created by MolCluster, which contain a list of 'orig_atom_no' and ' new_atom_no'. 'orig_atom_no' is the number in the HISTORY file and 'new_atom_no' is the corresponding atom number in the opt.chm
open 'atom_no_mapping.txt' and map the atom number '365' to 'orig_atom_no' list.
e.g if the 'orig_atom_no' is '643', use '643' in the script to draw the path of the centre of mass of a molecule throughout an animation link