Optimising charged molecules in electric fields
This page deals with the optimisation of charged molecules - in this case the ions of which ionic liquids are composed - in the presence of an electric field. The main problems are:
- When optimising in internal coordinates, rotations and translations are removed. Hence, the orientation of the electric field with respect to the standard orientation of the molecule stays fixed.
- The optimisation is best performed in cartesian coordinates to allow for the movements induced by the force the field excerts on the atoms.
- Translation of the molecule along the electric field is not desired and has to be turned off manually.
- Since there are always residual forces on the atoms with the field, the optimization never 'finishes' with a normal termination and has to be inspected manually.
The solution is to use the manual optimization kit, whose use and design is explained below. If you find any errors, please report them to F. Philippi in MSRH 601.
Getting Started
First start by downloading the FieldKit.
Then follow the procedure:
- Unzip it and put everything in a working directory on hpc of your choice. Then adjust the number of CPUs, mem, private queue etc in FIELD.pbs. Don't change the 'XXXX'.
- Make sure the 'Element_Table' file is there, too. It should look like this.
- Execute the 'distribute' script with 'bash distribute'. It asks you to enter the individual system names without file extension, in my case the names would be e.g. TFSI_cis, TFSI_trans etc. If you just want to generate new .pbs files, comment out the three lines that change 'submitter' and 'Element_Table'.
- When you're done, simply type 'done' and press return. The 'distribute' script then prepares the separate .pbs files, the element tables and a file called submitter which can be used later to submit the jobs all at once.
- What it doesn't do is adjust the atom numbers in the element tables, which has to be done manually once. While doing that, also check that the format is valid - fortran will crash otherwise.
- Check that the .xyz files are there and in the gaussian format without the atom number and comment line.
- Change the HEADER.com file according to your needs.
- Now just submit everything by typing 'bash submitter' into the command line.
If neither of us has made a mistake, then your jobs should be running now. While everything is running, take a look at the output section to see what you will get (should expect) as output.
The content of FieldKit
This section explains what all these files and scripts are doing. The FieldKit also contains a file named 'procedure' with a rough outline of what to do.
FIELD.pbs
This is probably the most important file of all, and the first one you should take a look at when aiming at changing the procedure. The XXXX dummy is replaced by the names of your systems by the script 'distribute'. I will use the file 'TFSI_cis.pbs' as an example. It begins with the usual PBS section:
#!/bin/bash #PBS -l walltime=72:00:00 #PBS -l select=1:ncpus=32:mem=124gb:avx=TRUE #PBS -N TFSI_TS #PBS -j oe
This particular one is intended to end up as a general72 job on cx1. Then the gaussian module is loaded, and the name of the file is defined:
module load gaussian INFILE=TFSI_cis
Be aware that the appropriately named files should be present, like Element_Table_TFSI_cis, as they are copied to the working directory:
cp $PBS_O_WORKDIR/$INFILE.xyz $TMPDIR/opt.xyz cp $PBS_O_WORKDIR/Center.f90 $TMPDIR cp $PBS_O_WORKDIR/Element_Table_TFSI_cis $TMPDIR/Element_Table cp $PBS_O_WORKDIR/HEADER.com $TMPDIR cd $TMPDIR
The last step in the preparation:
echo "EXECUTION IN "$TMPDIR > FORTRANLOG date >> FORTRANLOG cat /dev/null > Energies head -1 Element_Table > NATOM head -1 Element_Table >> NATOM gfortran Center.f90
Here, the following files are created:
- FORTRANLOG - which contains start time, end time, and the length of the center-of-mass vector of each iteration
- Energies - contains the energies in Hartree
- the a.out executable together with its required modules is produces by compiling Center.f90 with the GNU fortran compiler.
After that, the actual optimization takes place in the loop:
for i in {1..10}
do
./a.out >> FORTRANLOG
cat HEADER.com opt.xyz > $INFILE.com
g09 < $TMPDIR/$INFILE.com > $TMPDIR/$INFILE.log
newzmat -ichk -oxyz opt
grep -m 1 "E(" $INFILE.log | cut -c 26-39 >> Energies
done
After shifting the center of mass of the molecule to the origin of the cartesian coordinate system, HEADER.com and opt.xyz are concatenated, producing the working comfile with which gaussian is invoked. By that, opt.chk is generated (see HEADER.com), which is then converted to opt.xyz so the cycle can start again. Additionally, the Energy after the optimization is reported to 'Energies'.
Finally, everything is finalised and copied:
date >> FORTRANLOG mkdir $PBS_O_WORKDIR/$INFILE mkdir $PBS_O_WORKDIR/optimized cp $TMPDIR/$INFILE.log $PBS_O_WORKDIR/$INFILE cp $TMPDIR/FORTRANLOG $PBS_O_WORKDIR/$INFILE cp $TMPDIR/Energies $PBS_O_WORKDIR/$INFILE mv opt.xyz TFSI_cis.xyz cp $TMPDIR/TFSI_cis.xyz $PBS_O_WORKDIR/optimized
The gaussian logfile, 'FORTRANLOG' and 'Energies' are collected in separate folders in the working directory, whereas the output structures (in gaussian format, not the real format) of all the ions are collected in one folder named 'optimized'.
HEADER.com
This is the gaussian input file, but without the cartesian coordinates, which are added by the FIELD.pbs script as stated above. This is the standard form:
%nprocshared=48 %mem=246GB %chk=opt.chk # opt=(maxcycles=10,cartesian) iOP(1/28=1) field=X+50 MP2/cc-pVTZ int=ultrafine symmetry=none ELECTRIC FIELD OPTIMIZATION -1 1
The first two lines you have to manually adjust to match your FIELD.pbs script. Don't change the name of the checkpoint file unless you know what you are doing. If you want, you can preorient your files using a dipole moment vector to make the calculation converge faster. The overlay 1/28 controls which translations and rotations are used in the optimisation process. The overlay has been tested on a single methoxide anion that has been intentionally misoriented, whereas the following behaviour was observed:
- If using iop_(1/28)=0, then the Molecule stays in the input orientation. This is surprising since if, as the manual suggests, nothing is removed during coordinate transformations, then everything should be included, leading the molecule to translate. Curiously, the same behaviour is obtained with iop_(1/28)=6, but not with -2, which should equal 0 according to the manual (with -2 in gaussian 09, the optimization becomes very unstable and overshoots. In g16, the behaviour is indeed identical to 0).
- 'Removing' one coordinate by setting iop_(1/28) to one led to the molecule translating, and only rotating marginally. iop_(1/28)=1 is the desired mode of operation, since when only one coordinate is removed, it has to be the right one when (if) the translation is absent. The latter is usually the case after a few iterations, when the orientation approaches equilibrium.
- With iop_(1/28)=2, the methoxide still translates, although the rotation happens faster.
- Setting iop_(1/28) to 3,4 or 5, the desired behaviour is obtained: MeO- aligns with the electric field without traveling along it.
- It has to be pointed out here that omitting the cartesian option is not helpful at all.
Apart from iop_(1/28)=-2, the behaviour is the same for g09 and g16. Note that for transition states, iOP(1/28=0) has to be requested instead of iOP(1/28=1).
'distribute' and 'submitter'
The file 'distribute' is a bash script that asks for the names of the systems and then changes the XXXX wildcards in FIELD.pbs accordingly, writing the result into distinct pbs scripts. To easily submit the generated files, it also prepares a new bash script called 'submitter', which looks like this:
#!/bin/bash qsub FSI_cis.pbs qsub FSI_trans.pbs qsub TFSI_cis.pbs qsub TFSI_trans.pbs
By just typing 'Bash submitter' to the command line, you are submitting all the files as separate jobs. That might prove valuable if you want to change their demand separately, e.g. in my case, FSI can be run in the general queue with 62GB, whereas TFSI requires much more computational resources. If you now that your system(s) allow for it, you can consider using an array job.
The 'Element_Table'
The element table is required by the fortran centering script. Here is an example for a valid element table:
15 H 1.00 N 14.0067 O 15.999 C 12.011 F 18.998 S 32.06 X 0.0
This is actually the file 'Element_Table_TFSI_cis' produced by 'distribute'. The first line is the number of atoms, which you have to adjust by yourself. Then follow the Elements exactly as they appear in HEADER.com, with their masses given in the same row as second entry. These masses are the ones used for the center procedure and have to be provided as floats, i.e. '0.0', but not '0'. The last line is a dummy atom. This tells fortran to not read further, and has the additional advantage that dummy atoms can be used. They have no influence on the center procedure since their mass is set to zero.
Centering the Molecule
The file Center.f90 contains source code written in FORTRAN 90 that serves the purpose of subtracting the vector of the center of mass of all the atoms. For that, the masses are required, which are read from a file with the filename 'Element_Table'. The input is in the form of a file 'opt.xyz', which will be overwritten by the corrected xyz file. If no errors are encountered, then the only standard output (usually UNIT 6) is the length of the vector that is subtracted.
The output
After successful execution of the above script, the following four files will be reported into the appropriate subfolders of the working directory.
xyz file
This is a file with just the coordinates of the atoms and their element symbol. It will be located in a subfolder 'optimized'. Below is TFSI_cis.xyz as an example. Again, note the absence of the first two lines.
N -0.357366585080 -0.898171688763 0.203702564151 S 0.177178976496 -0.543087336453 1.676074559819 S 0.532894281061 -0.936854691043 -1.144888319458 C -0.028020803200 1.333610444064 1.817643351233 C -0.696437115173 -0.070243634980 -2.286534060030 F -0.170331408827 -0.026073703769 -3.532456495289 F 0.301741522370 1.732104137265 3.067927971787 O 0.635336215505 -2.296412831703 -1.696451557835 O 1.734625901504 -0.083537867534 -1.185172705530 O -0.786792557139 -1.047230419472 2.656562088502 O 1.616538018609 -0.758987320134 1.905915808875 F -1.292723619825 1.708718187068 1.582646080687 F -0.916139900149 1.194035021477 -1.888498049800 F 0.773401486267 1.988586701397 0.956411457591 F -1.867335419317 -0.711009961024 -2.351481561715
FORTRANLOG
Here is an example of how your FORTRANLOG could look like:
EXECUTION IN /var/tmp/pbs.2445418[1].cx1 Mon 4 Mar 20:07:50 GMT 2019 2.40722704 0.231841758 0.113191992 0.143129453 0.700006068 0.827584445 0.202516750 1.55257192E-02 3.18809599E-02 5.43816872E-02 1.49539197E-02 3.00093030E-04 0.132388204 4.78270315E-02 2.46396363E-02 3.36616440E-03 9.67344549E-03 2.82673282E-03 1.05692418E-02 1.56903919E-02 2.96897665E-02 Mon 4 Mar 21:57:39 GMT 2019
It gives you the starting and ending point of the optimization and the absolute value of the shift during the center procedure. The shift is relatively large at the beginning, indicating a badly or differently centered molecule, and diminishes rapidly. Also, sometimes the Value is relatively large, like 0.13 in the middle of the optimization process. This indicates that the molecule translated, and that the wrong coordinate had been removed by IOp(1/28). Since this overlay is chosen to be one, a low value (at least below 0.1 Å) shows that only the translation along the field had been removed, confirming that the procedure is working properly.
Energies
Even more important to decide whether the optimization is finished is the 'Energies' file, which should look like this:
-1351.70813031 -1351.70926443 -1351.70973841 -1351.71004072 -1351.71021741 -1351.71029994 -1351.71033094 -1351.71034684 -1351.71034954 -1351.71034951 -1351.71034951 -1351.71034951 -1351.71034951 -1351.71034951 -1351.71034951
It gives you the calculated energy at the end of each iteration. Here, the optimization is finished after the 10th cycle.
XXXX.log
Also, you will of course receive the usual gaussian log file. However, it will always terminate with an Error since there are residual forces on the atoms. Take a look at it anyway to confirm that there is nothing pathological like 'Input Errors', 'Error in internal coordinate system' or 'galloc: could not allocate memory'.
Helpful Add-Ons
Last but not least, this section contains two supplementary parts of the kit that you might find useful.
The Array Job
There is a pbs script available for array jobs of this type. It is advisable to use the array job, as long as your jobs don't have strongly differing requirements in terms of CPU and RAM usage. Please also note that this array job collects the intermediate geometries in a trajectory file.
Preorienting Molecules
It is sensible to preorient the molecules so that their electrical dipole moment is aligned with the electric field, which speeds up the calculation by doing most of the rotation. There is currently a script written in FORTRAN 90 available doing just that. To use the script, you first need the dipole moments, which are available from gaussians multipole analysis. You could, for example, use a command like
grep -e Tot= systemnumber_*_.log > dipoles
which takes the dipole moment out of your log files and writes them into a separate file. Be aware that only the last dipole moment in a logfile with normal termination is sensible to use. If you want to use a script, then it is sensible to add the number of systems to the first line. As an example may serve the following DIPOLES file:
6 FSI_cis -6.7730 -2.9602 -9.6133 Tot= 12.1265 FSI_trans -6.6079 -3.1530 -10.0511 Tot= 12.4351 FSI_TS -6.3151 -3.0996 -9.5853 Tot= 11.8898 TFSI_cis -3.3579 0.1484 -1.7588 Tot= 3.7935 TFSI_trans -2.7663 0.0002 -1.2194 Tot= 3.0231 TFSI_TS -3.5951 0.0018 -1.6231 Tot= 3.9445
The 'X=', 'Y=' and 'Z=' are removed here. You can also keep them, but have to add dummies into the read section of the script.