Multidimensional Scans of Internal Coordinates
The potential energy surface of a molecule is a function in 3N-6 dimensions. Thus, the PES is virtually impossible to obtain for reasonably sized molecules and even more impossible to interpret. Sometimes a subset of the PES is sufficient, with all the other degrees of freedom being chosen as to minimize the energy. 2D scans, which can be easily visualized by contour plots, are usually a sensible choice.
Introduction
A simple 2D Scan of the backbone dihedrals of the FSI anion at RB3LYP/6-31+G(d,p) is shown in the picture below as an example.
Many of the degrees of freedom are relatively stiff or of minor importance for the conformational space, like the bond lengths in FSI. However, the procedure provided here can easily be changed to scan e.g. an angle and a dihedral, two angles, or every desired combination. Naturally, also scans in 3 or more dimensions can be implemented by only minor changes in the input files. However, as the computational effort scales as 𝒪(2n), the limit of feasibility is reached very quickly.
Performing the 2D scan
Start by downloading and extracting this .zip file. The following step-by-step manual explains how to use the files.
Creating the Input Structure
Create a suitable inputfile after the provided template, including the Link 0 commands:
%nprocshared=4 %mem=2gb %chk=PLATZHALTER.chk
The wildcard preceding .chk will be replaced by a bash script later. The route section should include opt=modredundant:
# opt=(Modredundant) Rb3lyp/6-31+g(d,p)
This serves the purpose of introducing the constraints, as can be seen from the following z-matrix:
N S 1 1.69478113 S 1 1.69478113 2 132.47526929 C 2 1.80000000 1 109.185464 3 D4213 C 3 1.80000000 1 109.189243 2 D5312 F 5 1.34618746 3 108.791335 1 174.509706 F 4 1.34618746 2 108.802903 1 174.556496 O 3 1.47726612 5 99.42046041 6 -64.86454114 O 3 1.47726612 5 105.36186599 6 58.944352 O 2 1.47726612 4 105.36186599 7 58.984361 O 2 1.47726612 4 99.42046041 7 -64.86454114 F 4 1.34618746 2 111.879718 7 107.569662 1 F 5 1.34618746 3 111.881954 6 107.569147 1 F 4 1.34618746 7 107.965588 12 108.456943 -1 F 5 1.34618746 6 107.962763 13 108.457201 -1 Variables: D4213 = XXXX D5312 = YYYY D 4 2 1 3 F D 5 3 1 2 F
It is of crucial importance that the z-matrix is constructed carefully by hand, which might take a few attempts. A badly constructed z-matrix leads to problems for example the oxygen atoms not following the change in the dihedrals, or even clashes between atoms. The last two lines fix the desired dihedrals, and the wildcards 'XXXX' and 'YYYY' will both be iteratively replaced, producing a matrix of .com files for the scan. This is superior to the relaxed scan in gaussian (not to be confused with the 'scan' keyword, which produces rigid scans), which doesn't allow for parallelization of the grid elements, but rather sweeps through. Additionally, artefacts are sometimes encountered when atoms get caught:
This is not a problem with a good z-matrix.
Preparing the Gaussian Input Files
The wildcards are replaced with the desired values by the bash script '2DSCAN'. Before executing it, the system name at the beginning has to be modified, as well as the range of values to scan. '2DSCAN' also changes the wildcards in 'gau2D.pbs' and 'gauMP2.pbs', and creates a list called 'list' with the file names.
Optimizing the other Dimensions
Omitting the 'opt' keyword would lead to a rigid scan, whereas here, a relaxed scan is desired. To this end, every point in the grid is separately optimised as distinct gaussian job, keeping the two chosen internal coordinates constant. Only one array job is submitted by invoking 'qsub gau2D.pbs', which takes full advantage of the cluster. When all the jobs are finished, they should be checked by running 'bash read_and_check'. This serves the purpose of identifying issues, hence it might be beneficial to adjust the conditions to the problems usually encountered. After running this script, a file called 'aFAILED' containing all the failed jobs will be present. These jobs can then be resubmitted, e.g. by just changing 'list' to 'aFAILED' in gau2D.pbs and adjusting the number of array members to the number of lines in 'aFAILED'. This procedure should be repeated until 'aFAILED' is empty, with the exception of inherent problems that are tolerable.
Performing higher-level SP calculations
The optimized structures have to be extracted first to perform a higher level single point calculation on top of the previous constrained optimization. The script 'read_prepare' serves this purpose, and also prepares a new list called 'MP2list'. It makes use of the gaussian 'newzmat' utility, which means that the gaussian module has to be loaded. Link 0 commands and the route section can be found and altered in 'read_prepare', too. When every file is ready, the single point calculations can be started with 'qsub gauMP2.pbs'. Again, the results should be checked with 'read_and_check_MP2', resubmitting everything which is reported into 'aFAILED'.
Reading results from the files
The two additional files 'read_MP2' and 'read_RB3LYP' read the desired information from the output logfiles. In the example, the total dipole moment and the total energies are of interest. Take care to read the right energy, especially for the MP2 calculations:
DVALUE=$(tac $FILENAME | grep -m 1 Tot= | cut -c 90-)
echo $ZAEHLER" "$D9START" "$DVALUE >> B3LYPRESULTS_DIPOLES
EUVALUE=$(tac $FILENAME | grep -m 1 "E(" | cut -c 26-39)
echo $ZAEHLER" "$D9START" "$EUVALUE >> B3LYPRESULTS_ENERGIES
DVALUE=$(tac $FILENAME | grep -m 1 Tot= | cut -c 90-)
echo $ZAEHLER" "$D9START" "$DVALUE >> MP2RESULTS_DIPOLES
EUVALUE=$(tac $FILENAME | grep -m 1 "EUMP2" | cut -c 39-)
echo $ZAEHLER" "$D9START" "$EUVALUE >> MP2RESULTS_ENERGIES
Here, the order of lines is reverted first, followed by an extraction of the first search result, followed by cutting out the desired information. $ZAEHLER and $D9START iterate over the two variables, thus the produced files can be directly important into visualization software like Matlab or Origin.
Tidying up
The default procedure produces over 30000 per ion, namely 5329 files of each .chk, .log, and 2*5329 .com and job log files. It is advisable to archive the output in a condensed form. Usually, it is sufficient to keep the structures as .xyz files, which can be extracted using the newzmat utility. This can also be done with the script 'preparexyz'. The files to be archieved can be combined and compressed, followed by deleting everything that is no longer needed:
zip logfiles.zip *.log zip xyz.zip *.xyz rm *.o* rm *.log rm *.chk rm *.xyz