Jump to content

Multidimensional Scans of Internal Coordinates

From ChemWiki

The potential energy surface of a molecule is a function in 3N-6 dimensions. Thus, the PES is virtually impossible to obtain for reasonably sized molecules and even more impossible to interpret. Sometimes a subset of the PES is sufficient, with all the other degrees of freedom being chosen as to minimize the energy. 2D scans, which can be easily visualized by contour plots, are usually a sensible choice.

Introduction

A simple 2D Scan of the backbone dihedrals of the FSI anion at RB3LYP/6-31+G(d,p) is shown in the picture below as an example.

Many of the degrees of freedom are relatively stiff or of minor importance for the conformational space, like the bond lengths in FSI. However, the procedure provided here can easily be changed to scan e.g. an angle and a dihedral, two angles, or every desired combination. Naturally, also scans in 3 or more dimensions can be implemented by only minor changes in the input files. However, as the computational effort scales as 𝒪(2n), the limit of feasibility is reached very quickly.

Performing the 2D scan

Start by downloading and extracting this .zip file. The following step-by-step manual explains how to use the files.

Creating the Input Structure

Create a suitable inputfile after the provided template, including the Link 0 commands:

%nprocshared=4
%mem=2gb
%chk=PLATZHALTER.chk

The wildcard preceding .chk will be replaced by a bash script later. The route section should include opt=modredundant:

# opt=(Modredundant) Rb3lyp/6-31+g(d,p)

This serves the purpose of introducing the constraints, as can be seen from the following z-matrix:

N  
S   1   1.69478113          
S   1   1.69478113           2   132.47526929         
C   2   1.80000000           1   109.185464    3   D4213    
C   3   1.80000000           1   109.189243    2   D5312    
F   5   1.34618746           3   108.791335    1   174.509706   
F   4   1.34618746           2   108.802903    1   174.556496   
O   3   1.47726612           5   99.42046041   6   -64.86454114       
O   3   1.47726612           5   105.36186599  6   58.944352    
O   2   1.47726612           4   105.36186599  7   58.984361    
O   2   1.47726612           4   99.42046041   7   -64.86454114       
F   4   1.34618746           2   111.879718    7   107.569662      1
F   5   1.34618746           3   111.881954    6   107.569147      1
F   4   1.34618746           7   107.965588    12  108.456943     -1
F   5   1.34618746           6   107.962763    13  108.457201     -1
Variables:
D4213 		  =		   XXXX
D5312 		  =		   YYYY

D 4 2 1 3 F
D 5 3 1 2 F

It is of crucial importance that the z-matrix is constructed carefully by hand, which might take a few attempts. A badly constructed z-matrix leads to problems for example the oxygen atoms not following the change in the dihedrals, or even clashes between atoms. The last two lines fix the desired dihedrals, and the wildcards 'XXXX' and 'YYYY' will both be iteratively replaced, producing a matrix of .com files for the scan. This is superior to the relaxed scan in gaussian (not to be confused with the 'scan' keyword, which produces rigid scans), which doesn't allow for parallelization of the grid elements, but rather sweeps through. Additionally, artefacts are sometimes encountered when atoms get caught:

This is not a problem with a good z-matrix.

Preparing the Gaussian Input Files

The wildcards are replaced with the desired values by the bash script '2DSCAN'. Before executing it, the system name at the beginning has to be modified, as well as the range of values to scan. '2DSCAN' also changes the wildcards in 'gau2D.pbs' and 'gauMP2.pbs', and creates a list called 'list' with the file names.

Optimizing the other Dimensions

Omitting the 'opt' keyword would lead to a rigid scan, whereas here, a relaxed scan is desired. To this end, every point in the grid is separately optimised as distinct gaussian job, keeping the two chosen internal coordinates constant. Only one array job is submitted by invoking 'qsub gau2D.pbs', which takes full advantage of the cluster. When all the jobs are finished, they should be checked by running 'bash read_and_check'. This serves the purpose of identifying issues, hence it might be beneficial to adjust the conditions to the problems usually encountered. After running this script, a file called 'aFAILED' containing all the failed jobs will be present. These jobs can then be resubmitted, e.g. by just changing 'list' to 'aFAILED' in gau2D.pbs and adjusting the number of array members to the number of lines in 'aFAILED'. This procedure should be repeated until 'aFAILED' is empty, with the exception of inherent problems that are tolerable.

Performing higher-level SP calculations

The optimized structures have to be extracted first to perform a higher level single point calculation on top of the previous constrained optimization. The script 'read_prepare' serves this purpose, and also prepares a new list called 'MP2list'. It makes use of the gaussian 'newzmat' utility, which means that the gaussian module has to be loaded. Link 0 commands and the route section can be found and altered in 'read_prepare', too. When every file is ready, the single point calculations can be started with 'qsub gauMP2.pbs'. Again, the results should be checked with 'read_and_check_MP2', resubmitting everything which is reported into 'aFAILED'.

Reading results from the files

The two additional files 'read_MP2' and 'read_RB3LYP' read the desired information from the output logfiles. In the example, the total dipole moment and the total energies are of interest. Take care to read the right energy, especially for the MP2 calculations:

DVALUE=$(tac $FILENAME | grep -m 1 Tot= | cut -c 90-)
echo $ZAEHLER" "$D9START" "$DVALUE >> B3LYPRESULTS_DIPOLES
EUVALUE=$(tac $FILENAME | grep -m 1 "E(" | cut -c 26-39)
echo $ZAEHLER" "$D9START" "$EUVALUE >> B3LYPRESULTS_ENERGIES

DVALUE=$(tac $FILENAME | grep -m 1 Tot= | cut -c 90-)
echo $ZAEHLER" "$D9START" "$DVALUE >> MP2RESULTS_DIPOLES
EUVALUE=$(tac $FILENAME | grep -m 1 "EUMP2" | cut -c 39-)
echo $ZAEHLER" "$D9START" "$EUVALUE >> MP2RESULTS_ENERGIES

Here, the order of lines is reverted first, followed by an extraction of the first search result, followed by cutting out the desired information. $ZAEHLER and $D9START iterate over the two variables, thus the produced files can be directly important into visualization software like Matlab or Origin.

Tidying up

The default procedure produces over 30000 per ion, namely 5329 files of each .chk, .log, and 2*5329 .com and job log files. It is advisable to archive the output in a condensed form. Usually, it is sufficient to keep the structures as .xyz files, which can be extracted using the newzmat utility. This can also be done with the script 'preparexyz'. The files to be archieved can be combined and compressed, followed by deleting everything that is no longer needed:

zip logfiles.zip *.log
zip xyz.zip *.xyz
rm *.o*
rm *.log
rm *.chk
rm *.xyz