Jump to content

Mod:Hunt Research Group/database

From ChemWiki

File Storage

Navigate to the correct database folder and either select the appropriate molecule folder or create a new one. Within the files the appropriate files should be stored.

Structures

All structures should be stored as log files from stand alone frequency calculations. No .chk files or linked Opt Freq log files should be stored due to their size unless the optimisation is particularly interesting. Different conformers should be named sequentially (e.g. A, B, C... or 1, 2, 3...) with the lowest energy conformer labelled as the 'A' or '1' conformer.

Scans

Potential energy scans should be stored as the log files of the resulting scan. Where scans have been completed over multiple jobs then these files should be grouped together within a separate, appropriately named subfolder. E.g.

Where 6a and 6b are the two 180° fragments of a full 360° scan of the CCNC dihedral within the molecule.

File naming

File names should be clear and searchable so don't worry about the length of the file name, just its content.

File name formats should roughly follow: StructureID_JobType_Method_BasisSet_OtherDetails.log

For example: VX_A_Freq_B3LYP_6311gdp_GD3BJ_SMD.log

  • VX_A is the structure ID, this is the lowest energy conformer hence it is labelled A.
  • Freq is the job type showing that the log file corresponds to a stand alone frequency calculation.
  • B3LYP is the DFT functional used, methods should be upper case if possible.
  • 6311gdp refers to the use of the 6-311G(d,p) basis set. All letters within the basis set should be lower case and no special characters e.g. +,* or ( should be used within the file name.
  • GD3BJ illustrates the use of the +GD3BJ empirical dispersion within the method
  • SMD shows that the job was run within the solvent phase using SMD, in contrast no specifier or "GP" indicates a gas-phase structure.

Database file

Accompanying all of the structures stored on the database should be a completed spreadsheet detailing the files stored. Ideally there will be one spreadsheet file per folder, if more are required then it is likely that your structures can be separated into two different molecular subfolders. The database template can be downloaded here File:DB Temp.xlsx and an example can be viewed here File:DB Example.xlsx.

Header

Filling in the header section of the database requires:

  • A Molecule/Group identifier which is simply the molecule name or similar.
  • The Method section requires the model chemistry used e.g. B3LYP/6-311G(d,p) + GD3BJ
  • Phase relates to whether the calculations were done in the gas phase or the solvent phase

SMD Parameters

If the calculation has been done in the gas phase then 'N/A' can simply be entered and the boxes greyed out.

However, if the calculation has been carried out in the solvent phase then the SMD Parameters used must be listed. The parameter categories are the SMD parameters used for ILs and should be entered below each of the parameter names.

In the special case that Water has been used then Solvent=Water can be entered after the SMD parameters cell and the parameters greyed out.

Names and identifiers

The first three columns of the table require the names and identifiers of the structures:

  • Filename is simply the full name of the file in the database folder.
  • Thesis/Paper Reference relates to any identifier used for the structure within the corresponding thesis / paper. If the structure has not been mentioned directly within a publication then leave this blank.
  • Personal Naming allows you to enter your personal reference for the structure e.g. an older filename or number.

Thermodynamic parameters

The next 9 columns are concerned with the thermodynamic parameters of the structure. These include the:

  • Energy in atomic units of the structure, which is simply the SCF Done energy from the Freq file.
  • ΔE is the energy change relative to the lowest energy conformer. This is recorded in kJmol-1 to 2 d.p. and should be highlighted green when a value is entered by conditional formatting already present in the database.
  • Gibbs free energy in atomic units. This is the thermally corrected Gibbs free energy pulled from the thermodynamic section of the frequency log file.
  • ΔG is the free energy change relative to the lowest energy conformer. Similarly to ΔE this is recorded to 2 d.p. in kJmol-1 and should be automatically be highlighted green when a value is filled. The highlighting between these two columns is to show any changes in the ranking of the conformers between the values.
  • Thermally corrected enthalpy recorded in atomic units. Again this can be found from the thermodynamic section of the Freq file.
  • ΔH, the enthalpy change relative to the lowest energy conformer recorded to 2 d.p. in kJmol-1.
  • Entropy term of the Gibbs free energy (TS) in atomic units which can be calculated from the ΔG and ΔH values.
  • TΔS, the entropy change relative to the lowest energy conformer recorded to 2 d.p. in kJmol-1.
  • Zero-point energy (ZPE) in atomic units.

Low frequencies

The 6 low frequencies are recorded across the next 6 columns. These are pulled directly from the frequency log file:

Conditional formatting is already present within the spreadsheet to highlight the frequencies. Frequencies which are highly negative (<-15.0) will be highlighted in red and comments should be made, if known, to any potential reasons for their cause.

Comments

Comments made can be:

  • Anything interesting about the file or structure e.g. an interesting optimisation
  • Reasons for any oddities seen in the low frequencies or the thermodynamic parameters

Formatting

  • One table should be used per set of structures / conformers and also per method used.
  • Copy and paste the header and table headings block onto a new sheet or further down on the same sheet for a new set of structures.
  • This may break up the cell range for which any conditional formatting operates for. In which case, ranges can be changed by:
    • Clicking on the Conditional Formatting option under the Home tab.
    • Click manage rules.
    • Select to show formatting rules for the whole sheet from the dropdown list and changes can then be made to the cell ranges.
    • If this is too unfamiliar then just highlight manually.
  • The structures should be ordered by their ΔE values, if not already ordered in this way then to do so:
    • Highlight the full range of structures across all of the columns in the table.
    • Right click and select Sort then Custom Sort.
    • Order the structures by column E, the ΔE column.