Jump to content

GFP : Green Fluorescent Protein

From ChemWiki

Creating an input file

This is a step-by-step guide to creating a .com input file for ONIOM calculations using GFP. We've prepared things using amber as the low layer of theory.

1. Download the text .pdb file for GFP from http://www.rcsb.org/pdb/explore/explore.do?structureId=1W7S
2. Open GaussView 4.1 and open the .pbd file (crtl+o). The "drag and drop" open does not work with .pdb files. You must specify that hydrogens are added: GaussView recognizes amino acid sequences from its library and assigns the right atom types and charges accordingly.
3. Save the file as a .com (ctrl+s).
4. Open the .com in TextEdit, and in the route section replace the method and basis set with the keyword "amber".
5. Open this new .com with GaussView 4.1 which will attempt to assign as many atom types and charges as possible.
6. Save this assigned file as a com (ctrl+s).

This was the basic part. Now we want to assign the atom types GaussView missed:

7. Export the .com file from step 6 to a cluster folder.
8. Open an X11 terminal window and navigate to the .com file location. Open using the nedit command.
9. In the nedit window, toggle the statistics line on with alt+a. This will allow lines without atom types or atom charges to be located easily for future use.
10. Search for "- " in nedit. Note down the lines with missing atom types. The atom numbers can be easily deduced from the line numbers.
11. Cut and paste the lines with missing atom types to a new input file.
12. Open the new input file with GaussView 3.0. Manually assign the missing atom types if possible. If it is not possible to tell (e.g.: the atom is terminal but this may be because the input was cut and paste), remember the approximate location and open the .com file from step 6. Manually navigate to the region of interest. Make sure you toggle labels off when navigating to speed the process up!
13. Update the nedit file with the assigned atom types, save.
14. By way of review there are 4 unassigned atom types on lines 480, 4141, 4877, 5964. The correct assignments are N-NB-, O-O-, S-S-, N-NB- respectively.

And finally we want to assign the missing charges:

15. In the route section of the updated .com file on the cluster, specify the route section as "# amber=qeq". Make sure you include a minimum memory of 4000mb and also a suitable location for the .chk file.
16. Run the input file on the cluster.
17. You will encounter an error message concerning unknown bond stretches and bends. To correct this, specify "amber=(qeq,hardfirst) in the route section and input the bond stretches and bends at the end of the input as usual.
18. Now run the input file. Hopefully this will work and you will get the missing charges.
19. Open the .log file in an nedit window using X11 as before and copy all the missing charges. Paste into another nedit window displaying the input file from step 17 and save. You now have a suitable input file for running ONIOM calculations.

Files from different stages of input setup

Raw input files

The files below have no charges on the chromophore.

Here are two files, one with GFP with a surrounding water matrix and one with just GFP (chromophore in high level)
Media:gfp_water_1w7s_ready_to_use.gjf
Media:gfp_nowater_1w7s_ready_to_use.gjf

Ready-to-use input files

Media:gfp_nowater_charges.gjf
Media:gfp_water_charges.gjf

Input file with an additional amino acid in the high region (and 1 water molecule)

Media:gfp_water_charges_amino_acid.gjf

Calculations

  • gfp_water_charges_complete_1.com : Chromophore well defined (hydrogens, bonds... OK), with the water environment

Single point energy successful

Input, Output :Media:input_spe_1.gjf

  • gfp_water_charges_complete_opt_1.com : Optimisation of the previous system.

Optimisation successful

Input, Output :Media:input_opt_1.gjf

  • gfp_water_charges_complete_opt_1_freq_1.com : Frequency calculation of the previous optimisation

Input, Output : Media:input_freq_1.gjf

But one imaginay frequency, so not a minimum but a transition state

  • gfp_water_charges_complete_2.com : Chromophore well defined, with the water environment and the additional amino acid plus one water molecule in high level of accuracy.

Single point energy calculation successful

Input, Output :Media:input_spe_2.gjf

  • gfp_water_charges_complete_opt_2.com : Optimisation of the previous system

Problem of job killed, calculation requires a memory of 14000mb and a minimum of 500h

Input, Output:Media:input_opt_2.gjf

  • gfp_nowater_charges_complete_energy_1.com : chromophore well defined and just the chromophore in the high level.

single point energy successful.

  • gfp_nowater_charges_complete_energy_2.com : test of oniom(b3lyp/6-31g(d):amber=hardfirst)=scalecharge=555500 (no change concerning this and embedcharge)
  • gfp_nowater_charges_complete_opt_1.com : optimisation of the previous system

opt successful

Media:gfp_nowater_charges_complete_opt_1.gjf

  • gfp_nowater_charges_complete_opt_1_freq_1.com : frequency calculation of the previous optimised geometry

Media:gfp_nowater_charges_complete_opt_1_freq_1.gjf

No imaginary frequencies, but ModelModes produced fewer modes than expected (only 40). So, using a script written by David Mendive Tapia, the output can be cut as required to create space to display the remaining modes: Media:jobscript_live.log. This script prints just the geometry and frequency in the output. Change the .log extension to .sh

  • gfp_nowater_charges_complete_opt_1_freq_1_red : ModelModes
  • gfp_nowater_charges_complete_opt_1_freq_2_red : Freq calculation
  • gfp_nowater_charges_complete_opt_model_1.com : Optimisation with the option of printing the geometry of the model system

Successful

Media:gfp_nowater_charges_complete_opt_model_1.gjf

  • gfp_nowater_charges_complete_opt_model_1_freq_1.com : Frequency calculation of the model geometry, but failed because of the "page layout", so gaussian did not recognize this way of presenting the geometry.

Media:gfp_nowater_charges_complete_opt_model_1_freq_1.gjf

  • gfp_nowater_charges_complete_opt_model_1_freq_2.com : Optimised geometry changed, 9 imaginary frequencies obtained

Media:gfp_nowater_charges_complete_opt_model_1_freq_2.gjf

To run some frequency calculation on the model, use the IOp(1/33=1) keyword in order to print the model geometry.

Aurelie's comments

Optimisation

First I run optimisations with the file without water : optimisation = no problem, frequency calculation = problem...
Once the optimised geometry found I checked if there was some imaginary frequency, so I ran with the keyword Freq=Modelmodes which enables me to get a file with a reasonable size and the information that I want to get. I was lucky because I had no imaginary frequency, but I just got 40 modes with the keyword Freq=ModelModes, instead of 3*36-6 = 102, so there is a problem.
I imagined two possibilities : because the calculation was very big it crashed when I wanted to print the final output even if I restrained what it had to print. Or there is a limitation in he code and so just the 40 last modes can be printed.
So I tried to run an other calculation with the optimised geometry by asking the geometry of the model with IOp(1/33=1), once the file obtained, I had a look at the output and the geometry of the model was behind the part below :

                            At end of L120:                       
 ---------------------------------------------------------------------
 Center     Atomic      Atomic             Coordinates (Angstroms)
 Number     Number       Type             X           Y           Z
 ---------------------------------------------------------------------
      1          7           6       23.050591    3.469428   -1.008981

So that was strange because when I did this previously (using IOp(1/33=1)) I got the atoms outside the model with a -1, and here I have for the atoms outside the model 6 and for the model some big number(as a code), as below

 ---------------------------------------------------------------------
 Center     Atomic      Atomic             Coordinates (Angstroms)
 Number     Number       Type             X           Y           Z
 ---------------------------------------------------------------------
.
.
.
.
    473          6    20001005       -0.375481    1.652856    0.482589
    474          7    20001016       -0.273771    1.012963    1.610015
    475          6    20001005        0.594142   -0.055349    1.376775
    476          6    20001005        1.027366   -0.007835   -0.040053
    477          8    20001024        1.707451   -0.776975   -0.709324
    478          7    20001018        0.428530    1.145189   -0.536877
    479          6    20001001        0.563225    1.567653   -1.931383

So just take the geometry and hope for a frequency calculation on the model did not work, so I took the geometry and delete all the atoms which did not correspond to the model system, and replace the big code number by 0.


Back to ONIOM for biomolecules

Back to ONIOM tutorial (G03)