Mod:Hunt Research Group/new gf script

A new gf script

This is a python script which does two jobs:

Edits .com files so that
1. .chk filename matches the .com file
2. the ncps requested is correct
3. the memory requested for gaussian is correct.
Submits the job(s) (you can use wildcards in the command) using the standard runscripts ([link]) stored in a folder in your home directory.

Note: the first 3 lines of the .com file need to exist for the editing process to work. ie. You have 3 lines (with anything or nothing on, doesn't matter) BEFORE your method line. If your method line is in the first 3 lines it will get overwritten and the script won't work.

How to set up for Gaussian16

Go to /home/username/bin and type vi gf.py. Press i and then copy and paste the following code in:

(instructions continue below the code - ignore the instructions in the code for now, they are just to help out if you are not looking at this page).


import os
import glob
import sys

############
#HOW TO RUN#
############

# 0. Save this script in /rds/general/user/username/home as gf.py 

# 1. Enter your username here:

username = "ab1234"

# 2. Save the runscripts from https://wiki.ch.ic.ac.uk/wiki/index.php?title=Mod:Hunt_Research_Group/new_gf_script in a directory called /rds/general/user/username/home/bin/runscripts 

# 3. Go to the directory where your .com files are and type in: 
# python /rds/general/user/username/home/bin/gf.py version queuname nprocs inputfiles

# For example:
# python /rds/general/user/username/home/bin/gf.py c01 pqph 20 "01.com"     will submit job 01.com on 20 processors to the pqph queue to run on gaussian16 version c01.
# python /rds/general/user/username/home/bin/gf.py c01 pqph 40 "*.com"      will submit all the .com files in the directory to run on 40 processors to the pqph queue to run on gaussian16 version c01. 

#####################
#Setting the options#
#####################

#First variable sets which version of gaussian to use - use a01 usually.

version = sys.argv[1]
print("version = " + version)

#Second variable sets the queue - pqph

queue = sys.argv[2]
print("queue = " + queue)

#Third variable sets the number of processors:
#20, or 40

nprocs = sys.argv[3]
print("nprocs = " + nprocs)

#Last variable is the input files. You can use wildcards but you need to add "quotes" around the input 

job_input = sys.argv[4]
print("job_input= " + job_input)
jobs = []
for job in glob.glob(job_input):
    jobs.append(job)

print("list of jobs: ")
print(jobs)

#######################
#Finding the runscript#
#######################

runscript = "/rds/general/user/" + username + "/home/bin/runscripts/" + version + "_" + queue + "_" + nprocs
print("   ")
print("runscript = " + runscript)

########################
#Editing the input file#
########################

#Getting the correct nprocs and mem lines for the .com files:

gauss_nprocs_line = "undefined"
gauss_mem_line = "undefined"

if nprocs == "20":
    gauss_nprocs_line = "%nprocs=20\n"
    gauss_mem_line = "%mem=58000MB\n"
elif nprocs == "40":
    gauss_nprocs_line = "%nprocs=40\n"
    gauss_mem_line = "%mem=120000MB\n"
else:
    print("Error: nprocs (argument 3) may only be 20, or 40.")

print("gauss_nprocs_line = " + gauss_nprocs_line)
print("gauss_mem_line = " + gauss_mem_line)

#Editing the files:

for job in jobs:

    #Getting the chk line to overwrite the input file

    name_list = job.split(".")
    name = name_list[0]
    chk_line = "%chk=" + name + ".chk\n"
    print(name + " chk_line = " + chk_line)

    #Overwriting the first 3 lines of the input file

    with open(job, "r") as file:
        file_lines = file.readlines()
       #print file_lines
    file_lines[0] = gauss_nprocs_line
    file_lines[1] = gauss_mem_line
    file_lines[2] = chk_line
    #print file_lines
    with open(job, "w") as file:
        file.writelines(file_lines)

#####################
#Submitting the jobs#
#####################

#Getting the terminal to do this: "qsub -N jobname -v in=name runscript"

    print("qsub -N " + name + " -v in=" + name + " " + runscript)
    os.system("qsub -N " + name + " -v in=" + name + " " + runscript)

Then change the username = at the top to your own username, and press :wq enter to save and quit.

Make a directory in /rds/general/user/username/home/bin/ called runscripts and put these scripts in it: runscripts

To use

Go to the directory where your .com files are and type in: python /rds/general/user/username/home/bin/gf.py version queuename nprocs inputfiles

For example:

python /rds/general/user/username/home/bin/gf.py a01 pqph 20 "01.com"

will submit job 01.com on 20 processors to the pqph queue to run on gaussian16 version a01.

python /rds/general/user/username/home/bin/gf.py a01 pqph 40 "*.com"

will submit all the .com files in the directory to run on 40 processors in the pqph queue to run on gaussian16 version a01.

Possible options

The version option may be a01.

The queue option may be pqph.

The nprocs options can be 20, or 40.

Depending on which of these you use a different runscript from the runscripts folder is called. The list of runscripts that that are included by default are:

a01_pqph_20 a01_pqph_40

This is every combination of the above options. If you want to use a different option to the ones provided simply make the runscript you want to use, and name it with the same convention. If you want a different nprocs option you must also edit the script slightly in the section headed: Editing the input file, in the subsection: Getting the correct nprocs and mem lines for the .com files. You will just need to copy and paste an elif block and change the nprocs and the memory requested to whatever you want. It should be easy to copy what I did for the other examples.

How to set up for g09

Go to /home/username/bin and type vi gf.py. Press i and then copy and paste the following code in:

(instructions continue below the code - ignore the instructions in the code for now, they are just to help out if you are not looking at this page).


import os
import glob
import sys

############
#HOW TO RUN#
############

# 0. Save this script in /rds/general/user/username/home as gf.py 

# 1. Enter your username here:

username = "rr1210"

# 2. Save the runscripts from https://wiki.ch.ic.ac.uk/wiki/index.php?title=Mod:Hunt_Research_Group/new_gf_script in a directory called /rds/general/user/username/home/bin/runscripts 

# 3. Go to the directory where your .com files are and type in: 
# python /rds/general/user/username/home/bin/gf.py version queuname nprocs inputfiles

# For example:
# python /rds/general/user/username/home/bin/gf.py d01 pqph 12 "01.com"      will submit job 01.com on 12 processors to the pqph queue to run on gaussian version d01.
# python /rds/general/user/username/home/bin/gf.py d01 pqchem 32 "*.com"      will submit all the .com files in the directory to run on 32 processors in the pqchem queue to run on gaussian version d01. 

#####################
#Setting the options#
#####################

#First variable sets which version of gaussian to use - use d01 usually or b01 for getting latest nbo.

version = sys.argv[1]
print "version = " + version

#Second variable sets the queue - pqph or pqchem

queue = sys.argv[2]
print "queue = " + queue

#Third variable sets the number of processors, as standard the options are the same as those defined by Tricia on the wiki:
#12, 24, 32, 40, 48

nprocs = sys.argv[3]
print "nprocs = " + nprocs

#Last variable is the input files. You can use wildcards but you need to add "quotes" around the input 

job_input = sys.argv[4]
print "job_input= " + job_input
jobs = []
for job in glob.glob(job_input):
    jobs.append(job)

print "list of jobs: "
print jobs

#######################
#Finding the runscript#
#######################

runscript = "/rds/general/user/" + username + "/home/bin/runscripts/" + version + "_" + queue + "_" + nprocs
print "   "
print "runscript = " + runscript

########################
#Editing the input file#
########################

#Getting the correct nprocs and mem lines for the .com files:

gauss_nprocs_line = "undefined"
gauss_mem_line = "undefined"

if nprocs == "12":
    gauss_nprocs_line = "%nprocs=12\n"
    gauss_mem_line = "%mem=45000MB\n"
elif nprocs == "24":
    gauss_nprocs_line = "%nprocs=24\n"
    gauss_mem_line = "%mem=58000MB\n"
elif nprocs == "32":
    gauss_nprocs_line = "%nprocs=32\n"
    gauss_mem_line = "%mem=58000MB\n"
elif nprocs == "40":
    gauss_nprocs_line = "%nprocs=40\n"
    gauss_mem_line = "%mem=120000MB\n"
elif nprocs == "48":
    gauss_nprocs_line = "%nprocs=48\n"
    gauss_mem_line = "%mem=250000MB\n"
else:
    print "Error: nprocs (argument 3) may only be 12, 24, 32, 40, 48."

print "gauss_nprocs_line = " + gauss_nprocs_line
print "gauss_mem_line = " + gauss_mem_line

#Editing the files:

for job in jobs:

    #Getting the chk line to overwrite the input file

    name_list = job.split(".")
    name = name_list[0]
    chk_line = "%chk=" + name + ".chk\n"
    print name + " chk_line = " + chk_line

    #Overwriting the first 3 lines of the input file

    with open(job, "r") as file:
        file_lines = file.readlines()
       #print file_lines
    file_lines[0] = gauss_nprocs_line
    file_lines[1] = gauss_mem_line
    file_lines[2] = chk_line
    #print file_lines
    with open(job, "w") as file:
        file.writelines(file_lines)

#####################
#Submitting the jobs#
#####################

#Getting the terminal to do this: "qsub -N jobname -v in=name runscript"

    print "qsub -N " + name + " -v in=" + name + " " + runscript
    os.system("qsub -N " + name + " -v in=" + name + " " + runscript)

Then change the username = at the top to your own username, and press :wq enter to save and quit.

Make a directory in /rds/general/user/username/home/bin/ called runscripts and put these scripts in it: [runscripts[1]]

To use

Go to the directory where your .com files are and type in: python /rds/general/user/username/home/bin/gf.py version queuename nprocs inputfiles

For example:

python /rds/general/user/username/home/bin/gf.py d01 pqph 12 "01.com"

will submit job 01.com on 12 processors to the pqph queue to run on gaussian version d01.

python /rds/general/user/username/home/bin/gf.py d01 pqchem 32 "*.com"

will submit all the .com files in the directory to run on 32 processors in the pqchem queue to run on gaussian version d01.

Possible options

The version option may be eith d01 or b01.

The queue option may be either pqph or pqchem.

The nprocs options can be 12, 24, 32, 40 or 48.

Depending on which of these you use a different runscript from the runscripts folder is called. The list of runscripts that that are included by default are:

b01_pqchem_12 b01_pqchem_24 b01_pqchem_32 b01_pqchem_40 b01_pqchem_48 b01_pqph_12 b01_pqph_24 b01_pqph_32 b01_pqph_40 b01_pqph_48 d01_pqchem_12 d01_pqchem_24 d01_pqchem_32 d01_pqchem_40 d01_pqchem_48 d01_pqph_12 d01_pqph_24 d01_pqph_32 d01_pqph_40 d01_pqph_48

This is every combination of the above options. If you want to use a different option to the ones provided simply make the runscript you want to use, and name it with the same convention. If you want a different nprocs option you must also edit the script slightly in the section headed: Editing the input file, in the subsection: Getting the correct nprocs and mem lines for the .com files. You will just need to copy and paste an elif block and change the nprocs and the memory requested to whatever you want. It should be easy to copy what I did for the other examples.