Mod:Hunt Research Group/pimpQSUB
Back to the main wiki-page
How to... speed up your job queing
This is a little function that reduce the time spent in sending job to the hpc. It has also some useful auto-correction part that automatically sets your chk, your processors and your memory in the input file if not coherent with the terminal input. Write in the hpc terminal
vi ~/bin/gf
and copy and paste this:
#!/bin/bash ######################################################################################### # Default Variables # ######################################################################################### DIR=$PWD SFILE="$HOME/bin/.rng" PFILE="$HOME/bin/.presets" WLOG="$HOME/bin/.wlog" WLOGf="$HOME/bin/.wulog" CUE="pqph" CORE=12 MEM=47988 WALLT="119:59:00" QUIET=1 FRC=0 CORRECTION=1 RED='\033[0;31m' NC='\033[m' YELLOW='\033[1;33m' BLUE='\033[1;34m' OPT=0 ######################################################################################### # Debugging Function # ######################################################################################### debug(){ echo "1.PWD $PWD" echo "2.# $#" echo "3.* $*" echo "4.OPTARG ${OPTARG}" echo "5.OPTIND ${OPTIND}" echo "6.CUE ${CUE}" echo "7.CORE ${CORE}" echo "8.MEM ${MEM}" echo "9.WALLTIME ${WALLT}" echo "10.MAXDISK ${MAXDISK}" } ######################################################################################### # Help output # ######################################################################################### usage(){ echo -e "gaussian function v1.2 ${RED}NAME${NC} gfunc ${RED}SYNOPSIS${NC} gf [jobfilename.com] ${RED}OPTIONS${NC} ${BLUE}-q${NC} [cue] set the cue for the job, default is pqph ${BLUE}-c${NC} [cores] sets the number of cores, 8 is set as default ${BLUE}-m${NC} [memory] sets the quantity of memory to use (MB or GB) ${BLUE}-w${NC} [walltime] set the walltime ${BLUE}-g${NC} [gaussian version] sets the version of gaussian in use (e.g. d01) ${BLUE}-s${NC} silent send directly the job ${BLUE}-p${NC} [preset] preset = [0-9] load a preset show list the saved preset set open the editor to set the preset ${BLUE}-d${NC} [max disk] set the maxdisk ${BLUE}-n${NC} no correction no correction of the input file ${BLUE}-l${NC} [select] select = all work ID (e.g. '7197851') prints the log of the jobs sent ${BLUE}-h${NC} help ${RED}DESCRIPTION${NC} This function sends the jobs to the HPC. It also corrects the settings of your file automatically. In the input file the checkpoint filename is set equally to the input filename, the number of cores is set coherently to the input as the memory. This automatic correction can be disabled by -n option This function relies on a modified version of a script files given me by Claire (thanks Claire) that have to be placed in ~/bin. Next function that is projected to be added is the correction of the input file settings even if they are not written at all in the input file. Enjoy!!" } bytechunker(){ local __resultvar=${1} if [[ ${1} =~ .*"MB" ]]; then local MBYTES=${1%MB}; elif [[ $1 =~ .*"GB" ]] then local MBYTES=${1%GB}; let "MBYTES=MBYTES*1000"; else local MBYTES=${1-"800000"}; fi echo $MBYTES; eval $__resultvar="'$MBYTES'" } ######################################################################################### # Creating/Verifing the existence of some files used by the function # ######################################################################################### if [[ ! -d ~/bin ]] then mkdir ~/bin fi if [[ ! -f ${SFILE} ]] then echo '#!/bin/sh # submit jobs to the que with this script using the following command: # rng4 is this script # jobname is a name you will see in the qstat command # name is the actual file minus .com etc it is passed into this script as ${in%.com} # # qsub rng -N jobname -v in=name # batch processing commands #PBS -l walltime=119:59:00 #PBS -lselect=1:ncpus=12:mem=48000MB:tmpspace=400gb #PBS -j oe #PBS -q pqph #PBS -m ae # load modules # module load gaussian/g09-d01 # check for a checkpoint file # # variable PBS_O_WORKDIR=directory from which the job was submited. test -r $PBS_O_WORKDIR/${in%.com}.chk if [ $? -eq 0 ] then echo "located $PBS_O_WORKDIR/${in%.com}.chk" cp $PBS_O_WORKDIR/${in%.com}.chk $TMPDIR/. else echo "no checkpoint file $PBS_O_WORKDIR/${in%.com}.chk" fi # # run gaussian # g09 $PBS_O_WORKDIR/${in} cp $TMPDIR/${in%.com}.chk /$PBS_O_WORKDIR/. cp $TMPDIR/${in%.com}.wfx /$PBS_O_WORKDIR/. # cp *.chk /$PBS_O_WORKDIR/pbs_${in%.com}.chk # test -r $TMPDIR/fort.7 # if [ $? -eq 0 ] # then # cp $TMPDIR/fort.7 /$PBS_O_WORKDIR/${in%.com}.mos # fi # exit' > "${SFILE}" chmod a+x "${SFILE}" fi if [[ ! -f ${PFILE} ]] then echo '## Here is where to list the preset for gaussian calculations ## Each line is a preset and it is written in this way : ## [CUE];[CORES];[MEMORY];[WALLTIME];[GAUSSIAN VERSION];[MAXDISK] ## e.g pqph;8;14400MB;119:59:00;d01;800GB ## ## ##----------------------------------------------------------------------------- ## ## presets starts from next line pqph;12;48000MB;119:59:00;d01;400GB' > "${PFILE}" fi ######################################################################################### # Retriving inputted options # ######################################################################################### while getopts fhnsc:d:g:l:m:p:q:w: OPTION do if [[ $OPT -lt $OPTIND ]] then let "OPT=OPTIND - 1" fi case $OPTION in d) if [[ $OPTARG =~ .*"MB" ]]; then MAXDISK=${OPTARG%MB}; elif [[ $OPTARG =~ .*"GB" ]] then MAXDISK=${OPTARG%GB}; let "MAXDISK=MAXDISK*1000"; else MAXDISK=${OPTARG-"16000"}; fi ;; p) if [[ $OPTARG =~ [0-99] ]] then i=0; while read VAR do let "m=i-9" if [[ $m == $OPTARG ]] then echo $VAR IFS=";" read -ra PRESET <<< "$VAR" CUE=${PRESET[0]}; CORE=${PRESET[1]}; if [[ ${PRESET[2]} =~ .*"MB" ]]; then MEM=${PRESET[2]%MB}; elif [[ ${PRESET[2]} =~ .*"GB" ]] then $MEM=${PRESET[2]%GB} let "MEM=MEM*1000"; else MEM=${PRESET[2]-"16000"}; fi WALLT=${PRESET[3]}; GAUSS=${PRESET[4]}; if [[ ${PRESET[5]} =~ .*"MB" ]]; then MAXDISK=${PRESET[5]%MB}; elif [[ ${PRESET[5]} =~ .*"GB" ]] then MAXDISK=${PRESET[5]%GB}; let "MAXDISK=MAXDISK*1000"; else MAXDISK=${PRESET[5]-"16000"}; fi fi ((i++)) done < $PFILE; elif [[ $OPTARG == "show" ]] then i=1; while read VAR do if [[ $i -gt 9 ]] then IFS=";" read -ra PRESET <<< "$VAR" let "m=i-9" echo "$m - cue: ${PRESET[0]} cores:${PRESET[1]} memory: ${PRESET[2]} walltime: ${PRESET[3]} gaussian version: ${PRESET[4]} max disk: ${PRESET[5]}"; fi ((i++)) done < $PFILE exit elif [[ $OPTARG == "set" ]] then vi $PFILE exit fi WHITESTRIPES="Charged preset : $CUE $CORE $MEM $WALLT $GAUSS $MAXDISK"; ;; h) usage; exit; ;; q) CUE=${OPTARG-"pqph"}; ;; c) CORE=${OPTARG-"8"}; ;; f) FRC=1; ;; m) if [[ $OPTARG =~ .*"MB" ]]; then MEM=${OPTARG%MB}; elif [[ $OPTARG =~ .*"GB" ]] then MEM=${OPTARG%GB}; let "MEM=MEM*1000"; else MEM=${OPTARG-"16000"}; fi ;; n) CORRECTION=0; ;; g) GAUSS=${OPTARG}; ;; w) WALLT=${OPTARG}; ;; s) QUIET=1; ;; l) if [[ ${OPTARG} == 'all' ]] then cat $WLOG; else sed -n "/${OPTARG}/p" $WLOG; fi ;; ?) usage; exit; ;; esac done if [[ $# -eq 0 ]] then usage exit fi shift $OPT if [[ ! -f $GAUSS && -f "${GAUSS}.com" ]] then $GAUSS="${GAUSS}.com" fi ######################################################################################### # Correcting the script file # ######################################################################################### if [[ -n $MAXDISK ]] then sed -i "s/#PBS -lselect=1:ncpus=.*/#PBS -lselect=1:ncpus=${CORE}:mem=${MEM}MB:tmpspace=${MAXDISK}MB/" ${SFILE} else sed -i "s/#PBS -lselect=1:ncpus=.*/#PBS -lselect=1:ncpus=${CORE}:mem=${MEM}MB/" ${SFILE} fi sed -i "s/#PBS -l walltime=.*/#PBS -l walltime=${WALLT}/" ${SFILE} if [ "$CUE" != "PUBLIC" ] then grep -q "#PBS -q .*" ${SFILE} && sed -i "s/#PBS -q .*/#PBS -q ${CUE}/" "${SFILE}" || sed -i "14s/^/#PBS -q ${CUE}\n/" ${SFILE} else grep -q "#PBS -q .*" ${SFILE} && sed -i "/#PBS -q .*/d" ${SFILE} fi if [[ -n $GAUSS ]] then sed -i "s/module load gaussian\/g09-.*/module load gaussian\/g09-${GAUSS}/" ${SFILE} fi ######################################################################################### # Pushing the job to the HPC # ######################################################################################### while [ $# -gt 0 ] do if [[ ! -f ${1} && -f "${1}.com" ]] then set -- "${1}.com" fi NMFL=${1%.com} if [[ ! -f ${1} ]] then echo "\"${1}\" does not exists." exit elif [[ ! ${1} =~ .*'.com' ]] then echo "\"${1}\" is not a com file." exit fi if [[ $CORRECTION -eq 1 ]] then CD=$(pwd) CD=${CD//\//\\\/}'\/' GMEM=$((${MEM} * 15 / 20)) gmem=$(grep -i -c "%mem=.*/" ${1}) gpshar=$(grep -i -c "%nprocshared=.*/" ${1}) gchk=$(grep -i -c "%chk=./" ${1}) echo $gmem $gpshar $gchk grep -q "%mem=" ${1} && sed -i "s/%mem=.*/%mem=${GMEM}MB/" ${1} || sed -i "1s/^/%mem=${GMEM}MB\n/" ${1} grep -q "%nprocshared=" ${1} && sed -i "s/%nprocshared=.*/%nprocshared=${CORE}/" ${1} || sed -i "1s/^/%nprocshared=${CORE}\n/" ${1} sed -i "s/%chk=.*chk/%chk=${1%com}chk/" ${1} # sed -i "s/%chk=.*chk/%chk=$CD${1%com}chk/" ${1} if [[ -n $MAXDISK ]] then sed -i "s/maxdisk=.*B/maxdisk=${MAXDISK}MB/" ${1} fi fi if [[ $QUIET -eq 1 ]] then response="y" else echo -e "${RED}--------------------------------------------------------------------------------\n\n${NC}" more ${1} echo -e "${RED}--------------------------------------------------------------------------------\n\n${NC}" echo -e $YELLOW$WHITESTRIPES$NC echo -e "qsub -N ${NMFL:0:15} -v in=${1} ~/bin/.rng" echo "Cores: $CORE; Memory: $MEM; Cue: $CUE; Walltime: $WALLT; Gaussian-version: $GAUSS; Maxdisk: $MAXDISK" echo -e "${RED}------------------------------Are you sure? [y/N]-------------------------------${NC}" read -r -p " " response fi if [[ $response =~ ^([yY][eE][sS]|[yY])$ ]] then if [[ $FRC=1 ]] then out=$(qsub -p 100 -N${NMFL:0:15} -v in=${1%.com} ~/bin/.rng); else out=$(qsub -N "${NMFL:0:15}" -v in="${1%.com}" ~/bin/.rng); fi cp="$(pwd)" if [[ -n $out ]] then echo "$(date -u +'%d/%m - %H:%M') | $out | $NMFL | ${cp#/work/gd2613/jobs/}" >> $WLOG echo "$(date -u +'%d/%m - %H:%M') | $out | $NMFL | ${cp#/work/gd2613/jobs/}" >> $WLOGf fi echo $out echo -e "${YELLOW}\n $(date -u +'%H:%M') - Work sent \n${NC}" else echo -e "${YELLOW}\n Work aborted \n${NC}" fi shift done
write and quit
:wq
and then set the file as executable file
chmod +x gf
I've reduced the function name to gf, more similar to gv and shorter. A complete guide to the function is printed writing gf or gf ? or gf help in the terminal. The old rng file are no more need.
I also suggest you to add this other 3 alias if you want. I found really useful when modding your .bashrc as convulsively as I do.
alias bhrc="cd ~; cp .bashrc .bashrc.old;vi .bashrc; loadbh; cd ~-" alias bhrc.old="cd ~;cp .bashrc.old .bashrc; loadbh; cd ~-" alias loadbh="source ~/.bashrc"
bhrc and gfunc have become my favorite commands quite soon!! And bhrc.old is my panic button, a perfect backup plan if you mess up something in the bashrc file.
Using The Script
To submit a job using the default parameters of the script simply type
gf jobname.com
where jobname.com is the name of the Gaussian input file. Note it is possible to use wildcards with this script, for example to simultaneously submit all input files in your current directory type
gf *.com
You can also submit jobs using different parameters (e.g to use a different number of processors) by using a .presets file. This file will be created in ~/bin the first time you run a job. An example of a .presets file is
## Here is where to list the preset for gaussian calculations ## Each line is a preset and it is written in this way : ## [CUE];[CORES];[MEMORY];[WALLTIME];[GAUSSIAN VERSION];[MAXDISK] ## e.g pqph;8;14400MB;119:59:00;d01;800GB ## ## ##----------------------------------------------------------------------------- ## ## presets starts from next line pqph;4;7500MB;119:59:00;d01;400GB pqph;16;64000MB;119:59:00;d01;400GB pqph;20;128000MB;119:59:00;d01;400GB
The uncommented lines contain adjustable parametes. For example the first uncommented line tell the script to submit to the pqph queue using 4 processors, 7500MB memory, 120 hours walltime, Gaussian version d01 and 400GB temporary space. To use one these presets type one of the following commands
gf -p0 jobname.com gf -p1 jobname.com gf -p2 jobname.com
Where jobname.com is the name of your input file. For the example .presets file; p0 uses the 4 core/7500MB parameters (the first uncommented line), p1 uses the 16 core/64000MB parameters (2nd uncommented line) and p2 uses the 20 core/128000MB parameters (3rd uncommented line).