Mod:Hunt Research Group/hpc

From ChemWiki
Revision as of 16:30, 19 February 2020 by Smf115 (Talk | contribs) (Runscript)

Jump to: navigation, search

Running jobs on the Imperial HPC

Back to the main wiki-page

The aim of this wiki is to get new users set up on the Imperial HPC and to take you through:

  • Introduce the Imperial HPC
  • Logging in to the HPC
  • Setting up your HPC environment (.bashrc)
  • Job submission
  • Managing your jobs

Before going through this wiki make sure that you have a:

  • HPC account
  • Are on the HPC Gaussian users list

Introduction

HPC systems are usually composed of a cluster of nodes (computers). Just like your laptop/desktop, each node has CPUs (cores/processors), disk space and memory (RAM). The imperial HPC has several clusters: CX1 (general), CX2 (high-end parallel jobs) and AX4 (big data). You will be using CX1 for your work.

Upon logging in you will find yourself on one of the login nodes. These nodes act as a gateway to the actual compute nodes (where your jobs will be run) and are good for file transfers, small job testing and setting up software. Don't use the login node as the place to run your job (it slows it down for everyone!)

From the login node, you will submit your jobs to the compute nodes on CX1. The job submission is handled by something called a scheduler. Imperial use PBS as the scheduler but others exist and all operate in a similar way with similar syntax. The job of the scheduler is to submit (non-interactively) the job to run on the compute node appropriately to ensure the resources available are being used efficiently. When you queue a job to run, you have to tell it which queue to send it to and the resources you want (number of processors, memory and walltime), the scheduler will do the rest.

Logging on

To login to the Imperial HPC from a Linux/Unix (mac) system a secure shell client can be used (ssh):

  1. Open a terminal window
  2. Type the line:

ssh -XY username@login.cx1.hpc.imperial.ac.uk
Replacing username with your username e.g. th194

  1. If it is your first time logging in then you will be asked to accept the host, type yes to do so.
  2. Enter password when prompted

You are now on one of the HPC login nodes!

Notes:

  • The -XY flags in the ssh command enable X11 forwarding

Setting up your .bashrc

Similarly to when setting up your local mac environment, you can use a .bashrc to set variables for your HPC bash environment. Below are the steps to do so and an example .bashrc with some useful alias' on. As you progress you can edit your .bashrc (always remember to source it to activate any updates) and if you think of any particularly useful lines then let the group know!

  1. Using your favourite text editor create the file .bashrc
  2. Copy and paste the script below into the new file

  3. Initialise your .bashrc by executing the command:
    source ~/.bashrc

Initial .bashrc to copy:

#!/bin/sh

# Change the prompt
   export PS1="[\$USER@\h]\$PWD \$ "

# Bash history commands
   HISTSIZE=100
   PATH=$PATH:~/bin
   EDITOR=vi
   export EDITOR

# alias definitions
   alias force="grep -i 'Maximum Force'"
   alias dist="grep -i 'Maximum Disp'"
   alias energy="grep -i 'SCF Done:'"

   alias q='qstat'
   alias qs="qstat -q pqph"
   alias qq="qstat -q"
   alias gv="module load gaussian gaussview; gaussview"

The script mainly contains alias definitions. The most important ones for the HPC are probably the gv alias to allow easy loading of GuassView and the qstat aliases. Once you are more familiar with bash and your HPC use you should feel free to edit your .bashrc to suit you.

You should now see that your command line prompt has changed. If it hasn't then the above hasn't worked. If it has then you can now see that you are logged on to one of the login nodes on the HPC (`user@login-#-internal` where # is the number of the login node).

Job Submission

To introduce you to the HPC we are going to run a test Gaussian calculation. To do this we need the necessary input files for the calculation (.com and .chk file) and a way to submit out Gaussian job to cx1 to run. Remember, the PBS scheduler manages running the job on the compute nodes. To run our job successfully PBS needs to know the resources and programs that our job requires. We use a runscript (or jobscript) to contain this information. The runscript is essentially a set of instructions on how to run the job.

Therefore, to run our job we need:

  • Input files (.com/.chk)
  • A runscript to tell PBS how to run our job

Input Files

  1. In your home directory set up a folder for the test job and cd into this folder.
    If you have a job you want to run on the HPC then we will use that file. This file is likely to be located on your local machine somewhere, in which case:
  2. Open a new terminal window and cd to the directory where your .com file is located
  3. We want to copy this to your new directory on the hpc, which can be done with the command:
    scp test.com username@login.hpc.ic.ac.uk:/rds/general/user/username/home
    This command is a secure copy and should be familiar from the unix cp command. Make sure you edit the destination to be the directory for your test job, put your shortcode instead of username and change the name of the file from test.com if it is different.
  4. Enter your password at the prompt
  5. If the copy was successful then your test .com file should now be located in your directory on the HPC.
  6. If the job requires a .chk file then repeat the process for this file
  7. A file created on your mac will not run on the hpc, it needs some additional information
  8. you need to add a %mem= for how much memory is required and a %nprocshared= for how many processors are required commands, the following is the first part of a test.com file setup for the hpc
%chk=test.chk
%nprocshared=12
%mem=45000MB
# hf/3-21g geom=connectivity

Title Card Required

0 1
 C

Checking the .com file

We now have a runscript and our input files within the directory. The resources requested in the run script above must match those entered at the top of your .com file.

  1. Open your .com file
  2. %chk should have the name of the checkpoint file which must be the same as the .com file
  3. %NProcShared is the number of processors requested and must match the number requested in the runscript. Edit it to 12.
  4. %mem should be slightly less than the memory requested in the runscript. Edit it to 46000MB

NB: There is an easier way to access your files on the HPC which is to mount it locally, this is almost like creating a tunnel between the two so that they can see each other directly. See the bottom of the page for information on how to do this later.

Runscript

As mentioned before the runscript contains all of the instructions to successfully run our job. The runscript usually contains PBS directives, which tell PBS the resources our job needs, and then a list of commands executing the job.

Modules

To run our job we will be using Gaussian. Firstly, check that you are registered as a user for the Gaussian group. If you are not then the job will fail to run as you will not be able to execute Gaussian. If you are not on the list then email Tricia to get added.

Gaussian and other programmes, such as GaussView, are available on the HPC as modules. To use a module you have to load it first, an example of module load commands were in the .bashrc file before.

Useful commands:

module avail: This lists the modules available on the HPC. The names of the modules are usually the programme name and the version (e.g. gaussian/g09-d01)
module load: Used to load a required module. Only once loaded can the program be used. (e.g. module load gaussian/g09-d01 loads guassian to your local environment)

We will be loading gaussian for use in our runscript.

Computational Resources

To tell PBS the resources our job needs we use special PBS directives. These are lines in the script which start with #PBS. Resource requests are denoted by the flag -l and then the resource itself. These can be:

walltime=[hhh:mm:ss]: The amount of real time the job requires to run. (There is usually a limit to the walltime available and this will change for each queue).
select=[integer]: The number of nodes our job needs to run on.
nprocs=[integer]: The number of processors on each node.
mem=[integer|GB/MB]: The amount of memory required.

Queues

We also need to tell PBS where to submit our job, this is the queue. The PBS directive to set the queue is:

-q [queue name]

There are several queues which you may have access too. A queue will have a set number of resources assigned to it and different limits (e.g. to walltime). The number of people who are using a queue defines how busy it will be and therefore, how long it may take waiting for your job to run. Specifying resources efficiently will help jobs run faster on the queue. Queues include:

pqph (various, see below) this is the hunt group queue, runs on the servers listed below
Each user has can have a maximum of 12 running jobs
To help balance usage please have a maximum of 20 jobs running or queued
pqchem (42 nodes) this is the chemistry department queue

Script

The below script is an example run script.

  • The script starts with "#!/bin/sh", without this the job will always go to the queue "short" instead of the queue asked for.
  • The next part of the script is the PBS directives discussed above which set the resources and variables needed.
  • The module for gaussian is then loaded
  • The script then checks to see if a .chk file exists and if so, copies it over to the temporary working directory on the compute node.
  • The final section executes when the job has complete and searches for the output files (e.g. .log file) to copy back over to your home directory.

The script needs to be placed in the directory you are running the job from:

  1. Open a new file 'rs12' and copy the below into it:
#!/bin/sh

# submit jobs to the queue with this script using the following command:
# qsub -N jobname -v in=name rs12
# rs12 is this script
# jobname is a name you will see in the qstat command
# name is the actual file minus .com etc it is passed into this script as ${in}

# batch processing commands
#PBS -l walltime=119:59:00
#PBS -l select=1:ncpus=12:mem=47000MB
#PBS -j oe
#PBS -q pqph

# load modules
#
  module load gaussian/g09-d01

# check for a checkpoint file
#
# variable PBS_O_WORKDIR=directory from which the job was submitted.
   test -r $PBS_O_WORKDIR/${in}.chk
   if [ $? -eq 0 ]
   then
     echo "located $PBS_O_WORKDIR/${in}.chk"
     cp $PBS_O_WORKDIR/${in}.chk $TMPDIR/.
   else
     echo "no checkpoint file $PBS_O_WORKDIR/${in}.chk"
   fi   
#
# run gaussian
#
  g09 $PBS_O_WORKDIR/${in}.com
#
# job has ended copy back the checkpoint file
# check to see if there are other external files like .wfn or .mos and copy these as well
#
  cp $TMPDIR/${in}.chk /$PBS_O_WORKDIR/.
# exit
  • Everyone should START by using this script, and not the automated submission script (see later)

We now have compatible input files and a runscript. We are ready to submit our job!

Job Submission

The instructions to submit a job are the same as those at the top of the runscript. We run the command: qsub -N jobname -v in=name rs12

  • qsub is the PBS command to submit the job.
  • jobname is the name you will see for your job in the qstat command
  • name is the actual file minus .com etc it is passed into this script as ${in}
  • rs12 is the name of the runscript

the run script must be in the same directory as your job!

  1. Run the command with the appropriate substitutions

If successful, a job number (XXXXXXX.cx1) should be printed out to the terminal, this is your jobID which PBS assigns to a submitted job.

Monitoring your Job

Now that your job has been submitted you can monitor by using the command qstat. This gives you the status of your jobs in the queues. Useful commands may be:

qstat to get your jobs that are running
qstat -q to get a list of all queues
qstat -f to get a full printout of all your queued jobs information

To delete a job from the queue you can use the command:

qdel [jobID] to remove a job from the queue

In your .bashrc there was an alias set for some of these options. Typing 'q' in the terminal should produce the same result as 'qstat'. The status of your job in the queue will either be Q (waiting to run), or R (running) and the run time so far.

Keep checking your job until it has run. If successful then you the .log file should be copied back to your working directory, check this to see if your job was successful. You will also find a file which has the extension: .o[jobID], this is the merged output and error files for your job. If there has been an error it will be detailed with this file along with the resources requested and used by your job.

ONLY once you have used the queuing script for some time

  • use the gf script created by Giacommo made which makes it easier to submit jobs to the hpc.
  • The link below contains the script and instructions for using it.
https://wiki.ch.ic.ac.uk/wiki/index.php?title=Mod:Hunt_Research_Group/pimpQSUB

Memory needed to run

  • Gaussian is greedy and will exceed the allocated memory
each proc needs a gaussian executable, which takes about 8MW (or 12 for MP2 frequencies)
MW is megaword which is the unit gaussian allocates memory
1MW is about 8.4MB
so each proc needs 1*8*8.4 approximately 68MB just to run
so 12 proc jobs require 12*68=816MB just to run
so 16 proc jobs require 16*68=1088MB just to run
so 20 proc jobs require 20*68=1360MB just to run
so 24 proc jobs require 24*68=1632MB just to run
so 40 proc jobs require 40*68=2720MB just to run
so 48 proc jobs require 48*68=3264MB just to run
so when allocating memory inside the gaussian job you must reduce the memory by at least this amount
  • thus best to reduce the memory by about 100MB*no.processors inside the gaussian script
  • you also need some overhead within the PBS script
  • the memory can be given in binary such as 251 GB (binary) is really 251 GB =251000*1,048,576 Bytes =264GB (decimal)

pqph Resources

Current pqph resources:

  • You can check the current queue resources and staus here: pqph queue status
  • Currently, pqph consists mainly of 40 proc/124GB nodes and a couple of 48 proc/256GB nodes in pqph.


Recommended specifications

For running Gaussian jobs on pqph' it is recommended to just use two job sizings. The sizings mean that either a full node will be used or just half of the node, allowing a second job to be run on the other half of the node. These are only applicable to Gaussian jobs which can't be run across nodes. You may want to use multiple nodes/alternate job sizings for codes which are parallelised.


Small/medium jobs:

  • Run jobs using half of a 40 processor node and half the memory allowance (64GB).
  • PBS script input:
#PBS -l walltime=72:00:00
#PBS -lselect=1:ncpus=20:mem=64000MB
  • Gaussian .com file input:
%nprocs=20
%mem=60000MB


Medium/large jobs:

  • Run jobs using a full 40 processor node and the full the memory allowance (128GB).
  • PBS script input:
#PBS -l walltime=72:00:00
#PBS -lselect=1:ncpus=40:mem=128000MB
  • Gaussian .com file input:
%nprocs=40
%mem=122000MB


If you need to use the larger (48 proc) nodes for more expensive calculations:

  • Run jobs using a full 48 processor node and the full the memory allowance (256GB).
  • PBS script input:
#PBS -l walltime=72:00:00
#PBS -lselect=1:ncpus=48:mem=25000MB
  • Gaussian .com file input:
%nprocs=48
%mem=256000MB


  • add tmpspace=400 only for large disk jobs to ensure you are put on a node with enough disk!!
  • Note that this requires you to include maxdisk=400gb in your gaussian input.
NOTE the queuing system does not check disk allocations. When requesting large disk jobs remember to request all of the processors on a node even if you are not using all of the processors. For large jobs the maximum disk space you can request is 800GB on the 12 processor nodes.

more details for if you seem to be having memory or disk issues

  • normal jobs
will need 2*N^2 W *8.4 to get B (1,048,576B =1MB)
so 300 basis functions will need 180000W =0.18MW =1.5MB in addition to the above requirements
require 2ON^2 W of disk to run where O=number of occupied orbitals, N=number of basis functions
  • MP2 jobs
work best with %mem and maxdisk defined
in-core requires N^4/4 divided by 1,000,000 MW memory
so 400 basis functions will need 6400MW=53760MB=54GB memory per node, which is unlikely!
semi-direct requires 2*O(N^2) memory and N^3 disk
so N=476 basis functions O=56 occupied orbitals will need
25.4MW=214MB of memory
and 108MW=906MB disk (this is not actually true it will need much more probably around 1800MB disk per processor!)
so total memory for MP2 freq 8proc will be
12*8*8.4=807MB to run and 8*214=1712MB for calcs and some extra 400MB=3019MB=3.3GB
gaussian does not like GB directive so give %mem in MB

checkpoint and other files

checkpoint files should be exactly the same name as the input file name
for jobs that may exceed the wall time specify the full path of the checkpoint file, for example
%chk=/work/phunt/tmp/filename.chk
this means the checkpoint file will be written into your personal work directory, it may slow the job down
this is also the reason /work is sometimes very slow on CX1 so only do this as an exception!


Resources

The Imperial Research Computing service have a hpc wiki which has useful information including an intro to shell scripting, modules and job management information:

https://wiki.imperial.ac.uk/display/HPC/High+Performance+Computing

The RCS also run several courses throughout the year, including intro to Linux, HPC, python and more advanced topics. Upcoming courses can be viewed from:

https://www.imperial.ac.uk/admin-services/ict/self-service/research-support/rcs/training/

Next steps

Mount Alias shortcut for logging in Keypair page Once you are comfortable and understand the job submission process then the automatic job script which ... can be used

Other information (may be out of date)

3.1 CPMD:
https://www.ch.ic.ac.uk/wiki/index.php/Image:Runcpmd_md.sh
3.2 DL-POLY:
https://www.ch.ic.ac.uk/wiki/index.php/Image:Mpirun.sh
Note: You´ll not be able to see the output until the job finishes : the directory /tmp/pb.XXX isn´t accessible to you because it is on the private disk of the node running the job.
To get DLPOLY to terminate before the job hits the walltime limit and killed, you need to run it through a program called pbsexec, for example:
pbsexec mpiexec DLPOLY.X
This will kill DLPOLY 15 minutes before the walltime limit, giving your script time to transfer files back to $work.