Mod:Hunt Research Group/program

How to build a Fortran program (Learning notes by Ling)

Resources:

Fortran Course: http://personalpages.manchester.ac.uk/staff/david.d.apsley/lectures/fortran/INDEX.htm

Day 1

Exercise 1 - write a program to read in the coordinates and store them in memory, and print the coordinates stored in memory out.

The best way is to start with a program, learn the basic keywords, and improve little by little. The simplest thing is to start writing a program to read in the coordinates and store them in memory.

We will write a program in fortran90, which is much better than fortran 77 (e.g. it has dynamic memory allocation, i.e. it allows to occupy and release memory blocks during execution, depending on the program needs).

Step 1: Skeleton

Create a file called e.g. "coordinates.f90". The first statement in the program will be: program coordinates. The last line will be: end program coordinates. Then compile using: gfortran coordinates.f90. This is the skeleton of the program.

 PROGRAM	COORDINATES
   ! Program reading in the coordinates and storing them in memory                                                                                                                                                                                                                    
 END PROGRAM COORDINATES

Step 2: Preamble

a. define an array that will contain the atomic position. It will have two indices, one specifying the atom, the other the coordinates (x, y, z). So for instance the indices (1,1) will give the x coordinates of atom 1, the indices (2,3) the z coordinate of atom 2 and so on.

So first line in the main body of the code is

 implicit none ! This is for good programming style. It tells the compiler not to make assumptions on the type of the variable, i.e. we will have to tell the compiler, for each variable or array that we define, whether it is an integer, a real, a complex, and so on.

b. in order to define the array containing the coordinates, we need to know how many atoms we are going to read. Suppose we know that we are going to read 100 atoms. Then we define a variable, as integer, called NumAtoms, like this:

 integer, parameter :: NumAtoms

I have added the keyword "parameter": this means that the value of this "variable" is fixed once for all, and will never be modified during the program.

use gfortran to compile, I got:

 Error: PARAMETER at (1) is missing an initializer

The reason is that we defined NumAtoms as a parameter, so we have to give its value once and for all, like this:

 integer, parameter :: NumAtoms = 100

Now we can define the array that will contain the atomic coordinates. We can call it atoms. It will be an array of type real, because atomic positions are real numbers:

 real, dimension(NumAtoms,3) :: atoms

Assuming that we will be reading from a file containing lines like (atom x y z), we also need to define an array containing the atomic symbols for each atom read, let us call it symbols. This will be of type "character", like this:

 character*2, dimension(NumAtoms) :: symbols

"*2" means that the atomic symbol can be as long as 2 character. We do not need the second dimension for this array, because, for instance symbol(45) will give us the atomic symbol of atom 45.

c. define an integer variable for the loop on the atoms (will be clear later). We call it "i", like this:

integer :: i

So this is the end of the preamble, where the type and dimensions of the variable/arrays are defined. This sequence of instruction has always to go before any other instruction in the code. If you need to define a new variable later on, you have to put it in this section.

Step 3: start reading the coordinates frm a file.

a. define a loop over the lines in the file, assuming that each line contains (symbol, x, y, z), and that there are NumAtoms lines in the file:

 do i = 1, NumAtoms

b. read each line in turn and store symbols and position in the array symbols and atoms respectively, like this:

 read(5,*) symbols(i), atoms(i,:)

5 means that the code expects to read from "standard input". The * means that the code is expecting a formatted file (i.e. not a binary file) with no particular format (it could be character, numbers, etc.). This means that you can either type in the coordinates, or redirect from a file, like this:

./a.out < coordinate_file

(Last two characters should be a colon and a close parenthesis - It means that for each atom, the code will read all the values allowed by the second dimension in the array, in this case from 1 to 3, i.e. x, y, z.)

c. print the coordinates stored in memory out:

 print *, symbols(i), atoms(i,:)

or do pretty printing with "write", for instance like this:

 write(6,'(a,3f20.6)') symbols(i), atoms(i,:)

6 means that it is going to print to standard output that you can redirect like this:

./a.out < inputfile > outputfile

3 means that the output format will contain three times the format "f20.6". "f20.6" means that a real number is printed of length 20 with 6 decimal digits.

If you want you can also specify a name of your choice for the output file, in this way:

 open(unit=7,file="some_name")

Then write to it like this,

 write(7,'(a,3f20.6)') symbols(i), atoms(i,:)

And when your are finished, close it, like this:

 close(7)

Then you would execute the program like this:

./a.out < inputfile

and the result would be in a new file called "some_name". The same trick can also be applied to the read statement.

Note: (1) need put open and close outside the loop. The file has to be open, then NumAtoms line are written to it during the loop. When the loop is finished, the file has to be closed. If we put open and close inside the loop, it would restart from the first line at every new value of i, and the output would only contain one line. (2) have to change write(6... with write(7... , because the file unit that you are using now is 7. unit=6 is only used for the standard output.

 unit 5 - the code expects to read from "standard input". The * means that the code is expecting a formatted file (i.e. not a binary file) with no particular format (it could be character, numbers, etc.). This means that you can either type in the coordinates, or redirect from a file, like this: ./a.out < coordinate_file
 unit 6 - print to standard output that you can redirect like this: ./a.out < inputfile > outputfile
 unit 7 - also specify a name of your choice for the output file, like this:./a.out < coordinate_file

You can have more that one output file open at a given time. Each one will be identified by a unit number (7, 8, ...). And you can decide which file to write to at a given time using write(7.... or write(8..., etc.

d. close the loop:

 end do

e. gfortran coordinates.f90

./a.out < 63waterCl.xyz > print.out

Code:

 PROGRAM COORDINATES
   implicit none
   integer, parameter :: NumAtoms=100
   real, dimension(NumAtoms,3) :: atoms
   character*2, dimension(NumAtoms) :: symbols
   integer :: i
   open(unit=7,file="some_name")
   do i = 1, NumAtoms
    read(5,*) symbols(i), atoms(i,:)
    write(7,'(a,3f20.6)') symbols(i), atoms(i,:)
   end do
   close(7) 
 END PROGRAM COORDINATES

Exercise 2 - compute centre of mass of molecules

Step 1: specify the masses for each kind of atoms.

a. How many atomic species do you have?

5: C N H B F

we only need to specify the masses for 5 atoms, which will be used for computing the centre of mass. These can be defined as parameters at the beginning of the code, for instance:

 real, parameter :: mH = 1.008

Exercise 3 - MSD program

The basic formula that we want to code is (11) at this page http://www.fisica.uniud.it/~ercolessi/md/md/node36.html

MSD = <|r(t)-r(0)|²>

where <...> denotes here averaging over all the atoms (or all the atoms in a given subclass). Care must be taken to avoid considering the "jumps" of particles to refold them into the box when using periodic boundary conditions as contributing to diffusion. (We will take periodic boundary conditions into account when computing the centres of mass, so we will avoid the jumps. - is this right?)

Now, in that formula the vectors r(t) and r(0) will be the positions of the centre of mass of bmim for instance at time t and at time t=0. Or they could be the centres mass of the anions, or the centres of the rings, etc.

Step 1: read the coordinates at time 0, then at time t1, t2, and so on, i.e. to read all steps in the trajectory (HISTORY).

The trajectory has all 216 bmim first, followed by 216 NO3. We can start by reading only a subset of coordinates (e.g. for the first 10 or 50 steps), and test the code on those steps. Then extend later to the full trajectory.

Now first problem is that the HISTORY file contains, for each step, first a series of lines that we should ignore (i.e. read them and then forget about them). Then, for each atom, a line with atomic symbol, a number, atomic mass, and another (not useful) number (charge, not need it for the RDFs and MSD, but it could be useful fur future analysis). And then, in the following line, the x, y, z.

a. modify the program in Exercise 1, so as to read in the coordinates from HISTORY for more than one iterations.

There are a few things that we should take into account. For instance, the HISTORY file starts with a blank line. We should remember this.

So, if we start from the old coordinates.f90, we can first of all add a dummy reading for the first blank line, like this:

 read(5,*) ! Read initial blank line

This will be outside all other loops, because it only occurs once.

Then we will write a big external loop on the number od steps. Inside this loop, there will be a loop on the number of molecules, then loop on the number of cations or anions in the molecule, similar to the old one. So we can rename the indices, to avoid confusion, and use the loops:

 do step = 1, NumSteps
   do molecule = 1, NumMolecules
     do atomCation = 1, NumAtomsCation
     end do
     do atomAnion = 1, NumAtomsAnion
     end do
   end do
 end do

step, molecule, atomCation, atomAnion have to be defined as integers. NumSteps NumMolecules, NumAtomsCation and NumAtomsAnions are parameters.

Note we close the loop on atomCation before starting the new loop on AtomAnion.

So now, inside the fist loop, we have to read some (5) lines that we do not actually need.

For starting, you can put NumSteps to 1, and try to read only the first step.

This time we will not need to store the atomic symbols, so we do not need these arrays. We can however keep a dummy character variable "symbol" for when we read, like this:

 character*2 :: symbol

to do: continue with reading the initial lines for each iterations. It might be slightly tricky to do it right.

Day 2

Exercise 1: a program to convert CPMD TRAJECTORY file into .xyz so that it could be visualised by VMD.

Day 3

Exercise 1: Putting molecules together

Purpose: in DL-POLY, we cannot just change the cell size (to be a larger box) and leave all the atom coordinates the same. When the atoms are outside the box, they can be translated inside the box. The real problem is when there is a bond on the cell boundary. If we translate the atom of the bond from outside to inside the cell, and then we change the cell size, it is as if we were stretching or shrinking the bond. This obviously gives a lot of problems with any constraints imposed on the structure, or on the stability of the structure itself. It is not just a question of putting all the molecules inside the cell, but also to make sure that bond lengths, etc. are conserved when we modify the cell size.

what i wanted to do is e.g. the simulation started and finished at a volume using NVT wiht cell size a. and the ending pressure was large. now i want to make the pressure smaller so increase the cell size and restart a new cal (with NVT or NPT) with a new cell size b. b > a. But this CONFIG file would cause some problems. Because even if all the atoms are in the box, the molecules are not "assembled". For instance, you could have an atom belonging to a molecule near a cell boundary, just inside the cell, and another just outside. If all the atoms are put back to the cell, the second atom will end up being very far from the first, near the opposite boundary of the cell. But this is no problem, because PBCs will take into account that atom 1 and atom 2 are in fact very close. However, if we decide to modify the cell size, PBCs will be applied with the new value of the cell, so they will no longer give atom1 and atom2 at the correct distance in the reconstructed molecule. So when you try to impose a constraint on that bond, the constraint algorithm will fail.

Ruth's code: I have no idea what the variable written out are, and whether this piece of code is supposed to work as part of DLPOLY, or as a standalone code. It seems to me though that the loop starting "do iatm=2,numsit(imoltyp)" may be doing what you want. xxx, yyy and zzz are vectors of coordinates. The loop reconstruct molecules. This is for those molecules whose bonds cross the cell boundaries. The routine rlpairimages evidently does this.

Day 4

Excise 1: a script that checks VDW in FF (DL-POLY) automatically. It can also be modified to compute new parameters using the combination rules, in case some values in the sigma/epsilon given in the paper were incorrect.