Jump to content

Resgrp:comp-photo-version control new gdv

From ChemWiki

Introduction

This page will explain how to import a new version of gdv into the version control system. For general help on the version control system look here.

How to import a new version

I'll assume you've got a tarball from Gaussian, in their usual format.

First, get a up to date copy of the gaussian-inc-versions repository. Either use hg clone to get a new copy or hg pull an existing repository. Make sure it's updated to the latest version on the raw branch (using hg update raw, or hg update -C raw if necessary.

Next, untar the tarball somewhere and process it using the old-to-new.sh script in the repository.

$ ls
gdv  new-to-old.sh  old-to-new.sh  pillage.sh
$ mkdir ../h29
$ cd ../h29
$ tar zxvf /work/scliffor/gaussian/gdvh29p.tgz
$ ls
gdv

(As ever, when using new-to-old.sh, you'll need to make sure you have a gau-fsplit in your path. I usually just copy the binary from the latest compiled version to my ~/bin directory.)

Now convert the code to our new format.

$ ../gaussian-inc-versions/old-to-new.sh new-style
doing gdv/allxc.inc
doing gdv/amber98.prm
doing gdv/amber.prm
doing gdv/archlib.F
doing gdv/arctmp.F
doing gdv/basis
doing gdv/blas-generic.F
doing gdv/bsd
...
doing gdv/view
doing gdv/vuind.inc
doing gdv/wrappers.F
doing gdv/wrmat.F
doing gdv/xcind.inc
$ ls new-style/gdv/
allxc.inc       commonz.inc         ertgen.inc   l1002  l1112  l302  l405  l701  l904              mdarch      rwfcopy.F
amber98.prm     copychk.F           fffcom.inc   l1003  l112   l303  l502  l702  l905              mdl1        rwfdump.F
amber.prm       cphfutil            fhello.F     l1004  l113   l305  l503  l703  l906              mdutil      separ.inc
archlib         crdparams.inc       filecom.inc  l101   l114   l306  l504  l705  l908              mm2.prm     tests
arctmp.F        crendx.F            flddsc.inc   l1014  l115   l307  l506  l709  l909              mm.F        trajgen.F
basis           cubegen.F           fldloc.inc   l102   l116   l308  l508  l715  l913              newzmatF    uff.prm
blas-generic.F  cubman.F            formchk.F    l103   l117   l309  l509  l716  l914              ntran.inc   unfchk.F
bsd             Default.Route.save  freqchk.F    l105   l118   l310  l510  l717  l915              oplsaa.prm  utilam
c86dv.F         demofc.F            freqmem.F    l106   l120   l311  l511  l718  l916              osutil      util.hlp
chkchk.F        dftba.prm           gauopt.F     l107   l121   l314  l601  l801  l918              pluck.F     utilnz
chkmove.F       dftbpar.prm         gauoptl.F    l108   l122   l315  l602  l802  l920              prparm.inc  view
cktoig.F        dinautil            gautraj.F    l109   l123   l316  l604  l804  l921              putil       vuind.inc
commonb2.inc    doc                 gdrgen.F     l110   l124   l317  l607  l809  l922              putil.hlp   wrappers
commonb.inc     dreiding.prm        ghelp.F      l1101  l125   l318  l608  l810  l923              qppar.inc   wrmat.F
commonlab.inc   dummy.F             ghelp.hlp    l1102  l126   l319  l609  l811  l924              qpstat.inc  xcind.inc
commonlp2.inc   dummy-link.F        grate.F      l111   l127   l320  l610  l901  l930              rdmat.F
commonmol2.inc  dummy-links.F       ham506.F     l1110  l202   l401  l611  l902  l9999             reform.F
commonmol.inc   dummy-narch.F       l1           l1111  l301   l402  l612  l903  lapack-generic.F  repall.inc

It's worth having a look here to make sure nothing looks too odd. This looks OK, so we press on. Change back to the gaussian-inc-versions directory, delete the gdv directory, and replace it with the gdv directory you just created.

$ cd ../gaussian-inc-versions
$ rm -fr gdv
$ cp -pr ../h29/new-style/gdv/ .

Now we have the new code, in our format, in the repository. Our job now is to prepare a commit that will correctly represent the changes that have been made by Gaussian.

The first task is to check that our ignore filter isn't hiding any changes we want to include. We can do this with hg stat -i. If you do this in this case, you'll see that the only things being ignored are the log files of test jobs, which we don't keep in the repository. Sometimes you'll see that Gaussian have added a library files (a .a file). If this happens add that particular file now using hg add.

Next we check the output of hg stat -u, which shows unknown (new) files. Firstly, we're checking here to see if there are any files that aren't being ignored that should. Examples include executables or compiled routines for which we have the source, Gaussian outputs such as .log</code or .chk files, temporary files (such as Emacs filename~ files), and so on. Anything that can be recreated from other things in the gdv directory is probably ignorable. Add suitable lines to the .hgignore file as appropriate, or just delete the file if you feel an ignore entry is not needed.

In the output of hg stat -u in this case you'll notice that the test000.com files have all been renamed to test0000.com files. Mercurial initially reports this as a deletion (of test000.com) and addition (of test0000.com). We can get Mercurial to recognise these as renames[1] instead by doing this command (the -n flag makes addremove do a dry-run. We'll do the tests subdirectory separately, as it's probably going to be fairly straightfoward.

$ hg addremove -s 50 -n gdv/tests | less

The -s N instructs Mercurial to compare files and report renames where the contents differ by no less than N%. If you set N too low then you'll see many false positives, too high and you may miss renames. Note that the percentage is based on the number of lines changed, so changes of even a few characters to a small file can produce quite low similarity percentages. If you leave -s off it will still report those that are 100% the same. In this case you'll see (after a while) a huge list of lines like:

recording removal of gdv/tests/com/test370.com as rename to gdv/tests/com/test0370.com (100% similar)
...
recording removal of gdv/tests/test983d-m-ref.cube as rename to gdv/tests/test0983d-m-ref.cube (83% similar)

Now you have to engage your knowledge of Gaussian! In almost every instance here in the tests directory it's clear that these are genuine renames, with some minor modifications. If in doubt, suspend the addremove command and do something like:

$ hg cat gdv/tests/test983d-m-ref.cube | diff - gdv/tests/test0983d-m-ref.cube | less

which will show the nature of the changes. Here you'll see that for reasons that probably only make sense to someone not using version control, Gaussian have very slightly altered some of the values in this .cube file.

You can play with the -s N parameter until you get a list that is only correct renames, with no false positives (if there are some false positives they can be undone before the commit, false negatives can also be added in later). When you're happy run the addremove command without the -n flag.

I decided to go with -s 50. Once the command finishes (and it can take quite a while, particularly for low values of N) you can check what's been done with hg stat -a -C, which shows files that have been added, and additionally, whether they are marked as copies. If they are copies they will appear like this:

$ hg stat -a -C
A gdv/tests/check-of
A gdv/tests/ckelap
A gdv/tests/com/test0000.com
  gdv/tests/com/test000.com
A gdv/tests/com/test0001.com
  gdv/tests/com/test001.com
...
A gdv/tests/com/test1036.com
  gdv/tests/com/test872.com
A gdv/tests/com/test1037.com
  gdv/tests/com/test872.com
A gdv/tests/com/test1038.com
...

which shows gdv/tests/com/test0000.com has been marked as a copy of gdv/tests/com/test000.com, etc. Note that gdv/tests/com/test1036.com and gdv/tests/com/test1037.com have been marked as copies of gdv/tests/com/test872.com, which has already been marked as the origin of gdv/tests/com/test0872.com. In certain circumstances this might be legitimate, with one file being renamed and copied (or just copied) but in this case, using hg diff I decide that however it's arisen, I don't want to record it. I remove the adds and then redo them, without marking them as renames:

$ hg revert gdv/tests/com/test1036.com gdv/tests/com/test1037.com
$ hg stat gdv/tests/com/test1036.com gdv/tests/com/test1037.com
? gdv/tests/com/test1036.com
? gdv/tests/com/test1037.com
$ hg add gdv/tests/com/test1036.com gdv/tests/com/test1037.com
$ hg stat -C gdv/tests/com/test1036.com gdv/tests/com/test1037.com
A gdv/tests/com/test1036.com
A gdv/tests/com/test1037.com

We now apply this same methodology to the remaining unknown files. We are trying to get to a situation where the output of hgstat -du is empty. The -u flag shows files that have appeared in the gdv directory but are unknown to mercurial; these are files added or renamed by Gaussian. The -d flag shows files that are known to mercurial but are no longer in the gdv directory.

If you run hg addremove on the gdv directory it will try re-add things in tests, so be careful. There's no harm in re-adding files you've already added, but you don't want to add files you've manually removed from the list (for example, hg addremove -s 50 gdv will re-mark gdv/tests/com/test1036.com as a copy of gdv/tests/com/test872.com again. Avoid this by using the -X flag to add remove:

$ hg addremove -s 40 -n -X 'glob:gdv/tests/**' gdv

If I use the above command I get a long list of files being removed and added and then at the end of list of files that are being recorded as renames. It's worth checking the list of additions and deletions, particularly scripts in the bsd directory. In this case there are some scripts, but they all check out OK. The list of renames is:

...
recording removal of gdv/l718/quairc.F as rename to gdv/dinautil/quairc.F (100% similar)
recording removal of gdv/l718/quarot.F as rename to gdv/dinautil/quarot.F (100% similar)
recording removal of gdv/l718/rotirc.F as rename to gdv/dinautil/rotirc.F (100% similar)
recording removal of gdv/l120/getlfo.F as rename to gdv/utilam/getlfo.F (100% similar)
recording removal of gdv/mdutil/lshift1.F as rename to gdv/mdutil/lshft1.F (73% similar)
recording removal of gdv/l718/difrot.F as rename to gdv/dinautil/difrot.F (97% similar)
recording removal of gdv/utilam/iiabs.F as rename to gdv/utilam/ivabs.F (82% similar)
recording removal of gdv/l1/decotp.F as rename to gdv/l1/decjob.F (77% similar)
recording removal of gdv/l1/decotp.F as rename to gdv/l1/decmth.F (47% similar)

this is where some of your Gaussian intuition comes in handy. Moving files from link subdirectories to utility subdirectories is a common and expected thing. If we inspect the gdv/utilam/iiabs.F to gdv/utilam/ivabs.F rename we see that the files are the same apart from two one character changes. Inspecting the final two renames shows that the file gdv/l1/decotp.F is indeed being split into two new files. Therefore I go ahead with the command without the -N flag.

The final task before we go ahead and commit this changeset is to check the makefile for any new targets. If these correspond to new executables we will need to add them to the .hgignore list.

$ hg diff gdv/bsd/i386.make | less

shows us that there are two new links:

...
@@ -212,7 +212,7 @@
         l105.exe    l106.exe    l107.exe    l108.exe    l109.exe \
         l110.exe    l111.exe    l112.exe    l113.exe    l114.exe    l115.exe \
         l116.exe    l117.exe    l118.exe    l120.exe    l121.exe \
-        l122.exe    l123.exe    l124.exe    l125.exe    l202.exe
+        l122.exe    l123.exe    l124.exe    l125.exe    l126.exe    l127.exe    l202.exe

 exe3:    l301.exe    l302.exe    l303.exe    l305.exe    l306.exe \
         l307.exe    l308.exe    l309.exe    l310.exe    l311.exe \ 
...

and two new utilities:

...
@@ -597,6 +597,12 @@
 unfchk: unfchk.o
        $(RUNF77) $(FFLAGS) -o unfchk unfchk.o $(EXTOBJ) $(EXTRAS) $(GAULIB) $(LIBS)

+gdrgen: gdrgen.o
+       $(RUNF77) $(FFLAGS) -o gdrgen gdrgen.o $(EXTOBJ) $(EXTRAS) $(GAULIB) $(LIBS)
+
+trajgen: trajgen.o
+       $(RUNF77) $(FFLAGS) -o trajgen trajgen.o $(EXTOBJ) $(EXTRAS) $(GAULIB) $(LIBS)
+
 rdmat: rdmat.o
        $(RUNF77) $(FFLAGS) -o rdmat rdmat.o $(EXTOBJ) $(EXTRAS) $(GAULIB) $(LIBS)
...

The links are detected by a pattern in the HG ignore file a we must add the two new utilities, so that:

$ hg diff .hgignore
diff -r 8db6c9367c60 .hgignore
--- a/.hgignore Thu Aug 02 11:17:35 2012 +0100
+++ b/.hgignore Mon Mar 04 17:58:09 2013 +0000
@@ -32,6 +32,7 @@
 gdv/gauopt
 gdv/gauoptl
 gdv/gautraj
+gdv/gdrgen
 gdv/gdv
 gdv/gdvlb
 gdv/ghelp
@@ -46,5 +47,6 @@
 gdv/rwfdump
 gdv/tags
 gdv/testrt
+gdv/trajgen
 gdv/unfchk
 gdv/wrmat

This then,represents Gaussian as it came from Gaussian Inc converted to our new format. Once you are satisfied commit the revision with hg ci. There is a preferred format for the log message for these raw branch commits; look at one of the previous commits using hg log -v --rev N | less for an example. The log message I will use for this particular commit is:

Import of Gaussian version H29p.  See below for details.

rm -fr gdv
cp -pr ../h29p/new-style/gdv .
hg addremove -s 50 gdv/tests
hg revert gdv/tests/com/test1036.com gdv/tests/com/test1037.com
hg add gdv/tests/com/test1036.com gdv/tests/com/test1037.com
hg addremove -s 40 -X 'glob:gdv/tests/**' gdv
and added gdrgen and trajgen to .hgignore.

The idea is that someone should be able to replicate what you've done just by referring to the log message.

Congratulations! You've just added a revision to the raw branch. Now tag it with the version identifier:

$ hg tag h29p-raw

Next, we must merge this version into our local branch, which represents a compilable version of Gaussian in our new format. generally this only requires changes to our local make files, however these changes can require considerable effort.

$ hg up local
1995 files updated, 0 files merged, 1228 files removed, 0 files unresolved
Added tags for local versions
$ hg merge raw
merging .hgignore
merging .hgtags
4 files to edit
...

At this point this guide will get a little hazy because I used vim and you may be using some other editor. After you type hg merge raw you may be presented with a graphical merge tool, a text based merge tool, a particular mode of your favourite text editor, a particular mode of a hated text editor, or indeed, nothing. You can customise the tool that is used if you want. For the purposes of this guide quit out of the tool or editor without making any changes. If you are using vim use the :cq command which will cause vim to return an error to mercurial which tells it that the merge is not finished. Again for the purposes of this guide run hg resolve -a -u to mark all the conflicting files as unresolved.

You should now be able to see:

$ hg resolve -l
U .hgignore
U .hgtags
U gdv/bsd/uber-i386.make
U gdv/bsd/uber-sandyb.make

Let's now resolve these four files. First .hgignore:

$ hg resolve .hgignore
merging .hgignore

This will return immediately having automatically merged the files and produced no conflicts. Easy enough. If you examine the file you will see that our two new executables have been added. Next:

$ hg resolve .hgtags
merging .hgtags

For me this command starts vimdiff with windows showing the tags file as it exists in the two-parent revisions of the merge plus another window where I can edit the final merged version; this window will have the title gdv/bsd/.hgtags, the others will contain words like "base, other, and orig". Although the tags file will generally look after itself I like to keep things neat so I edit the file so that the knew tag, h29p, exists in the final version.

now to the more difficult files:

$ hg resolve gdv/bsd/uber-i386.make

again for me this opens up vimdiff. Here we have some work to do because gdv/bsd/uber-i386.make has been marked as a copy of gdv/bsd/i386.make, which inevitably is changed with every new version of Gaussian. I'll go through this carefully here the generally you'll have to use your own experience with make and Gaussian to guide you through the process. If necessary you can quit out of the merge tool (use :cq for vim based editors. This marks the file as unresolved) and examine the changes to gdv/bsd/i386.make using hg diff.

The first difference I see involves INCDIRX. However, inspection shows that this has not changed in the new version of gdv/bsd/i386.make. The next change is to the definition of DIMENSX. Here we see that Gaussian have added a flag "-DSTUPID_ATLAS", changed the expansion of "-DDEFMAXCOORDINFO" from 16 to 32, and changed " -DSETCORE_OK" to " -DSETCDMP_OK". These are legitimate changes that I add to our local makefile.

The next part we come to is the part of the local makefile that says output from set-mflags here. This is a series of compiler and other flags that are emitted by Gaussian utility during the normal build phase. We have to capture those flags directly into the makefile. To do this you will need to do something like:

$ cd gdv
$ ln -s uber-i386.make bsd/gdv.make
$ make build-tools
$ PATH=$PATH:.:./bsd gdvroot=/tmp/scliffor/repos/gaussian-inc-versions/ bsd/set-mflags
GAUDIM=2500 CSIZE=2097152 CSIZEW=128 OPTOI= MMODEL='-mcmodel=medium' SPECFLAG= NKSEC=-DDEFKSEC=256 I8FLAG=-i8 R8FLAG=-r8 I8CPP1=-DI64 I8CPP2=-DP64 I8CPP3=-DPACK64 I8CPP4=-DUSE_I2 MACHTY=k8-64 GAULIBU=util.a BLAS1=bsd/libf77blas-amd64.a BLAS2=bsd/libatlas-amd64.a

that last line is space separated and not terminated with a newline. If you paste it into a file and replace the spaces with newlines you'll see that nothing has changed.

The next changes that we care about come in the definition of make targets exe1 and UTILLIST, as we noticed previously Gaussian have added two new links and two new executables l126.exe, l127.exe, gdrgen, and trajgen. Further down we must include the commands needed to make the two new executables.

Then there is a change to the commands needed to make gau-cpp and finally there are some alterations to the lists of subroutines that need special optimisation flags. I include all of these changes.

when you're done save the merged file and quit the merge tool. If the merge tool or editor that you are using exits with a success code then mercurial will mark the file as resolved.

Do the same for the gdv/bsd/uber-sandyb.make. The changes will be pretty much the same.

Because there are new executables targets we must add them to our master Makefile, gdv/Makefile. edit this file and you will see that the declarations of the targets mirrors the gdb/bsd/*.make files. Add l126.exe, l127.exe, gdrgen, and trajgen to the file. Further down, after the part of the makefile that looks like:

.INTERMEDIATE: unfchk.o
unfchk: unfchk.o $(GAULIB)
    $(RM) $@
    $(MAKE) -f $(GAU_DIR)/bsd/gdv.make MAKE='$(MAKE)' $(EXTRAMAKEFLAGS) $@

copy these five lines (including the blank one) and the change all occurrences of unfchk to gdrgen. Repeat for trajgen. Finally find the lines:

.SECONDARY: $(filter-out $(ml125.o), $(patsubst %.F, %.o, $(wildcard l125/*.F)))
l125.a: $(filter-out $(ml125.o), $(patsubst %.F, %.o, $(wildcard l125/*.F)))
    $(AR) rv $@ $?

copy them (again including the blank line), replacing l125 with l126. Repeat for l127.

After this we are hopefully left with the new version of Gaussian in our new format with our local makefiles to build that new format. The best way to test this is to build it! I did this and it builds successfully. Now run the test suite (yeah right).

A quick check with hg stat -u will usually show some .orig files. These are usually left behind by the merge process, and can be safely removed. If anything else shows up you must decide whether to add it to the .hgignore file or delete it. You are now ready to commit this revision, so do so with an appropriate log message, and tag it (hg tag h29p-local).

And now we're ready for the final merge. Update to the default branch and do the merge:

$ hg update default
$ hg merge local

This is very much the same as the previous merge from raw to local. What you're aiming for on the default branch is a version of Gaussian that will build and run efficiently on the group's local systems (namely CX1). This means that the makefiles should contain the necessary optimisations for the CX1 architecture. It may also mean that you may have to make changes to the code to fix runtime or compile time bugs. For instance, when we started using a new version of the PGI compiler it exposed a bug in Gaussian that prevented any jobs using more than one processor.

I have made some changes to the compiler flags of both of the uber-* makefiles.

It's worth compiling the code again at this point and running a few test jobs. You can use these compiled executables in the /home/gaussian-devel/versions directory.

Once you are satisfied that the code is fine push your changes back to the master repository (no harm in making a backup of the master beforehand). In the local-dev-master repository pull the changes from the gaussian-inc-versions repository and let people know that the changes are now available. They can pull and the merge the changes as I've described here.

  1. You may be wondering why we care about renames. Simply put, if changes from Gaussian that include a rename or a copy must be merged with your own changes to those files (or vice versa) that it helps mercurial to know about the renames and copies. Otherwise for a rename it only sees a file being deleted and another file being created.