Running Gaussian Jobs on The Grid

At a glance

  • Gaussian is located in /share/apps/g03/g03
  • Simple Gaussian job header:
    g03root="/share/apps/g03"
    GAUSS_SCRDIR="/tmp/"
    export g03root GAUSS_SCRDIR
    . $g03root/g03/bsd/g03.profile

Running a Job

Please begin by reading the general tutorial on Using the Grid. This tutorial will cover the same material in much less detail.

Now I'm going to assume that you can connect to the grid, and are at a normal bash prompt. We will need to do the following steps:

  • Create a directory to run the job in
  • Create/get the Gaussian input file
  • Setup a job file to run Gaussian on that input file
  • Submit this job to the queue

Additionally, we will watch the job as it runs.

Create a Directory and get a Job File

I am going to make a directory called "gaussian_job" (of course, you will want to build a directory hierarchy (organization of directories and files) that will help you keep your research organized). Then I am going to pull one of the test jobs out of the gaussian directory and use it. The first test is just a simulation with water.

[user@hnode ~]$ mkdir gaussian_job
[user@hnode ~]$ cd gaussian_job
[user@hnode gaussian_job]$ cp /share/apps/g03/g03/tests/com/test000.com .
[user@hnode gaussian_job]$ ls
test000.com
[user@hnode gaussian_job]$ head test000.com
# SP, RHF/STO-3G punch=archive trakio scf=conventional

Gaussian Test Job 00
Water with archiving

0 1
O
H 1 0.96
H 1 0.96 2 109.471221

 

Now we must edit our job file. I use the text editor VIM to do this. Our job file will be a simple Bash script, with three parts.

First, the #$ -cwd tells our queuing software that we want to run the program and put all output files in the directory where we run qsub from. (qsub is how we run jobs, this is explained in the next step).

Second we setup the Gaussian environment. I like to do this manually, because then you can set your scratch directory as is appropriate. You can copy the stock Gaussian environment from this webpage (see the "At A Glance" section at the top). If you need more space, you might point your GAUSS_SCRDIR to a different directory. I will do a full demo of this later.

Finally, we run g03, using bash's input redirection to get input from the file "test000.com" and bash's output redirection to send our output to a file called "output.txt".

[user@hnode gaussian_job]$ cat job.sge
#!/bin/bash
#$ -cwd

g03root="/share/apps/g03"
GAUSS_SCRDIR="/tmp/"
export g03root GAUSS_SCRDIR
. $g03root/g03/bsd/g03.profile

g03 < test000.com > output.txt

 

Now, this file "job.sge" represents everything that a computer needs to do, so we can hand this off to the queueing software to have it run. We do this by running qsub as follows:

[user@hnode gaussian_job]$ qsub job.sge
Your job 22586 ("job.sge") has been submitted

 

Now we wait for a little while. We can run qstat and watch our job go from "Queue Waiting" to "Running" and finally gone.

[user@hnode gaussian_job]$ qstat
[user@hnode gaussian_job]$ echo "No output means I have no jobs in the queue"
No output means I have no jobs in the queue
[user@hnode gaussian_job]$ ls
fort.7 job.sge job.sge.e22586 job.sge.o22586 output.txt test000.com
[user@hnode gaussian_job]$ tail output.txt
TO HEAR SUCH TUNES AS KILLED THE COW.
PRETTY FRIENDSHIP 'TIS TO RHYME
YOUR FRIENDS TO DEATH BEFORE THEIR TIME.
MOPING, MELANCHOLY MAD:
COME PIPE A TUNE TO DANCE TO, LAD.

-- A. E. HOUSMAN
Job cpu time: 0 days 0 hours 0 minutes 1.4 seconds.
File lengths (MBytes): RWF= 13 Int= 1 D2E= 0 Chk= 9 Scr= 1
Normal termination of Gaussian 03 at Tue Nov 20 11:29:30 2007.
[user@hnode gaussian_job]$

 

Success! Our job ran to completion.

Multiple CPU Jobs

We may want to reserve an entire node to run one of our jobs, so that we can use the full memory of that node. In order to do this we need to tell the queue that we want to run a parallel job using Shared Memory Parallel Environment.

Parallel jobs are run by specifying a parallel environment with the -pe option. There are many different parallel environments that can be used in different circumstances, such as when running large MPI jobs. By using the SharedMem.pe parallel environment we request multiple CPUs that must be on the same node. Since each node has 4 CPUs, we can reserve all 4 CPUs on a node with the following directive:

#$ -pe SharedMem.pe 4

 

This means that the Queue will reserve all four slots that could potentially hold a job, and so effectively reserves an entire node for us.

Other parallel environments will allow the user to reserve multiple CPUs on multiple nodes, so that networked jobs can run. However, we will not need to do this with Gaussian.

Walkthrough

First we need to create a job file.

job.sge

#!/bin/bash
#$ -cwd
#$ -pe SharedMem 4

g03root="/share/apps/g03"
GAUSS_SCRDIR="/tmp/"
export g03root GAUSS_SCRDIR
. $g03root/g03/bsd/g03.profile

g03 < test000.com > output.txt

Run as usual:

[user@hnode test03]$ qsub job.sge
Your job 29100 ("job.sge") has been submitted

 

Now if we check on the job as it runs, we see that it is using 4 slots on one of the nodes:

[user@hnode test03]$ qstat
job-ID    prior    name    user  ... queue                      slots
-------------------------------- ... ----------------------------------
29100     0.00000  job.sge user  ... NEBL.cq@compute-2-9.local    4

 

And we check our results:

[user@hnode test03]$ tail output.txt
5712,0.,0.3836558\PG=C02V [C2(O1),SGV(H2)]\\@
The archive entry for this job was punched.


IT CANNOT BE MY BEAUTY, FOR I HAVE NONE; AND IT CANNOT BE MY WIT,
FOR HE HAS NOT ENOUGH TO KNOW THAT I HAVE ANY.
-- CATHARINE SEDLEY, PUZZLED ABOUT WHY SHE WAS MISTRESS TO JAMES II
Job cpu time: 0 days 0 hours 0 minutes 2.7 seconds.
File lengths (MBytes): RWF= 13 Int= 1 D2E= 0 Chk= 9 Scr= 1
Normal termination of Gaussian 03 at Fri Nov 30 13:15:27 2007.
[user@hnode test03]$