The results and ramblings of research


Posts Tagged ‘pbs

Automating PBS jobs: Ideas and a Tutorial

with 2 comments

I use PBS and clusters to run a number of my jobs. I will detail some of my experiences in automating job submission, monitoring and deletion.

  • Deleting Jobs: My favorite script is the job deletion especially when you have to remove all the submitted jobs. Make a script file called and make it executable. Naming the job submission scripts are especially useful, because you can then remove sets of jobs using a pattern. For example, currently, I use qstat -a| grep username to get all jobs from a specific user (me), however, if scripts are submitted in a pattern then the script below can be modified using qstat -a| grep username| grep pattern to remove a subset of jobs.
qstat -a | grep username  | awk '{print $1}' >2.txt
for LINE in $(cat 2.txt) ; do qdel "$LINE""r"; done
  • Submitting sets of Jobs: I have to submit numerous jobs, where a single program is run with different parameters. Automating this is again easy. Make a file called job and mark it executable. Open and add the following to execute an R program.
export dir=progdir
export jdir=jobdir
export file1=input1
export file2=input2
export testDir=dir1
export outFileName=outputFileName
export Rprogram=ProgramName.R
export outDir=outputDir
export resDir=resDir
export ITRCOUNT=10
export type=CA_TX
for ROWS in 5000 10000 25000 50000 100000
                while [ $COLS -le 26 ]
                        for ITRSTART in 1 11 21 31 41
                                fName="$fileName""$ROWS"_"$ITRSTART"_"$ITRCOUNT"_ #make PBS jobscript
                                touch "$jdir""$fName""$COLS"
                                cp /dev/null "$jdir""$fName""$COLS"
                                echo "#!/bin/bash" >> "$jdir""$fName""$COLS"
                                echo "#PBS -l nodes=1:nehalem:ppn=1">> "$jdir""$fName""$COLS" #request 1 node of nehalem and 1 processor at this node
                                echo "#PBS -l walltime=64:00:00">>"$jdir""$fName""$COLS" # request 64hrs of compute time
                                echo "#PBS -l mem=16gb">> "$jdir""$fName""$COLS" #request 16GB memory
                                echo "#PBS -N TopK"$type"_"$ROWS"_"$COLS" " >> "$jdir""$fName""$COLS" #name of job file
                                echo "#PBS -m abe">> "$jdir""$fName""$COLS" 
                                echo "#PBS -M username" >>  "$jdir""$fName""$COLS" #mail to username
                                #this is important. Copy all files to the node
                                #hard drives since this improves compute performance 
                                echo  "cp  $testDir$file1 "  "\$TMPDIR"  >> "$jdir""$fName""$COLS"
                                echo  "cp  $testDir$file2 "  "\$TMPDIR"  >> "$jdir""$fName""$COLS" 
                                echo 'cd $PBS_O_WORKDIR'>> "$jdir""$fName""$COLS"
                                echo "# run the program">> "$jdir""$fName""$COLS"
                                echo "module load R" >> "$jdir""$fName""$COLS"
                                echo "R --slave --args \$TMPDIR/$file1 \$TMPDIR/$file2 $ROWS $COLS $resDir$outName $outDir $type $ITRSTART $ITRCOUNT < $dir$program " >> "$jdir""$fName""$COLS"
                                qsub "$jdir""$fName""$COLS" #submit job to PBS
  • Querying PBS
$ qstat -a | grep username -> get all jobs for username
$ qstat -a | grep username |wc -l-> get number of jobs for username
$ qstat -a  | grep arj135 |awk '{print $10}' | grep R |wc -l -> get number of running jobs for username
$ qstat -a  | grep arj135 |awk '{print $10}' | grep Q |wc -l -> get number of queued jobs for username
$ pbsnodes -a | grep "state = free" -> get all free nodes
$ pbsnodes -a | grep "state = free" | wc -l -> number free nodes

Written by anujjaiswal

December 10, 2011 at 12:04 am