Posts Tagged ‘pbs’
Automating PBS jobs: Ideas and a Tutorial
I use PBS and clusters to run a number of my jobs. I will detail some of my experiences in automating job submission, monitoring and deletion.
- Deleting Jobs: My favorite script is the job deletion especially when you have to remove all the submitted jobs. Make a script file called jobdel.sh and make it executable. Naming the job submission scripts are especially useful, because you can then remove sets of jobs using a pattern. For example, currently, I use qstat -a| grep username to get all jobs from a specific user (me), however, if scripts are submitted in a pattern then the script below can be modified using qstat -a| grep username| grep pattern to remove a subset of jobs.
#!/bin/bash qstat -a | grep username | awk '{print $1}' >2.txt for LINE in $(cat 2.txt) ; do qdel "$LINE""r"; done
- Submitting sets of Jobs: I have to submit numerous jobs, where a single program is run with different parameters. Automating this is again easy. Make a file called job submit.sh and mark it executable. Open and add the following to execute an R program.
#!/bin/bash export dir=progdir export jdir=jobdir export file1=input1 export file2=input2 export testDir=dir1 export outFileName=outputFileName export Rprogram=ProgramName.R export outDir=outputDir export resDir=resDir export ITRCOUNT=10 export type=CA_TX for ROWS in 5000 10000 25000 50000 100000 do COLS=1 while [ $COLS -le 26 ] do for ITRSTART in 1 11 21 31 41 do fName="$fileName""$ROWS"_"$ITRSTART"_"$ITRCOUNT"_ #make PBS jobscript outName="$fName""$COLS"".txt" >"$jdir""$fName""$COLS" touch "$jdir""$fName""$COLS" cp /dev/null "$jdir""$fName""$COLS" echo "#!/bin/bash" >> "$jdir""$fName""$COLS" echo "#PBS -l nodes=1:nehalem:ppn=1">> "$jdir""$fName""$COLS" #request 1 node of nehalem and 1 processor at this node echo "#PBS -l walltime=64:00:00">>"$jdir""$fName""$COLS" # request 64hrs of compute time echo "#PBS -l mem=16gb">> "$jdir""$fName""$COLS" #request 16GB memory echo "#PBS -N TopK"$type"_"$ROWS"_"$COLS" " >> "$jdir""$fName""$COLS" #name of job file echo "#PBS -m abe">> "$jdir""$fName""$COLS" echo "#PBS -M username" >> "$jdir""$fName""$COLS" #mail to username #this is important. Copy all files to the node #hard drives since this improves compute performance echo "cp $testDir$file1 " "\$TMPDIR" >> "$jdir""$fName""$COLS" echo "cp $testDir$file2 " "\$TMPDIR" >> "$jdir""$fName""$COLS" echo 'cd $PBS_O_WORKDIR'>> "$jdir""$fName""$COLS" echo "# run the program">> "$jdir""$fName""$COLS" echo "module load R" >> "$jdir""$fName""$COLS" echo "R --slave --args \$TMPDIR/$file1 \$TMPDIR/$file2 $ROWS $COLS $resDir$outName $outDir $type $ITRSTART $ITRCOUNT < $dir$program " >> "$jdir""$fName""$COLS" qsub "$jdir""$fName""$COLS" #submit job to PBS done COLS=$((COLS+1)) done done
- Querying PBS
$ qstat -a | grep username -> get all jobs for username $ qstat -a | grep username |wc -l-> get number of jobs for username $ qstat -a | grep arj135 |awk '{print $10}' | grep R |wc -l -> get number of running jobs for username $ qstat -a | grep arj135 |awk '{print $10}' | grep Q |wc -l -> get number of queued jobs for username $ pbsnodes -a | grep "state = free" -> get all free nodes $ pbsnodes -a | grep "state = free" | wc -l -> number free nodes