By default all jobs are scheduled on the sol
queue. Jobs are scheduled based on their resource requests, estimated runtime, and user priorities. You can also specify the test
queue, at your own risk.
commonHG
: s04-s33
& u01-u26
largeHP
: t01-t02
commonHP
: s04-s33
& u01-u26testHP
: u34-u35
-We
, or hard run time, -W
, of 59 [minutes] or less, it is considered ‘short’.-We HOURS:MINUTES
or -W HOURS:MINUTESlargeHG
hosts if they request enough memory to be eligible.largeHG
hosts are for jobs that need a lot of memory.internet
access on largeHG
hosts.Internet
hostsinternetHG
hosts are for jobs that need internet
access.internet
, can run on internetHG
hosts.-R "select[internet]"
.bsub
command, as in bsub -sla Pipeline
, to be guaranteed a certain amount of the cluster (so long as that portion isn’t in use by anyone else in the same SLA group.)entire
host is reserved for Pipeline until Pipeline reaches its SLA.luna
:Pipeline
gets 40% of commonHG, 50% internetHG
, and 50% largeHG
.Short
(short jobs auto attach to this) gets 20% of commonHG if there are no priority jobs.-R "span[hosts=1]"
Jobs that request multiple processors span a single host.-R "rusage[iounits=1]" The maximum iounits per host is 10. IOUNITS are an arbitrary measure of the amount of reading/writing that the job incurs.
-o file
. To redirect you must add quotes around the command to execute inside the bsub
command. For example: bsub -We 1 -J jobName -o output_file.txt "ls -al 1> redirect_file.txt"
bsub -w
is the wait option, as in bsub -w "post_done($PREV_JOBNAME)"
bsub
command.Use post_done
to hold jobs, instead of done, which may start too quickly. If holding on multiple jobs with very similar names, -w “post_done($PREV_JOBNAME*)” should work, unless you have one. This will only let the job run if $PREV_JOBNAME job completed with exit status 0, and
completed its post_done
processes.
bsub sleep 30
This submits a basic sleep job (sleeps for 30 seconds)bsub -J jobname -We 0:30 -R "select[internet]" myjob
Submits job with job name “jobname” with an estimated runtime of 30 minutes, selecting for hosts with internet.bsub -m commonHG -R “rusage[mem=20]” myjob
Submits jobs only to hosts in host group commonHG, with 20GB mem requestedFirst the user must `export LSB_JOB_REPORT_MAIL=Y` on the terminal that they are going to submit their job.
Then they use bsub -u <emailaddress@site.com> -N
The -N means email the job output file (people usually write it to a file using -o) at the end of the job. This is what the e-mail will look like.
Job was submitted from host by user in cluster . Job was executed on host(s) , in queue , as user in cluster . was used as the home directory. was used as the working directory. Started at Tue May 24 11:14:37 2016 Results reported on Tue May 24 11:14:50 2016 Your job looked like: ------------------------------------------------------------ # LSBATCH: User input sleep 13 ------------------------------------------------------------ Successfully completed. Resource usage summary: CPU time : 0.07 sec. Total Requested Memory : - Delta Memory : - Run time : 13 sec. Turnaround time : 14 sec. The output (if any) follows:
*Also a NOTE before using this: If you have that LSB_JOB_REPORT_MAIL=Y exported and do not put -u or -N ( and you don’t have -o or -oo), a message gets sent to you in the terminal at /var/mail/username and is only on the host that you ran the job on. In order to change it back just export LSB_JOB_REPORT_MAIL=N after you are done! If the users DON’T there is probably a potential to flood the /var/mail directories on the hosts with junk!
Memory Request Rules:
– Both Soft memory limits -R “rusage[mem=GB] and Hard memory limits -M GB should be requested.
– If none are requested the default for soft is 8 GB and for hard is 16 GB
– If hard is requested but soft is not: soft = hard
Runtime Request Rules for short jobs:
– If soft runtime -We hour:minute is set and hard -W hour:minute is not, hard runtime = 2x soft runtime
– If hard runtime is set, soft runtime does not need to be.
Note:
– There is no hard runtime for long jobs. A job is considered long if there is no runtime specified.
– If soft mem limit is less than small host threshold (376 GB), job is long (>60 minutes), and it does not have internet requested the jobs will only be submitted to “commonHG”.
– Queue test is exempt from these rules.