LSF scheduler information specific to the luna cluster

By default all jobs are scheduled on the sol queue. Jobs are scheduled based on their resource requests, estimated runtime, and user priorities. You can also specify the test queue, at your own risk.

LSF host groups:

LSF host partitions

Long vs. Short Jobs

Large memory hosts

Internet hosts

Service Level Agreement Guarantees

Current defaults for jobs:

Use post_done to hold jobs, instead of done, which may start too quickly. If holding on multiple jobs with very similar names, -w “post_done($PREV_JOBNAME*)” should work, unless you have one. This will only let the job run if $PREV_JOBNAME job completed with exit status 0, and completed its post_done processes.

Examples

How to send an email at the end of a job:

First the user must `export LSB_JOB_REPORT_MAIL=Y` on the terminal that they are going to submit their job.
Then they use bsub -u <emailaddress@site.com> -N
The -N means email the job output file (people usually write it to a file using -o) at the end of the job. This is what the e-mail will look like.

Job was submitted from host by user in cluster .
Job was executed on host(s) , in queue , as user in cluster .
was used as the home directory.
was used as the working directory.
Started at Tue May 24 11:14:37 2016
Results reported on Tue May 24 11:14:50 2016
Your job looked like:
------------------------------------------------------------
# LSBATCH: User input
sleep 13
------------------------------------------------------------
Successfully completed.
Resource usage summary:
CPU time : 0.07 sec.
Total Requested Memory : -
Delta Memory : -
Run time : 13 sec.
Turnaround time : 14 sec.
The output (if any) follows:

*Also a NOTE before using this: If you have that LSB_JOB_REPORT_MAIL=Y exported and do not put -u or -N ( and you don’t have -o or -oo), a message gets sent to you in the terminal at /var/mail/username and is only on the host that you ran the job on. In order to change it back just export LSB_JOB_REPORT_MAIL=N after you are done! If the users DON’T there is probably a potential to flood the /var/mail directories on the hosts with junk!


LSF Rules:

Memory Request Rules:
– Both Soft memory limits -R “rusage[mem=GB] and Hard memory limits -M GB should be requested.
– If none are requested the default for soft is 8 GB and for hard is 16 GB
– If hard is requested but soft is not: soft = hard

Runtime Request Rules for short jobs:
– If soft runtime -We hour:minute is set and hard -W hour:minute is not, hard runtime = 2x soft runtime
– If hard runtime is set, soft runtime does not need to be.

Note:
– There is no hard runtime for long jobs. A job is considered long if there is no runtime specified.
– If soft mem limit is less than small host threshold (376 GB), job is long (>60 minutes), and it does not have internet requested the jobs will only be submitted to “commonHG”.
– Queue test is exempt from these rules.