General Documentation
- Welcome FAQ
- Secure Shell SSH
- Available Software
- Installing Software
- Guidelines and Policies
- Glossary
- Grant Support
- Sharing Data
- Containers & Singularity
- UserGroup Presentations
- Jupyter Notebook Usage
LSF Primer
Lilac Cluster Guide
Juno Cluster Guide
Cloud Resources
Backup Policy on server/node local drives
File lists
Page History
...
Nyx currently has 8 compute nodes na01-08 (288 CPUs) with Gclisi Isilon as the main file system. The login node is 'nyx'.
...
5. To check jobs which are DONE or status EXIT, use "bhist -l JobID" or "bhist -n 0 -l JobID". bacct is also available. "bjobs -l JobID" only shows RUNNING and PEND jobs.
Queue
The Nyx cluster uses LSF (Load Sharing Facility) 10.1FP6 from IBM to schedule jobs. The default queue, ‘general’, includes all Nyx compute nodes.
Job Resource Control Enforcement in LSF with cgroups
LSF 10.1 makes use of Linux control groups (cgroups) to limit the CPU cores and memory that a job can use. The goal is to isolate the jobs from each other and prevent them from consuming all the resources on a machine. All LSF job processes are controlled by the Linux cgroup system. If a job's processes on a host use more memory than it requested, the job will be terminated by the Linux cgroup memory sub-system.
LSF Configuration Notes
Memory (-M or -R "rusage[mem=**]" ) is a consumable resource. specified as GB per slot/task (-n).
LSF will terminate any job which exceeds its requested memory (-M or -R "rusage[mem=**]").
Job Default Parameters
Queue name: general
Operating System: CentOS 7
Number of slots (-n): 1
Maximum Waltime (job runtime): N/A
Memory (RAM) : 2GB
Job Submission
To submit one slot and 6GB of RAM job to Nyx nodes :
Code Block | ||
---|---|---|
| ||
bsub -n 1 -R "rusage[mem=6]" -J test |