General Documentation
- Welcome FAQ
- Secure Shell SSH
- Available Software
- Installing Software
- Guidelines and Policies
- Glossary
- Grant Support
- Sharing Data
- Containers & Singularity
- UserGroup Presentations
- Jupyter Notebook Usage
LSF Primer
Lilac Cluster Guide
Juno Cluster Guide
Cloud Resources
Backup Policy on server/node local drives
File lists
Page History
The new juno cluster is now available only for testing. Please request "Access another cluster" from http://hpc.mskcc.org/compute-accounts/account-change-request/ if you want to participate in the test. juno currently has 216 CPUs. The login node name is juno. It's , a running CentOS 7.3 host. The latest CentOS 7 Linux operating system is on nodes jx01-jx10 and the Centos 6 Linux operating system is on node ju14. All cluster Nodes jx01-jx10 are running CentOS 7, and node ju14 is running CentOS 6. All juno nodes have access to the solisi Isilon file systems. CentOS 7 nodes also have access to /juno GPFS storage.
Differences in LSF
...
Configuration between Juno and Luna
- We reserve ~12GB of RAM per host for the operating system and GPFS on Juno CentOS 7 hosts.
- On each jx## node (CentOS 7, GPFS installed), 240GB of RAM is available for LSF jobs.
- On each ju## node (CentOS 6, no GPFS), 250GB of RAM is available for LSF jobs.
- When specifying RAM for LSF jobs, specify GB of RAM per slot/task on Juno, or per job on luna.
- All jobs must have -W (maximum execution Walltime) specified on juno. Please do not use -We on juno.
- There is no /swap on CentOS 7 nodes. Memory usage is enforced by cgroups so jobs never swap. A job will be terminated if memory usage exceeds its LSF specification.
- To check jobs which are DONE or status EXIT, use "bhist -l JobID" or "bhist -n 0 -l JobID". bacct is also available. "bjobs -l JobID" only shows RUNNING and PEND jobs.There is no /swap on CentOS 7 nodes. Memory usage is enforced by cgroups so jobs never swap. A job will be terminated if memory usage exceeds its LSF specification.
- There is no iounits resource on juno.
Queues
The Juno juno cluster uses LSF (Load Sharing Facility) 10.1 FP6 1FP6 from IBM to schedule jobs. The default LSF queue, ‘general’, includes all Juno compute nodes.
Job
...
Resource Control Enforcement in LSF with cgroups
LSF 10.1 makes use of Linux control groups (cgroups) to limit the CPU cores and memory that a job can use. The goal is to isolate the jobs from each other and prevent them from consuming all the resources on a machine. All LSF job processes are controlled by the Linux cgroup system. If a job's processes on a host use more memory than it requested, the job will be terminated by the Linux cgroup memory sub-system.
...
LSF Configuration Notes
Memory (-M or -R "rusage[mem=**]" ) is a consumable resource requested . specified as GB per slot/task (-n).
LSF will terminate the any job if it which exceeds its requested memory (-M or -R "rusage[mem=**]").
All jobs should specify Walltime (-W), otherwise the a default Walltime of 6 hours will be used.
LSF will terminate the any job if it which exceeds the its Walltime.
The maximum Walltime for general queue is 744 hours (31 days).
Job Default
...
Parameters
Queue name: general
Operating System: CentOS 7.3
Number of slots (-n): 1
Maximum Waltime (job running timeruntime): 6 hours
Memory (RAM) : 2GB
Job Submission
The By default operating system for Juno jobs is CentOS 7.3. Users can request that jobs will run on one specific type of operating system or any operating systemjobs submitted on juno only run on CentOS 7 nodes. Users can specify CentOS 7 nodes (with GPFS), CentOS 6 nodes (Isilon only), or either type.
To submit the a job to default ( CentOS 7 .3) operating systemnodes:
Code Block | ||
---|---|---|
| ||
bsub -n 1 -W 1:00 -R "rusage[mem=2]" |
To submit the a job to CentOS 6 .9 operating systemnodes:
Code Block | ||
---|---|---|
| ||
bsub -n 1 -app anyOS -W 1:00 -R "rusage[mem=2]" -R "select[type==CentOS6]" |
To submit the a job to any operating systemnodes, running either CentOS 6 or 7:
Code Block | ||
---|---|---|
| ||
bsub -n 1 -app anyOS -W 1:00 -R "rusage[mem=2]" |
To submit the a job on node with available NVMe scratch to nodes with NVMe /fscratch:
Code Block | ||
---|---|---|
| ||
bsub -n 1 -R "fscratch" .... |
...