Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The new juno Juno cluster is now available only for testing. Please request "Access another cluster" from http://hpc.mskcc.org/compute-accounts/account-change-request/ if you want to participate in the test. juno currently has 216 . Juno resources are specifically for investigators associated with the Center for Molecular Oncology. If you have an account on luna, you have access to juno.

Juno currently has 2556 CPUs. The login node is juno'Juno', and runs CentOS 7 with GPFS . Nodes jx01-jx10 as the main file system. 

All nodes are running CentOS 7 with GPFS, while node ju14 is running CentOS 6 without GPFS. All juno nodes access to the new  GPFS /juno storage.

All Juno nodes also  have access to the solisi Solisi Isilon file systems. CentOS 7 nodes also have access to /juno GPFS storage. 

Table of Contents

Configuration change log

New nodes jx01-34 added to the cluster. 

New nodes jy01-03 added to the cluster. These nodes don't have NVMe /fscratch. To request node with /fscratch partition: bsub -n 1 -R fscratch ...

Slides from November 2019 User Group: HPC-User-Group-2019-10.pdf

Slides from March 21 2019 User Group Juno_UG_032019_final.pdf

Slides from Sep 27 2018 User Group meeting (updated Nov 28)  Juno_UG_092018_final.pdf

...

December 17, 2018: The new queue "control" has been added. Please, check "Queues"

November 28, 2018: The default OS type is CentOS07. Please, check "Job Submission"

Differences in LSF Configuration between

...

  1. We reserve ~12GB of RAM per host for the operating system and GPFS on Juno CentOS 7 hosts.
  2. On each jx## node (CentOS 7, GPFS installed), 240GB of RAM is available for LSF jobs.
  3. On each ju## node (CentOS 6, no GPFS), 250GB of RAM is available for LSF jobs.
  4. When specifying RAM for LSF jobs, specify GB of RAM per task (slot) on juno (unlike luna, where RAM is specified per job).
  5. All jobs must have -W (maximum execution Walltime) specified on juno. Please do not use -We on juno.
  6. There is no /swap on CentOS 7 nodes. Memory usage is enforced by cgroups so jobs never swap. A job will be terminated if memory usage exceeds its LSF specification.
  7. To check jobs which are DONE or status EXIT, use "bhist -l JobID" or "bhist -n 0 -l JobID". bacct is also available. "bjobs -l JobID" only shows RUNNING and PEND jobs.
  8. There is no iounits resource on juno.
  9. CMOPI SLA configured on juno. The loan policies are: 100% resources for 90 mints jobs for not SLA users, and 75% resources for 240 mints jobs for not SLA users.

Queues

...

Juno and luna

Please see here: http://mskcchpc.org/display/CLUS/Juno+vs.+Luna

Queues

The Juno cluster uses LSF (Load Sharing Facility) 10.1FP6 1FP8 from IBM to schedule jobs. The Juno cluster has two queues: general and control. The default queue, ‘general’, includes all Juno compute nodes.

The control queue doesn't have wall-time limitation and has one node with 144 oversubscribed slots. The control queue should be used only for monitoring or control jobs (the jobs which doesn't use real CPU and memory resources).

To submit the job to the control queue:

Code Block
languagebash
bsub -n 1 -q control  -M 1 

 

Job Resource Control Enforcement in LSF with cgroups

...

Maximum Waltime (job runtime): 6 hours

Memory (RAM) : 2GB


Short vs. Long Jobs and Node Availability

Juno has CMOPI and DEVEL SLAs. When CMOPI/DEVEL jobs are not filling their assigned nodes, 100% of those job slots are available to non-CMOPI jobs with a duration under 2 hours, 75% of slots are available to jobs under 4 hours, and 50% of slots are available to jobs under 31 days.

Nodes assigned to other SLAs are available to non-SLA jobs up to 6 hours.

Job Submission 

By default jobs submitted on juno only run on CentOS 7 nodes (with GPFS). Users can specify CentOS 6 nodes (Isilon only), CentOS 7 nodes (with GPFS), or either type. 

To submit a job to CentOS 7 nodes use either of these formats:

Code Block
languagebash
bsub -n 1 -W 1:00 -R "rusage[mem=2]"
bsub -n 1 -W 1:00 -app anyOS -R "select[type==CentOS7]" -R "rusage[mem=2]"

To submit a job to CentOS 6 nodes:

Code Block
languagebash
bsub -n 1 -W 1:00 -app anyOS -R "select[type==CentOS6]" -R "rusage[mem=2]"

...

Code Block
languagebash
bsub -n 1 -W 1:00 -app anyOS -R "rusage[mem=2]"



To submit a job to nodes with NVMe /fscratch:

...