Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Access is via ssh only.
  • No Protected Health Information (PHI) is allowed on any HPC system currently.
  • Users must have a working knowledge of Linux and high performance computing, and be mindful of the impact of their jobs on the usability of compute resources for others. Users who negatively impact compute resources will be restricted, have their jobs terminated, or lose access.
  • We strictly require all users to have their own accounts and do not allow users to share their logins.
  • Users should transfer their data onto the cluster, transfer results off, and clean up when they are finished. Cluster storage is for active working data only. Please talk to SKI Computing about storage of results and long-term archiving of data.
  • All compute jobs must run on the compute nodes through the scheduler (Torque, SGE, or LSF). Computations on login nodes will be terminated without notice. Repeat offenders will be restricted or removed.  SSH to nodes is enabled for all users by default to allow for log and job inspection when appropriate.  It is NOT intended for running applications or testing them outside of the scheduler.  This poses a risk to the system and affects all other users.  Users will be warned initially, but repeat offenders will have SSH to node access removed.  Users are required to use interactive sessions for such activity.  If direct compute access outside of a scheduler is absolutely required, users are asked to submit a request, and will be guided to more appropriate compute servers. SSH access to nodes is provided by default for process monitoring and debugging, NOT computing.
  • We require notification when users leave MSK so we can terminate accounts and free the space for other users (see “Closing Accounts“, below).
  • For other systems, all management and consultation is billed in addition to the Laboratory Subscription Fee.
  • Please report any problems to hpc-request@cbio.mskcc.org.toImage Added
  • When requesting an account, users must provide an active email address. If at any point the email address on file is no longer active, your account will be deactivated. It is the user’s responsibility to update their email address by emailing hpc-request@cbio.mskcc.org.emailing Image Added

Data Backup

  • Lilac Cluster:
    • /home is replicated locally, it is not backed up.  Daily snapshots are taken and retained for 4 days rolling window
    • /data/<group> is replicated locally, no backups are taken, 4 day snapshot window
    • /warm is non-replicated cheap storage.  it is not backed up, it is not replicated, it has a 4 day snapshot window
    • /archive  is non-replicated cheap storage, it is backed up geographically (2nd copy) with a 4 day snapshot window
  • LUNA CLUSTER:
    • /home is duplicated to an off-site replica. In the event of a complete failure of the primary storage array data, current /home data can be recovered.
    • Other data is snapshotted on-site, and files can be recovered for up to 2 days. After 2 days deleted files are unrecoverable. Additionally, there is NO offsite backup of non-home data, and no recoverability in case of catastrophic failure.

...