Use of MSK HPC resources assumes you have a basic understanding of Unix/Linux operating systems and their use. If you are a novice and would like additional assistance in getting started, please let us know and we’ll do our best to accommodate you with the basic training needed to fairly and efficiently use our compute resources.
- Access is via ssh only.
- No Protected Health Information (PHI) is allowed on any HPC system currently.
- Users must have a working knowledge of Linux and high performance computing, and be mindful of the impact of their jobs on the usability of compute resources for others. Users who negatively impact compute resources will be restricted, have their jobs terminated, or lose access.
- We strictly require all users to have their own accounts and do not allow users to share their logins.
- Users should transfer their data onto the cluster, transfer results off, and clean up when they are finished. Cluster storage is for active working data only. Please talk to SKI Computing about storage of results and long-term archiving of data.
- All compute jobs must run on the compute nodes through the LSF scheduler. Computations on login nodes will be terminated without notice. Repeat offenders will be restricted or removed. SSH to nodes is enabled for all users by default to allow for log and job inspection when appropriate. It is NOT intended for running applications or testing them outside of the scheduler. This poses a risk to the system and affects all other users. Users will be warned initially, but repeat offenders will have SSH to node access removed. Users are required to use interactive sessions for such activity. If direct compute access outside of a scheduler is absolutely required, users are asked to submit a request, and will be guided to more appropriate compute servers. SSH access to nodes is provided by default for process monitoring and debugging, NOT computing.
- We require notification when users leave MSK so we can terminate accounts and free the space for other users (see “Closing Accounts“, below).
- For other systems, all management and consultation is billed in addition to the Laboratory Subscription Fee.
- Please report any problems to
- When requesting an account, users must provide an active email address. If at any point the email address on file is no longer active, your account will be deactivated. It is the user’s responsibility to update their email address by emailing
Data Backup
- Lilac Cluster:
- /home is replicated locally, it is not backed up. Daily snapshots are taken and retained for 4 days rolling window.
- /data/<group> is replicated locally, no backups are taken, 4 day snapshot window.
- /warm is non-replicated cheap storage. it is not backed up, it is not replicated, it has a 4 day snapshot window.
- There is NO offsite backup of non-home data, and no recoverability in case of catastrophic failure.
- Juno Cluster:
- /home is duplicated to an off-site replica. In the event of a complete failure of the primary storage array data, current /home data can be recovered.
- Other data is snapshotted on-site, and files can be recovered for up to 2 days. After 2 days deleted files are unrecoverable.
- There is NO offsite backup of non-home data, and no recoverability in case of catastrophic failure.
Closing Accounts
Please complete the account close request form to notify us of your intent to close your HPC account.
An account is not considered closed until all data has been removed from /home. As long as data remains in a user’s /home directory, the user’s PI will continue to be charged for that account and /home quota.
It is the responsibility of the user to delete their /home data; the HPC group does not delete user data. If a user is unable to move or delete data from their /home directory, the HPC group is able to reassign files to another user or share them with the group. Please note your request to reassign files in the close an account form.