Condo HPC Service

The information on this page is largely superseeded by our new service structure, but being presevered for posterity

Overview

With the CRC's condo service, a Principle Investigator (PI) invests in compute+storage hardware that is placed in the CRC data center and added to the HPC cluster. Participants in the condo service share unused portions or elements of the resource with each other and non-invested users (such as students or occasional users) who do not pay a fee for access.

Condo membership is granted to a PI who buys compute nodes to plug into existing infrastructure. Unused portions of these resources (i.e.; compute cycles) will be shared among other condominium users until the primary user requires access to the resource. A queue management system gives vested PIs top priority to the hardware he/she has purchased for a defined period (below) whenever the PI needs the resource. We use wall-time limits and/or pre-emption to interrupt other users' jobs as needed to give vested PIs access to their share.

Common infrastructure elements such as the environmentally regulated data center, network connectivity, equipment racks, management and technical staff, etc. allow the PI to focus time and energy on research.

Benefits

  • The condo HPC model presents researchers with much greater flexibility and power, coupled with greatly reduced overhead and management requirements as compared to owning and operating individual, standalone clusters.
  • Access to Condominium and Free hardware within the CRC cluster
  • Enhanced security through restricted physical access

Buy In

New hardware purchases

There are two account options for new hardware purchases, one in which the PI retains ownership and one where the CRC will become the owner.

PI Ownership

A PI will have priority access to any hardware purchased for the period of the standard warranty of the hardware. During this time any hardware problems will be corrected as soon as possible given the warranty terms. After the warranty is expired, the PI will have standard shared access to the compute nodes, and the hardware will be supported on a Best Effort basis until is suffers complete failure. Once a node has reached End of Life due to failure or obsolescence, it will be removed from service and either returned to the PI or surplussed. Hardware can be returned to the PI at any time.

PI's will receive Condominium Accounts for each member of their research group, each of which will have priority scheduling on the PI's condo compute nodes. Condominium accounts are described below. If the PI chooses to pay for additional Standard or Satellite accounts, those can also have priority scheduling on the condo hardware at the PI's request.

CRC Ownership

PI's will be given Standard or Satellite accounts of equivalent value to the hardware purchased (based on current rates, subject to annual updates), and those accounts will have priority scheduling on the PI's condo compute nodes for the duration of the warranty period. The CRC will correct any hardware problems as soon as possible given the warranty terms. At the end of the warranty period, the CRC will maintain the condo compute nodes on a Best Effort basis until the hardware suffers systemic failure. Hardware cannot be returned to the PI.

Minimum specifications for nodes:

  • Rack chassis (<= 4U) with rails
  • Processor physical cores: 8 (with hyperthreading), 12 (no hyperthreading)
  • System RAM: 64 GB
  • Boot hard drive: >= 120 GB (mirrored boot hard drives preferred)
  • Storage hard drives combined capacity: >= 8TB
  • Networking: 2x 1Gbe Ethernet (10Gbe SFP+ Ethernet preferred)

Additional optional hardware:

  • GPU: Nvidia 1080 Ti or better

Please contact the CRC to assist with hardware specifications before placing an order, we will not be responsible for misconfigured servers that we did not assist in ordering.

Previous hardware purchases

If a PI has existing compute/storage hardware and is interested in adding to the CRC cluster there are two alternatives: 1) PI retains ownership and maintains hardware. In this case the PI and (those they deem appropriate) will have priority access to the server(s) for as long as they fund hardware maintenance costs. These costs include things such as replacing failed components (hard drives, power supplies, etc). PI's (and other members of their research group) will be granted Condominium accounts. 2) Alternatively, if the PI transfers ownership of the hardware to the CRC, the CRC will fund hardware maintenance on a Best Effort basis, replacing ancillary components such as hard drives, fans, etc. In this case, the PI will not have priority access, but will be granted Condominium accounts.

The hardware must meet these minimum specifications:

  • Rack chassis (<= 4U) with rails
  • Processor physical cores: 8
  • System RAM: 64 GB
  • Boot hard drive: 120 GB
  • Networking: 2x 1Gbe Ethernet

Condominium Accounts

Condominium Accounts are granted to PI's and those individuals (students, faculty, staff) a condominium PI requests have access to their compute nodes:

  • Access only to Condominium and Free hardware - which includes condominium nodes, and fully depreciated CRC hardware.
  • Access to the partition associated with that PI's condo nodes (if the PI has priority access to nodes).
  • Access to the 'volatile' partition on the cluster, which includes all condominium nodes. Jobs running in the volatile partition are subject to preemption by Condo account holders. Maximum of 10 concurrent jobs.
  • Access to the 'free' partition on the cluster.
  • Maximum of 1 TB data usage. Data is stored redundantly, but not backed up. No file recovery services offered. Condominium users who also have Standard or Satellite accounts do not have these limitations (data is backed up). Condominium accounts' data can be backed up if the PI provides hardware with such capability.

Free Accounts

Members of the UI research community who would like access to the CRC compute resources for academic purposes will be provided a Free account at no charge. PI's, Postdocs, and Graduate students can submit account requests directly, undergraduate students must be sponsored by a faculty member or instructor. Undergraduate accounts will automatically terminate at the end of each academic semester unless otherwise requested. All Free accounts have the following restrictions:

  • Access only to Free hardware - which includes fully depreciated CRC hardware. Standalone servers: jayne, slartibartfast
  • Access only to the 'free' partition on the cluster. Jobs running in the free partition are subject to preemption by Condo account holders. Maximum of 10 concurrent jobs.
  • Maximum of 100 GB data usage. Data is stored redundantly, but not backed up. No file recovery services offered.

List of Free Hardware

  • Standalone Servers: slartibartfast, jayne
  • Cluster nodes: n001-n016

List of Condominium Hardware

  • n106, n107

updated 2018-11-07

Previous version available here