Guidelines for users of BlueBEAR

Tux-svg-icon

 

 

This page outlines the process for registration and mechanisms that govern the use of the BlueBEAR (High Performance Computing) service. These mechanisms are designed to ensure fair distribution of BlueBEAR's large, albeit finite, resources between the whole user community and will be periodically revised in light of users' experience with the cluster and also as the quantity and variety of work grows.

Comments about these mechanisms or any other aspects of the system are welcome by the BlueBEAR management team at bearinfo@contacts.bham.ac.uk. Please note that the IT Service Desk should be used to report problems and submit requests for support. (Logging calls through the Service Desk ensures all calls are logged, prioritised, referred to the right person, followed-up and that statistics on problems and service requests are recorded and analysed.)

Service Provision

There are fundamentally two services that could be provided by a large cluster such as BlueBEAR:

  • Capability Computing which is a service provided to a small number of users who place long term intensive demands on system resources
  • Capacity Computing which is a service provided to a large and diverse user base, none of whom place long term intensive demands on system resources

BlueBEAR provides the latter: a Capacity Computing service.

Jobs on the cluster are controlled by a scheduling system in order to optimise the throughput of the service and ensure that the large but finite resources are effectively utilised. This means that there are some types of work, for example tasks that require repeated access to specific nodes, that are not appropriate for this service. IT Services may be able to offer alternative facilities for this type of work; contact the IT Service Desk in the first instance to discuss any such requirements. The scheduling system is configured to offer an equitable distribution of resources over time to all users, thereby fulfilling its remit to provide best value for the whole University in supporting the breadth of the University's research portfolio.

Service Registration

See the help page for registration for information about registering for the BlueBEAR service

Accessing the cluster

Access is provided via the Secure Shell (SSH) protocol. All users should login to bluebear.bham.ac.uk as the single point of entry to the system. For the security of the service, access is limited to machines with a University IP address. The University provides a Remote Access Service to enable you to connect to resources via the campus network from off-site, which includes being able to connect to BlueBEAR remotely. (Please use the Service Desk to request remote access if you require it.) For those connecting from Windows machines there is more detailed information on using PuTTY and Exceed.

Compute Resources

All jobs are under the control of the Slurm scheduling system. The scheduling system implements a "fair-share" policy based on the requirements of users and system managers with locally-defined policies. These policies are determined by the Service Policy Committee and as such are subject to regular reviews.

The maximum time limit on job (referred to as its walltime) is currently 10 days. On a large cluster such as this, failures of individual nodes are to be expected and it is the responsibility of the user to make long-running jobs resilient against the failure of individual nodes, for example by writing restart files during calculations.

The login nodes are provided in order for users to prepare and submit jobs but they are not intended for running applications: these will be terminated if excessive.

Applications

As is common on high performance computer systems, software applications are provided by the means of Environment Modules – this allows us to provide multiple applications, as well as multiple versions of these applications, without them conflicting. A list of the available commercial and open source applications (including information on how to load them and the support levels we offer) can be found on the Applications pages. Requests for new applications, alternate versions of existing applications, or additional software should be directed to the IT Service Desk.
There are some applications that are provided as default by the Linux operating system on each node, which can be used without needing to load a corresponding module. Please note, however, that the versions of these applications tend to be relatively old and in some cases we provide modules that allow you to run newer versions – the application git is a prime example of this.

For commercial software applications IT Services may provide a limited number of licences so that they are available to all users of the service. It is important to understand that, with the limited funds available, IT Services can only provide a limited number of licences and cannot guarantee availability of these licences at any specific time. If guaranteed availability is required, for example for a long-term or time-critical project, IT Services recommend that users purchase their own licences for use on this system; these can then be reserved for nominated users.

User Developed Code

A range of compilers, libraries and software development tools are provided to allow for users write their own code to run on the system. For those developing their own code, please see the information on the FOSS and IOMKL toolchains.

Storage Space

See the BlueBEAR Storage page for information about the allocation of disk space and for information about mapping BlueBEAR storage to other computers.

Data Policies

Users are reminded that they must adhere to the Campus Conditions of Use, Data Protection legislation and the University of Birmingham's Security Policy. 

Operational Status

A summary of the operational state of the BlueBEAR cluster will always be available on the IT Services status page. This also includes any scheduled maintenance or downtime.

Scheduled Downtime

IT Services reserve the right to take the cluster down for maintenance during the JANET at risk period (Tuesday 07:00 - 09:00 a.m.) PLUS during major maintenance windows for the datacentre, usually three per year. Any scheduled downtime dates will be published well in advance. The JANET at-risk slots are very rarely used.

Contacting Us

Problems, requests and queries should all be directed to the IT Service Desk.


Last modified: 20 March 2019