Sections
You are here: Home User Portal Documentation Getting started

How to use DKRZ facilities

Which system to use for what ?

Workflows in climate modelling research are complex and comprise, in general, a number of different tasks, such as model formulation and development (including debugging, platform porting, and performance optimization), generation of input data, performing model simulations, postprocessing, visualization and analysis of output data, long-term archiving of the data, documentation and publication of results. The DKRZ hardware and software infrastructure is optimally adapted to accomplish these tasks in an efficient way. In the graphic below we give a schematic overview on the DKRZ systems.

DKRZsystems

  1. For a more detailed description of the different systems shown in the picture and basic software installed on these systems click here.
  2. Below we desribe the basic facts you should know before starting to use the systems:

Login

To start working at DKRZ you need to login to one of our computers:

  • blizzard is the main "work horse" of DKRZ,
  • passat is an additional system for serial jobs and interactive usage,
  • lizard and halo are mainly used for pre- and post-processing of data and high end visualization respectively.

All three servers are directly connected to the same "GPFS" file system (see below).

  • tornado is mainly used for special code development and has its own filesystem.
        (Please take into account that tornado will be decommissioned in July 2012.)

Login to all systems is done via Secure Shell network protocol (ssh):

ssh -X <userid>@<system>.dkrz.de

where <user id> stands for your login name provided by the master user or DKRZ Beratung and <system> for blizzard, lizard, halo or tornado respectively.

Code development and compiling

On blizzard and tornado, after having logged in you find yourself on a "login node". The login nodes serve as an interface to the compute nodes of the cluster - and for blizzard also as an interface to a test node (p249). Login nodes are intended for file editing and compilation of source code, as well as for submission, monitoring and canceling of batch jobs. The login nodes can also be used for simple and none time and memory intensive serial processing tasks. The routine data pre- and post-processing and visualization, however, has to be performed in batch mode on the compute nodes or interactively on the linux machine lizard or the visualization Server Halo. For interactive testing and debugging of parallel jobs we provide the test node p249. Access to p249 is granted upon request to Beratung. All of these machines (login as well as compute nodes) except tornado share the same file system ("GPFS"), thus you can access the same data from each of them without need for copying any files.

To maintain different software versions the module environment is employed at DKRZ. It is worthwhile to make yourself familiar with the use of the module command to make your interactive work more comfortable. The most important module sub-commands are avail, list, load, unload, switch and show. For example, to get a list of all available modules type:

module avail

Model production

In contrast to the login nodes, the compute nodes are only accessible via the job scheduling system and are reserved for production jobs. Job submission is accomplished using a job command file (see Batch Jobs for blizzard and Batch Jobs for tornado). The job command file is a shell script containing model run script and directives to specify the resources required for the job. You should carefully read the section on Batch Jobs since an appropriate setup of the resource requirements might decrease the idle time until your job starts executing.

All file systems available on login nodes are also seen on the compute nodes. However, the environment can be somewhat different.

Storage management (GPFS Filesystem)

The DKRZ utilizes a storage approach on based in three levels, see the detailed description of the parallel file system called GPFS on blizzard and the detailed description of the parallel file system called lustre on tornado:

  • HOME, i.e. /pf/<first letter of user id>/<user>, is the file system you log into, it is backed up and should be used to store source code, scripts, and important results
  • SCRATCH, i.e. /scratch/<first letter of user id>/<user>, is the storage file system available for temporal usage. To prevent the file system from overflow old data is automatically deleted - the granted retention period is 14 days. In the interest of others please use less than 20000 GB.
  • WORK file system, i.e. /work/<project> is project based  (see section on "resources management below") and provides disk space for large amount of data, but it is not backed up. It can be used for e.g. writing model raw output and processing of data accessible to all project members

For long term storage please use the tape archive HPSS - see below.

Pre- and post-processing including visualization

For data pre- and post-processing linux machines lizard and the Vis Server Halo are available. The access to the machines is via ssh. These machines provide a large set of analysis and visualization software, e.g. MATLAB, Mathematica, FERRET, NCL, R, etc.

Linux visualization cluster Halo offers comfortable computing environment for processing and visualization of large amounts of data. The Halo system is primarily designed for interactive 3D-visualization of your results. The resources can be requested and checked via the reservation system.

Archiving (HPSS tape archive)

For backup, storage, and long-term archiving of your data a data archive is available. The connection to this HPSS tape archive is established via File Transfer Protocol (pftp) within DKRZ/ZMAW network. External access from other sites is granted read only  via the machine "xtape" using sftp  or gridftp.

Resources management

Project-oriented resources management is implemented at DKRZ. For this purpose all users are assigned to one (or more) projects. To check what projects you belong to  use our  DKRZ online  services. After logging in via DKRZ <userid> and <password> you can change your password or login shell, edit your user profile, check resources usage, and manage projects (project administrators only).
If you belong to more than one project, choose the one to be charged.

The resources available within each project (CPU time, disk space in the parallel file systems, and storage space in HPSS tape archive) are allocated yearly by the Scientific Steering Committee

Service and support

DKRZ staff is pleased to support you at all stages of your work, such as porting of your applications to "blizzard" or "tornado" clusters, performance analysis and optimization, visualization, data reorganization in HPSS tape archive etc. Furthermore, every year DKRZ holds a number of introductory and advanced courses on usage of DKRZ facilities, parallel programming with MPI and OpenMP, and visualization. Please, do not hesitate to contact us.

Document Actions