Sie sind hier: Startseite / Systems / HPSS tape archive
Alle Inhalte des Nutzerportal sind nur auf Englisch verfügbar.


Tape Archive - HPSS filesystems (High Performance Storage System)

For details on the new Hierarchical Storage Management (HSM) System, please take a look into

A time schedule for the migration from HPSS to HSM is given in the FAQ.

Tape Archive - HPSS filesystems  (High Performance Storage System)

All relevant data created and post processed on DKRZ systems will eventually be stored in our on- and off-site tape archives. The tape archive delivers its best performance if data is appropriately packaged before uploading. See Preparing Data for Archiving for more information on bundling and compression.

The location where you upload your data affect the relevant "class of service".

 /hpss/arch/<prj> reference data (single copy on tape),
lifetime: 1 year after project expiration
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB
 /hpss/double/<prj> like "arch", but with second copy on a separate tape,
project's quota (arch) will be charged with twice the amount of data.

If you want to move data from "arch" to "double",
please contact  DKRZ-Beratung
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB

 LTA (doku)


long-term archive for documentation data (with second copy),
data can be accessed by all registered DKRZ and CERA users,
lifetime:  up to 10 years after project expiration
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB 

Please note:
If you try to write a file larger than 500 GB, it might be truncated. The smallest accounting unit is 1 GB, so charging for each file will be rounded up to the nearest GB.

Your project's file list in HPSS tape archive

A daily updated file called  "_PROJECT.<prj>.file-list.GIGA"  containing a list of all files stored in your project  is located in /hpss/arch/<prj>  or in /hpss/double/<prj>  and can be fetched from HPSS via pftp.

To get an overview of your project's  consumption of space  in HPPS archive use our DKRZ online services.

Access via pftp

The tape archive is accessed with the pftp client program. It is available on mistral and on several other servers on campus.
The preferred way for authentication is with Kerberos. Legacy password authentication is also supported.
After changing to your project directory /hpss/arch/<prj>/... you can upload files to the tape archive with the put and mput commands. Download is initiated with get or mget.
The pftp command is just a stub at the moment but will be expanded by and by and more options will be added.

Please note, that you should avoid pipes when using  pftp.

Example for getting/putting data files from/to HPSS:
mlogin100$ cd $SCRATCH
mlogin100$ pftp
Connected to ...
ftp> cd /hpss/arch/<prj>/path/to/mydir
ftp> ls             # show files on HPSS directory
ftp> get my_data_file
ftp> !ls            # show files on local machine (mistral)
ftp> quit
221 Goodbye.

mlogin100$ ls

Recursive operations with pftp

pftp allows certain operations to work recursively. These operations are mget, mput, and mdelete. For instance with recursive mput you can put an entire directory tree with its files into archive

ftp> prompt
Interactive mode off.
ftp> recursive mput mydirectory
257 MKD command successful.
257 MKD command successful.

Do not upload large numbers of small files. Remember that each file is accounted with at least 1 GB no matter how small.

In a similar fashion, you can also delete an entire directory recursively in archive

ftp> recursive mdel mydirectory
250 DELE command successful.
250 RMD command successful.
250 DELE command successful.

Be careful with non-interactive mode not to delete files you still need. There is no backup for files in archive.

Passwordless access

Kerberos is the preferred way to use pftp in a job script.

Simplified packing and file transfer via packems

The process of packing & archiving of multiple data files to HPSS and their retrieval is simplified by the packages "packems". It consists of four command line programs:

  • a pack-&-archive tools "packems",
  • a list-archived-content tool "listems",
  • a retrieve-&-unpack tool "unpackems" and
  • a Kerberos-ticket-status-checker "tapeinit".

Internally, packems and unpackems generate Makefiles, which contain calls of pftp and tar. tapeinit masks calls of klist and kinit. Authentication via Kerberos has to be enabled. 

The packems package was a joint development of the MPI-M and the DKRZ. It is available to all DKRZ users via the module "packems" on Mistral. The full documentation and exemplary command line calls are provided in the MPI-M Redmine:

Note: The packems package was particularly developed for the usage with the HPSS currently used at DKRZ. To account for the higher archiving demands expected for the new supercomputer HLRE-4 "Levante" (expected to be operational in late 2021), HPSS will be replaced by a new Hierarchical Storage Management (HSM) System in the second quarter of 2021, which will offer new (and different) command line tools and features. However, retrieval and unpacking of data archived with packems will be maintained for the new HSM. Details on the time schedule of the HSM migration are given in our HSM FAQ.