Sie sind hier: Startseite / Systems / HPSS tape archive


Tape Archive - HPSS filesystems (High Performance Storage System)

Tape Archive - HPSS filesystems  (High Performance Storage System)

All relevant data created and post processed on DKRZ systems will eventually be stored in our on- and off-site tape archives. The tape archive delivers its best performance if data is appropriately packaged before uploading. See Preparing Data for Archiving for more information on bundling and compression.

The location where you upload your data affect the relevant "class of service".

 /hpss/arch/<prj> reference data (single copy on tape),
lifetime: 1 year after project expiration
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB
 /hpss/double/<prj> like "arch", but with second copy on a separate tape,
project's quota (arch) will be charged with twice the amount of data.

If you want to move data from "arch" to "double",
please contact  DKRZ-Beratung
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB
 LTA (doku) long-term archive for documentation data (with second copy),
data can be accessed by all registered DKRZ and CERA users,
lifetime:  up to 10 years after project expiration
preferred file size:  10 GB - 100 GB
maximum file size: 500 GB 

Please note:
If you try to write a file larger than 500 GB, it might be truncated. The smallest accounting unit is 1 GB, so charging for each file will be rounded up to the nearest GB.

Your project's file list in HPSS tape archive

A daily updated file called  "_PROJECT.<prj>.file-list.GIGA"  containing a list of all files stored in your project  is located in /hpss/arch/<prj>  or in /hpss/double/<prj>  and can be fetched from HPSS via pftp.

To get an overview of your project's  consumption of space  in HPPS archive use our DKRZ online services.

Access via pftp

The tape archive is accessed with the pftp client program. It is available on mistral and on several other servers on campus.
The preferred way for authentication is with Kerberos. Legacy password authentication is also supported.
After changing to your project directory /hpss/arch/<prj>/... you can upload files to the tape archive with the put and mput commands. Download is initiated with get or mget.
The pftp command is just a stub at the moment but will be expanded by and by and more options will be added.

Please note, that you should avoid pipes when using  pftp.

Example for getting/putting data files from/to HPSS:
mlogin100$ cd $SCRATCH
mlogin100$ pftp
Connected to ...
ftp> cd /hpss/arch/<prj>/path/to/mydir
ftp> ls             # show files on HPSS directory
ftp> get my_data_file
ftp> !ls            # show files on local machine (mistral)
ftp> quit
221 Goodbye.

mlogin100$ ls

Recursive operations with pftp

pftp allows certain operations to work recursively. These operations are mget, mput, and mdelete. For instance with recursive mput you can put an entire directory tree with its files into archive

ftp> prompt
Interactive mode off.
ftp> recursive mput mydirectory
257 MKD command successful.
257 MKD command successful.

Do not upload large numbers of small files. Remember that each file is accounted with at least 1 GB no matter how small.

In a similar fashion, you can also delete an entire directory recursively in archive

ftp> recursive mdel mydirectory
250 DELE command successful.
250 RMD command successful.
250 DELE command successful.

Be careful with non-interactive mode not to delete files you still need. There is no backup for files in archive.

Passwordless access

Kerberos is the preferred way to use pftp in a job script. For the time being we also support .netrc files. Be aware that this method is inherently insecure.

The format of a ~/.netrc file is as follows

machine login <user> password <password>
machine  login <user> password <password>

This file must have 600 permissions (chmod 600 ~/.netrc). Special characters have to be escaped with a backslash or quoted.