FAQs & known issues
- blizzard: How can I login to the system, change my password and login shell?
- blizzard: How can I check my disk quota?
- blizzard: How can I access my GPFS data from outside DKRZ/ZMAW ?
- blizzard: How can I choose which account to use, if I am subscribed to more than one project ?
- blizzard: Why doesn't my LoadLeveler job start running?
- blizzard: Why does my job wait so long before being executed.
- or: Why is my job being overtaken by other jobs in the queue
- blizzard: How can I run a short MPI job using up to 4 nodes?
- blizzard: How can I see on which nodes my job was running, e.g. to tell this to beratung in case of errors?
- blizzard: How can I get a stack trace if my program crashes?
- HPSS: How can I get information about my project's data in the tape archive?
- HPSS: How can I use the HPSS tape archive without typing my password every time, e.g. in scripts or jobs?
- HPSS: How can I use HPSS archive in scripts/jobs?
blizzard: How can I login to the system, change my password and login shell?
Login to the system via:
ssh <user>@blizzard.dkrz.deChange your password and/or login shell via DKRZ online
blizzard: How can I check my disk quota?
Check your individual quota on $HOME file sytem /pf :
wquota -u <uid> (e.g. wquota -u a201234)
and your project quota on project file system $WORK :
wquota -j <prj> (e.g. wquota -j ab0123)
$SCRATCH usage can be checked here.
blizzard: How can I access my GPFS data from outside DKRZ/ZMAW ?
You can use either sftp ...
sftp <user>@blizzard.dkrz.de
or gridftp from blizzard ...
globus-url-copy file:///work/<prj>/<file> gsiftp://server.example.com/tmp/
or gridftp from outside:
globus-url-copy file:///tmp/testfile gsiftp://gridftp.dkrz.de/scratch/u/u123456Detailed Information: GridFTP on blizzard.
back_to_top
blizzard: How can I choose which account to use, if I am subscribed to more than one project ?
Just insert the following line into your job script:
#@ account_no = <Project> (e.g.: #@ account_no = ab0123)
Your default account is stored in file $HOME/.acct. If you do not specifiy the account in your jobscript, your computer time consumption is charged to this account.
back_to_top
blizzard: Why doesn't my LoadLeveler job start running?
llq -s <jobid>
gives information about status of jobs (may e.g. help if you want to find out why the scheduler does not start your job).
Hint: Check if tasks_per_node and task_affinity match:
#@ tasks_per_node = 64 requires #@ task_affinity = cpu(1)
#@ task_affinity = core(1) comes with #@ tasks_per_node = 32
blizzard: Why does my job wait so long before being executed.
or: Why is my job being overtaken by other jobs in the queue
First make sure your job can be executed at all.
If your job is a valid job there are several possible reasons for it to be queued for a long time and/or to be overtaken ...
- ... later submitted jobs with a higher priority (usually these have used less of their share then your job).
- ... by jobs with lower priority that are sufficiently small and specified a wall clock limit to allows them to be considered for backfilling
For a more detailed description of scheduling policies on blizzard please refer to blizzard -> scheduling.
back_to_top
blizzard: How can I run a short MPI job using up to 4 nodes?
Use job class 'express' by inserting this line into your job script:
#@ class = express
blizzard: How can I see on which nodes my job was running, e.g. to tell this to beratung in case of errors?
Insert this line into your Job script:
uniq $LOADL_HOSTFILE
blizzard: How can I get a stack trace if my program crashes?
The official way to find the location where your program crashed is to run it in a debugger like dbx or inspect a core file with the debugger. A quick way to get the stack trace without the need for a debugger is to insert the following lines at the beginning of your program:
include 'fexcp.h' call signal(11, xl__trce)
Then compile your program with
-g
or
-qlinedebug
If you want to know the absolute path of the source files mentioned in the trace then add the option
-qfullpath
to your compiler options.
Now if you run your program and it happens to crash because of a segment violation, then it will generate some output like this:
$ ./tracetest
Signal received: SIGSEGV - Segmentation violation
Traceback:
Offset 0x00000080 in procedure dostuff, near line 24 in file tracetest.f90
Offset 0x0000004c in procedure tracetest, near line 12 in file tracetest.f90
--- End of call chain ---
Real debuggers will allow you to get much more information in case the problem is not easily identified.
HPSS: How can I get information about my project's data in the tape archive?
A list of all files is located in your project's home directory /hpss/arch/<prj>.
To get an overview of your project's quota and consumption of space in HPSS tape archive use our DKRZ online services.
HPSS: How can I use the HPSS tape archive without typing my password every time, e.g. in scripts or jobs?
Create a file called .netrc in your HOME directory, which has only read permissions for yourself:
blizzard1% ls -la $HOME/.netrc -rw------- ..... /home/dkrz/<uid>/.netrc blizzard$ cat $HOME/.netrc machine tape login <uid> password <password>
Be aware that special characters like "/ "or "," should be escaped with a backslash, e.g.
machine tape login u123456 password Null\,nix
HPSS: How can I use HPSS archive in scripts/jobs?
Assumed you've got a ~/.netrc file you can use pft in scripts as follows:
#!/client/bin/ksh export DATA_DIR=/hpss/arch/<prj>/<mydir> echo "DATA_DIR= $DATA_DIR" export filename=test.txt # pftp << ENDE cd $DATA_DIR bin dir get $filename quit ENDE #
