last modified Mar 06, 2018 07:59 AM
You are here: Home / DICAD / Subproject 1 (DKRZ) / Data Workflows / Bibliometrical Phase

Bibliometrical Phase

In this section the past-project phase is described: long-term data archival and reuse of the data. During long-term archival data and information collected within the project are stored (see CMIP6 Guides: ). Afterwards the data curation phase starts, in which the data and metadata are maintained and the reuse of the data begins, e.g. by IPCC DDC ( ) users accessing the Reference Data Archive of AR6.


Step 0 - Preparation for Long-Term Data Archival

In the pre-phase of the long-term archival, general information is entered into the long-term archive based on CMIP6 web pages, e.g. project description, templates for summaries or descriptions of provided variables (short names, long names and connected CF Standard Names).

Step 1 - Create Use Metadata from the ESGF

The long-term archival starts at the snapshot date agreed with the IPCC, TGICA and IPCC Working Group I on 15th October 2020.

Detailed fine-granular information is accessed from the ESGF Search API and stored in the metadata database of the long-term archive at DKRZ. These information is included in the file headers during the production phase and extracted during ESGF data publication. A mapping of the information provided by the ESGF Search API to the local database schema and a connection to the entered information during Step 0 is done.

Step 2 - Data Archival and Adding Ancillary Metadata

In this second step two parallel processes are carried out:

  • The data of the data pool is physically archived at DKRZ and moved on tape into the HPSS.
  • The metadata is enriched by adding information provided by ancillary metadata repositories within CMIP6: ES-DOC for model and simulation descriptions (see ), Citation Service content (see ) and other accessible information provided by CMIP6 repositories for ancillary data or the modeling centers.

Step 3 - Finalize Data Archival: Technical Quality Assurance

As a final step of the long-term data archival, the data and metadata is quality checked in the Technical Quality Assurance step. This was successfully applied for the CMIP5 data archival (see QC level 3 details at: ). The main checks ensure data and metadata consistency as well as conformance and data accessibility. For metadata from several different locations and sources, consistency and conformance checks are highly important. After the successful finalization of the Technical Quality Assurance step, the long-term archival is completed.

Step 4 - Data Publication and Citation / IPCC Data Distribution Centre

The long-term archived data build the AR6 Reference Data Archive of the IPCC Data Distribution Centre (IPCC DDC, ). Therefore the archived data is added to the IPCC DDC web pages.

Stable data collections of the IPCC DDC data are registered DataCite DOIs on the same granularities as for the CMIP6 citation of evolving data (see ), i.e.:

  • model/MIP data: all data contributed by an institution with one model to an individual MIP
  • experiment data: all data contributed by an institution with one model to a CMIP6 experiment

After DOI assignment the DOI is documented on the IPCC DDC web page and finally the data - data relation to the DDC data is added to the CMIP6 Citation on evolving data and the altered citation information is published.

The IPCC DDC data in the long-term archive is also published in the ESGF and thus made accessible for ESGF users.

Step 5 - Data Curation and Data Reuse Phase

After data archival and data publication the data curation and data reuse phase starts, in which the metadata is updated with new information, e.g. adding paper references using CMIP6 data via Scholix services or adding errata information. Information on the data is provided for harvesting of secondary portals, e.g. OpenAire or EUDAT.