Enhancing Data Archiving and Stewardship Efforts
Encouraging full curation of historical data and promoting data availability through traceability and metadata enhancement. Recommendations for data centers to improve coordination and interoperability for efficient data utilization.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Data archivingand stewardship Presentation of initial draft by Martine De Mazi re and Tove Svendby Alberto Redondas and Victoria Sofieva, Rapporteurs
Data centers Recommendationsfrom ORM-11 Continue to encourage data providers to submit or link to established databases to avoid a proliferation of databases, and especially to avoid the loss of data after the end of a measurement or (inter)calibration campaign or project, and to enable possible reprocessing of the data. Further target enhanced linkage among data centres. This requires that data centres coordinate more and make further progress with the exchange of metadata and interoperability ORM-11 recommendation is up-to-date: The creation of central data portals (e.g., at the World Data Centres) that provide visibility and linkage to the ensemble of existing data centres for ozone research-related data would enhance the possibility of making synergistic use of all the data and, as such, increase the effectiveness and valorization of data acquisition efforts Enhanced coordination/collaboration between data centers on data formats, especially on metadata and data availability and discovery would be useful ORM-12 recommendation: WMO World Data Centers to urgently advance data interoperability
Traceabilityand metadata The delegates encourage the instrument based central processing with the storage of the raw data and metadata, which allow reproducibility, reprocessing, and improves uncertainty evaluation and data harmonization The delegates emphasized importance of including all metadata that are required for data use and reprocessing. This is especially important when data conversion (for example, from pressure to altitude grid) to the standards of datacenters has been applied The format and content for metadata(dataset- or community- specific) should be negotiated. GCOS recommendations on metadata should be considered.
Encouragingfullcurationof data Digitizeand curatehistorical data It is important A big effort, resources should beallocated Long-term data preservation ground-based and satellite data Curationof metadata Curationof data processingsoftware Experience for future Saveenoughinformationfor futuredata reprocessing Strongly encourage the full curation of data, including historical data. In particular, the curated data should include all metadata and ancillary data. Address the need to allocate resources for digitizing and curating historical data for ozone and related species, as well as for ancillary data (e.g., laboratory spectroscopic data, station information), where available and before the information and knowledge get lost, in order to include the data in modern database systems.
Data availability and traceability Data availability must be implemented according to FAIR data principles. This is supported by the assignment of a DOI and data license to the data sets. Data publishing with an associated DOI should be encouraged to provide data to the scientific community and to give recognition to scientists and the funding agencies for providing the data. This may also offer a good solution for the archiving (including traceability) of model output or single data or versioning of data processing codes. An open data policy is recommended, but with the requirement to give appropriate credit to the data originator. A way must be found to ensure that these credits are given, as they are often taken as a key performance indicator for funding agencies.
Data format A user-friendly data format is recommended A decision about a common data format and metadata standard would facilitate the exploitation of data retrieved from different data centres. Several common data standards, like netCDF-CF or GEOMS HDF are used by several Earth observation communities (e.g., satellite data providers and the climate modelling community) and are supported by a number of tools for extracting and visualizing the data. It is most important that the formats enable a good structuring of the data and metadata; the packaging of data and metadata, whether it is netCDF or HDF or else, is less important as there are many tools available to convert from one to another. Datacenters should make the data available in different standard formats or provide the appropriate conversion tools. More and sustainable resources should be allocated to the data centres.
Other recommendations Satellite overpass data coincident with ground-based network station must be readily available There is a progress in addressing this (e.g., tools at EVDC) Potentially the same overpass dataset can be created with models/reanalyses (e.g., NDACC stores MERRA-2 overpass data) A service through Copernicus? Campaign data should be stored, together with metadata, and potentially made also publicly available. Calibration data must be preserved
Achievements Enhanced and more timely availability of ground-based, satellite and modelling data through several data centers (NDACC, WOUDS, ESA, NASA, ACTRIS, EUBREWNET, Copernicus CDS, CAMS and others ) Central data processing systems are taken further in several monitoring networks, such as EUBREWNET, for selected NDACC-type data, Pandora data in the framework of ACTRIS, FRM4DOAS programme A progress has been made on enhanced linkage among data centres Progress has been made on data publishing with an associated digital object identifier (DOI). Data centreshave made progress in providing data in several accepted standard formats and providing different data versions