Implementing Data Validation in National Accounts with Eurostat
Eurostat's ESA 2010 Validation Task Force has been instrumental in setting up validation checks, resolving recurrent validation problems, and enhancing the data validation process for National Accounts. The project involves participants from NSIs, Central Banks, and data users, focusing on refining the data validation tool and ensuring the accuracy and reliability of economic data.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Data validation practical use case: National Accounts IPA Course: Data Validation in the ESS 18-19 May 2017 Daniel SURANYI Eurostat Directorate C 1 Eurostat
Presentation Outline Background and milestones ESA 2010 Validation Task Force Structural and content validation service Target implementation dates Questions and discussion 2 Eurostat
Background and milestones 2012 agreed ESTAT-OECD-ECB Validation checks reflected in ESTAT-OECD Protocol 2014 Implementation of validation checks in internal systems for ESA 2010 2014 NAWG/DMES agree to set up ESA 2010 validation Task Force Recurrent validation problems progressively shift from coding to content related checks 3 Eurostat
ESA 2010 TF on data validation: Mandate Review validation checks performed in Eurostat Clarification of methodological or practical aspects Validation rules for an internal or external pre- validation tool Collection and dissemination of associated metadata Eurostat
TF Participants Participants represented seven NSIs (AT, DE DK, GR, IE, SE, UK), three Central Banks (BE, FR, IT) and two main data users (ECB and OECD) IT, SI and PL participated via correspondence Participated in the workshop for ESTAT grants on validation (DK, GR, NL, SI) ESTAT links with representatives to cover ESA 2010 TP 5 Eurostat
Current validation process Tool: Pre-transmission checks STEP 1: Structural & basic data checks NSIs & Central Banks Format Codes Embargo + "F" flag Hole in series Sources SDMX eDAMIS Revisions A vs Q Horizontal checks Balances CLVs for ref year Zeros & -ve values STEP 2: Structural & basic data checks EU AGGREGATES & DISSEMINATION STEP 3: Statistical and Economic Plausibility Checks Eurostat
NAPS-S: Objective of the project Re-design of the National Accounts statistical Production System using Eurostat corporate (CSPA compliant) services for managing and validating incoming data files. 7 Eurostat
Official statistics: the challenge Official reference Cross-domain usage More timely policies Commercial providers GDP T+30 Shrinking resources 8 Eurostat
Stovepipe production: the reality Collect Process Analyse Disseminate Survey A Survey B Survey B 9 Eurostat
Stovepipe production: the reality Customised for a specific domain Conventions used within domains / surveys Hampering cross-domain usage Leading to low level of transparency Not possible to share IT tools efficiently Difficult to share data across domains / organisations Difficult to measure quality 10 Eurostat
Example: National Accounts production in Eurostat Business process 11 Eurostat
Example: National Accounts production in Eurostat Implementation FAME Oracle RDBMS Oracle OLAP 12 Eurostat
Target: flexible use of statistical services 13 Eurostat
Intermediate step: partial orchestration 14 Eurostat
Intermediate step: partial orchestration 15 Eurostat
Target: flexible use of statistical services Customised for a specific domain Conventions used within domains / surveys Hampering cross-domain usage Leading to low level of transparency Not possible to share IT tools efficiently Difficult to share data across domains / organisations Difficult to measure quality 16 Eurostat
Target: flexible use of statistical services Customised for a specific domain Conventions used within domains / surveys Hampering cross-domain usage Leading to low level of transparency Not possible to share IT tools efficiently Difficult to share data across domains / organisations Difficult to measure quality 17 Eurostat
Target: flexible use of statistical services Architecture for cross-domain usage Standards used across domains / surveys Enabling cross-domain usage Leading to transparency Encouraged to share IT tools efficiently Facilitates sharing data across domains / organisations Possible to measure quality (process, data) 18 Eurostat
The big picture: using standards GSIM Reference information model GSBPM Process step categories Statistical Production CSPA SDMX Data Modelling VTL Service specification Validation expressions Eurostat
Connections from data providers National Content Validation National Structural Validation 1) Connect to the repository Statistical Service A Statistical Service B National software VTL SDMX Registry Common Repository Common Repository 20 Repository Eurostat
Connections from data providers ESS Content Validation ESS Structural Validation 2a) Use ESS service Content Validation Structural Validation Statistical Service A Statistical Service B replicated VTL SDMX Registry Common Repository Common Repository 21 Repository Eurostat
Connections from data providers 2a) Use ESS service Content Validation Structural Validation Statistical Service A Statistical Service B shared VTL SDMX Registry Common Repository Common Repository 22 Repository Eurostat
Connections from data providers 3) Connect to the process Structural Validation Content Validation Statistical Service A Statistical Service B SDMX Registry VTL Common Repository Common Repository 23 Repository Eurostat
SDMX compliance Basic logical checks Valid SDMX-ML file Coded according to the DSD Mandatory fields present Correct data types Dataflow definition Sender ID and REF_AREA Table ID is present Value "NaN" and OBS_STATUS EMBARGO_DATE and CONF_STATUS PRICES and REF_YEAR_PRICE Structural Validation SDMX Registry General plausibility and consistency (within file) Basic content checks Missing or unexpected series Hole in series Zero values Negative values Additivity of breakdowns Outliers Consistency between prices Unadjusted and adjusted series Repository? Content Validation VTL Advanced plausability and consistency (across files) Cross-domain or source checks Revisions Quarterly versus Annual Same series across tables Balance of Payments Trade statistics Labour market statistics Data pulished by NSI or IO
Validation Roadmap: NAPS-S What Development of service Pilot with countries Structural Q3-4/2015 Q1/2016 Content Q1/2017 03-04/2017 Comments Based on modified version of EDIT Looping in of VTL language Corrections & deployment Q2/2016 Go-live Q2-3/2017 Q4/2017 Q3-4/2016
Phase 1 Setup All countries Regular production Selected TF members Eurostat Process Manager EDAMIS Structural Validation Svc PoC Step 1: Call Structual Validation Svc If OK: deliver to "reduced" production system If Not OK: deliver report to EDAMIS feedback channel Eurostat
Conval Workflow? All countries Regular production Selected TF members Eurostat Process Manager EDAMIS Content Validation Svc PoC Structural Validation Validation Structural ??? Warnings ??? Step 2: Call Content Validation Svc o Supporting metadata / footnotes inside the SDMX message o Advanced validation, e.g. visualisation o Judgement call: error or warning If OK: deliver to "reduced" production system If Not OK: deliver report to EDAMIS feedback channel Eurostat
Demonstration: Scenario 3 Use EDAMIS to transmit data Data provider perspective Use Eurostat process manager Eurostat workflow defined IS4STAT Input Hall: Eurostat process monitor Eurostat
More information National Accounts Conval Webinar o o o o Scope of the content validation service High level setup Full validation workflow Sample report and implementation timeline https://circabc.europa.eu/d/a/workspace/SpacesStore/605d7bcd-5975- 4835-ab3d-7e54a54215f4/ESA%20VALTF%20Conval%20Webinar_0.mp4 Daniel SURANYI Department: ESTAT.C Email: daniel.suranyi@ec.europa.eu 29 Eurostat