Data Validation Techniques for SDG Implementation

sdg data validation n.w
1 / 12
Embed
Share

"Explore the importance of data validation in Sustainable Development Goal (SDG) initiatives. Learn about structural and content validation, ensuring dataset accuracy and adherence to SDG frameworks. Validate against SDG DSD and dataflows for reliable reporting. Understand SDG content constraints and constraints matrices for data accuracy. Enhance your understanding of SDG data management through validation techniques." (Word count: 69)

  • Data Validation
  • SDG
  • Sustainable Development Goals
  • Data Management
  • Content Constraints

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. SDG Data Validation

  2. Diagram of SDG artefacts Codelists CL_FREQ DSD Concept Scheme SDG SDG_CONCEPTS CL_PRODUCT Dataflow Dataflow DF_SDG_GLC DF_SDG_GLH Content Constraints CN_SDG_GLC CN_SERIES_SDG_GLC CN_SDG_GLH CN_SERIES_SDG_GLH Statistics Division

  3. SDMX Validation Structural validation Ensures that all dimensions and mandatory attributes are in place and have valid values, and the dataset does not contain duplicates Content validation In addition to structural validation, ensures that more complex data relationships are also observed Statistics Division

  4. Validation against the SDG DSD Validating a dataset against the SDG DSD achieves structural validation. It ensures that the dataset has all dimensions and mandatory attributes in place, and all concepts have correct values according to the DSD. Statistics Division

  5. Validation against an SDG dataflow Countries should always use the Country Global Dataflow, DF_SDG_GLC, for SDG data that they exchange SDG Content Constraints are attached to the global dataflows When a dataset is validated against a global dataflow, the Content Constraints are applied Structural validation is carried out as in the case of the DSD In addition, relationships between the dimensions are validated using the Content Constraints Statistics Division

  6. SDG Content Constraints Cube region constraints: CN_SDG_GLC and CN_SDG_GLH Ensure that the reporting type is correct for each dataflow: N for DF_SDG_GLC, and G for DF_DSG_GLH Series constraints: CN_SERIES_SDG_GLC and CN_SERIES_SDG_GLH Ensure that the dataset has valid combinations of dimensions Valid disaggregation is provided for each SDG series The Content Constraint Matrix helps visualize the content constraints and apply them in mapping SDG series Statistics Division

  7. SDG Content Constraints Matrix Informal representation of SDG series content constraints in CSV/Excel Can be used to determine how to correctly map an SDG series Statistics Division

  8. Content Constraint Matrix: Columns SERIES Series name [for convenience, ignored in validation] Unit Measure* (attribute) Unit Multiplier* (attribute) SEX, AGE, URBANISATION, COMPOSITE_BREAKDOWN, EDUCATION_LEV, DISABILITY_STATUS, OCCUPATION, INCOME_WEALTH_QUANTILE, PRODUCT, ACTIVITY * In SDMX 2.1, validation of attributes is not supported. Values for Unit of Measure and Unit Multiplier are listed for correct mapping, and will be enforced on import to SDG Lab. Statistics Division

  9. Content Constraint Matrix: Values Allowed disaggregation codes are listed for each series One or more codes separated with semicolon (;) Y15T19;Y10T14 Special value ALL means there are no restrictions on the corresponding dimension, i.e. all values are allowed Statistics Division

  10. Content Constraint matrix: example series SERIES SP_DYN_ADKL Adolescent birth rate (per 1,000 women aged 15-19 and 10-14 years) [3.7.2] PER_1000_POP ALL Name UNIT_MEASURE UNIT_MULT SEX AGE Y15T19;Y10T14 ALL URBANISATION COMPOSITE_BREAKDOWN EDUCATION_LEV _T;MS_MIGRANT;MS_NOMIGRANT;MS_EUMIGRANT;MS_NONEUMIGRANT ALL ALL DISABILITY_STATUSOCCUPATIONINCOME_WEALTH_QUANTILE ALL PRODUCT ACTIVITY _T F ALL _T Series: SP_DYN_ADKL (Adolescent birth rate (per 1,000 women aged 15-19 and 10-14 years)) Allowed concept values: Unit of measure: PER_100_POP Unit multiplier: 0 Sex: F Age: Y15T19 and Y10T14 Composite Breakdown: _T;MS_MIGRANT;MS_NOMIGRANT;MS_EUMIGRANT;MS_NONEUMIGRANT Product: _T Activity: _T Urbanisation, Disability status, Occupation, Income or Wealth Quantile: all values allowed Statistics Division

  11. SDG Lab validation SDG Lab validates dataset before importing, including unit of measure and unit multiplier. Invalid datasets are rejected. SDMX Content Constraints help map the series and validate the data before uploading the dataset to the SDG Lab Statistics Division

  12. Thank you.

More Related Content