Open Data Interface (ODI) Project Overview and Components

Slide Note
Embed
Share

Open Data Interface (ODI) is a database system developed in 2008 for ingesting, processing, storing, and retrieving space environment data. The project involves a server, client, and various components for data manipulation and integration. Current contractors include DH Consultancy and Solar Analytics. ODI supports approximately 330 datasets and allows for extensibility with user-added functionality. The database structure, server software, and tools for handling geographic and magnetic coordinates and spacecraft data are key features of the project.


Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. 16/12/2019 ODI: Open Data Interface ESA FP Days, ESTEC D. Heynderickx, DH Consultancy, Belgium P. Wintoft, Solar Analytics, Sweden 1

  2. Project overview 16/12/2019 Open Data Interface (ODI) is a database system for ingesting, processing, storing and retrieving space environment (and other) data and metadata in a MySQL (MariaDB) database. Development started in 2008 (Swedish Institute of Space Physics, DH Consultancy); continuous development and maintenance since then Current contractor team: DH Consultancy (D. Heynderickx), Solar Analytics (P. Wintoft) Name of the TO in TEC-EES: Hugh Evans Currently support for ~330 datasets. Extensible with user added functionality. Available for download at the European Space Software Repository (https://essr.esa.int/): server, client, datasets ESA FP Days, ESTEC 2

  3. ODI components 16/12/2019 ODI server Database configuration and setup Data file download and parser scripts Calculation of geographic and magnetic coordinates Processing hooks Definition of metadata ODI client REST, Excel interfaces to the database APIs in various programming languages Dataset definitions Configuration files, download and parser scripts for 330 space environment datasets ESA FP Days, ESTEC 3

  4. ODI server software (I) 16/12/2019 PHP engine for database communication and process flow Database setup and configuration script Create database and user accounts on the MySQL server Store configuration parameters (location of data files, SPACE-TRACK credentials, user email, ) Download scripts Built-in support for wget Dataset specific downloads (e.g. SFTP) Data parsers NASA/GSFC CDFexport tool to ingest CDF data Python tool to ingest NetCDF data Dataset specific parser scripts cron/quartz setup for automated download and ingestion User may add functionality triggered by hooks on creation, download and ingestion ESA FP Days, ESTEC 4

  5. ODI server software (II) 16/12/2019 Coordinates for Earth orbiting spacecraft Download of TLEs from https://space-track.org Generation of spacecraft coordinates (GEI) using NASA/JPL SPICE library (http://naif.jpl.nasa.gov/) Magnetic coordinates for spacecraft in Earth s magnetosphere UNILIB library for calculation of L, L*, MLT, For fixed or record varying pitch angle(s) IGRF + OPQ field models A single command triggers the whole processing suite (data download, ingestion, pre- and/or post-processing), manually or as a cron/quartz job), e.g.: ./get_ingest.php goes_gp_mag_1m_rt ESA FP Days, ESTEC 5

  6. Database structure 16/12/2019 ESA FP Days, ESTEC 6

  7. PROBA1/SREM L2 configuration file raw_data_dir=SREM-DC/proba1/srem/L2 file_name_pattern=SREMPROBA1_PACC_*_L2.cdf.gz wind/plasma-6-hour.json"; $json = file_get_contents($rname) $data = json_decode($json); $tfile = fopen("load.tmp", "w"); array_shift($data); foreach ($data as $row) { $epoch = explode(".", $row[0]); $epoch = $epoch[0]; $ms = intval($epoch[1]); $cdf_epoch = date_to_cdfepoch($epoch, $ms); $values = array(); for ($i = 1; $i < count($row); $i++) { if (is_null($row[$i])) $values[] = -999.9; else $values[] = $row[$i]; DSCOVR real time data parser script $rname = "http://services.swpc.noaa.gov/products/solar- How to define a new dataset platform=PROBA1 platform_type=satellite instrument=SREM skeleton_file=SREM_PACC_L2.skt settings_file=SREM_PACC_L2.set availability=public download_script=wget_generic wget_cut_dirs=3 wget_url=http://srem.psi.ch/datarepo/L2/proba1/ cron_schedule=30 4 * * * #quartz_schedule=0 30 4 * * ? indexed_columns=epoch odi_unilib_l SPACETRACK_satnum=26958 UNILIB_PREFIX= UNILIB_CALCULATE_LSTAR=false online_data=true platform=DSCOVR platform_type=satellite instrument=FC skeleton_file=FC_RT.skt parser_file=../../parser/DSCOVR_rt.php 16/12/2019 Create a configuration file configuration.txt Create a CDF type skeleton file If required, write download and/or parser scripts If required, write processing hook scripts Run ./create_dataset.php <dataset name> to set up the SQL data table, create table colums and insert metadata Run ./get_ingest.php <dataset name> to ingest the data files, with optional download and processing hooks If required, run ./cron_install.php to add an entry in crontab } $values = implode(",", $values); fprintf($tfile, "%20.3f,%s,%3d,%s,NULL%s", $cdf_epoch, $epoch, $ms, $values, LF); } ESA FP Days, ESTEC DSCOVR real time data configuration file 7

  8. Metadata browser 16/12/2019 Php script to render a web page with metadata information on datasets and variables Uses ODI php client -> very simple script ESA FP Days, ESTEC 8

  9. ODI processing hooks 16/12/2019 User defined process hooks Scripts (batch, shell or php) started by triggers Post-creation, pre/post download and ingestion Generic: run on all datasets Dataset specific scripts Post ingest example: copy GOES GEI and magnetic coordinates from one dataset to others ESA FP Days, ESTEC 9

  10. ODI processing hooks 16/12/2019 EMU data processing (GALEM) Post-ingest hook Copies GEI and magnetic coordinates from level 0 datasets into l1, l2 datasets NGRM data processing (SSA PR-SWE-XXI) Post-ingest hook Copies data from raw dataset to level 0 science dataset Merges in spacecraft state vectors from the OEM dataset Calculates magnetic coordinates using the ODI UNILIB tool VALIRENE data cleaning and calibration Post-creation hook Copies data from a raw dataset Merges in cleaning flags Applies calibration factors ESA FP Days, ESTEC 10

  11. ODI client software 16/12/2019 HTTP/REST (server/client), JSON output HAPI (NASA Heliospheric API: https://github.com/hapi- server/data-specification) server/client Metadata browser Java SE and MySQL Connector/J JDBC driver APIs for php, Java, IDL, Matlab, Jython, Python Standardised procedure syntax Outputs in language specific objects Excel interface IDL example (RENELLA LARB model production) oDB = OBJ_NEW('ODI_JDBC ) oDB->connect query = SELECT * FROM dataset_sampex_pet_ref0 WHERE oDB->query, query, nrows=nrows res = oDB->getRows() res is an IDL structure of arrays. ESA FP Days, ESTEC 11

  12. Datasets supported in ODI distribution (~330) 16/12/2019 ACE: archive and real time EPAM, SWEPAM, MAG, SIS DSCOVR: real time IMF and plasma data (JSON streams) GOES: archive (SMS01 GOES15) and real time data SREM: PROBA1, Integral, GioveB, Rosetta, Herschel, Planck Magnetic and solar indices (Kp, Dst, F10.7, ISN, OMNI, ) Interplanetary particle datasets: HELIOS, IMP8, Voyager, Pioneer, Wind Radiation belt missions: AZUR/EI-88, GPS/CXD, S3-3/PT, CRRES/MEA/HEEF/PROTEL, UARS/PEM, SAMPEX/PET, NOAA/POES/SEM2, XMM/ERMD, PROBA-V/EPT, RBSP/HOPE/MAGEIS/REPT/RPS, HIMAWARI/SEDA Proprietary datasets: MIR/REM, STRV1B/REM, AMPTE/UKS, EQUATOR-S, ISEE1/WIM/KED, Meteosat/SEM, Galileo/EMU, TSX-5/CEASE ESA FP Days, ESTEC 12

  13. ODI applications 16/12/2019 ESA projects HIERRAS, SEPEM, SEDAT, SPENVIS, SAAPS, JHelioViewer, ESPREM, SAWS-ASPECS, SREN RENELLA, VALIRENE, ecIRENE, PEM, SRREM, GALEM SSA: P2-SWE-II (I-ESC SGIArv), VSWMC, P2-SWE-XIII (SaRIF), P3- SWE-XXI (NGRM) EC FP7 projects SEPServer SPACECAST, SPACESTORM EURISGIC ESA FP Days, ESTEC 13

  14. SQL tips and tricks 16/12/2019 VALIRENE data selection query (PROBA1/SREM counts) for IRENE validation SELECT cdf_epoch, odi_position_1 as X,odi_position_2 as Y,odi_position_3 AS Z, odi_unilib_l AS mcl,odi_unilib_b_calc AS mcb,odi_unilib_alpha_eq AS mca0,30.0 AS deltat,90.0 AS pitchangle, countrate_1 AS f1,countrate_2 AS f2,countrate_3 AS f3,countrate_4 AS f4,countrate_5 AS f5,countrate_6 AS f6,countrate_7 AS f7,countrate_8 AS f8,countrate_9 AS f9,countrate_10 AS f10,countrate_11 AS f11,countrate_12 AS f12,countrate_13 AS f13,countrate_14 AS f14,countrate_15 AS f15 FROM dataset_proba1_srem_pacc_v0 WHERE countrate_1>=0 AND countrate_2>=0 AND countrate_3>=0 AND countrate_4>=0 AND countrate_5>=0 AND countrate_6>=0 AND countrate_7>=0 AND countrate_8>=0 AND countrate_9>=0 AND countrate_10>=0 AND countrate_11>=0 AND countrate_12>=0 AND countrate_13>=0 AND countrate_14>=0 AND countrate_15>=0 AND (odi_unilib_l>1 AND odi_unilib_l<5 AND DEGREES(ATAN2(odi_position_2, odi_position_1))>-120 AND DEGREES(ATAN2(odi_position_2, odi_position_1))<90 AND DEGREES(ACOS(odi_position_3/SQRT(POW(odi_position_1, 2)+POW(odi_position_2, 2)+POW(odi_position_3, 2))))>70) AND cdf_epoch>=datetocdfepoch('2003-01-01 00:00:00') AND Why SQL for time series data? Removes dependence on data files: read once and forget CDF epoch is the primary key -> very fast data selection by time range SQL provides very powerful and efficient data processing and retrieval functionality Applications can be developed independent of dataset Time averaging SELECT FLOOR(cdf_epoch/86400.0E3) AS day, AVG(FPDU_1) FROM dataset_sampex_pet_h WHERE GROUP BY day ASC (L, 0) binning SELECT AVG(fpdu_1), FLOOR(odi_unilib_l*100)/100 AS l, FLOOR(odi_unilib_alpha_eq) AS alpha FROM dataset_sampex_pet_h_ref0 WHERE fpdu_1>=0 and fpdu_quality_esa_1=0 GROUP BY l,alpha Complex queries Queries combining datasets cdf_epoch<=datetocdfepoch('2003-12-31 23:59:59.999') ORDER BY cdf_epoch ASC ESA FP Days, ESTEC 14

  15. Future work 16/12/2019 Updates Client interfaces NetCDF parser REST and HAPI interfaces Documentation Generic parser for CSV files Enhance functionality for ingesting FITS and PDS data files Parsing of SPASE metadata (http://spase-group.org) Support for new datasets (e.g. GOES-R) Support for JSON data types Software maintenance Dataset maintenance User support / help desk Any suggestions? ESA FP Days, ESTEC 15

Related