Updates and Enhancements in ERDDAP Version 2.10 by Bob Simons at NOAA NMFS SWFSC ERD Monterey, CA

Slide Note
Embed
Share

ERDDAP version 2.10 brings new features and improvements including updated libraries, support for unsigned integer types, revised testing system, enhanced search functionalities, and new visualization options for users. The update addresses challenges such as missing values, and offers better support for various data types. These advancements aim to provide more efficient data access and analysis capabilities for users.


Uploaded on Sep 29, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. New ERDDAP Features in v2.10 Bob Simons DOC / NOAA / NMFS / SWFSC / ERD Monterey, CA bob.simons@noaa.gov

  2. ERDDAP v2.10 Almost done Feb. 2020. Last step: update libraries, including netcdf-java (finally to v5.x) One big change ... ROI/ROE - Ideally: low effort, high value. vs. high effort, but necessary.

  3. Most effort by far: Unsigned Integer Types E.g., Byte's range is -128 to 127. UByte's range is 0 to 255. Full support internally. Support as much as possible elsewhere. Problem #1: Java doesn't support, so I wrote code to simulate it. PrimitiveArrays, e.g., UByteArray Problem #2: Double can't contain all Long and ULong values. E.g., calculating min/max value. Solution: PAOne. Everywhere! Problem #3: ERDDAP data types can't just be Java data types. Solution: PAType. Everywhere! Problem #4: .nc3 files doesn't support unsigned integer types and OPeNDAP 2.0 has only partial support. The version number jump is because of this.

  4. No missing_value or _FillValue? Technically not a netcdf-java 5.x issue, but some variables sometimes have no mv, e.g., RGB values. ERDDAP always had default mv for integer types: MAX_VALUE, e.g., 127 for Byte. Now it is optional. Required changes everywhere.

  5. And Revised Testing System Now, easy for me to run all non-interactive or interactive tests.

  6. New Features for Users

  7. Search Search terms in double quotes are interpreted as .json strings (with \-encoded characters). E.g. "institution=NOAA\n" won't match "institution=NOAA NCEI". Thanks to Dan Nowacki (USGS).

  8. Advanced Search Now strict for non-.html requests. Thanks to Rich Signell (USGS). Bigger and more accurate world map. Thanks to John Maurer (PacIOOS).

  9. Make A Graph Three new marker types: Borderless Filled Square|Circle|Filled Up Triangle. Thanks to Marco Alba (EMODnet Physics), who provided the code. Draw land mask: outline | off. Thanks to John Maurer (PacIOOS).

  10. "files" System Supports plainFile type responses. E.g., a directory listing as a plainFile type (e.g., .csv or .json). Thanks to Kyle Wilcox (Axiom).

  11. Properly Percent Encoded URLs generated by Data Access Form (.html) and Make A Graph (.graph) are now fully percent encoded. Harder to read but technically correct. Better for security. Doesn't require relaxedQueryChars="[]|" in server.xml. Thanks to Antoine Queric (IFREMER).

  12. Byte Range Requests (e.g., from netcdf-java tools) Now forbidden for .nc and .hdf files. Horribly inefficient. Error-prone clients. Use (Open)DAP requests instead.

  13. Float and Double Representation When appropriate: more rounded in more places, e.g., 32.10000000000000002 might now appear as 32.1, (Thanks to Kyle Wicox (Axiom)) or floats appearing as floats (32.18562) not doubles (32.18561968534027972).

  14. New Features for Administrators

  15. Biggest feature for admins: Script SourceNames (AKA Derived Variables) Lets you make new variables based on existing variables in the source files in EDDTableFrom...Files, EDDTableFromDatabase, and EDDTableFromFileNames datasets, e.g., <sourceName>=Math2.anglePM180(row.columnDouble("lon")) </sourceName> Access to 5 classes of static methods (Math2, String2, Calendar2). May be an expression or a multi-line script, with variables. Safe and secure. Thanks to Bob Simons, Kevin O'Brien (PMEL), Roland Schweitzer (PMEL), John Maurer (PacIOOS), and Apache JEXL library. <sourceName>=NaN</sourceName> is now allowed. Can be useful for certain types of advanced searches. Thanks to Mathew Biddle (BCO-DMO/IOOS).

  16. "files" System Rewritten to add new functionality and wider support, including new support in EDDGridAggregateExistingDimension, EDDGridFromEDDTable, EDDGridFromErddap, EDDGridFromEtopo, EDDGridSideBySide, EDDTableFromEDDGrid, and EDDTableFromErddap datasets. Changed to encourage always making source files accessible for all datasets by: in setup.xml, <defaultAccessibleViaFiles>true</defaultAccessibleViaFiles> and GenerateDatasetsXml no longer adds <accessibleViaFiles>false tag. "files" is the preferred system for many use cases.

  17. Support for .nc4 and .hdf5 Groups GenerateDatasetsXml for EDDGridFromNcFiles asks for group. Gets global metadata from group and parent (root) group. sourceName, e.g., aGroup/aVar becomes destinationName, e.g., aGroup_aVar (suggested), since almost all output file types don't support groups. Thanks to Charles Carleton (DREN) and Jessica Hausman (JPL).

  18. EDDGridFrom...Files Now okay if some files don't have some of the variables. Appears as all missing values. This allows aggregation of dissimilar files. (I should have done this long ago.) Thanks to Dale Robinson (CoastWatch, West Coast).

  19. EDDTableFrom...Files quickRestart quickRestart is now faster for nc-related file types because makeExpected() now just reads metadata. Thanks to Jessica Austin (Axiom).

  20. DimensionsCSV GenerateDatasetsXml for EDDGridFromNcFiles (Unpacked) asks for DimensionsCSV, so you can make a dataset with variables which share the same, specified dimensions. (I should have done this long ago.)

  21. EDDGridFromNcFilesUnpacked Now Standardizes units to avoid problems when different files have slightly different units, e.g., "meter per second", "meters/second", "m.s^-1", and "m s-1" all become "m s-1".

  22. Pseudo High Precision Times E.g., "2020-05-22T01:02:03.456000000Z". Thanks to Yibo Jiang (JPL).

  23. EDDTableFromAsciiFiles and EDDTableFromColumnarAsciiFiles <skipHeaderToRegex>\*\*\* END OF HEADER.*</skipHeaderToRegex> skips header lines until regex is found. <skipLinesRegex>#.*</skipLinesRegex> skips all lines matching the regex. Thanks to Eli Hunter (Rutgers University).

  24. CF DSG Validity Checking Improved. E.g., Are any variables in cdm_trajectory_variables also in cdm_profile_variables? Thanks to Micah Wengren (IOOS).

  25. Simplify Used to make StringArray into e.g., IntArray. Faster and uses less memory. Thanks to Unidata.

  26. Better Documentation For creating datasets from files in AWS S3 buckets. Thanks to Micah Wengren (IOOS). Better system for checking for broken links. Lots of little improvements.

  27. Lots of Small Changes and Bug Fixes

  28. Read about ERDDAP and try it out http://coastwatch.pfeg.noaa.gov/erddap/ Download and install ERDDAP http://coastwatch.pfeg.noaa.gov/erddap/download/setup.html Thank you! Questions? Comments? Suggestions? bob.simons@noaa.gov

Related