SpatioTemporal Adaptive Resolution Encoding (STARE): A Versatile Data Store Leveraging HDF Virtual Object Layer

 
S
T
A
R
E
-
P
O
D
S
:
 
A
 
V
E
R
S
A
T
I
L
E
 
D
A
T
A
 
S
T
O
R
E
L
E
V
E
R
A
G
I
N
G
 
T
H
E
 
H
D
F
 
V
I
R
T
U
A
L
 
O
B
J
E
C
T
L
A
Y
E
R
 
F
O
R
 
C
O
M
P
A
T
I
B
I
L
I
T
Y
 
Michael L Rilee
1,2
 Kwo-Sen Kuo
1,3
,
 James Gallagher
4
, 
James Frew
5
, Niklas Griessbaum
5
,
Edward Hartnett
6
, Robert Wolfe
1
, Gerd Heber
7
, Siri Jodha Khalsa
8
 
 
1
NASA Goddard Space Flight Center, Greenbelt, Maryland, USA
2
Rilee Systems Technologies LLC, Derwood, Maryland, USA
3
Bayesics LLC, Bowie, Maryland, USA
4
OPeNDAP, Inc., Narragansett, Rhode Island, USA
5
University of California, Santa Barbara, California, USA
6
Ed Hartnett Consulting, Boulder, CO, USA
7
The HDF Group, Champaign, IL, USA
8
Coloradio Associates for Science and Technology LLC, Boulder, CO, USA
 
2020 ESIP Summer Meeting
2020 July 22
 
STARE
Proposal No. 17-ACCESS17-0039
Federal Award ID No. 80NSSC18M0118
 
SpatioTemporal Adaptive Resolution Encoding (STARE)
 
S
T
A
R
E
-
P
O
D
S
 
f
o
r
 
s
c
a
l
a
b
l
e
 
A
n
a
l
y
s
i
s
 
R
e
a
d
y
 
D
a
t
a
 
(
A
R
D
)
 
Diverse low-level Earth Science data (ESD) requires special treatment to
co-align and combine for integrative analysis
The SpatioTemporal Adaptive Resolution Encoding (STARE) provides a
unifying indexing scheme to combine geo-located ESD
STARE partitioned ESD enables 
Parallel Optimized Data Store 
(PODS)
HDF’s Virtual Object Layer (VOL) and Virtual Data Set (VDS) technologies
can provide familiar front-ends to data in STARE-PODS
STARE-PODS unifies accessing diverse data with minimum duplication
 
STARE-PODS is a proposal to NASA/ACCESS-19 currently under review.
 
STARE Basics
 
 
Existing native array & memory indexing impedes integration and processing.
 
S
T
A
R
E
 
B
a
s
i
c
s
 
Two swath sections, A and B, overlap with the region of interest (ROI) outlined in black, with data on
separate computational nodes (numbered).
 
Parallel
 
& distributed indexing based on native array partitioning
leads to extra data movement, breaking 
SCALABILITY
.
 
Higher-res nadir
 
Lower-res wing
Region of interest
 
S
T
A
R
E
 
E
n
c
o
d
i
n
g
 
a
 
l
o
c
a
t
i
o
n
s
 
i
n
 
a
 
r
e
c
u
r
s
i
v
e
 
s
p
a
t
i
a
l
 
q
u
a
d
-
t
r
e
e
 
STARE Temporal indexing is similar but based on calendrical periods.
 
A tilted root polyhedron
0
th
 level
 
First refinement level
1
st
 level
 
STARE Spatial ‘Trixels’
Encoded  as 64-bit integers
 
Worker
Node
2
 
Worker
Node
1
 
Worker
Node
3
 
Worker
Node
4
 
Chunk 1
ffc0-ffcc
 
Chunk 2
ffd0-ffdc
 
Chunk 3
ffe0-ffec
 
Chunk 4
fff0-fffc
Parallel Store, SciDB…
 
N3333
Bit 1 1 11 11 11 11 -> 0xffc (right justified)
N3333 
ffc
0000000000000 @level 3 (left justified)
 
N3333
0
-N3333
3
N33330 
ffc
0000000000000 @level 4
l
N33333 
fff
0
000000000000 @level 4
 
00
 
01
 
10
 
11
 
N3333
00
-N3333
33
N333300 
ffc0
000000000000 @level 5
l
N333333 
fffc
0000000000000 @level 5
 
@level 5
 
0000
 
0011
 
1100
 
1111
 
3
 
4
 
5
 
“Chunks”
 
Levels
 
STARE Spatial Hierarchical Triangular Mesh (HTM) Indexing:  
spherical triangles to integers via quadtree recursion
-
aids comparison of different data sets, integer operations are much faster than geometric calculations
-
bit pattern keeps co-located data together when “chunked”
STARE Temporal Hierarchical Calendrical Partitioning (HCP):
 similar but with branching based on calendar partitions
 
00
 
01
 
10
 
11
 
level
 
Worker
Node
2
 
Worker
Node
1
 
Worker
Node
3
 
Worker
Node
4
 
STARE vs Floating-Point Encoding
The smallest triangle in the figure
is at quadfurcation level 6.
 
*
STARE id also includes resolution information. In this case, it points
to quadfurcation level 20, i.e. 
 10m
 
NADIR
 
WING
 
STARE indexing
adapts to the
resolution of the
data, which often
varies.
 
MODIS
 
GOES pixel
 
 
Lon-lat
search area for
combining data
 
Supporting conventional lon-lat vs. STARE-based integration
 
One “scan” with
ten sensors.
 
MODIS pixel
(nadir resolution)
 
2+1 Dimensions indexed with two integers
 
STARE Volumes
(not to scale)
 
Parallelization
for Volume & Variety Scaling
 
 
STARE
 supporting a 16-way partitioning co-locating diverse data
 
GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection)
 
Using STARE to combine GOES and MODIS data
 
Can use key-value store to integrate
 
GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection)
 
Using STARE to combine GOES and MODIS data
 
Can use key-value store to integrate
 
 
HDF
Virtual Object Layer and
Virtual Data Sets
 
 
 
 
 
Individual instrument field of views
 
Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS)
 
Actual data partitioned into
chunks for parallelism with
unified search and co-alignment.
 
HDF Virtual Data Set for
tailoring views into the data
 
Volume & variety scalability
 
Usability
HDF Virtual Data Set API
 
 
 
 
Individual instrument field of views
 
Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS)
 
Actual data partitioned into
chunks for parallelism with
unified search and co-alignment.
 
HDF Virtual Data Set for
tailoring views into the data
 
Usability
HDF Virtual Data Set API
 
STARE-SHARDS
Storage Layer
 
Volume & variety scalability
 
 
Use a STARE ‘cover’ to
partition a granule
 
STARE partitioned swath data
looks like familiar HDF files
 
 
 
Using familiar HDF methods to access STARE-SHARDS
Data Source 1
 
HDF
Virtual
Granule
 
End users and legacy applications interact with STARE-SHARDS transparently.
 
Different sources and varieties of data with
different coverage, resolutions…
 
 
Use a STARE ‘cover’ to
partition a granule
 
STARE partitioned swath data
looks like familiar HDF files
 
 
 
Using familiar HDF methods to access STARE-SHARDS
Data Source 1
 
HDF
Virtual
Granule
 
End users and legacy applications interact with STARE-SHARDS transparently.
 
Different sources and varieties of data with
different coverage, resolutions…
 
 
 
The Proposed Architecture
STARE SHARDS to PODS to Integrative Analysis
 
Computing & Storage
 
Index & Organization
 
Query, Marshalling, “Transport”
 
Use & Tooling
 
The Architecture
STARE SHARDS to PODS to Integrative Analysis
 
STARE Location Service (SLS)
A ‘DNS’ for geolocated data
 
C
o
n
c
l
u
s
i
o
n
:
 
S
T
A
R
E
-
P
O
D
S
 
f
o
r
 
s
c
a
l
a
b
l
e
 
i
n
t
e
g
r
a
t
i
v
e
 
a
n
a
l
y
s
i
s
 
STARE lays the foundation for scaling both variety and volume
Supports lower-level (L1 & L2) data accessibility, combination, and scalability
Features C++ and Python APIs, including a Pandas-like interface
STARE Sidecar files limit costs of translation into STARE indices
OPeNDAP integration is in progress
Libraries, examples, tests, and cookbooks at 
https://github.com/SpatioTemporal
STARE-PODS and STARE-SHARDS
Organize diverse data for co-alignment and parallel/distributed storage and processing
HDF Virtual Object Layer and Data Set support transparent legacy access
 
Acknowledgments
STARE-PODS is a proposal to NASA/ACCESS-19 currently under review.
This work is supported by NASA/ACCESS-17. Federal Award ID No. 80NSSC18M0118.
NASA/LaRC for interest and support.
 
Supplemental
 
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
 
Zooming in to the MODIS swath “bow-tie”
 
WING
 
NADIR
 
Two “scans”
overlapping
 
STARE Indexing adapts to the data
 
0x
1048
000000000005
 
0x
1049
e66dab30632b
 
STARE Spatial IDs
Level 5, green trixels
A  0x1048000000000005
B  0x104a000000000005
C  0x104c000000000005
D  0x104e000000000005
 
A
 
B
 
C
 
D
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 24
ROI+GOES
ROI+MODIS
ROI+GOES+MODIS
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 24
 
ROI+GOES
 
ROI+MODIS
 
ROI
+GOES
+MODIS
 
A
: 0x
1049e6
000000000a
 
B
: 0x
1049e6
600000000b
 
C
: 0x
1049e6
6dab30632b
 
Integration at the
finest level via IFOV
and PSF modeling
i
j
k
 
Finer trixels not shown for clarity.
 
“brown psf”
 
“blue psf”
 
Instrument Field of View and Point Spread Function Modeling
Slide Note
Embed
Share

STARE-PODS is a proposal by a team of experts aiming to provide a unifying indexing scheme for combining diverse Earth Science data. Leveraging the SpatioTemporal Adaptive Resolution Encoding (STARE) and Parallel Optimized Data Store (PODS), the system enables efficient processing and analysis of geo-located data. By utilizing HDF's Virtual Object Layer (VOL) and Virtual Data Set (VDS) technologies, STARE-PODS simplifies accessing and integrating complex Earth Science datasets, offering scalability and minimizing duplication.

  • SpatioTemporal
  • Adaptive Resolution
  • Data Store
  • Earth Science
  • STARE

Uploaded on Sep 16, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. SpatioTemporal Adaptive Resolution Encoding (STARE) STARE STARE- -PODS: A VERSATILE DATA STORE PODS: A VERSATILE DATA STORE LEVERAGING THE LEVERAGING THE HDF HDF VIRTUAL OBJECT LAYER FOR COMPATIBILITY LAYER FOR COMPATIBILITY VIRTUAL OBJECT Michael L Rilee1,2 Kwo-Sen Kuo1,3, James Gallagher4, James Frew5, Niklas Griessbaum5, Edward Hartnett6, Robert Wolfe1, Gerd Heber7, Siri Jodha Khalsa8 1NASA Goddard Space Flight Center, Greenbelt, Maryland, USA 2Rilee Systems Technologies LLC, Derwood, Maryland, USA 3Bayesics LLC, Bowie, Maryland, USA 4OPeNDAP, Inc., Narragansett, Rhode Island, USA 5University of California, Santa Barbara, California, USA 6Ed Hartnett Consulting, Boulder, CO, USA 7The HDF Group, Champaign, IL, USA 8Coloradio Associates for Science and Technology LLC, Boulder, CO, USA STARE 2020 ESIP Summer Meeting 2020 July 22 Proposal No. 17-ACCESS17-0039 Federal Award ID No. 80NSSC18M0118 ACCESS Advancing Collaborative Connections for Earth System Science

  2. STARE STARE- -PODS PODS for scalable Analysis Ready Data (ARD) Diverse low-level Earth Science data (ESD) requires special treatment to co-align and combine for integrative analysis The SpatioTemporal Adaptive Resolution Encoding (STARE) provides a unifying indexing scheme to combine geo-located ESD STARE partitioned ESD enables Parallel Optimized Data Store (PODS) HDF s Virtual Object Layer (VOL) and Virtual Data Set (VDS) technologies can provide familiar front-ends to data in STARE-PODS STARE-PODS unifies accessing diverse data with minimum duplication STARE-PODS is a proposal to NASA/ACCESS-19 currently under review. ACCESS Advancing Collaborative Connections for Earth System Science

  3. STARE Basics ACCESS Advancing Collaborative Connections for Earth System Science

  4. STARE Basics Existing native array & memory indexing impedes integration and processing. ACCESS Advancing Collaborative Connections for Earth System Science

  5. Parallel & distributed indexing based on native array partitioning leads to extra data movement, breaking SCALABILITY. Higher-res nadir Lower-res wing Region of interest Two swath sections, A and B, overlap with the region of interest (ROI) outlined in black, with data on separate computational nodes (numbered). ACCESS Advancing Collaborative Connections for Earth System Science

  6. STARE Encoding a locations in a recursive spatial quad-tree A tilted root polyhedron 0th level STARE Spatial Trixels Encoded as 64-bit integers First refinement level 1st level STARE Temporal indexing is similar but based on calendrical periods. ACCESS Advancing Collaborative Connections for Earth System Science

  7. STARE Spatial Hierarchical Triangular Mesh (HTM) Indexing: spherical triangles to integers via quadtree recursion - aids comparison of different data sets, integer operations are much faster than geometric calculations - bit pattern keeps co-located data together when chunked STARE Temporal Hierarchical Calendrical Partitioning (HCP): similar but with branching based on calendar partitions Levels N3333 level Bit 1 1 11 11 11 11 -> 0xffc (right justified) N3333 ffc0000000000000 @level 3 (left justified) 1 2 3 00 3 01 10 N33330-N33333 N33330 ffc0000000000000 @level 4 l N33333 fff0000000000000 @level 4 11 01 0000 10 11 4 00 1100 N333300-N333333 N333300 ffc0000000000000 @level 5 l N333333 fffc0000000000000 @level 5 0011 1111 5 Chunk 1 ffc0-ffcc Chunk 2 ffd0-ffdc Parallel Store, SciDB Chunk 4 fff0-fffc Chunk 3 ffe0-ffec @level 5 Worker Node 1 1 Worker Node 2 2 Worker Node 3 3 Worker Node 4 4 Worker Node Worker Node Worker Node Worker Node ACCESS Chunks Advancing Collaborative Connections for Earth System Science

  8. STARE vs Floating-Point Encoding Longitude Latitude Human readable +123.4 60 Single-precision floating-point 0x42f6cccd 0x42700000 STARE id* 0x36ee9398f7210f34 *STARE id also includes resolution information. In this case, it points to quadfurcation level 20, i.e. 10m The smallest triangle in the figure is at quadfurcation level 6. ACCESS Advancing Collaborative Connections for Earth System Science

  9. Supporting conventional lon-lat vs. STARE-based integration MODIS STARE indexing adapts to the resolution of the data, which often varies. One scan with ten sensors. MODIS pixel (nadir resolution) Lon-lat search area for combining data GOES pixel WING NADIR ACCESS Advancing Collaborative Connections for Earth System Science

  10. 2+1 Dimensions indexed with two integers STARE SpatioTemporal Search/Index VolumesHurricane IRMA Key West Sensor trajectory Cuba STARE Volumes (not to scale) ACCESS Advancing Collaborative Connections for Earth System Science

  11. Parallelization for Volume & Variety Scaling ACCESS Advancing Collaborative Connections for Earth System Science

  12. STARE supporting a 16-way partitioning co-locating diverse data ACCESS Advancing Collaborative Connections for Earth System Science

  13. Using STARE to combine GOES and MODIS data Can use key-value store to integrate GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection) ACCESS Advancing Collaborative Connections for Earth System Science

  14. Using STARE to combine GOES and MODIS data Can use key-value store to integrate GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection) ACCESS Advancing Collaborative Connections for Earth System Science

  15. HDF Virtual Object Layer and Virtual Data Sets ACCESS Advancing Collaborative Connections for Earth System Science

  16. Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS) HDF Virtual Data Set API Usability HDF Virtual Data Set for tailoring views into the data Actual data partitioned into chunks for parallelism with unified search and co-alignment. Individual instrument field of views Volume & variety scalability ACCESS Advancing Collaborative Connections for Earth System Science

  17. Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS) HDF Virtual Data Set API Usability HDF Virtual Data Set for tailoring views into the data STARE-SHARDS Storage Layer Actual data partitioned into chunks for parallelism with unified search and co-alignment. Individual instrument field of views Volume & variety scalability ACCESS Advancing Collaborative Connections for Earth System Science

  18. Using familiar HDF methods to access STARE-SHARDS End users and legacy applications interact with STARE-SHARDS transparently. STARE partitioned swath data looks like familiar HDF files HDF Virtual Granule Use a STARE cover to partition a granule Different sources and varieties of data with different coverage, resolutions Data Source A Data Source 2 Data Source B Data Source 3 Data Source 1 ACCESS Advancing Collaborative Connections for Earth System Science

  19. Using familiar HDF methods to access STARE-SHARDS End users and legacy applications interact with STARE-SHARDS transparently. STARE partitioned swath data looks like familiar HDF files HDF Virtual Granule Use a STARE cover to partition a granule Different sources and varieties of data with different coverage, resolutions Data Source A Data Source 2 Data Source B Data Source 3 Data Source 1 ACCESS Advancing Collaborative Connections for Earth System Science

  20. The Proposed Architecture STARE SHARDS to PODS to Integrative Analysis Use & Tooling Query, Marshalling, Transport Index & Organization Computing & Storage ACCESS Advancing Collaborative Connections for Earth System Science

  21. The Architecture STARE SHARDS to PODS to Integrative Analysis STARE Location Service (SLS) A DNS for geolocated data ACCESS Advancing Collaborative Connections for Earth System Science

  22. Conclusion: STARE Conclusion: STARE- -PODS for scalable integrative analysis PODS for scalable integrative analysis STARE lays the foundation for scaling both variety and volume Supports lower-level (L1 & L2) data accessibility, combination, and scalability Features C++ and Python APIs, including a Pandas-like interface STARE Sidecar files limit costs of translation into STARE indices OPeNDAP integration is in progress Libraries, examples, tests, and cookbooks at https://github.com/SpatioTemporal STARE-PODS and STARE-SHARDS Organize diverse data for co-alignment and parallel/distributed storage and processing HDF Virtual Object Layer and Data Set support transparent legacy access Acknowledgments STARE-PODS is a proposal to NASA/ACCESS-19 currently under review. This work is supported by NASA/ACCESS-17. Federal Award ID No. 80NSSC18M0118. NASA/LaRC for interest and support. ACCESS Advancing Collaborative Connections for Earth System Science

  23. Supplemental ACCESS Advancing Collaborative Connections for Earth System Science

  24. ACCESS Advancing Collaborative Connections for Earth System Science

  25. NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21 ACCESS Advancing Collaborative Connections for Earth System Science

  26. NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21 ACCESS Advancing Collaborative Connections for Earth System Science

  27. Zooming in to the MODIS swath bow-tie Two scans overlapping WING NADIR STARE Indexing adapts to the data ACCESS Advancing Collaborative Connections for Earth System Science

  28. ROI+GOES ROI+MODIS ROI+GOES+MODIS A D C B STARE Spatial IDs Level 5, green trixels A 0x1048000000000005 B 0x104a000000000005 C 0x104c000000000005 D 0x104e000000000005 0x1049e66dab30632b 0x1048000000000005 NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 24 ACCESS Advancing Collaborative Connections for Earth System Science

  29. NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21 ACCESS Advancing Collaborative Connections for Earth System Science

  30. ROI+MODIS ROI+GOES ROI +GOES +MODIS A: 0x1049e6000000000a B: 0x1049e6600000000b C: 0x1049e66dab30632b NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 24 ACCESS Advancing Collaborative Connections for Earth System Science

  31. Instrument Field of View and Point Spread Function Modeling Integration at the finest level via IFOV and PSF modeling k blue psf j brown psf ?? ????? ????? i ? = ? ? combined Signal (target) Observation Vectors (source) PSF weights Finer trixels not shown for clarity. ACCESS Advancing Collaborative Connections for Earth System Science

  32. ACCESS Advancing Collaborative Connections for Earth System Science

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#