PDS4 Core Concepts

 
 
1
 
PDS4 Core Concepts
 
Tech Session
 
28 February 2011
 
Anne Raugh
 
 
2
 
PDS4 Core Concepts
 
Data formats
Products
Archive design
Labels
XML
Namespaces
XML Schema
 
 
DATA FORMATS
 
PDS4 Core Concepts
 
Core Concepts – Data Formats
 
Arrays
 
 
4
 
  
A(i
1
,i
2
,i
3
)
 
In an array A(
i
1
,i
2
,…,i
n
), the elements are stored such that the 
i
n
 are
contiguous, then the 
i
n-1
, etc., down to the 
i
1
.
 
Core Concepts – Data Formats
 
Tables
 
 
5
 
In a table, each record is stored contiguously starting with
the first field of the first record, then the second, and so on
to the last; then the first field of the second record, etc.
 
Table with four fields and five records.
 
Core Concepts – Data Formats
 
Interleaving
 
Inserting some or all of the bytes of one
data structure into the stream of bytes of
another data structure.
 
Interleaving is prohibited in PDS4
 
 
6
 
Core Concepts – Data Formats
 
Parsable Byte Streams
Simple rules for parsing bytes into program data
structures.  (Simple = no bit changes)
 
Plain text
XML
CSV files
 
Generally used for documents.  PDS will designate
acceptable parsing standards (internal or external).
 
 
7
 
Core Concepts – Data Formats
 
Encoded Byte Streams
Bytes must be processed according to some,
probably external, standard to generate the desired
information.
PDF
MP3
MPEG
 
Mainly used for complex documents and potentially for
very high-order data products.  PDS will designate
acceptable encoding standards.
 
 
8
 
Core Concepts – Data Formats
 
Note that in PDS4:
Documents = Data
Geometry = Data
Calibration = Data
That is, all these types of information are identified and
labeled to the same level of detail as observational data.
 
 
9
 
PRODUCTS
 
PDS4 Core Concepts
 
Core Concepts - Products
 
Digital Object
For the purposes of this discussion, a 
digital object
 is
any sequence of bytes that is not an XML label.
So:
Data = Digital Object
 
 
 
11
 
Core Concepts - Products
 
Product
A 
product
 
is a label plus all the digital objects that that
label describes.
Each label is in a file of its own (no other labels or
digital objects).
A single file may contain more than one digital
object.
A single digital object may not be split across files.
Each product has an identifier which is unique across
URI space.
 
 
12
 
Core Concepts - Product
 
 
The PDS4 Registry Service tracks products.
 
Versioning is tracked at the product level.
 
 
PDS4 is product-oriented
 
 
13
 
ARCHIVE DESIGN
 
PDS4 Core Concepts
 
Core Concepts – Archive Design
 
Groups of similar (same general type, same
origin) products are gathered into 
Collections
.
Collections are organized into 
Bundles
.
A large mission archive can be broken into
separate bundles, if desired.
 
These are logical organizations that can (and
most likely will) also be used as physical
organizations.
 
 
15
 
Core Concepts – Archive Design
 
Collections contain closely related products. In
general, all the products of a collection will have:
The same data format (image, table,
document, …)
The same source (instrument, experiment, …)
The same processing history (reduction level,
calibration, …)
The same purpose (calibration, geometry,
documentation, …
 
 
16
 
Core Concepts – Archive Design
 
Bundles are used to organize related collections
into manageable groups.  Bundles may group
collections by any reasonable criteria, including:
Mission phases
Review schedules
Observing instrument
Development subcontractor
Hardware limitations on total size
etc.
 
 
17
 
Core Concepts – Archive Design
 
Physically, a collection is:
A table of product member IDs, plus
A label that describes the table and provides
documentation about the collection.
 
So a collection has the same form as a product,
but restricted content.
 
 
18
 
Core Concepts – Archive Design
 
Physically, a bundle is:
A table of collection member IDs, plus
A label that describes the table and provides
documentation about the bundle.
 
So a bundle has the same form as a product, but
restricted content.
 
 
19
 
Core Concepts - Product
 
 
The PDS4 Registry Service tracks products.
 
Versioning is tracked at the product level.
 
 
PDS4 is product-oriented
 
 
20
 
LABELS
 
PDS4 Core Concepts
 
Core Concepts – Labels
 
 
22
 
Wild Object
 
Core Concepts – Labels
 
 
23
 
Digital Object
 
Wild Object
 
Core Concepts – Labels
 
 
24
 
Digital Object
Structure
Definition
Base
Interpretation
 
Core Concepts – Labels
 
 
25
 
Digital Object
Structure
Definition
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
26
 
Digital Object
Structure
Definition
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
27
 
Digital Object
Structure
Definition
Mission
Documentation
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
28
 
Digital Object
Structure
Definition
Mission
Documentation
Node
Documentation
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
29
 
Digital Object
Structure
Definition
Mission
Documentation
Node
Documentation
PDS
Documentation
Label
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
30
 
Digital Object
Structure
Definition
Mission
Documentation
Node
Documentation
PDS
Documentation
Label
Extended
Interpretation
Base
Interpretation
 
Core Concepts – Labels
 
 
31
 
Digital Object
Structure
Definition
Mission
Documentation
Node
Documentation
PDS
Documentation
 
P
r
o
d
u
c
t
 
XML
 
PDS4 Core Concepts
 
Core Concepts - XML
 
XML – eXtensible Markup Language
Defines parsing rules
W3C recommendation
Supported by 3
rd
-party and open source
libraries
 
 
33
 
Core Concepts - XML
 
Here’s some XML:
<movie>
<title>Bedtime for Bonzo</title>
<firstRelease>1951</firstRelease>
<director>Frederick de Cordova</director>
<screenplayBy>Lou Breslow</screenplayBy>
<screenplayBy>Val Burton</screenplayBy>
<storyBy>Ted Berkman</storyBy>
<storyBy>Raphael Blau</storyBy>
<starring>Ronald Reagan</starring>
<starring>Diana Lynn</starring>
</movie>
 
 
34
 
Core Concepts - XML
 
Some XML Terminology
Tag:
 Anything inside “<>”, like 
<movie> 
or
<title>
Closing tag: 
“</…>”, like 
</movie> 
or 
</title>
Content: 
Everything between the opening and
closing tags (including other tags and their
content).
Element:
 The opening and closing tags plus the
content.
 
 
35
 
Core Concepts - XML
 
Some PDS4 Terminology
Attribute:
 An XML element that 
does not 
contain
other XML elements (like 
title
)
Class: 
An XML element that does contain other
elements (like 
movie
).  That is, a class is a
collection of attributes (and possibly other
classes).
Attributes and classes are defined in 
data
dictionaries.
 
 
36
 
NAMESPACES
 
PDS4 Core Concepts
 
Core Concepts - Namespaces
 
A namespace establishes a context for definition.
Two items with the same name but from
different namespaces generally have different
definitions.
For example, consider “title”.  This word will
have a very different meaning in a 
movie
namespace than it will in a 
car
 namespace.
 
 
38
 
Core Concepts - Namespaces
 
In PDS4, namespaces are used to delegate authority for
creating attributes used in label documentation sections.
PDS will assign namespaces to data preparers (by
mission, instrument, experiment, …)
Data preparers will have authority to create descriptive
attributes and classes in their assigned namespace.
The contents of a single namespace are defined in the
data dictionary
 for that namespace.
Node-level attributes and classes will be defined in
node-level namespaces (and thus node-level
dictionaries).
 
 
39
 
Core Concepts - Namespaces
 
In XML, namespaces are prefixed to the tag name:
<movie:title>…</movie:title>
<car:title>…</car:title>
More accurately, the prefix is nearly always an
abbreviation for the full namespace identifier, which is
defined at the opening of the XML document and takes the
form of a URI.  Typical PDS4 namespaces will look like
this:
http://pds.nasa.gov/schema/pds4/node/sbn
which is why they are normally abbreviated.
 
 
40
 
XML SCHEMA
 
PDS4 Core Concepts
 
Core Concepts – XML Schema
 
The XML standard only defines syntax; it does
not define any tags.
A 
schema
 can be used to define tags and
constraint their content.
XML Schema is an XML-based schema
language that provides the sort of capabilities
we want for PDS4 labels (and quite a bit
more).
 
 
42
 
Core Concepts – XML Schema
 
Some useful XML Schema capabilities:
Defining data types
Creating standard value lists
Constraining extrema
Namespace support
Designating required and optional attributes
In addition there are commercial and open source tools to
support creating XML Schema files and using them to
create and validate XML documents (like PDS4 labels).
 
 
43
 
Core Concepts – XML Schema
 
In PDS4, XML Schema documents will be used
for:
Creating one-off labels
Defining/constraining label content across a
collection
Holding data dictionary information needed for
label validation
Defining interface formats between system
elements
 
 
44
 
Core Concepts – XML Schema
 
Creating one-off labels
Start with template schema from the PDS
library.
Use it with an XML-aware editor to create a
blank XML label.
Fill in the blanks.
Validate the resulting XML against the
template schema.
 
 
45
 
Core Concepts – XML Schema
 
Defining/Constraining Label Content Across
a Collection
Start with generic schema from PDS library.
Node edits schema to reflect design decisions
already made; inserts node classes.
Mission inserts mission classes.
XML-aware editor is used to generate a
template for pipeline use.
Edited schema is used to validate output
labels.
 
 
46
 
Core Concepts – XML Schema
 
Holding Data Dictionary Information
Data dictionary information (attributes and
classes) resides in an integrated database.
Individual namespaces will be dumped to
separate schema files.
The namespace schema files are directly
referenced by the XML label files.
XML validators compare the use of the
attributes and classes in the XML file to the
definitions in the schema files.
 
 
47
 
Core Concepts – XML Schema
 
Defining Interface Formats
An XML schema defines the tags used for
input/output.
The schema can be used in an XML-aware
tool, like an editor, to generate a template
XML file.
XML input can be validated against the
schema prior to attempting processing.
 
 
48
 
QUESTIONS?
 
PDS4 Core Concepts
 
Backup
 
 
 
50
Slide Note
Embed
Share

Delve into the intricate world of PDS4 Core Concepts, exploring data formats, arrays, tables, interleaving, parsing byte streams, encoding byte streams, and the meticulous structuring of documents, data geometry, and calibration within the PDS4 framework. Discover the principles governing the storage and processing of data structures, along with the significance of adhering to designated standards for optimal information generation.

  • PDS4
  • Core Concepts
  • Data Formats
  • Arrays
  • Tables

Uploaded on Sep 25, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. PDS4 Core Concepts Tech Session 28 February 2011 Anne Raugh 1

  2. PDS4 Core Concepts Data formats Products Archive design Labels XML Namespaces XML Schema 2

  3. PDS4 Core Concepts DATA FORMATS

  4. Core Concepts Data Formats Arrays A(i1,i2,i3) In an array A(i1,i2, ,in), the elements are stored such that the in are contiguous, then the in-1, etc., down to the i1. 4

  5. Core Concepts Data Formats Tables Table with four fields and five records. In a table, each record is stored contiguously starting with the first field of the first record, then the second, and so on to the last; then the first field of the second record, etc. 5

  6. Core Concepts Data Formats Interleaving Inserting some or all of the bytes of one data structure into the stream of bytes of another data structure. Interleaving is prohibited in PDS4 6

  7. Core Concepts Data Formats Parsable Byte Streams Simple rules for parsing bytes into program data structures. (Simple = no bit changes) Plain text XML CSV files Generally used for documents. PDS will designate acceptable parsing standards (internal or external). 7

  8. Core Concepts Data Formats Encoded Byte Streams Bytes must be processed according to some, probably external, standard to generate the desired information. PDF MP3 MPEG Mainly used for complex documents and potentially for very high-order data products. PDS will designate acceptable encoding standards. 8

  9. Core Concepts Data Formats Note that in PDS4: Documents = Data Geometry = Data Calibration = Data That is, all these types of information are identified and labeled to the same level of detail as observational data. 9

  10. PDS4 Core Concepts PRODUCTS

  11. Core Concepts - Products Digital Object For the purposes of this discussion, a digital object is any sequence of bytes that is not an XML label. So: Data = Digital Object 11

  12. Core Concepts - Products Product A productis a label plus all the digital objects that that label describes. Each label is in a file of its own (no other labels or digital objects). A single file may contain more than one digital object. A single digital object may not be split across files. Each product has an identifier which is unique across URI space. 12

  13. Core Concepts - Product The PDS4 Registry Service tracks products. Versioning is tracked at the product level. PDS4 is product-oriented 13

  14. PDS4 Core Concepts ARCHIVE DESIGN

  15. Core Concepts Archive Design Groups of similar (same general type, same origin) products are gathered into Collections. Collections are organized into Bundles. A large mission archive can be broken into separate bundles, if desired. These are logical organizations that can (and most likely will) also be used as physical organizations. 15

  16. Core Concepts Archive Design Collections contain closely related products. In general, all the products of a collection will have: The same data format (image, table, document, ) The same source (instrument, experiment, ) The same processing history (reduction level, calibration, ) The same purpose (calibration, geometry, documentation, 16

  17. Core Concepts Archive Design Bundles are used to organize related collections into manageable groups. Bundles may group collections by any reasonable criteria, including: Mission phases Review schedules Observing instrument Development subcontractor Hardware limitations on total size etc. 17

  18. Core Concepts Archive Design Physically, a collection is: A table of product member IDs, plus A label that describes the table and provides documentation about the collection. So a collection has the same form as a product, but restricted content. 18

  19. Core Concepts Archive Design Physically, a bundle is: A table of collection member IDs, plus A label that describes the table and provides documentation about the bundle. So a bundle has the same form as a product, but restricted content. 19

  20. Core Concepts - Product The PDS4 Registry Service tracks products. Versioning is tracked at the product level. PDS4 is product-oriented 20

  21. PDS4 Core Concepts LABELS

  22. Core Concepts Labels Wild Object 22

  23. Core Concepts Labels Digital Object Wild Object 23

  24. Core Concepts Labels Digital Object Structure Definition 24

  25. Core Concepts Labels Base Interpretation Digital Object Structure Definition 25

  26. Core Concepts Labels Extended Interpretation Base Interpretation Digital Object Structure Definition 26

  27. Core Concepts Labels Mission Documentation Extended Interpretation Base Interpretation Digital Object Structure Definition 27

  28. Core Concepts Labels Node Documentation Mission Documentation Extended Interpretation Base Interpretation Digital Object Structure Definition 28

  29. Core Concepts Labels PDS Documentation Node Documentation Mission Documentation Extended Interpretation Base Interpretation Digital Object Structure Definition 29

  30. Core Concepts Labels Label PDS Documentation Node Documentation Mission Documentation Extended Interpretation Base Interpretation Digital Object Structure Definition 30

  31. Core Concepts Labels Product Label PDS Documentation Node Documentation Mission Documentation Extended Interpretation Base Interpretation Digital Object Structure Definition 31

  32. PDS4 Core Concepts XML

  33. Core Concepts - XML XML eXtensible Markup Language Defines parsing rules W3C recommendation Supported by 3rd-party and open source libraries 33

  34. Core Concepts - XML Here s some XML: <movie> <title>Bedtime for Bonzo</title> <firstRelease>1951</firstRelease> <director>Frederick de Cordova</director> <screenplayBy>Lou Breslow</screenplayBy> <screenplayBy>Val Burton</screenplayBy> <storyBy>Ted Berkman</storyBy> <storyBy>Raphael Blau</storyBy> <starring>Ronald Reagan</starring> <starring>Diana Lynn</starring> </movie> 34

  35. Core Concepts - XML Some XML Terminology Tag:Anything inside <> , like <movie> or <title> Closing tag: </ > , like </movie> or </title> Content: Everything between the opening and closing tags (including other tags and their content). Element: The opening and closing tags plus the content. 35

  36. Core Concepts - XML Some PDS4 Terminology Attribute: An XML element that does not contain other XML elements (like title) Class: An XML element that does contain other elements (like movie). That is, a class is a collection of attributes (and possibly other classes). Attributes and classes are defined in data dictionaries. 36

  37. PDS4 Core Concepts NAMESPACES

  38. Core Concepts - Namespaces A namespace establishes a context for definition. Two items with the same name but from different namespaces generally have different definitions. For example, consider title . This word will have a very different meaning in a movie namespace than it will in a car namespace. 38

  39. Core Concepts - Namespaces In PDS4, namespaces are used to delegate authority for creating attributes used in label documentation sections. PDS will assign namespaces to data preparers (by mission, instrument, experiment, ) Data preparers will have authority to create descriptive attributes and classes in their assigned namespace. The contents of a single namespace are defined in the data dictionary for that namespace. Node-level attributes and classes will be defined in node-level namespaces (and thus node-level dictionaries). 39

  40. Core Concepts - Namespaces In XML, namespaces are prefixed to the tag name: <movie:title> </movie:title> <car:title> </car:title> More accurately, the prefix is nearly always an abbreviation for the full namespace identifier, which is defined at the opening of the XML document and takes the form of a URI. Typical PDS4 namespaces will look like this: http://pds.nasa.gov/schema/pds4/node/sbn which is why they are normally abbreviated. 40

  41. PDS4 Core Concepts XML SCHEMA

  42. Core Concepts XML Schema The XML standard only defines syntax; it does not define any tags. A schema can be used to define tags and constraint their content. XML Schema is an XML-based schema language that provides the sort of capabilities we want for PDS4 labels (and quite a bit more). 42

  43. Core Concepts XML Schema Some useful XML Schema capabilities: Defining data types Creating standard value lists Constraining extrema Namespace support Designating required and optional attributes In addition there are commercial and open source tools to support creating XML Schema files and using them to create and validate XML documents (like PDS4 labels). 43

  44. Core Concepts XML Schema In PDS4, XML Schema documents will be used for: Creating one-off labels Defining/constraining label content across a collection Holding data dictionary information needed for label validation Defining interface formats between system elements 44

  45. Core Concepts XML Schema Creating one-off labels Start with template schema from the PDS library. Use it with an XML-aware editor to create a blank XML label. Fill in the blanks. Validate the resulting XML against the template schema. 45

  46. Core Concepts XML Schema Defining/Constraining Label Content Across a Collection Start with generic schema from PDS library. Node edits schema to reflect design decisions already made; inserts node classes. Mission inserts mission classes. XML-aware editor is used to generate a template for pipeline use. Edited schema is used to validate output labels. 46

  47. Core Concepts XML Schema Holding Data Dictionary Information Data dictionary information (attributes and classes) resides in an integrated database. Individual namespaces will be dumped to separate schema files. The namespace schema files are directly referenced by the XML label files. XML validators compare the use of the attributes and classes in the XML file to the definitions in the schema files. 47

  48. Core Concepts XML Schema Defining Interface Formats An XML schema defines the tags used for input/output. The schema can be used in an XML-aware tool, like an editor, to generate a template XML file. XML input can be validated against the schema prior to attempting processing. 48

  49. PDS4 Core Concepts QUESTIONS?

  50. Backup 50

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#