Abstract Data Organization Model

 
R
D
A
 
D
a
t
a
 
F
o
u
n
d
a
t
i
o
n
 
a
n
d
 
T
e
r
m
i
n
o
l
o
g
y
 
(
D
F
T
)
 
I
G
:
I
n
t
r
o
d
u
c
t
i
o
n
 
Prepared for 
RDA Plenary 
San Diego
, March
9, 2015
Gary Berg-Cross, Raphael Ritz,
Co-Chairs DFT IG
A 
PID record
 that points to a 
metadata
record
 and to 
instantiations
 of identical 
bit-
streams
 that may store additional 
attributes
 
Goal: Describe a basic, abstract (but clear) data
organization model that systemizes the already large
body of definition work on 
data management terms
,
especially as involved in RDA’s efforts.
 
DFT IG Session   Agenda   (16:00-17:30-11)
 
16:00-16:10 Overview of the DFT IG, Case Statement & the Breakout Session-
Goals and Plans                  Gary Berg-Cross
16:10-16-20  Overview of the Ted-T tool  (Raphael/Thomas)
16:20 -16:45 R Liaison relation to other RDA IGs and WGs & Solicitation of ideas
for additional Use Cases and candidate vocabulary items
 MIG and related RDA work (Keith Jefferies)
Practical policy (Regan Moore)
Adopter DataFed.net  ( Aaron Addison & Cynthia Hudson Vitale)
Science Europe  Working Group on Research Data   (Peter Doorn)
Also possible to hear from Legal interoperability,
   Legal interoperability
Marine data harmonization
   Data Fabric
   PIT and data type registries
  16:45-17:20 General Discussion (including remote participants)
17:20-17:30 Discussion of follow on work & Plan for follow up virtual meeting.
 
Prior DFT WG Activities & Accomplishments
 
One of the first RDA WGs
Drafted 4 related Model Documents on core work;
1.
Data Models 1: Overview – 20 + models
2.
Data Models 2: Analysis & Synthesis
3.
Data Models 3: Term Snapshot
4.
Data Models 4: Use Cases- Work with other RDA WGs on use cases to
illustrate data concepts
Presented draft work & held community discussions at RDA P1-P3
meeting
Participated in cross WG discussions
Developed 
Semantic Media Wiki Term Definition Tool (Ted-T) 
to
capture initial list of terms and definitions for discussions, demo held
at P3 (see http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page)
Participated in Adoption Day -
Common Language Resources and
Technology Infrastructure Adopting DFT, 
 DataFed.net, CLARIN etc.
Candidate List
Evolved to
Refined List
Tool demo at
 Plenary 3
 
Overview of Term Development
 
Starter
 areas and items :
Persistent Identifiers (PIDs and types)
Digital Object - Data Object
Collection - Data Set - Aggregation
Repository (Registries and related Policies)
 
Scope
Terms from
Model Papers
Model Papers
Placed In 
Placed In 
Tool
Tool
Defs & Refinement
Analysis and
Revision Process
 
Getting Defs
organized for
review
 
Example of Work on 10 categories of Terms
 
Digital Object (aka Digital Entity)
A 
digital object
 is composed of structured sequence of bits/bytes. As an object it is named. This bit sequence can be
identified & accessed by a unique and persistent identifier or by use of referencing attributes describing its
properties.
Note 
Digital Entity
 definition from X.1255 ITU standard “machine-independent data structure consisting of one or
more elements in digital form that can be parsed by different information systems; the structure helps to enable
interoperability among diverse information systems in the Internet.”
 
More Terms and Initial definitions are in TeD-T
 
It has, of course, been difficult to get consensus on the scope a common
vocabulary with detailed definitions.
The work has been more of model and vocabulary identification than
integrated definition
 We are and were in frequent discussions with communities about our
results and will intensify this interaction
.
Based on this experience, a broader plan for long-term maintenance will
be submitted to the TAB and Council as part of the IG.
As needed in consultation with these & other appropriate RDA
entities, some update to term definitions the can be
anticipated as part of maintenance.
The term tool (TED-T): a plan for its maintenance and use for DFT terms
and perhaps other WGs must be provided.
 
A special task force may be empowered to do this and other maintenance activities in
line with guidance from RDA governance organizations.
Based on interest a DFT IG was formed to continue efforts
 
Lessons Learned and Follow Up
 
Coordinated with several other RDA Groups
Considerable discussion of vocabularies has been part
of RDA group activities at Plenaries and as part of
ongoing RDA group discussion.
Cross-group coordinated with several RDA WGs, as
shown in the Data Fabric Figure on data concepts and
relations.
This coordination task needs to be ongoing.
 
 
Potentially all groups could be engaged in this IG
and we with them
Much more work and discussion would be useful such
as with the PP WG and its terminology that was only
briefly sketched out without full definitions.
PP along with MIG has expressed an interest in more
formalized definitions that can be processed by
computer and the Ted-T tool may be capable of doing
this or at least demonstrating its feasibility.
 
Objectives for P5
 
1.
Start IG discussion and leverage existing work and approach but improve
both
1.
We are expecting considerable discussion of new requirements coming out of groups
nearing completion, but also support as part of adoption.
2.
We can also leverage the experience of other IGs as to success factors
2.
Focus on facilitating community discussion on core concepts
1.
Based on feedback, some curated revisions  on definitions and extension of the
current synthesis model can be expected to finalize and stabilize the effort for
subsequent use.
3.
Facilitate definition development
1.
Potential adopters will be encouraged at P5 to provide feedback on additional use
case scenarios to illustrate what areas of work they plan on using the models and
vocabulary for.
2.
This will serve to plan work and virtual meetings  between P5 and P6.
Slide Note
Embed
Share

This content discusses the development of a data organization model for efficient data management terms, with a focus on RDA's efforts. It covers topics such as PID records, metadata, identical bit-streams, and key agenda items from DFT.IG sessions. The overview of DFT WG activities, term development, and example work on digital entities are also highlighted.

  • Data management
  • RDA efforts
  • Data organization
  • Metadata
  • Data models

Uploaded on Mar 01, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. RDA Data Foundation and Terminology (DFT) IG: RDA Data Foundation and Terminology (DFT) IG: Introduction Introduction Goal: Describe a basic, abstract (but clear) data organization model that systemizes the already large body of definition work on data management terms, especially as involved in RDA s efforts. A PID record that points to a metadata record and to instantiations of identical bit- streams that may store additional attributes Prepared for RDA Plenary San Diego, March 9, 2015 Gary Berg-Cross, Raphael Ritz, Co-Chairs DFT IG

  2. DFT IG Session Agenda (16:00-17:30-11) 16:00-16:10 Overview of the DFT IG, Case Statement & the Breakout Session- Goals and Plans Gary Berg-Cross 16:10-16-20 Overview of the Ted-T tool (Raphael/Thomas) 16:20 -16:45 R Liaison relation to other RDA IGs and WGs & Solicitation of ideas for additional Use Cases and candidate vocabulary items MIG and related RDA work (Keith Jefferies) Practical policy (Regan Moore) Adopter DataFed.net ( Aaron Addison & Cynthia Hudson Vitale) Science Europe Working Group on Research Data (Peter Doorn) Also possible to hear from Legal interoperability, Legal interoperability Marine data harmonization Data Fabric PIT and data type registries 16:45-17:20 General Discussion (including remote participants) 17:20-17:30 Discussion of follow on work & Plan for follow up virtual meeting.

  3. Prior DFT WG Activities & Accomplishments One of the first RDA WGs Drafted 4 related Model Documents on core work; 1. Data Models 1: Overview 20 + models 2. Data Models 2: Analysis & Synthesis 3. Data Models 3: Term Snapshot 4. Data Models 4: Use Cases- Work with other RDA WGs on use cases to illustrate data concepts Presented draft work & held community discussions at RDA P1-P3 meeting Participated in cross WG discussions Developed Semantic Media Wiki Term Definition Tool (Ted-T) to capture initial list of terms and definitions for discussions, demo held at P3 (see http://smw-rda.esc.rzg.mpg.de/index.php/Main_Page) Participated in Adoption Day -Common Language Resources and Technology Infrastructure Adopting DFT, DataFed.net, CLARIN etc. Candidate List Evolved to Refined List Tool demo at Plenary 3

  4. Overview of Term Development Getting Defs organized for review Digital Information Object A digital item or group of items referred to as a unit, regardless of type or format that a computer can address or manipulate as a single object. Scope Terms from Model Papers Placed In Tool Analysis and Revision Process Starter areas and items : Persistent Identifiers (PIDs and types) Digital Object - Data Object Collection - Data Set - Aggregation Repository (Registries and related Policies)

  5. Example of Work on 10 categories of Terms Digital Object (aka Digital Entity) A digital object is composed of structured sequence of bits/bytes. As an object it is named. This bit sequence can be identified & accessed by a unique and persistent identifier or by use of referencing attributes describing its properties. Note Digital Entitydefinition from X.1255 ITU standard machine-independent data structure consisting of one or more elements in digital form that can be parsed by different information systems; the structure helps to enable interoperability among diverse information systems in the Internet.

  6. More Terms and Initial definitions are in TeD-T

  7. Lessons Learned and Follow Up It has, of course, been difficult to get consensus on the scope a common vocabulary with detailed definitions. The work has been more of model and vocabulary identification than integrated definition We are and were in frequent discussions with communities about our results and will intensify this interaction. Based on this experience, a broader plan for long-term maintenance will be submitted to the TAB and Council as part of the IG. As needed in consultation with these & other appropriate RDA entities, some update to term definitions the can be anticipated as part of maintenance. The term tool (TED-T): a plan for its maintenance and use for DFT terms and perhaps other WGs must be provided. A special task force may be empowered to do this and other maintenance activities in line with guidance from RDA governance organizations. Based on interest a DFT IG was formed to continue efforts

  8. Coordinated with several other RDA Groups Considerable discussion of vocabularies has been part of RDA group activities at Plenaries and as part of ongoing RDA group discussion. Cross-group coordinated with several RDA WGs, as shown in the Data Fabric Figure on data concepts and relations. This coordination task needs to be ongoing. Potentially all groups could be engaged in this IG and we with them Much more work and discussion would be useful such as with the PP WG and its terminology that was only briefly sketched out without full definitions. PP along with MIG has expressed an interest in more formalized definitions that can be processed by computer and the Ted-T tool may be capable of doing this or at least demonstrating its feasibility.

  9. Objectives for P5 1. Start IG discussion and leverage existing work and approach but improve both 1. We are expecting considerable discussion of new requirements coming out of groups nearing completion, but also support as part of adoption. 2. We can also leverage the experience of other IGs as to success factors 2. Focus on facilitating community discussion on core concepts 1. Based on feedback, some curated revisions on definitions and extension of the current synthesis model can be expected to finalize and stabilize the effort for subsequent use. 3. Facilitate definition development 1. Potential adopters will be encouraged at P5 to provide feedback on additional use case scenarios to illustrate what areas of work they plan on using the models and vocabulary for. 2. This will serve to plan work and virtual meetings between P5 and P6.

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#