DARIAH Requirements and Roadmap in EGI - Insights from EGI Community Forum 2015

DARIAH requirements and
roadmap in EGI
Davor Davidović*
, Eva Cetinić, Karolj Skala
Ruđer Bošković Institute
EGI Community Forum 2015
 Bari, Italy
Contents
1.
Introduction
2.
What is DARIAH?
3.
Survey
4.
Requirements analysis
5.
DARIAH’s roadmap in EGI
 
What is DARIAH?
EGI Community Forum 10-13 November 2015
DARIAH, the Digital Research Infrastructure for
the Arts and Humanities
aims to enhance and support digitally-enabled
research and teaching across the humanities and arts.
It is a connected network of tools, information, people
and methodologies for investigating, exploring and
supporting research across the digital arts and
humanities for researchers and humanists.
About DARIAH
Established as European Research Infrastructure
Consortium (ERIC)
18 member states:
Austria, Belgium, Croatia, Cyprus, Denmark, France, Germany,
Greece, Ireland, Italy, Luxembourg, Malta, Netherlands, Poland,
Portugal, Serbia, Slovenia, Switzerland
National DARIAH organization: DARIAH-DE,
DARIAH-IT,…
In-kind contribution + associated projects
EGI Community Forum 10-13 November 2015
DARIAH Organization
EGI Community Forum 10-13 November 2015
 
15 Working Groups:
Text and Data Analytics
Natural Language Processing
Training and Education
Digital Annotation
Visual Media
Guidelines and Standards
 
Virtual Competency Centres
What are the DARIAH requirements?
No unique answer on that questions
Very heterogeneous community with numerous
research disciplines, applications and tools
utilized, types of media objects, etc…
DARIAH has not conducted any comprehensive
survey on the member’s technical requirements
Therefore…
hard to define DARIAH needs and provide the right
solutions!
EGI Community Forum 10-13 November 2015
The goal of the survey
We want answers to the following questions:
Who are the targeted research group
What is their technical background
Which application/services/tools they use and how
What are their current and future needs for e-
Infrastructure
How the digital objects are stored, managed and
shared
EGI Community Forum 10-13 November 2015
To collect information on e-Infrastructure
requirements
, 
experience
 and 
needs
 of the A&H
research community
Collection process
Online survey rounded with 
the 
interviews (
link
)
Tools: LimeSurvey and skype
Collection period:
  
15
th
 August – 30
th
 September 2015
Total number of questions: 
65
Questions divided into groups:
About participants (3)
Experience with e-Infrastructure (3)
Authentication and authorization (6)
Digital Arts and Humanities assets (4)
Data management – sharing and accessing data (17)
Services and applications for data analysis (9 x 3)
Future planning (3)
Contacts (optional)
EGI Community Forum 10-13 November 2015
Collecting process – target population
3 different user roles:
Application/service developers
, i.e. computer
scientists who design, develop and
/or
 
implement
applications and services used by other DARIAH
members
Application/service providers
, i.e. computer
specialists who are re
s
ponsible for providing
applications/services to other DARIAH members
Researchers
 in digital arts and humanities, i.e.
consumers of the applications and tools
EGI Community Forum 10-13 November 2015
Collecting process – target population
3 different user roles:
Application/service developers
, i.e. computer
scientists who design, develop and
/or
 
implement
applications and services used by other DARIAH
members
Application/service providers
, i.e. computer
specialists who are re
s
ponsible for providing
applications/services to other DARIAH members
Researchers
 in digital arts and humanities, i.e.
consumers of the applications and tools
EGI Community Forum 10-13 November 2015
Survey statistics
Full responses: 15
Incomplete responses: 20
Total responses: 35
2 interviews
EGI Community Forum 10-13 November 2015
Who are participants?
EGI Community Forum 10-13 November 2015
 
A significant number of
research scientists (52%)
‘Others’ -> two roles
Better technical
background to be
expected
 
‘History’-> emphasis is on
storage not computation
Experience
 
with e-Infrastructure
Only 5 positive reponses on “Do you know what
e-Infrastructure is?”
Requires more 
effort to be put in
 dissemination
The infrastructure services used:
EGI Community Forum 10-13 November 2015
Resource they have (or wish) access to
Mostly repositories and digital archives
 (55%)
Authentificatin and Authorization
12 (34.29%) of participants is aware of Identity
Federations 
7 (20.00%) of participants institutions are part of
national Identity Federations
Main barriers/chall
e
nges to join an identity
federation 
Lack of t
rus
t
 (8.57%) 
Lack of 
Manpower (11.43%)
DARIAH level
DARIAH IdP based on SAML, member of Edugain
EGI Community Forum 10-13 November 2015
 
Storing and sharing
 
37%
 use local machine
and 
37%
 institutional
storage/repositories
Only 
11%
 share their
research data
 
D
ata types
 used
Digital A
&H
 assets
 and data management
 
 
Where data are stored
Services, applications and tools
EGI Community Forum 10-13 November 2015
Infrastructure requirements
Data intensive
Two ma
y
or issues
Computational power
Main memory
Other needs:
GPU (TopicModelling) ->
GPGPU cloud access
EGI Community Forum 10-13 November 2015
histoGraph
Multiple independent
jobs
Requirements:
Main memory/storage
CPU power
Defiance:
Main memory
http://histograph.eu
Topic Modeling
MPI-based application
Requirements:
CPU power
Main memory/storage
Defiance:
Computational power
Main memory
Requirements: GPGPU
Examples
EGI Community Forum 10-13 November 2015
Survey conclusion - requirements
EGI Community Forum 10-13 November 2015
DARIAH roadmap in EGI
EGI Community Forum 10-13 November 2015
Per use-case approach is required!
Sustainable long-tail user support and training is obligatory
Slide Note
Embed
Share

DARIAH, the Digital Research Infrastructure for the Arts and Humanities, collaborates with the EGI Community to enhance digitally-enabled research and teaching in the humanities and arts. This collaboration involves analyzing requirements, establishing a roadmap, and engaging with Virtual Competency Centres. The diverse DARIAH community poses challenges in defining specific technical needs, highlighting the importance of tailored solutions for this multidisciplinary field.

  • DARIAH
  • EGI
  • Digital Research Infrastructure
  • Humanities
  • Arts

Uploaded on Oct 07, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. DARIAH requirements and roadmap in EGI Davor Davidovi *, Eva Cetini , Karolj Skala Ru er Bo kovi Institute EGI Community Forum 2015 Bari, Italy www.egi.eu EGI-Engage is co-funded by the Horizon 2020 Framework Programme of the European Union under grant number 654142

  2. Contents 1. Introduction 2. What is DARIAH? 3. Survey 4. Requirements analysis 5. DARIAH s roadmap in EGI 10/7/2024 2

  3. What is DARIAH? DARIAH, the Digital Research Infrastructure for the Arts and Humanities aims to enhance and support digitally-enabled research and teaching across the humanities and arts. It is a connected network of tools, information, people and methodologies for investigating, exploring and supporting research across the digital arts and humanities for researchers and humanists. EGI Community Forum 10-13 November 2015 10/7/2024 3

  4. About DARIAH Established as European Research Infrastructure Consortium (ERIC) 18 member states: Austria, Belgium, Croatia, Cyprus, Denmark, France, Germany, Greece, Ireland, Italy, Luxembourg, Malta, Netherlands, Poland, Portugal, Serbia, Slovenia, Switzerland National DARIAH organization: DARIAH-DE, DARIAH-IT, In-kind contribution + associated projects EGI Community Forum 10-13 November 2015 10/7/2024 4

  5. DARIAH Organization Virtual Competency Centres 15 Working Groups: VCC Research and Education VCC e- Text and Data Analytics Natural Language Processing Training and Education Digital Annotation Visual Media Guidelines and Standards Infrastructure VCC Scholarly Context Management VCC Advocacy Dynamic and flexible units with specific goals and outcomes, related to one or more VCCs Cover strategic areas and topics, provide sustainability and incorporate the outcomes of working groups EGI Community Forum 10-13 November 2015 10/7/2024 5

  6. What are the DARIAH requirements? No unique answer on that questions Very heterogeneous community with numerous research disciplines, applications and tools utilized, types of media objects, etc DARIAH has not conducted any comprehensive survey on the member s technical requirements Therefore hard to define DARIAH needs and provide the right solutions! EGI Community Forum 10-13 November 2015 10/7/2024 6

  7. The goal of the survey To collect information on e-Infrastructure requirements, experience and needs of the A&H research community We want answers to the following questions: Who are the targeted research group What is their technical background Which application/services/tools they use and how What are their current and future needs for e- Infrastructure How the digital objects are stored, managed and shared EGI Community Forum 10-13 November 2015 10/7/2024 7

  8. Collection process Online survey rounded with the interviews (link) Tools: LimeSurvey and skype Collection period: 15thAugust 30thSeptember 2015 Total number of questions: 65 Questions divided into groups: About participants (3) Experience with e-Infrastructure (3) Authentication and authorization (6) Digital Arts and Humanities assets (4) Data management sharing and accessing data (17) Services and applications for data analysis (9 x 3) Future planning (3) Contacts (optional) EGI Community Forum 10-13 November 2015 10/7/2024 8

  9. Collecting process target population 3 different user roles: Application/service developers, i.e. computer scientists who design, develop and/or implement applications and services used by other DARIAH members Application/service providers, i.e. computer specialists who are responsible for providing applications/services to other DARIAH members Researchers in digital arts and humanities, i.e. consumers of the applications and tools EGI Community Forum 10-13 November 2015 10/7/2024 9

  10. Collecting process target population 3 different user roles: Application/service developers, i.e. computer scientists who design, develop and/or implement applications and services used by other DARIAH members Application/service providers, i.e. computer specialists who are responsible for providing applications/services to other DARIAH members Researchers in digital arts and humanities, i.e. consumers of the applications and tools EGI Community Forum 10-13 November 2015 10/7/2024 10

  11. Survey statistics Full responses: 15 Incomplete responses: 20 Total responses: 35 2 interviews Full responses 43% Incomplete responses 57% EGI Community Forum 10-13 November 2015 10/7/2024 11

  12. Who are participants? Participants per discipline A significant number of research scientists (52%) Others -> two roles Better technical background to be expected History, Sociology, Philosophy, Religions Other 7 Law, Political Science 2 Economics, Finance, Methods and Statistics 2 Musicology and performing arts 3 Compute science, computer graphics, 4 Role of participants Archaeology and Prehistory 6 Art, Art History, Architecture, Cultural 6 Others 11% 12 Linguistics, Literature, Classical Studies Providers 26% Participants per country 17 Luxembourg, 1 16 18 0 2 4 6 8 10 12 14 Spain, 1 Italy, 1Denmark, 3 Croatia, 4 History -> emphasis is on storage not computation Switzerland, 1 France, 6 Researchers 37% Austria, 5 Germany, 3 Developers 26% Greece, 3 Ireland, 2 Belgium, 1 Slovenia, 2 Lithuania, 1 Poland, 1 EGI Community Forum 10-13 November 2015 10/7/2024 12

  13. Experience with e-Infrastructure Only 5 positive reponses on Do you know what e-Infrastructure is? Requires more effort to be put in dissemination The infrastructure services used: Computational resources (e.g. computer cluster, grid, or cloud)) Authorization / Authentication services Digital Repositories Web-oriented services 6 4 4 4 33% 22% 22% 22% Resource they have (or wish) access to Mostly repositories and digital archives (55%) EGI Community Forum 10-13 November 2015 10/7/2024 13

  14. Authentificatin and Authorization 12 (34.29%) of participants is aware of Identity Federations 7 (20.00%) of participants institutions are part of national Identity Federations Main barriers/challenges to join an identity federation Lack of trust (8.57%) Lack of Manpower (11.43%) DARIAH level DARIAH IdP based on SAML, member of Edugain EGI Community Forum 10-13 November 2015 10/7/2024 14

  15. Digital A&H assets and data management Data types used Where data are stored Storing and sharing 37% use local machine and 37% institutional storage/repositories Only 11% share their research data 3D 3% Plain files (text, audio, video, photo) 32% Public repositories (e.g. Figshare, Zenodo, ), 4, 11% repository (e.g. Amazon, Dropbox, Google, ), 6, Annotations 16% Local machine (personal computer), 13, 36% Commercial storage / Collections 20% 17% Institutional storage / repository, 13, 36% Metadata 29% Number of responses Data generated per year (in GB) 1 1 360 1 500 1 1000 4 10/7/2024 15

  16. Services, applications and tools Services Frameworks Tools ArcGis Gallica Koha GIS dana analysis and visualisatin Virtual library of the libraries France Library management system Fulir MeshLab Institutional repository of scientific production Processing 3d sampled data Python Topic Modeling Google, Google drive, Dropbox Generating word embedding vectors HAL-SHS histoGraph Photoscan archiving and dissemination of scientific literature graph based visualisation to explore the collectios scene 3D reconstruction software EGI Community Forum 10-13 November 2015 10/7/2024 16

  17. Infrastructure requirements Data intensive Two mayor issues Computational power Main memory Other needs: GPU (TopicModelling) -> GPGPU cloud access Technical requirements Defiances 0% 6% 13% 25% 25% 13% 25% 12% 6% 37% 19% 19% Lack of computational power (e.g. not enough processors at your disposal) Slow bandwidth Operating system Main memory Insufficient main memory CPU needs (number of cores) Insufficient amount of storage disk space Permanent disk storage required in total (GB) Temporal disk storage required (GB) Authorization / Authentication problems I don't know None EGI Community Forum 10-13 November 2015 10/7/2024 17

  18. Examples histoGraph Topic Modeling MPI-based application Requirements: CPU power Main memory/storage Defiance: Computational power Main memory Requirements: GPGPU Multiple independent jobs Requirements: Main memory/storage CPU power Defiance: Main memory http://histograph.eu EGI Community Forum 10-13 November 2015 10/7/2024 18

  19. Survey conclusion - requirements Processing power Storage/main memory Other computational resources clusters/GPU Resources Data sharing and accessing Local or institutional repositories Repositories and digital archives -> portals/gateways Low storage capacities DARIAH IdP Local access (username/password) Problem with accessing EGI resources AAI Support and training Benefits of Cloud services EGI Community Forum 10-13 November 2015 10/7/2024 19

  20. DARIAH roadmap in EGI Per use-case approach is required! Sustainable long-tail user support and training is obligatory Provide EGI Grid and Cloud resources (virtualization) Object storage for archiving EGI GPU cloud access (?) Resources Data sharing and accessing EGI long-tail of science platform Science portals/gateways for accessing EGI services gLibrary, CDSTAR, WS-PGRADE, other DARIAH IdP interoperability with EGI VOMS Virtual organization -> vo.dariah.eu (robot certificates Collaboration with EGI AAI AAI Support and training Promote the usage of EGI training framework Provide user support (on EGI level) Demonstrate successful stories on DARIAH related events EGI Community Forum 10-13 November 2015 10/7/2024 20

  21. Thank you for your attention. Questions? www.egi.eu This work by Parties of the EGI-Engage Consortium is licensed under a Creative Commons Attribution 4.0 International License.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#