Virtual Research Environments as-a-Service: Advancing Collaborative Research Efforts

 
Virtual Research Environments
as-a-Service
Pasquale Pagano, CNR
pasquale.pagano@cnr.it
EGI Community Forum
10-13 November 2015
Bari, Italy
 
Outline
 
e-Infrastructure
A
n
 
o
p
e
r
a
t
i
o
n
a
l
 
c
o
m
b
i
n
a
t
i
o
n
 
o
f
 
d
i
g
i
t
a
l
t
e
c
h
n
o
l
o
g
i
e
s
 
(
h
a
r
d
w
a
r
e
 
a
n
d
 
s
o
f
t
w
a
r
e
)
,
r
e
s
o
u
r
c
e
s
 
(
d
a
t
a
 
a
n
d
 
s
e
r
v
i
c
e
s
)
,
 
c
o
m
m
u
n
i
c
a
t
i
o
n
s
(
p
r
o
t
o
c
o
l
s
,
 
a
c
c
e
s
s
 
r
i
g
h
t
s
 
a
n
d
 
n
e
t
w
o
r
k
s
)
,
 
a
n
d
p
e
o
p
l
e
 
a
n
d
 
o
r
g
a
n
i
z
a
t
i
o
n
a
l
 
s
t
r
u
c
t
u
r
e
s
 
n
e
e
d
e
d
 
t
o
s
u
p
p
o
r
t
 
r
e
s
e
a
r
c
h
 
e
f
f
o
r
t
s
 
a
n
d
 
c
o
l
l
a
b
o
r
a
t
i
o
n
 
i
n
 
t
h
e
l
a
r
g
e
 
Genealogy
 
 
D4Science operates VREs for 
+2000 scientists in 44 countries, integrating
+50 heterogeneous data providers, executing
+20,000 processes/month; providing access
to over a billion quality records in repositories
worldwide, with 99,7% service availability.
D4Science hosts +40 VREs
 
Born to serve user needs
I need to host my applications in
a secure and scalable
environment
I need to maintain my database
I need to backup my data
I need to securely delivery my
data to a set of known people
I 
want  to offer a flexible sharing,
storage, reporting, search and
retrieval tool
I need to 
manage and analyze data
I need to manage the full data life-cycle from
import to validation, curation, harmonization
and publication
I need to offer to my team a powerful tool to
manage code-lists
I need to reduce the costs of data
maintenance of my dept.
I need to 
access authoritative data
I need to simplify the access to my data
I need to mash-up statistical and geospatial data
I 
need to analyse my big datasets
I need to validate my datasets and provide a standard access to them
 
 
D4SCIENCE
Distinguishing capabilities of the e-infrastructure
 
The D4Science infrastructure
H
y
b
r
i
d
 
D
a
t
a
 
I
n
f
r
a
s
t
r
u
c
t
u
r
e
c
o
m
b
i
n
i
n
g
 
o
v
e
r
 
5
0
0
 
s
o
f
t
w
a
r
e
 
c
o
m
p
o
n
e
n
t
s
 
i
n
t
o
 
a
 
c
o
h
e
r
e
n
t
a
n
d
 
c
e
n
t
r
a
l
l
y
 
m
a
n
a
g
e
d
 
s
y
s
t
e
m
 
o
f
 
h
a
r
d
w
a
r
e
,
 
s
o
f
t
w
a
r
e
,
 
a
n
d
d
a
t
a
 
r
e
s
o
u
r
c
e
s
 
D4Science
enables e-infrastructure by ...
 
Storage as Service
t
o
 
h
o
s
t
 
a
n
d
 
m
a
i
n
t
a
i
n
 
d
a
t
a
High-availability
Standard
Ready-to-use
Scalable
Reliable
Secure
Policies
 Standard 
Privacy and Attribution
D
a
t
a
b
a
s
e
C
l
o
u
d
 
S
t
o
r
a
g
e
G
e
o
g
r
a
p
h
i
c
a
l
 
D
B
 
Applications as a Service
t
o
 
c
u
r
a
t
e
 
a
n
d
 
m
a
n
a
g
e
 
d
a
t
a
M
e
t
a
d
a
t
a
 
G
e
n
e
r
a
t
i
o
n
Geospatial Data
Biodiversity Data
Statistical Data
Textual Data
H
a
r
m
o
n
i
z
a
t
i
o
n
Disambiguate 
Validate
Integrate and 
Consistency Check
D
a
t
a
 
E
x
c
h
a
n
g
e
OGC protocols
DarwinCore
SDMX
DublinCore
 
Computing as Service
t
o
 
p
r
o
c
e
s
s
 
a
n
d
 
e
x
t
r
a
c
t
k
n
o
w
l
e
d
g
e
S
c
a
l
a
b
l
e
 
Easy to Manage
Across Boundaries
Tailored
E
l
a
s
t
i
c
Assignment of Computing
Assignment of Processors
Virtual Research Environment
H
e
t
e
r
o
g
e
n
e
o
u
s
High Throughput 
Map-Reduce
Parallel R
 
Computational Engine
Not another cloud computer platform but
a platform where executions can be
repeated, compared, discussed, logged
Not another computational engine but
a platform where interdisciplinary tools
and services can be easily contributed by
the communities
 
Two exploitation models
 
Virtual Research Environment
t
o
 
a
c
c
e
s
s
,
 
s
h
a
r
e
 
a
n
d
 
c
o
l
l
a
b
o
r
a
t
e
S
h
a
r
e
Database Tables
Workflow 
Files
C
o
m
m
u
n
i
c
a
t
e
 
Post
Favourite
Connection
O
r
g
a
n
i
z
e
Dynamic
Secure 
Policy Driven
 
Virtual Research Environment
a
 
d
i
s
t
r
i
b
u
t
e
d
 
a
n
d
 
d
y
n
a
m
i
c
a
l
l
y
 
c
r
e
a
t
e
d
 
e
n
v
i
r
o
n
m
e
n
t
w
h
e
r
e
 
s
u
b
s
e
t
 
o
f
 
r
e
s
o
u
r
c
e
s
 
(
d
a
t
a
,
 
s
e
r
v
i
c
e
s
,
 
c
o
m
p
u
t
a
t
i
o
n
a
l
,
 
a
n
d
s
t
o
r
a
g
e
 
r
e
s
o
u
r
c
e
s
)
r
e
g
u
l
a
t
e
d
 
b
y
 
t
a
i
l
o
r
e
d
 
p
o
l
i
c
i
e
s
 
(
e
.
g
.
 
d
a
t
a
 
e
n
c
r
y
p
t
i
o
n
 
w
i
t
h
 
V
R
E
s
p
e
c
i
f
i
c
 
k
e
y
,
 
q
u
o
t
a
 
o
n
 
s
e
r
v
i
c
e
 
c
a
l
l
s
 
a
n
d
 
s
t
o
r
a
g
e
 
u
s
a
g
e
,
 
)
a
r
e
 
a
s
s
i
g
n
e
d
 
t
o
 
a
 
s
u
b
s
e
t
 
o
f
 
u
s
e
r
s
 
v
i
a
 
i
n
t
e
r
f
a
c
e
s
f
o
r
 
a
 
l
i
m
i
t
e
d
 
t
i
m
e
f
r
a
m
e
a
t
 
l
i
t
t
l
e
 
o
r
 
n
o
 
c
o
s
t
 
f
o
r
 
t
h
e
 
p
r
o
v
i
d
e
r
s
 
o
f
 
t
h
e
 
p
a
r
t
i
c
i
p
a
t
o
r
y
 
d
a
t
a
 
e
-
i
n
f
r
a
s
t
r
u
c
t
u
r
e
s
L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12
 
VRE Definition
 
 
 
Configuration
Applications
Metadata
Data
Simple and effective process
to define a new environment
 
Applications vs Services
Physical
View
Hardware
Software, Tools, Services
Configuration
Data
 
Application Bundles
https://www.gcube-system.org/catalogue-of-applications
To develop
applications
interfacing gCube
facilities
A
p
p
s
C
u
b
e
To aid modelling
and analysing of
distribuition data,
comparing
checklists, and
producing maps
B
i
o
l
C
u
b
e
To facilitate data
publication with
appropriate tools
including semantic
technologies
C
o
n
n
e
c
t
C
u
b
e
To properly
access, consume
and produce
geospatial
information
G
e
o
s
C
u
b
e
To assist tabular
data validation,
data enrichment
ad efficient
analytical tools
S
t
a
t
s
C
u
b
e
To support
deployment,
operation & mgmt
of a gCube-based
infrastructure
I
c
e
C
u
b
e
 
VRE Exploitation
Exploited for
Public VREs 
(used to offer an application environment to a
subset of users of a community) 
and
Private VREs 
(used for experiments, data access and
preparation, and data analytics)
 
Fully operational VRE
available in one hour
Software deployment
and hardware setup
completely hidden
Evolving needs of its
users completely
supported
 
Entity as Resource
 
Software as Resource
: 
transforms servlets-based
applications/services in e-Infrastructure resource
Container as Resource: 
transforms standard servlets-based container
in e-Infrastructure resource
Federated Sources as Resource: 
transforms external DBs and
Repositories in e-Infrastructure resource
Algorithm as Resource: 
for any new algorithm, model, procedure,
workflow, … it is possible to manage policies and assign dedicated
Hardware and Storage resources
Dataset and single product as Resource: 
for any dataset, map,
timeseries, code list, …. It is possible to manage policies and
monitor their exploitation
 
SmartGears
a
 
s
e
t
 
o
f
 
J
a
v
a
 
l
i
b
r
a
r
i
e
s
 
t
h
a
t
 
t
u
r
n
 
S
e
r
v
l
e
t
-
c
o
m
p
l
i
a
n
t
 
c
o
n
t
a
i
n
e
r
s
a
n
d
 
a
p
p
l
i
c
a
t
i
o
n
s
 
i
n
t
o
 
i
n
f
r
a
s
t
r
u
c
t
u
r
e
 
r
e
s
o
u
r
c
e
s
,
 
t
r
a
n
s
p
a
r
e
n
t
l
y
.
g
C
u
b
e
 
W
i
k
i
turn software and containers into resources
what does it mean ?
 
SmartGears [cont.]
S
o
f
t
w
a
r
e
-
a
s
-
R
e
s
o
u
r
c
e
C
o
n
t
a
i
n
e
r
-
a
s
-
R
e
s
o
u
r
c
e
software and nodes we can
discover
use without hardcoded
knowledge
monitor and control
take actions when not
operational
dedicate to user groups
change policies, assign roles
A
c
t
u
a
l
 
S
o
l
u
t
i
o
n
Z
e
r
o
 
c
o
n
s
t
r
a
i
n
t
s
human solutions
not practical, often impossible
automated solutions
local enabling software,
remotely controlled
management tasks
compile and publish
descriptions
track and change status
enforce policies
 
gCube: One stable open-source platform
Statistics form
openhub.net/p/gCube
gCube enables the D4Science HDI
 
Multi-tenant Delivery Model
 
References / Links
D4Science: 
http://www.d4science.org
Policies
https://wiki.d4science.org/D4Science_Deployment_and_Operation:_Policies
Procedures
https://wiki.d4science.org/D4Science_Deployment_and_Operation
gCube: 
http://www.gcube-system.org
Catalogue of Applications
https://www.gcube-system.org/catalogue-of-applications
Software Key Features
https://wiki.gcube-system.org/GCube_Features
Developer Guide
https://wiki.gcube-system.org/Developer%27s_Guide
FeatherWeightStack
https://wiki.gcube-system.org/Featherweight_Stack 
SmartGears
https://wiki.gcube-system.org/SmartGears 
gCube APIs
https://wiki.gcube-system.org/GCube_Application_Programming_Interface 
Administration Guide
https://wiki.gcube-system.org/Administrator%27s_Guide
 
QUESTIONS?
Thank you for your attention
Slide Note
Embed
Share

Explore how Virtual Research Environments as-a-Service, exemplified by D4Science, enhance research collaboration by providing operational ecosystems, genealogy testbeds, and e-Infrastructures. These environments cater to diverse user needs, from data management to cost reduction, offering capabilities like data access, analysis, and validation to a global community of scientists.

  • Research Collaboration
  • Virtual Environments
  • Data Management
  • Scientific Ecosystems

Uploaded on Sep 30, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Virtual Research Environments as-a-Service Pasquale Pagano, CNR pasquale.pagano@cnr.it EGI Community Forum 10-13 November 2015 Bari, Italy www.d4science.org 1 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  2. Outline E-Infrastructure History Context as a Service Capabilities Virtual Research Environment D4Science Features Numbers gCube www.d4science.org 2 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  3. e-Infrastructure An operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and people and organizational structures needed to support research efforts and collaboration in the large www.d4science.org 3 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  4. Genealogy Testbed: Virtual Research Environment DILIGENT 2004-2007 D4Scienc e 2008-2010 Operational: several use cases (fisheries), gCube became an open source project Operational Ecosystem: use cases (marine biodiversity use cases), D4Science born to go beyond project lifetime D4Science-II 2010-2012 Operational HDI: exploit D4Science, iMarine CoP, >1500 active users iMarine 2012-2014 www.d4science.org 4 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  5. D4Science operates VREs for +2000 scientists in 44 countries, integrating +50 heterogeneous data providers, executing +20,000 processes/month; providing access to over a billion quality records in repositories worldwide, with 99,7% service availability. D4Science hosts +40 VREs www.d4science.org 5 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  6. Born to serve user needs I need to host my applications in a secure and scalable environment I need to maintain my database I need to backup my data I need to securely delivery my data to a set of known people I want to offer a flexible sharing, storage, reporting, search and retrieval tool I need to manage and analyze data I need to manage the full data life-cycle from import to validation, curation, harmonization and publication I need to offer to my team a powerful tool to manage code-lists I need to reduce the costs of data maintenance of my dept. Applications Capacities I need to access authoritative data I need to simplify the access to my data I need to mash-up statistical and geospatial data I need to analyse my big datasets I need to validate my datasets and provide a standard access to them Data www.d4science.org 6 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  7. Distinguishing capabilities of the e-infrastructure D4SCIENCE www.d4science.org 7 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  8. The D4Science infrastructure Hybrid Data Infrastructure combining over 500 software components into a coherent and centrally managed system of hardware, software, and data resources www.d4science.org 8 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  9. D4Science enables e-infrastructure by ... Overcoming administrative boundaries Exploiting private and commercial providers Integrating geographically distributed computing infrastructure Operation Built on SLAs Support monitoring, auditing, reporting, and notification Providing service allocations, deployment, monitoring, and operation Trust Privacy, governance, and attribution Ensuring uniform resource and data access Security, trusted network www.d4science.org 9 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  10. Storage as Service to host and maintain data Database Cloud Storage Geographical DB High-availability Standard Ready-to-use Scalable Reliable Secure Policies Standard Privacy and Attribution www.d4science.org 10 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  11. Applications as a Service to curate and manage data Metadata Generation Geospatial Data Biodiversity Data Statistical Data Textual Data Harmonization Disambiguate Validate Integrate and Consistency Check Data Exchange OGC protocols DarwinCore SDMX DublinCore www.d4science.org 11 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  12. Computing as Service to process and extract knowledge Scalable Easy to Manage Across Boundaries Tailored Elastic Heterogeneous High Throughput Map-Reduce Parallel R Assignment of Computing Assignment of Processors Virtual Research Environment www.d4science.org 12 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  13. Computational Engine Not another cloud computer platform but a platform where executions can be repeated, compared, discussed, logged Not another computational engine but a platform where interdisciplinary tools and services can be easily contributed by the communities www.d4science.org 13 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  14. Two exploitation models Dispatcher Tools (R, Java, ) must be uploaded to the storage Executable is deployed on the worker nodes assigned to the VRE Data are made accessible to the worker nodes according to the specification provided Monitoring, accounting, failures management, partial re-execution, sharing, and repeatability are granted Application Framework Predefined data splitting models are provided A large array of models and algorithms can be exploited to define custom workflows Large array of algorithms to compare results are provided www.d4science.org 14 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  15. Virtual Research Environment to access, share and collaborate Share Communicate Post Favourite Connection Organize Dynamic Secure Policy Driven Database Tables Workflow Files www.d4science.org 15 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  16. Virtual Research Environment a distributed and dynamically created environment where subset of resources (data, services, computational, and storage resources) regulatedby tailored policies (e.g. data encryption with VRE specific key, quota on service calls and storage usage, ) are assigned to a subset of users via interfaces for a limited timeframe at little or no cost for the providers of the participatory data e- infrastructures L. Candela, D. Castelli, P. Pagano (2013) Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, Vol. 12 www.d4science.org 16 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  17. VRE Definition Metadata Applications Simple and effective process to define a new environment Data Configuration www.d4science.org 17 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  18. Applications vs Services Logical View Applications Data Hardware Configuration Registry Physical View Data Software, Tools, Services www.d4science.org 18 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  19. Application Bundles https://www.gcube-system.org/catalogue-of-applications AppsCube ConnectCube BiolCube To aid modelling and analysing of distribuition data, comparing checklists, and producing maps To facilitate data publication with appropriate tools including semantic technologies To develop applications interfacing gCube facilities GeosCube StatsCube IceCube To assist tabular data validation, data enrichment ad efficient analytical tools To support deployment, operation & mgmt of a gCube-based infrastructure To properly access, consume and produce geospatial information www.d4science.org 19 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  20. VRE Exploitation Exploited for Public VREs (used to offer an application environment to a subset of users of a community) and Private VREs (used for experiments, data access and preparation, and data analytics) Fully operational VRE available in one hour Software deployment and hardware setup completely hidden Evolving needs of its users completely supported www.d4science.org 20 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  21. Entity as Resource Entity As a resource As a service Server, Storage Container Software Data Publication/Discovery Lifecycle management Failure management Authorization-accounting Access Orchestrate Reference Software as Resource: transforms servlets-based applications/services in e-Infrastructure resource Container as Resource: transforms standard servlets-based container in e-Infrastructure resource Federated Sources as Resource: transforms external DBs and Repositories in e-Infrastructure resource Algorithm as Resource: for any new algorithm, model, procedure, workflow, it is possible to manage policies and assign dedicated Hardware and Storage resources Dataset and single product as Resource: for any dataset, map, timeseries, code list, . It is possible to manage policies and monitor their exploitation www.d4science.org 21 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  22. SmartGears a set of Java libraries that turn Servlet-compliant containers and applications into infrastructure resources, transparently. gCube Wiki turn software and containers into resources what does it mean ? www.d4science.org 22 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  23. SmartGears [cont.] Software-as-Resource Container-as-Resource software and nodes we can discover use without hardcoded knowledge Actual Solution Zero constraints human solutions not practical, often impossible automated solutions local enabling software, remotely controlled monitor and control take actions when not operational management tasks compile and publish descriptions track and change status enforce policies dedicate to user groups change policies, assign roles www.d4science.org 23 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  24. gCube: One stable open-source platform gCube enables the D4Science HDI Statistics form openhub.net/p/gCube www.d4science.org 24 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  25. Multi-tenant Delivery Model Dynamic deployment Hosting Resource Lifecycle Monitoring Accounting Security Infrastructure as a Service VRE BiolCube ConnectCube GeosCube StatsCube Software as a Service FeatherWeightStack SmartGears ApplicationSupportLayer SOA3 Platform as a Service www.d4science.org 25 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  26. References / Links D4Science: http://www.d4science.org Policies https://wiki.d4science.org/D4Science_Deployment_and_Operation:_Policies Procedures https://wiki.d4science.org/D4Science_Deployment_and_Operation gCube: http://www.gcube-system.org Catalogue of Applications https://www.gcube-system.org/catalogue-of-applications Software Key Features https://wiki.gcube-system.org/GCube_Features Developer Guide https://wiki.gcube-system.org/Developer%27s_Guide FeatherWeightStack https://wiki.gcube-system.org/Featherweight_Stack SmartGears https://wiki.gcube-system.org/SmartGears gCube APIs https://wiki.gcube-system.org/GCube_Application_Programming_Interface Administration Guide https://wiki.gcube-system.org/Administrator%27s_Guide www.d4science.org 26 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

  27. Thank you for your attention QUESTIONS? www.d4science.org 27 EGI CF 2015 - Virtual Research Environments as-a-Service P. Pagano

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#