Smart Data Analytics Research Group Overview

 
0
3
/
1
0
/
2
0
1
5
 
|
 
R
D
A
 
F
i
f
t
h
 
P
l
e
n
a
r
y
 
M
e
e
t
i
n
g
 
|
 
S
a
n
 
D
i
e
g
o
,
 
U
S
A
 
|
 
P
a
r
a
d
i
s
e
 
P
o
i
n
t
 
R
e
s
o
r
t
Markus Götz
Jülich Supercomputing Center (JSC) // University of Iceland
Morris Riedel
Jülich Supercomputing Center (JSC) // University of Iceland
B
i
g
 
D
a
t
a
 
I
n
t
e
r
e
s
t
 
G
r
o
u
p
S
m
a
r
t
 
D
a
t
a
 
A
n
a
l
y
t
i
c
s
 
O
u
t
l
i
n
e
Introduction
Research Group, Research Area
Smart Data Analytics Use Cases and Techniques
Classification, Land Cover Type, piSVM
Clustering, „Drunken Flies“, HPDBSCAN
Deep-Learning, Cortex Layers, pylearn CNN
Conclusion
Results and RDA Context
03/10/2015
2
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
3
Research Group
Jülich Supercomputing Center (HPC/HTC)
High Productivity Data Processing Group
Research Area
Smart Data Analytics Methods
Evaluation and Development of Scalable Tools
Processing Platform Requirements
Application 
in
 Scientific Use Case
I
n
t
r
o
d
u
c
t
i
o
n
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
4
Land Cover Type Problem
Collaboration with University of Iceland
Determine Land Cover Type in Satellite Images
Different Types - Road, Building, Vegetation, …
Classification
Supervised Learning Technique
Known Set of Groups or Classes
Determine Membership of New Items
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
/
/
 
L
a
n
d
 
C
o
v
e
r
 
T
y
p
e
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
5
Approach
Support Vector Machines (SVM)
Existing Solution: piSVM (MPI)
In-house Optimization of Parallel Code
C
l
a
s
s
i
f
i
c
a
t
i
o
n
 
/
/
 
L
a
n
d
 
C
o
v
e
r
 
T
y
p
e
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Area
Standard
deviation
Inertia
 
03/10/2015
6
„Drunken Flies“
Collaboration with University of Cologne
Investigate Influence of Genetics on Alcohol Consumption
Literally Make Flies Drunk
Clustering
Unsupervised Learning Technique
Subdivide Database into Similar Groups
Similarity Metrics
C
l
u
s
t
e
r
i
n
g
 
/
/
 
D
r
u
n
k
e
n
 
F
l
i
e
s
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
7
C
l
u
s
t
e
r
i
n
g
 
/
/
 
D
r
u
n
k
e
n
 
F
l
i
e
s
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Approach
Image Processing Pipeline
HPDBSCAN
In-house Development
(MPI+OpenMP)
 
03/10/2015
8
Cortex Layer Problem
Institute for Neuro-Medicine (INM) at FZJ
Segment the Seven Layers of the Cortex
Images of Actual Brain Slices
Each Gigabytes (60k square resolution)
Deep Learning
Supervised Learning Technique (Classification)
More Advanced Mathemical Models
Various Flavors of Neural Networks
D
e
e
p
 
L
e
a
r
n
i
n
g
 
/
/
 
C
o
r
t
e
x
 
L
a
y
e
r
s
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
9
Approach
Convolutional Neural Networks
Existing Serial Toolkit
Pylearn 2
/Theano
Scaling Issues
D
e
e
p
 
L
e
a
r
n
i
n
g
 
/
/
 
C
o
r
t
e
x
 
L
a
y
e
r
s
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
 
03/10/2015
10
C
o
n
c
l
u
s
i
o
n
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Results
Big Data Challenge is Real!
Gap between Analytics Requirements and Actual Implementations
Interest for RDA
Code is Open-source @ GitHub and Bitbucket
Data is Open and Freely Published @ B2SHARE
Choice of Dataformats
Question of Future Processing Platforms
 
03/10/2015
11
T
h
a
n
k
s
 
y
o
u
 
f
o
r
 
t
h
e
 
a
t
t
e
n
t
i
o
n
Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
F
i
f
t
h
 
P
l
e
n
a
r
y
 
M
e
e
t
i
n
g
08 – 12 March 2015
San Diego, USA | Paradise Point Resort
C
o
n
t
a
c
t
:
 
m
.
g
o
e
t
z
@
f
z
-
j
u
e
l
i
c
h
.
d
e
S
l
i
d
e
s
:
 
B
i
g
 
D
a
t
a
 
I
G
 
>
 
W
i
k
i
 
>
 
5
t
h
 
P
l
e
n
a
r
y
Slide Note
Embed
Share

This content provides an overview of the Smart Data Analytics Research Group at the Jülich Supercomputing Center and the University of Iceland. It covers their research areas such as classification, clustering, deep learning, and more. The group focuses on developing scalable tools and platforms for data analytics applications in scientific use cases. Collaboration with various institutions and the application of machine learning algorithms are highlighted.

  • Data Analytics
  • Research Group
  • Machine Learning
  • Jülich Supercomputing

Uploaded on Dec 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Big Data Interest Group Smart Data Analytics Markus G tz J lich Supercomputing Center (JSC) // University of Iceland Member of the Helmholtz Association Morris Riedel J lich Supercomputing Center (JSC) // University of Iceland 03/10/2015 | RDA Fifth Plenary Meeting | San Diego, USA | Paradise Point Resort

  2. Outline Introduction Research Group, Research Area Smart Data Analytics Use Cases and Techniques Classification, Land Cover Type, piSVM Clustering, Drunken Flies , HPDBSCAN Deep-Learning, Cortex Layers, pylearn CNN Conclusion Results and RDA Context Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 2

  3. Introduction Research Group J lich Supercomputing Center (HPC/HTC) High Productivity Data Processing Group Parallel Data Analytics Data Mining Methods Machine Learning Algorithms Smart Data Analytics Research Area Smart Data Analytics Methods Evaluation and Development of Scalable Tools Processing Platform Requirements Application in Scientific Use Case Scientific Community Application Data Analzsis Tools Generic Data Methods Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 3

  4. Classification // Land Cover Type Land Cover Type Problem Collaboration with University of Iceland Determine Land Cover Type in Satellite Images Different Types - Road, Building, Vegetation, Classification Supervised Learning Technique Known Set of Groups or Classes Determine Membership of New Items Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 4

  5. Classification // Land Cover Type Approach Support Vector Machines (SVM) Existing Solution: piSVM (MPI) In-house Optimization of Parallel Code Member of the Helmholtz Association Inertia Standard deviation Area 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 5

  6. Clustering // Drunken Flies Drunken Flies Collaboration with University of Cologne Investigate Influence of Genetics on Alcohol Consumption Literally Make Flies Drunk Clustering Unsupervised Learning Technique Subdivide Database into Similar Groups Similarity Metrics Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 6

  7. Clustering // Drunken Flies Approach Image Processing Pipeline HPDBSCAN In-house Development (MPI+OpenMP) Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 7

  8. Deep Learning // Cortex Layers Cortex Layer Problem Institute for Neuro-Medicine (INM) at FZJ Segment the Seven Layers of the Cortex Images of Actual Brain Slices Each Gigabytes (60k square resolution) Deep Learning Supervised Learning Technique (Classification) More Advanced Mathemical Models Various Flavors of Neural Networks Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 8

  9. Deep Learning // Cortex Layers Approach Convolutional Neural Networks Existing Serial Toolkit Pylearn 2/Theano Scaling Issues Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 9

  10. Conclusion Results Big Data Challenge is Real! Gap between Analytics Requirements and Actual Implementations Interest for RDA Code is Open-source @ GitHub and Bitbucket Data is Open and Freely Published @ B2SHARE Choice of Dataformats Question of Future Processing Platforms Member of the Helmholtz Association 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 10

  11. Thanks you for the attention Fifth Plenary Meeting 08 12 March 2015 San Diego, USA | Paradise Point Resort Member of the Helmholtz Association Contact: m.goetz@fz-juelich.de Slides: Big Data IG > Wiki > 5th Plenary 03/10/2015 Markus G tz | Smart Data Analytics | Forschungszentrum J lich 11

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#