Temporal Graph Analysis for Real-world Graph Evolution

C
h
r
o
n
o
s
:
 
A
 
G
r
a
p
h
 
E
n
g
i
n
e
 
f
o
r
T
e
m
p
o
r
a
l
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
Wentao Han
1,3
, 
Youshan Miao
2,3
, Kaiwei Li
1,3
,
Ming Wu
3
, Fan Yang
3
, Lidong Zhou
3
,
 
Vijayan Prabhakaran
3
, Wenguang Chen
1
, Enhong Chen
2
Tsinghua University
1
University of Science and Technology of China
2
Microsoft Research
3
1
Real-world graphs 
evolve
temporal graphs
Temporal graph properties bring more insights
A Social Graph
T
e
m
p
o
r
a
l
 
G
r
a
p
h
s
2
YEAR
A Social Graph
T
e
m
p
o
r
a
l
 
G
r
a
p
h
s
Real-world graphs 
evolve
temporal graphs
Temporal graph properties bring more insights
3
YEAR
YEAR
T
e
m
p
o
r
a
l
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
Computing properties on 
a series of
 graph snapshots
4
Graph
snapshot
t
0
t
2
Static
Graph
Analysis
Graph
Properties
t
1
T
e
m
p
o
r
a
l
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
Existing graph engines: targeting static graph analysis
A possible solution: computing 
snapshot by snapshot
5
YEAR
Task 1
Task 2
Task 3
P
e
r
f
o
r
m
a
n
c
e
 
I
s
s
u
e
s
6
Propagation based graph computation model
Vertex Data Array
Edge Array
scan
R
e
v
i
s
i
t
:
 
S
t
a
t
i
c
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
7
Local computation
Data Propagation
Propagation based graph computation model
Vertex Data Array
Edge Array
scan
R
e
v
i
s
i
t
:
 
S
t
a
t
i
c
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
8
Local computation
Data Propagation
Cache
Miss
In parallel: 
Partition
 graph & computations among CPU cores
R
e
v
i
s
i
t
:
 
S
t
a
t
i
c
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
9
scan
C
o
r
e
 
0
C
o
r
e
 
1
Cross-partition edge
Vertex Data Array
Edge Array
Inter-core
Communication
T
e
m
p
o
r
a
l
 
G
r
a
p
h
 
A
n
a
l
y
s
i
s
:
 
S
n
a
p
s
h
o
t
 
b
y
 
S
n
a
p
s
h
o
t
Computation on 
multiple
 graph snapshot – 
multiple
 cost
10
Snapshot
2
Snapshot
3
Vertex Data Arrays
Snapshot
1
Real-world graph often evolve gradually (Similar snapshots)
O
b
s
e
r
v
a
t
i
o
n
s
11
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
Snapshot 2
Snapshot 1
Snapshot 3
'
'
'
'
'
"
"
"
"
"
Similar propagations across snapshots
O
b
s
e
r
v
a
t
i
o
n
s
12
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
'
'
'
'
'
"
"
"
"
"
Snapshot 2
Snapshot 1
Snapshot 3
Group propagations 
by source & target
, not 
by snapshot
I
d
e
a
13
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
v
1
v
3
v
2
v
5
v
4
'
'
'
'
'
"
"
"
"
"
Step 1
Step 2
Step 3
Step 4
Step 1
Step 2
Step 3
1
 4
1
 3
1
 5
1
 2
Propagations:
Snapshot 2
Snapshot 1
Snapshot 3
C
h
r
o
n
o
s
:
 
D
a
t
a
 
L
a
y
o
u
t
Place together data
for the same vertex
across multiple
snapshots
14
fit in a cache line
Snapshot
1
Snapshot
2
Snapshot
3
Vertex Data Arrays (snapshot-by-snapshot)
(with time-locality)
Snapshot
1, 2, 3
Vertex Data Array (Chronos)
C
h
r
o
n
o
s
:
 
P
r
o
p
a
g
a
t
i
o
n
 
S
c
h
e
d
u
l
i
n
g
L
ocality 
A
ware 
B
atch 
S
cheduling (
LABS
)
:
Batching propagating across snapshots
15
vertex 
1
 -> vertex 
2
across snapshots
Vertex
Data Array
Edge Array
fit in a cache line
scan
vertex 
1
 -> vertex 
3
across snapshots
C
h
r
o
n
o
s
:
 
P
r
o
p
a
g
a
t
i
o
n
 
S
c
h
e
d
u
l
i
n
g
L
ocality 
A
ware 
B
atch 
S
cheduling (
LABS
)
:
Batching propagating across snapshots
16
Vertex
Data Array
Edge Array
fit in a cache line
Cache
Hit
scan
C
h
r
o
n
o
s
:
 
P
r
o
p
a
g
a
t
i
o
n
 
S
c
h
e
d
u
l
i
n
g
L
ocality 
A
ware 
B
atch 
S
cheduling (
LABS
)
:
Batching propagating across snapshots
17
Vertex
Data Array
Edge Array
access in a batch
Inter-core
Communication
scan
L
A
B
S
:
 
T
h
e
 
K
e
y
 
o
f
 
C
h
r
o
n
o
s
A graph layout
Place together vertex/edge data across snapshots
A scheduling mechanism
Batch propagations 
across snapshots
Efficient
Reduced cache miss / inter-core comm.
18
E
x
p
e
r
i
m
e
n
t
a
l
 
E
v
a
l
u
a
t
i
o
n
19
Large temporal graphs
Various graph algorithms
PageRank
Weakly-connected components (WCC)
Single-source shortest path (SSSP)
Maximal independent set (MIS)
Sparse matrix-vector multiplication (SpMV)
Settings
C
h
r
o
n
o
s
:
 
S
i
n
g
l
e
-
T
h
r
e
a
d
 
E
f
f
e
c
t
i
v
e
n
e
s
s
 
20
5~9x
 speedup
Baseline: Snapshot by snapshot
1
 
C
h
r
o
n
o
s
:
 
S
i
n
g
l
e
-
T
h
r
e
a
d
 
E
f
f
e
c
t
i
v
e
n
e
s
s
21
Reduced cache misses
92%
95%
70%
C
h
r
o
n
o
s
:
 
M
u
l
t
i
-
C
o
r
e
 
P
e
r
f
o
r
m
a
n
c
e
22
More than to 10x faster
1
10x
C
h
r
o
n
o
s
:
 
M
u
l
t
i
-
C
o
r
e
 
P
e
r
f
o
r
m
a
n
c
e
 
23
Reduced inter-core comm.
98%
98%
98%
M
o
r
e
 
i
n
 
P
a
p
e
r
:
Graph computation modes
All benefit from LABS
24
Push Mode
Pull Mode
Stream Mode
M
o
r
e
 
i
n
 
P
a
p
e
r
:
Incremental graph computation
Leveraging the previous snapshot’s result
Computing only the changed part
Can be enhanced with LABS
25
C
o
n
c
l
u
s
i
o
n
Temporal graph analysis
an emerging class of applications
Chronos
supports analysis of temporal graphs efficiently
Joint design of data layout and scheduling
Leveraging the temporal similarity of graphs
Exploit data locality esp. in time dimension
26
T
h
a
n
k
 
Y
o
u
!
27
Q
u
e
s
t
i
o
n
s
?
Tsinghua
University
University of Science
and Technology of
China
Microsoft
Research
B
A
C
K
U
P
Experiment Environment Details
Real Graphs Similarities over Time
Batch Size Discussion
LABS Locking
LABS with Incremental Computation
LABS on Cluster
Related Work
28
E
x
p
e
r
i
m
e
n
t
 
S
e
t
u
p
29
1. SSD model: TOSHIBA MK4001GRZB
T
e
m
p
o
r
a
l
 
D
i
s
t
r
i
b
u
t
i
o
n
s
 
o
f
 
G
r
a
p
h
s
Edges increase gradually
30
O
n
-
d
i
s
k
 
T
e
m
p
o
r
a
l
 
G
r
a
p
h
31
C
i
: checkpoint of 
v
i
: Edges without time information
a
ij
: 
j
-th activity of 
v
i
: Edge changes, e.g., 
<
addE, (
v
0
,
 v
3
, 
w
), 
t
2 
>
Snapshot Groups
 
A Snapshot Group
L
A
B
S
:
 
I
n
-
m
e
m
o
r
y
 
D
e
s
i
g
n
 
32
 
Edge Array
Vertex Data Array
indicate which snapshots
the edge exists in
 
Logically
Equals to:
T
e
m
p
o
r
a
l
 
G
r
a
p
h
 
R
e
-
c
o
n
s
t
r
u
c
t
i
o
n
User input time points: 0, 10, 20
Scan the graph activity log 
[Type, Endpoints, Time]
:
 
addE, v0->v1,    0
 
addE, v0->v2,  15
 
addE, v0->v3,    6
 
delE,  v0->v3,    8
Temporal edges 
[Endpoints, BitSet]
:
 
v0->v1, 111
 
v0->v2, 001
33
Temporal Properties
C
h
r
o
n
o
s
 
S
y
s
t
e
m
 
O
v
e
r
v
i
e
w
34
On-Disk Temporal Graph
 
Contains
all the graph
evolving activities
 
Contains
only snapshots of
interest
In-Memory Temporal Graph
 
User input
multiple time points
Scan
activities(log)
Reconstruct
graph snapshots
G
r
e
a
t
e
r
 
B
a
t
c
h
 
S
i
z
e
 
o
f
 
L
A
B
S
Pros
Possible to further reduce cache miss / inter-core comm.
Cons
Bit wide limit of the instruction: _BitScanForward64
Less snapshot similarity within a batch
No more cache miss / inter-core comm. to reduce
False sharing with locking
35
C
o
m
p
u
t
e
 
S
n
a
p
s
h
o
t
 
b
y
 
S
n
a
p
s
h
o
t
 
(
a
n
o
t
h
e
r
 
w
a
y
)
36
Vertex Data Array
Snapshot
1
Snapshot
2
Snapshot
3
 
Core 0
 
Core 1
 
Core 2
 
Snapshot-Parallelism
P
a
r
a
l
l
e
l
i
z
a
t
i
o
n
 
-
-
 
S
u
m
m
a
r
y
37
Snapshot by snapshot
LABS
 
Good partitioning: 
Num. of intra-partition edge 
> 
Num. of inter-partition edge
 
?
?
Partition-Parallelism:
 Computing 
partitions
 of the same snapshot in parallel
Snapshot-Parallelism:
 Computing 
snapshots
 in parallel
LABS-Parallel:
 Computing 
LABS-batched partition
 
in parallel
L
A
B
S
 
P
e
r
f
o
r
m
a
n
c
e
 
o
n
 
M
u
l
t
i
-
C
o
r
e
38
LABS-Parallelism out-performs
1
Baseline: Single Core
L
A
B
S
 
P
e
r
f
o
r
m
a
n
c
e
 
o
n
 
C
l
u
s
t
e
r
 
A small cluster with 4 machines
 
 
 
 
 
 
 
Benefit less than in single machine test
The benefit of LABS hided by the high overhead of network
 
39
Up to 10x speed up
R
e
d
u
c
e
d
 
L
o
c
k
 
C
o
n
t
e
n
t
i
o
n
s
LABS amortizes the lock cost across snapshots
PageRank on the Wiki graph
40
Reduced the time of locking
by more than 95%
95%
96%
96%
96%
L
A
B
S
 
w
i
t
h
 
I
n
c
r
e
m
e
n
t
a
l
 
C
o
m
p
u
t
a
t
i
o
n
 
Traditional incremental computing
 
 
 
 
Incremental computing with LABS
41
 
Apply LABS
(BatchSize = 3)
 
Incremental
Computing
G
a
i
n
 
o
f
 
I
n
c
r
e
m
e
n
t
a
l
 
L
A
B
S
42
 
Baseline:
Traditional
Incremental
R
e
l
a
t
e
d
 
w
o
r
k
Existing Graph Engines – static graph engines
Pregel (SIGMOD’10)
Powergraph (OSDI’12)
GraphLab (VLDB’12)
Grace (ATC’12)
X-stream (SOSP’13)
Active studies on changes and new concepts in evolving
graph
Densification law, “Shrinking diameters” diameter (KDD’05)
PageRank (CIKM’07),  Facebook user activities (EuroSys’09), centrality in
evolving graph (MLG’10), retweet after N friends’ retweets (WWW’11),
Rumors detection (SOMA’10)…
43
Slide Note

Good morning everyone

I am Youshan.

Glad to be here introducing our work Chronos.

It is a graph engine for temporal graph analysis.

**************************

OTHER POINTS

LABS fully used the cache line (Grace may lose some as neighbor cannot fulfil the cache line)

“LABS is a scheduling method for all algorithms, not one-single improved algorithm”

Reason LP > SP, and the cross partition edge ratio number.

Q

Partition#?

Load balancing?

Seems that snapshot parallelism not scale well, why? (L3 cache sharing)

Define inter-core communication

How today’s integrated memory controller works?

Relationship graph vs. activity graph?

What if there’s not enough memory to hold all the graph?

Q past

Is incremental worth sharing in detail? Not really, it is hard to express clearly in the talk

Embed
Share

Chronos: A Graph Engine for Temporal Graph Analysis in a social graph environment from years 2012 to 2014. Uncover insights through temporal graph properties and user ranking variations across different years. Discover the evolution of real-world graphs and the computation of properties on graph snapshots. Delve into existing graph engines targeting static graph analysis and performance issues. Revisit static graph analysis through propagation-based graph computation models.

  • Temporal Graph Analysis
  • Real-world Graphs
  • Graph Engines
  • Social Graph
  • Performance Issues

Uploaded on Feb 26, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Chronos: A Graph Engine for Chronos: A Graph Engine for Temporal Graph Analysis Temporal Graph Analysis Wentao Han1,3, Youshan Miao2,3, Kaiwei Li1,3, Ming Wu3, Fan Yang3, Lidong Zhou3, Vijayan Prabhakaran3, Wenguang Chen1, Enhong Chen2 Tsinghua University1 University of Science and Technology of China2 Microsoft Research3 1

  2. Temporal Temporal Graphs Graphs Real-world graphs evolve temporal graphs A Social Graph 2012 2014 2013 YEAR Temporal graph properties bring more insights User Ranking 2012 2013 YEAR 2014 2

  3. Temporal Temporal Graphs Graphs Real-world graphs evolve temporal graphs A Social Graph 2012 2014 2013 YEAR Temporal graph properties bring more insights User Ranking Temporal ranks can tell their differences 2012 2013 YEAR 2014 3

  4. Temporal Graph Temporal Graph Analysis Analysis Computing properties on a series of graph snapshots t0 t1 t2 Graph snapshot 2012 2014 2013 YEAR Static Graph Analysis User Ranking Graph Properties 2012 2013 YEAR 2014 4

  5. Temporal Graph Analysis Temporal Graph Analysis Existing graph engines: targeting static graph analysis A possible solution: computing snapshot by snapshot 2012 2014 2013 YEAR Task 1 Task 2 Task 3 User Ranking 2012 2013 YEAR 2014 5

  6. Performance Issues Performance Issues ?????????= ???????? ??????? 6

  7. Revisit: Static Graph Revisit: Static Graph Analysis Analysis Propagation based graph computation model Vertex Data Array Data Propagation v v2 2 ... ... ... ... v2 v1 v3 v v1 1 Local computation Edge Array v3 v5 scan v1 v2 v1 v3 v3 v5 ... ... ... ... 7

  8. Revisit: Static Graph Revisit: Static Graph Analysis Analysis Propagation based graph computation model Vertex Data Array Cache Miss Data Propagation v v2 2 ... ... ... ... v2 v1 v3 v v1 1 Local computation Edge Array v3 v5 scan v1 v2 v1 v3 v3 v5 ... ... ... ... 8

  9. Revisit: Static Graph Analysis Revisit: Static Graph Analysis In parallel: Partition graph & computations among CPU cores Vertex Data Array Inter-core Communication Cross-partition edge Core 1 Core 0 v v2 2 ... ... ... ... v2 v1 v3 v v1 1 Core 0 Core 0 Core Core 1 1 Edge Array v3 v5 Core 1 Core 0 scan v1 v2 v1 v3 v3 v5 ... ... ... ... 9

  10. Temporal Graph Analysis: Snapshot by Snapshot Temporal Graph Analysis: Snapshot by Snapshot Computation on multiple graph snapshot multiple cost Vertex Data Arrays Snapshot 1 ... ... ... ... v2 v1 v3 N snapshots N cache misses N inter-core comm. Snapshot 2 v2' ... v3' ... ... v1' ... Snapshot 3 v2 ... v3 ... ... v1 ... 10

  11. Observations Observations Real-world graph often evolve gradually (Similar snapshots) " " ' v v1 1 v v2 2 v v1 1 v v2 2 v v1 1 v v2 2 ' ' " v4 v4 v4 ' ' v3 v5 v3 v5 v3 v5 " " Snapshot 1 Snapshot 2 Snapshot 3 11

  12. Observations Observations Similar propagations across snapshots " " ' v v1 1 ' v v2 2 v v1 1 v v2 2 v v1 1 v v2 2 " v4 v4 v4 ' ' ' " " v3 v5 v3 v5 v3 v5 Snapshot 1 Snapshot 2 Snapshot 3 12

  13. Idea Idea Group propagations by source & target, not by snapshot " " ' v v1 1 ' v v2 2 v v1 1 v v2 2 v v1 1 v v2 2 " v4 v4 v4 ' ' ' " " v3 v5 v3 v5 v3 v5 Snapshot 1 Snapshot 2 Snapshot 3 Step 2 Step 1 Step 3 Step 1 Step 2 Step 4 Step 3 Propagations: 1 3 1 4 1 5 1 2 13

  14. Chronos: Data Layout Chronos: Data Layout Vertex Data Arrays (snapshot-by-snapshot) Snapshot 1 ... ... ... ... v2 v3 v1 Place together data for the same vertex across multiple snapshots Snapshot 2 v2' ... v3' ... ... ... v1' Snapshot 3 v2 ... v3 ... ... v1 ... Vertex Data Array (Chronos) (with time-locality) Snapshot 1, 2, 3 v2 v2' v2 ... ... ... ... v1 v1' v1 v3 v3' v3 ... fit in a cache line 14

  15. Chronos: Propagation Scheduling Chronos: Propagation Scheduling Locality Aware Batch Scheduling (LABS): Batching propagating across snapshots vertex 1 -> vertex 3 across snapshots vertex 1 -> vertex 2 across snapshots Edge Array v1 v2 v1' v2' v1 v2 v1 v3 v1' v3' v1 v3 ... ... scan Vertex Data Array v2 v2' v2 ... ... ... ... v1 v1' v1 v3 v3' v3 ... fit in a cache line 15

  16. Chronos: Propagation Scheduling Chronos: Propagation Scheduling Locality Aware Batch Scheduling (LABS): Batching propagating across snapshots N propagations 1 cache misses Edge Array v1 v2 v1 v2 v1 v2 v1' v2' v1 v2 v1' v2' v1 v2 v1' v2' v1 v2 v1 v3 v1 v3 v1 v3 v1' v3' v1 v3 ... v1' v3' v1 v3 ... v1' v3' v1 v3 ... ... ... ... scan Vertex Data Array v2 v2' v2 ... ... ... ... v1 v1' v1 v3 v3' v3 ... Cache Hit fit in a cache line 16

  17. Chronos: Propagation Scheduling Chronos: Propagation Scheduling Locality Aware Batch Scheduling (LABS): Batching propagating across snapshots N propagations 1 inter-core comm. Edge Array v1 v2 v1 v2 v1 v2 v1 v2 v1' v2' v1 v2 v1' v2' v1 v2 v1' v2' v1 v2 v1' v2' v1 v2 v1 v3 v1 v3 v1 v3 v1 v3 v1' v3' v1 v3 ... v1' v3' v1 v3 ... v1' v3' v1 v3 ... v1' v3' v1 v3 ... ... ... ... ... scan Vertex Data Array Core 1 Core 0 v2 v2' v2 ... ... ... ... v1 v1' v1 v3 v3' v3 ... Inter-core Communication access in a batch 17

  18. LABS: The Key of Chronos LABS: The Key of Chronos A graph layout Place together vertex/edge data across snapshots A scheduling mechanism Batch propagations across snapshots Efficient Reduced cache miss / inter-core comm. 18

  19. Experimental Evaluation Experimental Evaluation Large temporal graphs Graph # of Vertices # of Edge Events Time Span Source Wiki 1.9 M 40.0 M 6 years Wikipedia graph from KONECT Twitter 7.5 M 61.6 M 3 months Provided by Twitter Weibo 27.7 M 4.9 B 3 years Crawled from Sina Weibo Web 133.6 M 7.2 B 12 months Web graph from DELIS Various graph algorithms PageRank Weakly-connected components (WCC) Single-source shortest path (SSSP) Maximal independent set (MIS) Sparse matrix-vector multiplication (SpMV) Settings CPU 2.4GHz 16-Core RAM 128GB DISK 1TB SSD 19

  20. Chronos: Single Chronos: Single- -Thread Thread Effectiveness Effectiveness 5~9x speedup Temporal Graph Analysis on Wiki 10 WCC 9 Pagerank 8 SSSP 7 Speedup 6 5 4 3 2 1 0 8 16 24 32 1 BatchSize Baseline: Snapshot by snapshot 20

  21. Chronos: Chronos: Single Single- -Thread Effectiveness Thread Effectiveness Cache Miss Reduction 10,000 8,759 Cache miss # (in millions) 8,000 92% 6,000 3,865 3,462 4,000 70% 95% 160 2,000 1,107 1,003 687 649 584 287 265 196 0 L1d Cache Miss LLC Cache Miss dTLB Miss BatchSize=1 BatchSize=4 BatchSize=16 BatchSize=32 Reduced cache misses 21

  22. Chronos: Chronos: Multi Multi- -Core Performance Core Performance PageRank on Wiki 90 80 Snapshot-by-snapshot 70 Chronos 60 Speedup 10x 50 40 30 20 10 0 0 4 8 # of Cores 12 16 1 More than to 10x faster 22

  23. Chronos: Multi Chronos: Multi- -Core Performance Core Performance Reduced Inter-Core Communications 10000 4244.2 Communication Num. 2471.6 977.64 1000 (in Millions) 98% 98% No LABS 98% 105.2 58.56 100 LABS 23.08 10 1 2 4 8 Number of Cores Reduced inter-core comm. 23

  24. More More in in Paper: Paper: Graph computation modes v1 v3 v2 v1 v6 v2 v3 v6 v1 v2 v1 v5 v4 v7 v8 v2 v5 v8 Push Mode Pull Mode Stream Mode All benefit from LABS 24

  25. More More in in Paper: Paper: Incremental graph computation Leveraging the previous snapshot s result Computing only the changed part Can be enhanced with LABS 25

  26. Conclusion Conclusion Temporal graph analysis an emerging class of applications Chronos supports analysis of temporal graphs efficiently Joint design of data layout and scheduling Leveraging the temporal similarity of graphs Exploit data locality esp. in time dimension 26

  27. Thank You! Thank You! Questions? Questions? Tsinghua University University of Science and Technology of China Microsoft Research 27

  28. BACKUP BACKUP Experiment Environment Details Real Graphs Similarities over Time Batch Size Discussion LABS Locking LABS with Incremental Computation LABS on Cluster Related Work 28

  29. Experiment Setup Experiment Setup CPU 2.4GHz Intel Xeon E5-2665 16-core RAM 128GB 1TB SSD (RAID 0 with 372GB1 *3) DISK Network InfiniBand (DDR, 40Gb/s) ClusterSize 4 1. SSD model: TOSHIBA MK4001GRZB 29

  30. Temporal Distributions of Graphs Temporal Distributions of Graphs Edges increase gradually wiki weibo 100% 100% 90% 90% 80% 80% Number of Edges 70% Number of Edges 70% 60% 60% 50% 50% 40% 40% 30% 30% 20% 20% 10% 10% 0% 0% 13% 19% 25% 31% 38% 44% 50% 56% 63% 69% 75% 81% 88% 94% 100% 6% Ratio of time range Ratio of time range 30

  31. On On- -disk disk Temporal Graph Temporal Graph Snapshot Groups ... ... Snapshot Group 0 Snapshot Group 1 Time index A Snapshot Group ... a0,1 ... ... ... C0 a0,t C1 a1,1 a1,t Vertex index Edge activities of v0 Edge activities of v1 Edge data for v0 Edge data for v1 Ci: checkpoint of vi: Edges without time information aij: j-th activity of vi: Edge changes, e.g., <addE, (v0, v3, w), t2 > 31

  32. LABS: In LABS: In- -memory Design memory Design Vertex Data Array ... v2 v2' v2 ... v1 v1' v1 Vertex index Data of v1 Data of v2 Logically Equals to: v1 v2 v1' v2' v1 v2 indicate which snapshots the edge exists in Edge Array (v1) v2 110 (v1) v3 ... ... 111 ... ... Vertex index Temporal Edge Edges of v1 32

  33. Temporal Graph Re Temporal Graph Re- -construction construction User input time points: 0, 10, 20 Scan the graph activity log [Type, Endpoints, Time]: addE, v0->v1, 0 addE, v0->v2, 15 addE, v0->v3, 6 delE, v0->v3, 8 Temporal edges [Endpoints, BitSet]: v0->v1, 111 v0->v2, 001 33

  34. Chronos System Overview Chronos System Overview On-Disk Temporal Graph Contains all the graph evolving activities User input multiple time points Scan activities(log) Reconstruct graph snapshots In-Memory Temporal Graph Contains ... v2 v2' v2 v1 v1' v1 ... only snapshots of interest ... ... (v1) v2 111 (v1) v3 111 ... ... Temporal Properties 34

  35. Greater Batch Size of LABS Greater Batch Size of LABS Pros Possible to further reduce cache miss / inter-core comm. Cons Bit wide limit of the instruction: _BitScanForward64 Less snapshot similarity within a batch No more cache miss / inter-core comm. to reduce False sharing with locking 35

  36. Compute Snapshot by Snapshot Compute Snapshot by Snapshot (another way) (another way) Snapshot-Parallelism Vertex Data Array Snapshot 1 Core 1 Core 0 Core 0 ... ... ... ... v2 v1 v3 Snapshot 2 Core 1 v2' ... v3' ... ... v1' ... 3 cache misses 3 inter-core comm. Snapshot 3 Core 2 v2 ... v3 ... ... v1 ... Cache Miss Inter-core communication 36

  37. Parallelization Parallelization -- -- Summary Summary Good partitioning: Num. of intra-partition edge > Num. of inter-partition edge Partition- Parallelism Snapshot- Parallelism LABS- Parallelism More More Less Cache Miss ? Inter-core Communications More No Less Snapshot by snapshot LABS Partition-Parallelism: Computing partitions of the same snapshot in parallel Snapshot-Parallelism: Computing snapshots in parallel LABS-Parallel: Computing LABS-batched partition in parallel 37

  38. LABS Performance on Multi LABS Performance on Multi- -Core Core PageRank on Wiki Partition-Parallelism 90 80 LABS-Parallelsm 70 Snapshot-Parallelism 60 Speedup 50 40 30 20 10 0 0 4 8 # of Cores 12 16 1 Baseline: Single Core LABS-Parallelism out-performs 38

  39. LABS Performance on Cluster LABS Performance on Cluster A small cluster with 4 machines Up to 10x speed up 7318 6405 10000 2002 1250 518 1000 Time (s) 48 100 10 1 PageRank WCC SSSP Baseline LABS Benefit less than in single machine test The benefit of LABS hided by the high overhead of network 39

  40. Reduced Reduced Lock Contentions Lock Contentions LABS amortizes the lock cost across snapshots PageRank on the Wiki graph 120 96.73 No LABS LABS 100 Lock time (second) 80 60 47.54 96% 34.25 40 28.85 96% 20 96% 95% 4.02 1.85 1.34 1.32 0 2 4 8 16 Number of Cores Reduced the time of locking by more than 95% 40

  41. LABS with Incremental Computation LABS with Incremental Computation Traditional incremental computing Incremental Computing Snapshot 1 Snapshot 0 Snapshot 2 Snapshot 3 Incremental computing with LABS Apply LABS (BatchSize = 3) Snapshot 1 Snapshot 0 Snapshot 2 Snapshot 3 41

  42. Gain of Incremental LABS Gain of Incremental LABS 70% WCC 60% SSSP 50% Improvement (%) 40% 30% 20% 10% 0% 1 2 4 8 16 32 Batch size Baseline: Traditional Incremental 42

  43. Related work Related work Existing Graph Engines static graph engines Pregel (SIGMOD 10) Powergraph (OSDI 12) GraphLab (VLDB 12) Grace (ATC 12) X-stream (SOSP 13) Active studies on changes and new concepts in evolving graph Densification law, Shrinking diameters diameter (KDD 05) PageRank (CIKM 07), Facebook user activities (EuroSys 09), centrality in evolving graph (MLG 10), retweet after N friends retweets (WWW 11), Rumors detection (SOMA 10) 43

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#