The Vulnerability of Large Graphs in Network Analysis

On the Vulnerability of Large Graphs
Joint work by
Hanghang Tong, B. Aditya Prakash, Charalampos Tsourakakis
Tina Eliassi-Rad, Christos Faloutsos, Duen Horng (Polo) Chau
ICDM 2010, December 13-17, 2010, Sydney, Australia
Motivation: Immunization
2
34
33
25
26
27
28
29
30
31
32
22
21
20
19
18
17
23
24
12
13
14
15
16
1
9
10
11
3
4
5
6
7
8
2
SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn
Given
: a graph 
A
, virus prop model and budget 
k
;
Find
: 
k
 ‘best’ nodes for immunization.
Motivation: Immunization
3
34
33
25
26
27
28
29
30
31
32
22
21
20
19
18
17
23
24
12
13
14
15
16
1
9
10
11
3
4
5
6
7
8
2
SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn
Given
: a graph 
A
, virus prop model and budget 
k
;
Find
: 
k
 ‘best’ nodes for immunization.
Challenges
Given a graph 
A
, and budget 
k
,
Q1 (Measure):
 how to measure the
`shield-value’ for a subset of nodes (
S
)?
Q2 (Algorithm): 
how to find a subset of
k
 nodes with highest ‘shield-value’?
4
Roadmap
Motivation
(Q1) ‘Shield-value’
(Q2) 
Netshield
 Algorithm
Experimental Results
Conclusion
5
Background: SIS Virus Model 
[Chakrabarti+ 2008]
‘Flu’ like: Susceptible-Infectious-
Susceptible
If the virus ‘strength’ s < 
1/ λ
 
, an
epidemic can not happen
Intuition
s 
 
# of sneezes before heal
Largest eigenvalue 
λ
 
# of edges/paths
6
λ 
 indicates the vulnerability of the whole graph 
A !
Eigen-Drop:  an Ideal `shield-value’?
 
7
1
9
10
3
4
5
7
8
9
1
11
10
3
4
5
6
7
8
2
9
 Original Graph: 
λ
Without {2, 6}: 
λ
s
Eigen-Drop(
S
)
=λ-λ
s
Why not Use Eigen-Drop as `Shield-Value’ ?
Select 
k
 nodes, whose absence creates the
largest drop in 
λ
We need                     in time
 Example:
1,000 nodes, with 10,000 edges
It takes 0.01 seconds to compute 
λ
It takes
 
2,615 years
 
to find best-
5
 nodes
 !
8
Largest eigenvalue 
w/o subset of nodes 
S
λ-λ
s
Our `Shield-Value’: Approx Eigen-Drop
A
u
= 
λ 
X
  
u(i
)
:
 eigen-score
u
Footnote: 
u(i)
 ~ PageRank
(i)
  ~ in-degree
(i)
Theorem:
(1) 
λ
 
- λ
s
Sv(
S
)
=
i
є
S 
2λu
(
i
)
2
-
i,j
є
S 
A
(
i,j
)
u
(
i
)
u
(
j
)
   Intuition of the proposed `Shield-value’
 
find a set of nodes 
S (e.g. k=4), 
which
 
(C1) each has high eigen-scores
 
(C2) diverse among themselves
Theorem: (Tong+ 2009)
(1) 
λ
 
- λ
s
Sv(
S
)
=
i
є
S 
2λu
(
i
)
2
-
i,j
є
S 
A
(
i,j
)
u
(
i
)
u
(
j
)
Original Graph
The Proposed Alg.: 
Netshield
 
 
 
 
 
 
 
Example:  1,000 nodes, with 10,000 edges
Netshield
  takes
 
< 0.1 seconds
 
to find best-
5
 nodes
 !
 … as opposed to 
2,615 years
11
Theorem:
(1) 
λ
 
- λ
s
Sv(
S
)
=
i
є
S 
2λu
(
i
)
2
-
i,j
є
S 
A
(
i,j
)
u
(
i
)
u
(
j
)
(2) Sv(S) is sub-modular
Footnote: near-optimal means Sv(
S 
Netshield
) >= (1-1/e) Sv(
S 
Opt
)
Corollary:
(3) 
Netshield
 is near-optimal
 
(wrt max Sv(
S
))
(4) 
Netshield
 is O
(nk
2
+m)
12
1
3
10
8
7
4
6
5
2
9
1
3
10
8
7
4
6
5
2
9
Sub-Modular (i.e., Diminishing Returns) 
 
>=
δ
δ
Benefit of deleting {1,2, 3,4}
Benefit of deleting {1,2}
Marginal benefit of deleting {5,6}
Marginal benefit of deleting {5,6}
Why 
Netshield 
is Near-Optimal?
Why 
Netshield 
is Near-Optimal?
13
1
3
10
8
7
4
6
5
2
9
1
3
10
8
7
4
6
5
2
9
Sub-Modular (i.e., Diminishing Returns) 
 
>=
Theorem
: 
k
-step greedy alg. to maximize a sub-modular
function guarantees (1-1/e) optimal 
[Nemhauster+ 78]
δ
δ
T4: Why Sv(
S
) is sub-modular?
14
1
3
10
8
7
4
6
5
2
9
Already deleted
Newly deleted
    Sv(
S
green 
U
 
S
blue
) - Sv(
S
green
)
i
є
S
blue
 
2λu
(
i
)
2
-
i,j
є
S
blue
 
A
(
i,j
)
u
(
i
)
u
(
j
)
  (∑
i
є
S
blue
, j
є
S
green
 
A
(
i,j
)
u
(
i
)
u
(
j
)+∑
i
є
S
green
, j
є
S
blue
 
A
(
i,j
)
u
(
i
)
u
(
j
))
T4: Why Sv(
S
) is sub-modular?
15
1
3
10
8
7
4
6
5
2
9
Marginal Benefit of deleting
 
{5,6}
 
 
Only 
purple
 term depends on 
{1, 2}
!
=
-
Already deleted
Newly deleted
16
1
3
10
8
7
4
6
5
2
9
Marginal Benefit  
=  
Blue
Purple
More Green 
Footnote: greens are nodes already deleted; blue {5,6} nodes are
nodes to be deleted
1
3
10
8
7
4
6
5
2
9
More Purple 
Less Red 
Marginal Benefit of Left >= Marginal Benefit  of Right
T4: Why 
Sv
(
S
) is sub-modular?
Experiment:  Quality of 
Netshield
17
Eig-Drop
# of vaccines
Netshield
Optimal
(1-1/e) x Optimal
(better)
Experiment: Comparison of Immunization
18
(better)
Log(fraction of infected nodes)
Time Epoch
Netshield
Degree
Abnormality
PageRank
Eigs (=HITS)
Acquaintance
Between (short)
Between (RW)
Experiment: Speed of 
Netshield
19
> 10 days
0.1 seconds
Netshield
(better)
100,000
>=1,000,000
10,000
10
1
0.1
0.01
Com-Eigs
Com-Eval
NIPS co-authorship Network: 3K nodes, 15K edges
Time
1,000
100
10,000,000x
# of vaccines
Experiment: Scalability of 
Netshield
Time
# of edges
(better)
X 10
8
20
Conclusion
Prob. Dfn (Immunization)
Given
: a graph 
A
, virus prop model and budget 
k
;
Find
: 
k
 ‘best’ nodes for immunization.
Our Contributions
Measure (‘shield-value’) :  Approx Eigen-Drop
Algorithm:  Scalable & Near-optimal
21
Slide Note
Embed
Share

This joint work delves into the vulnerability of large graphs in network analysis, focusing on immunization strategies to combat the spread of viruses. The study explores the optimization of immunization efforts by identifying the most crucial nodes for immunization, offering insights into safeguarding network structures in the face of potential threats.

  • Graph Analysis
  • Network Vulnerability
  • Immunization Strategies
  • Virus Spread

Uploaded on Sep 16, 2024 | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. On the Vulnerability of Large Graphs Joint work by Hanghang Tong, B. Aditya Prakash, Charalampos Tsourakakis Tina Eliassi-Rad, Christos Faloutsos, Duen Horng (Polo) Chau ICDM 2010, December 13-17, 2010, Sydney, Australia

  2. Motivation: Immunization Given: a graph A, virus prop model and budget k; Find: k best nodes for immunization. http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 16 23 24 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 15 12 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 14 25 13 22 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 26 11 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 21 9 27 34 20 10 1 4 28 8 33 19 2 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 29 7 3 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 18 5 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 30 6 17 32 31 2 SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn

  3. Motivation: Immunization Given: a graph A, virus prop model and budget k; Find: k best nodes for immunization. http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 16 23 24 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 15 12 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 14 25 13 22 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 26 11 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 21 See full size image 9 27 See full size image 34 20 10 See full size image See full size image 1 4 See full size image 28 8 33 19 2 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg See full size image http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 29 7 3 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 18 5 http://t2.gstatic.com/images?q=tbn:DvSaKUuCMp_swM:http://newscoma.com/wp-content/uploads/2009/04/whooping-cough.jpg 30 6 17 32 31 3 SARS costs 700+ lives; $40+ Bn; H1N1 costs Mexico $2.3bn

  4. Challenges Given a graph A, and budget k, Q1 (Measure): how to measure the `shield-value for a subset of nodes (S)? Q2 (Algorithm): how to find a subset of k nodes with highest shield-value ? 4

  5. Roadmap Motivation (Q1) Shield-value (Q2) Netshield Algorithm Experimental Results Conclusion 5

  6. Background: SIS Virus Model [Chakrabarti+ 2008] Flu like: Susceptible-Infectious- Susceptible If the virus strength s < 1/ , an epidemic can not happen Intuition s # of sneezes before heal Largest eigenvalue # of edges/paths indicates the vulnerability of the whole graph A ! 6

  7. Eigen-Drop: an Ideal `shield-value? Eigen-Drop(S)= - s 9 9 9 11 10 10 1 1 4 4 8 8 2 2 7 3 7 3 5 5 6 6 Original Graph: Without {2, 6}: s 7

  8. Why not Use Eigen-Drop as `Shield-Value ? Select k nodes, whose absence creates the largest drop in - s We need in time Example: 1,000 nodes, with 10,000 edges It takes 0.01 seconds to compute It takes 2,615 years to find best-5 nodes ! Largest eigenvalue w/o subset of nodes S 8

  9. Our `Shield-Value: Approx Eigen-Drop Theorem: (1) - s Sv(S)= i S 2 u(i)2- i,j S A(i,j)u(i)u(j) u = X A u u(i): eigen-score Footnote: u(i) ~ PageRank(i) ~ in-degree(i)

  10. Intuition of the proposed `Shield-value Theorem: (Tong+ 2009) (1) - s Sv(S)= i S 2 u(i)2- i,j S A(i,j)u(i)u(j) find a set of nodes S (e.g. k=4), which (C1) each has high eigen-scores (C2) diverse among themselves Original Graph Select by C1 Select by C1+C2

  11. The Proposed Alg.: Netshield Theorem: (1) - s Sv(S)= i S 2 u(i)2- i,j S A(i,j)u(i)u(j) (2) Sv(S) is sub-modular Corollary: (3) Netshield is near-optimal (wrt max Sv(S)) (4) Netshield is O(nk2+m) Example: 1,000 nodes, with 10,000 edges Netshield takes < 0.1 seconds to find best-5 nodes ! as opposed to 2,615 years Footnote: near-optimal means Sv(S Netshield) >= (1-1/e) Sv(S Opt) 11

  12. Why Netshield is Near-Optimal? Marginal benefit of deleting {5,6} Marginal benefit of deleting {5,6} 3 9 10 3 9 1 10 5 1 5 2 6 2 8 6 7 8 7 4 4 Benefit of deleting {1,2} Benefit of deleting {1,2, 3,4} Sub-Modular (i.e., Diminishing Returns) >= 12

  13. Why Netshield is Near-Optimal? 3 9 10 3 9 1 10 5 1 5 2 6 2 8 6 7 8 7 4 ' 4 >= Sub-Modular (i.e., Diminishing Returns) Theorem: k-step greedy alg. to maximize a sub-modular function guarantees (1-1/e) optimal [Nemhauster+ 78] 13

  14. T4: Why Sv(S) is sub-modular? Newly deleted 3 9 10 1 5 2 6 8 7 4 Already deleted 14

  15. T4: Why Sv(S) is sub-modular? Newly deleted Marginal Benefit of deleting {5,6} Sv(SgreenU Sblue) - Sv(Sgreen) = 3 9 10 - 1 i Sblue2 u(i)2- i,j SblueA(i,j)u(i)u(j) 5 2 6 ( i Sblue, j SgreenA(i,j)u(i)u(j)+ i Sgreen, j SblueA(i,j)u(i)u(j)) 8 7 4 Already deleted Pure benefit from {5,6} Interaction between {5,6} and {1,2} Only purple term depends on {1, 2}! 15

  16. T4: Why Sv(S) is sub-modular? 3 3 9 9 10 10 1 1 5 5 2 2 6 6 8 8 7 7 4 4 Marginal Benefit = Blue Purple More Green More Purple Less Red Marginal Benefit of Left >= Marginal Benefit of Right Footnote: greens are nodes already deleted; blue {5,6} nodes are nodes to be deleted 16

  17. Experiment: Quality of Netshield Optimal (better) Eig-Drop Netshield (1-1/e) x Optimal # of vaccines 17

  18. Experiment: Comparison of Immunization Log(fraction of infected nodes) (better) Abnormality PageRank Between (short) Degree Between (RW) Acquaintance Eigs (=HITS) Netshield Time Epoch 18

  19. Experiment: Speed of Netshield > 10 days >=1,000,000 Com-Eigs 100,000 10,000 Com-Eval Time 10,000,000x 1,000 (better) 100 10 Netshield 1 0.1 seconds 0.1 # of vaccines 0.01 NIPS co-authorship Network: 3K nodes, 15K edges 19

  20. Experiment: Scalability of Netshield Time (better) # of edges X 108 20

  21. Conclusion Prob. Dfn (Immunization) Given: a graph A, virus prop model and budget k; Find: k best nodes for immunization. Our Contributions Measure ( shield-value ) : Approx Eigen-Drop Algorithm: Scalable & Near-optimal 21

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#