Ligra: A Lightweight Graph Processing Framework for Shared Memory
Ligra is a lightweight graph processing framework developed by Julian Shun during his time at the Miller Institute, UC Berkeley. This framework, created in collaboration with Laxman Dhulipala and Guy Blelloch, is designed for shared memory systems to efficiently analyze large graphs. Key features include computational efficiency, space efficiency, and programming efficiency. Ligra enables operations such as Breadth-first Search (BFS) and abstract graph traversal algorithms, offering an efficient solution for processing graphs in shared memory environments.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun Miller Institute, UC Berkeley (work done while at Carnegie Mellon University) Joint work with Laxman Dhulipala and Guy Blelloch (Principles and Practice of Parallel Programming, 2013 and Data Compression Conference, 2015)
2 HPGM 2015 Large Graphs Need to efficiently analyze these graphs Computational efficiency Space efficiency Programming efficiency
3 HPGM 2015 Breadth-first Search (BFS) Compute a BFS tree rooted at source r containing all vertices reachable from r Frontier r r Can process each frontier in parallel Race conditions, load balancing
4 HPGM 2015 BFS Abstractly All graph traversal algorithms do this! Operate on a subset of vertices Map computation over subset of edges in parallel Return new subset of vertices (Map computation over subset of vertices in parallel) Bellman-Ford shortest paths Graph eccentricity estimation PageRank Breadth-first search Betweenness centrality Connected components Can we build an abstraction for these types of algorithms?
5 HPGM 2015 Graph Processing Systems Existing: Pregel/Giraph, GraphLab, Pegasus, Knowledge Discovery Toolbox, GraphChi, Parallel BGL, and many others Our system: Ligra - Lightweight graph processing system for shared memory Efficient for frontier-based algorithms
6 HPGM 2015 Ligra Graph Operate on a subset of vertices Map computation over subset of edges in parallel Return new subset of vertices (Map computation over subset of vertices in parallel) VertexSubset } EdgeMap VertexMap Shared memory Cilk Plus/OpenMP
7 HPGM 2015 Ligra Framework 0 0 VertexSubset 2 6 8 0 4 5 4 4 bool f(v){ data[v] = data[v] + 1; return (data[v] == 1); } 8 8 VertexMap 7 1 6 6 3 VertexSubset 8 6
8 HPGM 2015 Ligra Framework T 0 0 T VertexSubset 2 2 6 8 0 4 5 5 T 4 4 4 8 8 EdgeMap bool update(u,v){ } bool cond(v){ } T 7 7 T 1 1 6 6 3 F VertexSubset 2 1 4 5 7
9 HPGM 2015 Breadth-first Search in Ligra parents = {-1, , -1}; //-1 indicates unvisited procedure UPDATE(s, d): return compare_and_swap(parents[d], -1, s); procedure COND(i): return parents[i] == -1; //checks if unvisited procedure BFS(G, r): parents[r] = r; frontier = {r}; //VertexSubset while (size(frontier) > 0): frontier = EDGEMAP(G, frontier, UPDATE, COND); frontier
10 HPGM 2015 Actual BFS code in Ligra
11 HPGM 2015 Sparse or Dense EdgeMap? Frontier Dense method better when frontier is large and many vertices have been visited 1 2 9 9 Sparse (traditional) method better for small frontiers 3 1 3 1 0 0 1 4 1 4 Switch between the two methods based on frontier size [Beamer et al. SC 12] 11 11 5 1 5 1 1 2 2 6 Limited to BFS? 7 8
12 HPGM 2015 EdgeMap procedure EDGEMAP(G, frontier, Update, Cond): if (|frontier| + sum of out-degrees > threshold) then: return EDGEMAP_DENSE(G, frontier, Update, Cond); else: return EDGEMAP_SPARSE(G, frontier, Update, Cond); Loop through incoming edges of unexplored vertices (in parallel), breaking early if possible Loop through outgoing edges of frontier vertices in parallel Not limited to BFS! Generalization to other problems: Bellman-Ford shortest paths, betweenness centrality, graph eccentricity estimation, connected components, PageRank Users need not worry about this
13 HPGM 2015 PageRank
14 HPGM 2015 PageRank in Ligra p_next = {0, , 0}; diff = {}; p_curr = {1/|V|, , 1/|V|}; procedure UPDATE(s, d): return atomic_increment(p_next[d], p_curr[s] / degree(s)); procedure COMPUTE(i): p_next[i] = p_next[i] + (1- ) (1/|V|); diff[i] = abs(p_next[i] p_curr[i]); p_curr[i] = 0; return 1; procedure PageRank(G, , ): frontier = {0, , |V|-1}; error = while (error > ): frontier = EDGEMAP(G, frontier, UPDATE, CONDtrue); frontier = VERTEXMAP(frontier, COMPUTE); error = sum of diff entries; swap(p_curr, p_next) return p_curr;
15 HPGM 2015 PageRank Sparse version? PageRank-Delta: Only update vertices whose PageRank value has changed by more than some -fraction (discussed in GraphLab and McSherry WWW 05)
16 HPGM 2015 PageRank-Delta in Ligra PR[i] = {1/|V|, , 1/|V|}; nghSum = {0, , 0}; Change = {}; //store changes in PageRank values procedure UPDATE(s, d): return atomic_increment(nghSum[d], Change[s] / degree(s)); //passed to EdgeMap procedure COMPUTE(i): Change[i] = nghSum[i]; PR[i] = PR[i] + Change[i]; return (abs(Change[i]) > ); //passed to VertexMap //check if absolute value of change is big enough
17 HPGM 2015 Ligra Performance Twitter graph (41M vertices, 1.5B edges) 6 244 seconds (64 x 32-cores) Running time (seconds) (16 x 8-cores) 5 (64 x 8-cores) GraphLab 4 3 Ligra (40-core machine) 2 1 Hand-written Cilk/OpenMP (40-core machine) 0 Page Rank (1 iteration) BFS Connected Components Ligra performance close to hand-written code Faster than distributed-memory on per-core basis Several shared-memory graph processing systems subsequently developed: Galois [SOSP 13], X-stream [SOSP 13], PRISM [SPAA 14], Polymer [PPoPP 15], Ringo [SIGMOD 15]
18 HPGM 2015 Large Graphs Available RAM 6.6B edges Running Time ~20B edges (latest version) ~150B edges Memory Required All fit in a Terabyte of memory; can fit on commodity shared memory machine What if you don t have that much RAM?
19 HPGM 2015 Ligra+: Adding Graph Compression to Ligra Difference encoding (using variable-length codes) for sorted edges per vertex Modify EdgeMap: parallel edge decoding on-the-fly All hidden from the user!
20 HPGM 2015 Ligra+: Adding Graph Compression to Ligra Space relative to Ligra 40-core time relative to Ligra 1.4 1 0.9 1.3 0.8 1.2 0.7 Ligra 1.1 0.6 1 0.5 0.4 0.9 0.3 0.8 Ligra+ 0.2 0.7 0.1 0.6 0 Cost of decoding on-the-fly? Memory bottleneck a bigger issue as graph algorithms are memory-bound
21 HPGM 2015 Conclusion Ligra: lightweight graph processing framework for shared-memory frontier-based algorithms Switches computation based on frontier size Ligra+: extension which incorporates graph compression Reduces space usage and improves parallel performance Code: http://github.com/jshun/ligra An Evaluation of Parallel Eccentricity Estimation Algorithms on Undirected Real-World Graphs [KDD 2015] RT06 Tue 13:00-15:00 Social and Graphs 2 Thank you!