Making Sense of Spark Performance at UC Berkeley

Slide Note
Embed
Share

PhD student at UC Berkeley presents an overview of Spark performance, discussing measurement techniques, performance bottlenecks, and in-depth analysis of workloads using a performance analysis tool. Various concepts such as caching, scheduling, stragglers, and network performance are explored in the context of large-scale distributed systems.


Uploaded on Oct 09, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Making Sense of Spark Performance eecs.berkeley.edu/~keo/traces Kay Ousterhout UC Berkeley In collaboration with Ryan Rasti, Sylvia Ratnasamy, Scott Shenker, and Byung-Gon Chun

  2. About Me PhD student in Computer Science at UC Berkeley Thesis work centers around performance of large-scale distributed systems Spark PMC member

  3. About This Talk Overview of how Spark works How we measured performance bottlenecks In-depth performance analysis for a few workloads Demo of performance analysis tool

  4. Count the # of words in the document I am Sam I am Sam 6 Cluster of machines 6 Sam I am Do you like Spark driver: 6+6+4+5 = 21 4 Green eggs and ham? Thank you, Sam I am 5 Spark (or Hadoop/Dryad/etc.) task

  5. Count the # of occurrences of each word {I: 4, you: 2, } {I: 2, am: 2, } I am Sam I am Sam {Sam: 1, I: 1, } {am: 4, Green: 1, } Sam I am Do you like MAP REDUCE {Green: 1, eggs: 1, } {Sam: 4, } Green eggs and ham? {Thank: 1, eggs: 1, } {Thank: 1, you: 1, } Thank you, Sam I am

  6. Performance considerations I am Sam I am Sam (1) Caching input data Sam I am Do you like (2) Scheduling: assigning tasks to machines Green eggs and ham? (1) Straggler tasks (2) Network performance (e.g., during shuffle) Thank you, Sam I am

  7. CachingPACMan [NSDI 12], Spark [NSDI 12], Tachyon [SoCC 14] SchedulingSparrow [SOSP 13], Apollo [OSDI 14], Mesos [NSDI 11], DRF [NSDI 11], Tetris [SIGCOMM 14], Omega [Eurosys 13], YARN [SoCC 13], Quincy [SOSP 09], KMN [OSDI 14] StragglersScarlett [EuroSys 11], SkewTune [SIGMOD 12], LATE [OSDI 08], Mantri [OSDI 10], Dolly [NSDI 13], GRASS [NSDI 14], Wrangler [SoCC 14] NetworkVL2 [SIGCOMM 09], Hedera [NSDI 10], Sinbad [SIGCOMM 13], Orchestra [SIGCOMM 11], Baraat [SIGCOMM 14], Varys [SIGCOMM 14], PeriSCOPE [OSDI 12], SUDO [NSDI 12], Camdoop [NSDI 12], Oktopus [SIGCOMM 11]), EyeQ [NSDI 12], FairCloud [SIGCOMM 12] Generalized programming model Dryad [Eurosys 07], Spark [NSDI 12]

  8. CachingPACMan [NSDI 12], Spark [NSDI 12], Tachyon [SoCC 14] SchedulingSparrow [SOSP 13], Apollo [OSDI 14], Mesos [NSDI 11], DRF [NSDI 11], Tetris [SIGCOMM 14], Omega [Eurosys 13], YARN [SoCC 13], Quincy [SOSP 09], KMN [OSDI 14] StragglersScarlett [EuroSys 11], SkewTune [SIGMOD 12], LATE [OSDI 08], Mantri [OSDI 10], Dolly [NSDI 13], GRASS [NSDI 14], Wrangler [SoCC 14] Stragglers are a major issue with unknown causes Network and disk I/O are bottlenecks NetworkVL2 [SIGCOMM 09], Hedera [NSDI 10], Sinbad [SIGCOMM 13], Orchestra [SIGCOMM 11], Baraat [SIGCOMM 14], Varys [SIGCOMM 14], PeriSCOPE [OSDI 12], SUDO [NSDI 12], Camdoop [NSDI 12], Oktopus [SIGCOMM 11]), EyeQ [NSDI 12], FairCloud [SIGCOMM 12] Generalized programming model Dryad [Eurosys 07], Spark [NSDI 12]

  9. This Work (1) Methodology for quantifying performance bottlenecks (2) Bottleneck measurement for 3 SQL workloads (TPC-DS and 2 others)

  10. Network optimizations can reduce job completion time by at most 2% CPU (not I/O) often the bottleneck Most straggler causes can be identified and fixed

  11. Example Spark task: network read compute disk write time : time to handle one record Fine-grained instrumentation needed to understand performance

  12. How much faster would a job run if the network were infinitely fast? What s an upper bound on the improvement from network optimizations?

  13. How much faster could a task run if the network were infinitely fast? network read compute disk write Original task runtime : blocked on network : blocked on disk compute Task runtime with infinitely fast network

  14. How much faster would a job run if the network were infinitely fast? time Task 0 Task 2 2 slots : time blocked on network Task 1 to: Original job completion time Task 0 2 slots Task 1 Task 2 tn: Job completion time with infinitely fast network

  15. SQL Workloads TPC-DS (20 machines, 850GB; 60 machines, 2.5TB) www.tpc.org/tpcds Big Data Benchmark (5 machines, 60GB) amplab.cs.berkeley.edu/benchmark Databricks (9 machines, tens of GB) databricks.com 2 versions of each: in-memory, on-disk

  16. How much faster could jobs get from optimizing network performance? Percentiles 95 75 50 25 5 Median improvement at most 2%

  17. How can we sanity check these numbers?

  18. How much data is transferred per CPU second? Microsoft 09- 10: 1.9 6.35 Mb / task second Google 04- 07: 1.34 1.61 Mb / machine second

  19. How can this be true? Shuffle Data < Input Data

  20. What kind of hardware should I buy? 10Gbps networking hardware likely not necessary!

  21. How much faster would jobs complete if the disk were infinitely fast?

  22. How much faster could jobs get from optimizing disk performance? Median improvement at most 19%

  23. Disk Configuration Our instances: 2 disks, 8 cores Cloudera: At least 1 disk for every 3 cores As many as 2 disks for each core Our instances are under provisioned results are upper bound

  24. How much data is transferred per CPU second? Google: 0.8-1.5 MB / machine second Microsoft: 7-11 MB / task second

  25. What does this mean about Spark versus Hadoop? This work: 19% serialized + compressed in-memory data serialized + compressed on-disk data Faster

  26. This work says nothing about Spark vs. Hadoop! This work: 19% 6x or more amplab.cs.berkeley.e du/benchmark/ up to 10x spark.apache.org serialized + compressed in-memory data serialized + compressed on-disk data deserialized in-memory data (on-disk data) Faster

  27. What causes stragglers? Takeaway: causes depend on the workload, but disk and garbage collection common Fixing straggler causes can speed up other tasks too

  28. Live demo

  29. eecs.berkeley.edu/~keo/traces

  30. I want your workloads! spark.eventLog.enabled true keo@cs.berkeley.edu

  31. Network optimizations can reduce job completion time by at most 2% CPU (not I/O) often the bottleneck 19% reduction in completion time from optimizing disk Many straggler causes can be identified and fixed Project webpage (with links to paper and tool): eecs.berkeley.edu/~keo/traces Contact: keo@cs.berkeley.edu, @kayousterhout

  32. Backup Slides

  33. How do results change with scale?

  34. How does the utilization compare?

Related


More Related Content