HTS: A Multithreaded Direct Sparse Triangular Solver

HTS: A Multithreaded Direct Sparse Triangular Solver
Slide Note
Embed
Share

This article discusses a multithreaded direct sparse triangular solver that combines level scheduling and recursive blocking techniques. The problem statement, solution approach, algorithms, and implementation details are covered, focusing on efficiency in handling sequential RHS with the same nonzero pattern. Various algorithms such as level scheduling, pruned point-to-point, recursive blocking, and hybrid approaches are explored. The current method entails smoothed level size with a threshold using OpenMP and C++98.

  • Multithreaded
  • Sparse Triangular
  • Solver
  • Level Scheduling
  • Recursive Blocking

Uploaded on Feb 23, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. HTS: A Multithreaded Direct Sparse Triangular Solver Combining Level Scheduling and Recursive Blocking Andrew M. Bradley Thanks: E. Boman, J. Booth, C. Dohrmann, S. Hammond, W. Held, M. Heroux, R. Hoekstra, K. Kim, S. Olivier, A. Prokopenko, S. Rajamanickam Trilinos User Group Meeting 2015 SAND2015-9187 C Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND NO. 2011-XXXXP

  2. Problem Statement Solve R P T Q x = b Upper or lower sparse triangular matrix T Row scaling R Permutations P, Q Solution and RHS x, b (Everything that is needed for LDL, LU, incomplete factorizations, etc.) Efficient to absorb user data For many sequential RHS with Same T or Same nonzero pattern pat(T) On a multi/many-core node 2

  3. Solution Approach Symbolic phase Find parallelism in pat(T), the graph of T Numeric phase Load data structures with numbers Solve phase 3

  4. Algorithms: Level Scheduling Reorder 4

  5. Algorithms: Level Scheduling Reorder 5

  6. Algorithms: Pruned Point-to-Point Park, J., M. Smelyanskiy, N. Sundaram, and P. Dubey., "Sparsifying synchronization for high-performance shared-memory sparse triangular solver." In Supercomputing, pp. 124-140. Springer International Publishing, 2014. 6

  7. Algorithms: Recursive Blocking 7

  8. Algorithms: Hybrid 8

  9. Algorithms: Hybrid Reorder 9

  10. HTS Level schedule solve: Park et al. s method Recursive blocking Also with point-to-point Switching method Currently: smoothed level size with threshold Currently: OpenMP and C++98 10

  11. Algorithms: Hybrid Reorder 11

  12. 12

  13. 13

  14. UMFPACK LU on IB and KNC 14

  15. Ordering 15

  16. IC-0 (NodeND) on IB and KNC 16

  17. 17

  18. Future Work Point-to-point level scheduling Group rows into tasks to minimize #dependencies Size tasks to reflect level of synchronization Hybrid Switching method(s) Does not have to be 3 blocks; alternate HTS Kokkos interface Support a variety of input formats Direct sparse methods on GPU? 18

More Related Content