Efficient Dynamic Memory Management for Embedded Multicore Systems

 
A Transaction-Friendly Dynamic
Memory Manager for
Embedded Multicore Systems
 
Maurice Herlihy
Joint with
Thomas Carle , Dimitra Papagiannopoulou
Iris Bahar , Tali Moreshet
 
Modern Embedded Systems
 
now many core, distributed memory
Parallel data structures
Locks don’t scale
Transactions do (más o menos)
 
2
 
 
 
Dynamic Memory Management
 
3
 
Software’s « oldest profession>
Usually provided by OS/Libraries
For parallel data structures
on embedded platforms
with HTM
 
High-End Embedded Systems
 
4
 
simplicity
small memory footprint
resource needs
roughly known in advance
but not always exactly
 
 
Dynamic Memory Management
 
Heap is linked list  or binary tree
Applications explicitly 
malloc() 
and 
free() 
memory
 
 
5
Size: 16 bytes
Start: 0x0800000
Size: 64 bytes
Start: 0x0800100
Principles of dynamic memory management
Allocate a 32-byte chunk
6
Size: 16 bytes
Start: 0x0800000
Size: 64 bytes
Start: 0x0800100
 
Too small
 
Large enough
 
Dynamic Memory Management
 
Allocate a 32-byte chunk
 
7
Size: 16 bytes
Start: 0x0800000
Size: 64 bytes
Start: 0x0800100
 
Dynamic Memory Management
 
Allocate a 32-byte chunk
 
8
Size: 16 bytes
Start: 0x0800000
Size: 64 bytes
Start: 0x0800100
Size: 32 Bytes
Start: 0x0800100
Size: 32 Bytes
Start: 0x0800120
Principles of dynamic memory management
Allocate a 32-byte chunk
9
Size: 16 bytes
Start: 0x0800000
Size: 64 bytes
Start: 0x0800100
Size: 32 Bytes
Start: 0x0800100
Size: 32 Bytes
Start: 0x0800120
Principles of dynamic memory management
Allocate a 32-byte chunk
10
Size: 16 bytes
Start: 0x0800000
Size: 32 Bytes
Start: 0x0800100
Size: 32 Bytes
Start: 0x0800120
Freeing is similar …
 
Dynamic Memory Management
 
is a 
concurrent data structure 
problem
steps must be 
atomic
 
 
11
 
Fast Path
 
Thread-local pools
successful 
malloc()
 
and 
free()
 
need no synchronization
Calls happen 
inside
 
transaction
Speculative allocation can speed up the execution
 
 
12
 
Slow Path
 
If local pool exhausted,
must allocate from 
shared heap
Allocate 
within
 
transaction
means 
more conflicts
Instead:
1.
abort current transaction
2.
allocate fresh local pool
3.
restart transaction
 
13
 
Benefits
 
Transactions have 
smaller footprints
Disentangles
application
shared heap management
Transactions more likely to commit
 
14
 
Transaction-friendly memory management
 
At application startup:
1.
One thread 
initializes
 
the heap
2.
Each thread allocates its own 
local pool
 
15
Transaction-friendly dynamic memory
management
16
Enter transaction
Next
Instruction
?
 
malloc()
Allocate from
local pool
Enough
memory?
Return allocated
chunk
 
yes
Abort transaction
Enter allocation
transaction
Free remaining
pool and allocate
fresh pool
Commit
allocation
transaction
 
no
Transaction-friendly dynamic memory
management
17
Enter transaction
Next
Instruction
?
 
free()
Free to local pool
 
 
Evaluation
 
STAMP Vacation benchmark
1, 2, 4, 8 
and 
16
 cores
Local pools large enough so no refill needed
Tested:
transactional with local pools
lock-based with local pools
lock-based without local pools
 
18
 
Evaluation
 
Vacation benchmark (Normal run)
 
19
 
worse
 
better
 
Evaluation (2)
 
Vacation benchmark (Refill mechanism on one core)
8
 cores
local pool sizes: 
64, 128, 256, 512, 1024,
 
2048
 bytes
Transactional synchronization
 
20
 
Evaluation (2)
 
Vacation benchmark (Refill mechanism on one core)
 
21
 
Evaluation (3)
 
Vacation benchmark (Refill mechanism on all but one core)
8
 cores
local pool sizes: 
64, 128, 256, 512, 1024
 or 
2048
 bytes
Transactional synchronization
 
22
 
Evaluation (3)
 
Vacation benchmark (Refill mechanism on all but one core)
 
23
 
worst case: 20% increase in
execution time
 
« Wall » between 512 and
1024 bytes: refills may
induce additional conflicts
 
Conclusions
 
Dynamic memory management for Embedded TM
Results
Better than locking, in-transaction malloc
More flexible than static allocation
Future work
More benchmarks
Explore new fallback reprovision strategies
Memory stealing between threads
 
24
Slide Note
Embed
Share

This content delves into the challenges of dynamic memory management in embedded multicore systems, emphasizing the importance of transaction-friendly approaches. It covers parallel data structures, the role of operating systems/libraries, and principles of memory allocation. Through illustrations and descriptions, the complexities and strategies involved in managing dynamic memory efficiently are explored.

  • Embedded Systems
  • Dynamic Memory Management
  • Multicore
  • Parallel Data Structures
  • Transaction-Friendly

Uploaded on Sep 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. A Transaction-Friendly Dynamic Memory Manager for Embedded Multicore Systems Maurice Herlihy Joint with Thomas Carle , Dimitra Papagiannopoulou Iris Bahar , Tali Moreshet

  2. Modern Embedded Systems now many core, distributed memory Parallel data structures Locks don t scale Transactions do (m s o menos) 2

  3. Dynamic Memory Management Software s oldest profession> Usually provided by OS/Libraries For parallel data structures on embedded platforms with HTM 3

  4. High-End Embedded Systems simplicity small memory footprint resource needs roughly known in advance but not always exactly 4

  5. Dynamic Memory Management Heap is linked list or binary tree Applications explicitly malloc() and free() memory Size: 16 bytes Start: 0x0800000 Size: 64 bytes Start: 0x0800100 5

  6. Principles of dynamic memory management Allocate a 32-byte chunk Size: 16 bytes Start: 0x0800000 Size: 64 bytes Start: 0x0800100 Large enough Too small 6

  7. Dynamic Memory Management Allocate a 32-byte chunk Size: 16 bytes Start: 0x0800000 Size: 64 bytes Start: 0x0800100 7

  8. Dynamic Memory Management Allocate a 32-byte chunk Size: 16 bytes Start: 0x0800000 Size: 64 bytes Start: 0x0800100 Size: 32 Bytes Start: 0x0800100 Size: 32 Bytes Start: 0x0800120 8

  9. Principles of dynamic memory management Allocate a 32-byte chunk Size: 16 bytes Start: 0x0800000 Size: 64 bytes Start: 0x0800100 Size: 32 Bytes Start: 0x0800100 Size: 32 Bytes Start: 0x0800120 9

  10. Principles of dynamic memory management Allocate a 32-byte chunk Size: 16 bytes Start: 0x0800000 Size: 32 Bytes Start: 0x0800120 Size: 32 Bytes Start: 0x0800100 10

  11. Dynamic Memory Management is a concurrent data structure problem steps must be atomic 11

  12. Fast Path Thread-local pools successful malloc() and free() need no synchronization Calls happen inside transaction Speculative allocation can speed up the execution 12

  13. Slow Path If local pool exhausted, must allocate from shared heap Allocate within transaction means more conflicts Instead: 1. abort current transaction 2. allocate fresh local pool 3. restart transaction 13

  14. Benefits Transactions have smaller footprints Disentangles application shared heap management Transactions more likely to commit 14

  15. Transaction-friendly memory management At application startup: 1. One thread initializes the heap 2. Each thread allocates its own local pool 15

  16. Transaction-friendly dynamic memory management Commit allocation transaction Free remaining pool and allocate fresh pool Enter allocation transaction Abort transaction no malloc() Next Enough memory? Allocate from local pool Instruction ? Enter transaction yes Return allocated chunk 16

  17. Transaction-friendly dynamic memory management free() Next Instruction ? Enter transaction Free to local pool 17

  18. Evaluation STAMP Vacation benchmark 1, 2, 4, 8 and 16 cores Local pools large enough so no refill needed Tested: transactional with local pools lock-based with local pools lock-based without local pools 18

  19. Evaluation Vacation benchmark (Normal run) worse better 19

  20. Evaluation (2) Vacation benchmark (Refill mechanism on one core) 8 cores local pool sizes: 64, 128, 256, 512, 1024, 2048 bytes Transactional synchronization 20

  21. Evaluation (2) Vacation benchmark (Refill mechanism on one core) 21

  22. Evaluation (3) Vacation benchmark (Refill mechanism on all but one core) 8 cores local pool sizes: 64, 128, 256, 512, 1024 or 2048 bytes Transactional synchronization 22

  23. Evaluation (3) Vacation benchmark (Refill mechanism on all but one core) worst case: 20% increase in execution time Wall between 512 and 1024 bytes: refills may induce additional conflicts 23

  24. Conclusions Dynamic memory management for Embedded TM Results Better than locking, in-transaction malloc More flexible than static allocation Future work More benchmarks Explore new fallback reprovision strategies Memory stealing between threads 24

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#