Enhancing Replay Interface Efficiency in System Debugging

Microsoft Research Asia
Ming Wu
, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo,
Huayang Guo, Lidong Zhou, Zheng Zhang
MIT
Fan Long, Xi Wang, Zhilei Xu
Outline
Motivation
Observation
Challenges
Modeling Replay Interface
Generating Replay Interface
Record and Replay
Evaluation
Conclusion
Motivation
 
Replay is important due to non-determinism
Caused by time, user input, network I/O, thread
interleaving
Makes postmortem debug hard
Existing replay tools
Incurs significant overhead: interposition & logging
Hard to be adopted, especially for deployed system
How to mitigate recording overhead?
Using 
efficient
 way to find the 
necessary
 information
to log
Observation
Replay interface
 between program and environment
Only part of the program needs to be replayed
neon’s
routine
status->code
req->respbuf
recv
*respbuf
(
1MB
)
atoi
return value
 
struct ne_request {
    ne_status status;
    char respbuf[];
};
 
int read_status_line(
        ne_request *req,
        ne_status *status, ...)
{
    ne_sock_readline(…, req->respbuf, …);
    if (...) 
status->code = atoi(buffer + 4);
    else if (ne_parse_statusline(buffer, status))
    {...}
}
 
4B
 
1MB
Challenges
 
 
 
 
 
 
Finding a complete replay interface
Finding a replay interface incurring low recording
overhead
 
Outline
Motivation
Observation
Challenges
Modeling Replay Interface
Generating Replay Interface
Record and Replay
Evaluation
Conclusion
Execution Flow Graph(EFG)
 
f() {
  cnt = 0;
  g(&cnt);
  printf("%d\n", cnt);
  g(&cnt);
  printf("%d\n", cnt);
}
g(int *p) {
  a = random(); *p += a;
}
 
// execution
1 cnt
1
 
<- 
0
2 a
1
 
<- 
random()
3 cnt
2
 
<- 
cnt
1
 + a
1
4 print cnt
2
5 a
2
 
<- 
random()
6 cnt
3
 
<- 
cnt
2
 + a
2
7 print cnt
3
Inst1
Inst2
Inst3
Inst5
cnt
1
a
1
cnt
2
a
2
Inst6
cnt
3
Inst7
Inst4
 
Cut 1
 
Cut 2
 
replay target
 
non-deterministic
 
deterministic
Outline
Motivation
Observation
Challenges
Modeling Replay Interface
Generating Replay Interface
Record and Replay
Evaluation
Conclusion
Static Flow Graph
 
Replay interface on EFG only optimal for specific run
Sound approximation of execution flow graph
 
 
 
 
Scan whole program
Operation node for function, value node for variable
Interpret instruction as read/write
y = x + 1 
 read x and write y
Alias analysis
f
cnt
a
g
Functions without Source Code
 
Conservatively consider them as non-deterministic by
default
recv(fd, buf, len, flags)
Annotate functions to provide write edge
recv([in]fd, [out, bsize(return)] buf, [in]len, [in]flags)
Annotate functions as deterministic
Math functions: 
abs()
, 
sqrt()
Memory and string: 
memcpy()
, 
strcat()
Outline
Motivation
Observation
Challenges
Modeling Replay Interface
Generating Replay Interface
Record and Replay
Evaluation
Conclusion
Replay Runtime
 
Calls
Record call from function in non-replay space to
function which 
may
 be in replay space
Replay callee if it does belong to replay space
Writes
Where to issue writes
When to issue writes
Other Subtle Non-determinisms
 
Memory Management
Address of variables in replay space should not change
Separated deterministic memory pool for replay space
Separate stacks for replay and non-replay functions
Thread Interleaving
Synchronization log
Deterministic multi-threading
Evaluation
Implemented iTarget for C program on Windows
Using Phoenix compiler framework for
instrumentation
Benchmarks:
Apache HTTP Server, Berkeley DB, neon HTTP client,
wget, SPEC CINT2000
Modular and monolithic programs
Compared to R2 (OSDI 2008)
Apache HTTP server
 
 
less than 1%
slowdown
Conclusion
A model
Reduce the problem of finding an optimal replay
interface to that of finding the minimum cut in a data
flow graph
A system: iTarget
employ programming language techniques to achieve
both correctness and low recording overhead
Thanks! Q&A
Profiling
Results tend not to be sensitive to the profiling
workload scale
Related Work
Library-based replay tools:
RecPlay (TOCS 1999)
Flashback (USENIX ATC 2004)
Liblog (USENIX ATC 2006)
R2 (OSDI 2008)
Instruction level replay
iDNA (VEE 2006)
Other language runtime
Java, ML, MPI
Memory Management
 
Address of variables in replay space should not change
Variables allocated in heap
Non-replay space function may allocate memory
Separate deterministic memory pool for replay space
Variables allocated on stack
Record ESP at a call from
   non-replay to replay space
Reset the ESP during replay
 
ESP
 
Recorded ESP
 
lower address
 
higher address
Memory Management
Address of variables in replay space should not change
Variables allocated in heap
Non-replay space function may allocate memory
Separate deterministic memory pool for replay space
Variables allocated on stack
Record ESP at a call from
   non-replay to replay space
Reset the ESP during replay
ESP
Recorded ESP
Current ESP
lower address
higher address
Monolithic Program
Neon and Wget
Functions without Source Code
 
Conservatively consider them as non-deterministic by
default
recv(fd, buf, len, flags)
System global variable: 
errno
Annotate functions to provide write edge
recv([in]fd, [out, bsize(return)] buf, [in]len, [in]flags)
Annotate functions as deterministic
Math functions: 
abs()
, 
sqrt()
Memory and string: 
memcpy()
, 
strcat()
Memory Management
 
Address of variables in replay space should not change
Variables allocated in heap
Non-replay space function may allocate memory
Separate deterministic memory pool for replay space
Variables allocated on stack
Record ESP at a call from
   non-replay to replay space
Reset the ESP during replay
Thread Management
Thread interleaving introduce another source of non-
determinism
Guarantee same write order during replay as recording
run
Synchronization log
Track causal dependency
Utilize deterministic multi-threading model
Execution Flow Graph(EFG)
f() {
  cnt = 0;
  g(&cnt);
  printf("%d\n", cnt);
  g(&cnt);
  printf("%d\n", cnt);
}
g(int *p) {
  a = random(); *p += a;
}
// execution
1 cnt
1
 
<- 
0
2 a
1
 
<- 
random()
3 cnt
2
 
<- 
cnt
1
 + a
1
4 print cnt
2
5 a
2
 
<- 
random()
6 cnt
3
 
<- 
cnt
2
 + a
2
7 print cnt
3
f
cnt
1
a
1
cnt
2
a
2
cnt
3
g
1
g
2
 
Cut 2
Slide Note
Embed
Share

Efforts by researchers at Microsoft Research Asia and MIT focus on enhancing replay interface efficiency for system debugging. The motivation stems from the non-determinism challenges caused by time, user input, network I/O, and thread interleaving. The study observes that only certain parts of a program need to be replayed, posing unique challenges in finding a complete replay interface with low recording overhead. The execution flow graph illustrates deterministic and non-deterministic behaviours during replay. By addressing these challenges, the research aims to improve postmortem debugging and system reliability.

  • System debugging
  • Microsoft Research Asia
  • MIT
  • Replay interface
  • Non-determinism

Uploaded on Sep 21, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Microsoft Research Asia Ming Wu, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo, Huayang Guo, Lidong Zhou, Zheng Zhang MIT Fan Long, Xi Wang, Zhilei Xu

  2. Outline Motivation Observation Challenges Modeling Replay Interface Generating Replay Interface Record and Replay Evaluation Conclusion

  3. Motivation Replay is important due to non-determinism Caused by time, user input, network I/O, thread interleaving Makes postmortem debug hard Existing replay tools Incurs significant overhead: interposition & logging Hard to be adopted, especially for deployed system How to mitigate recording overhead? Using efficient way to find the necessary information to log

  4. Observation Replay interface between program and environment Only part of the program needs to be replayed struct ne_request { ne_status status; char respbuf[]; }; status->code neon s routine req->respbuf int read_status_line( ne_request *req, ne_status *status, ...) { ne_sock_readline( , req->respbuf, ); if (...) status->code = atoi(buffer + 4); else if (ne_parse_statusline(buffer, status)) {...} } return value recv 4B 1MB *respbuf (1MB) atoi

  5. Challenges Finding a complete replay interface Finding a replay interface incurring low recording overhead

  6. Outline Motivation Observation Challenges Modeling Replay Interface Generating Replay Interface Record and Replay Evaluation Conclusion

  7. Execution Flow Graph(EFG) f() { cnt = 0; g(&cnt); printf("%d\n", cnt); g(&cnt); printf("%d\n", cnt); } g(int *p) { a = random(); *p += a; } replay target cnt1 Inst1 a1 Inst2 deterministic Inst3 Cut 1 cnt2 Inst4 // execution 1 cnt1 <- 0 2 a1 <- random() 3 cnt2 <- cnt1 + a1 4 print cnt2 5 a2 <- random() 6 cnt3 <- cnt2 + a2 7 print cnt3 Cut 2 non-deterministic a2 Inst5 Inst6 cnt3 Inst7

  8. Outline Motivation Observation Challenges Modeling Replay Interface Generating Replay Interface Record and Replay Evaluation Conclusion

  9. Static Flow Graph Replay interface on EFG only optimal for specific run Sound approximation of execution flow graph cnt g f a Scan whole program Operation node for function, value node for variable Interpret instruction as read/write y = x + 1 read x and write y Alias analysis

  10. Functions without Source Code Conservatively consider them as non-deterministic by default recv(fd, buf, len, flags) Annotate functions to provide write edge recv([in]fd, [out, bsize(return)] buf, [in]len, [in]flags) Annotate functions as deterministic Math functions: abs(), sqrt() Memory and string: memcpy(), strcat()

  11. Outline Motivation Observation Challenges Modeling Replay Interface Generating Replay Interface Record and Replay Evaluation Conclusion

  12. Replay Runtime Calls Record call from function in non-replay space to function which may be in replay space Replay callee if it does belong to replay space Writes Where to issue writes Replayed When to issue writes f i x Downcall Upcall g h Non-replayed

  13. Other Subtle Non-determinisms Memory Management Address of variables in replay space should not change Separated deterministic memory pool for replay space Separate stacks for replay and non-replay functions Thread Interleaving Synchronization log Deterministic multi-threading

  14. Evaluation Implemented iTarget for C program on Windows Using Phoenix compiler framework for instrumentation Benchmarks: Apache HTTP Server, Berkeley DB, neon HTTP client, wget, SPEC CINT2000 Modular and monolithic programs Compared to R2 (OSDI 2008)

  15. Apache HTTP server less than 1% slowdown

  16. Conclusion A model Reduce the problem of finding an optimal replay interface to that of finding the minimum cut in a data flow graph A system: iTarget employ programming language techniques to achieve both correctness and low recording overhead

  17. Thanks! Q&A

  18. Profiling Results tend not to be sensitive to the profiling workload scale

  19. Related Work Library-based replay tools: RecPlay (TOCS 1999) Flashback (USENIX ATC 2004) Liblog (USENIX ATC 2006) R2 (OSDI 2008) Instruction level replay iDNA (VEE 2006) Other language runtime Java, ML, MPI

  20. Memory Management Address of variables in replay space should not change Variables allocated in heap Non-replay space function may allocate memory Separate deterministic memory pool for replay space Variables allocated on stack Record ESP at a call from non-replay to replay space Reset the ESP during replay lower address i i ESP Recorded ESP h Run- time g f f higher address Recording stack Replay stack

  21. Memory Management Address of variables in replay space should not change Variables allocated in heap Non-replay space function may allocate memory Separate deterministic memory pool for replay space Variables allocated on stack Current ESP Record ESP at a call from non-replay to replay space Reset the ESP during replay lower address i i ESP Recorded ESP h Run- time g f f higher address Recording stack Replay stack

  22. Monolithic Program Neon and Wget

  23. Functions without Source Code Conservatively consider them as non-deterministic by default recv(fd, buf, len, flags) System global variable: errno Annotate functions to provide write edge recv([in]fd, [out, bsize(return)] buf, [in]len, [in]flags) Annotate functions as deterministic Math functions: abs(), sqrt() Memory and string: memcpy(), strcat()

  24. Memory Management Address of variables in replay space should not change Variables allocated in heap Non-replay space function may allocate memory Separate deterministic memory pool for replay space Variables allocated on stack Record ESP at a call from non-replay to replay space Reset the ESP during replay

  25. Thread Management Thread interleaving introduce another source of non- determinism Guarantee same write order during replay as recording run Synchronization log Track causal dependency Utilize deterministic multi-threading model

  26. Execution Flow Graph(EFG) f() { cnt = 0; g(&cnt); printf("%d\n", cnt); g(&cnt); printf("%d\n", cnt); } g(int *p) { a = random(); *p += a; } cnt1 g1 a1 Cut 2 f cnt2 // execution 1 cnt1 <- 0 2 a1 <- random() 3 cnt2 <- cnt1 + a1 4 print cnt2 5 a2 <- random() 6 cnt3 <- cnt2 + a2 7 print cnt3 g2 a2 cnt3

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#