Understanding Buffer Overflow in Computer Programming
This content delves into the intricacies of buffer overflow vulnerabilities in computer programming, showcasing real-world examples such as the Internet Worm and IM War incidents. It covers topics like stack buffer overflow exploits, Unix function implementations prone to buffer overflow, and the repercussions of vulnerable buffer code execution. Practical demonstrations highlight the dangers of unchecked data input and the potential for memory corruption issues leading to system crashes.
Uploaded on Sep 19, 2024 | 1 Views
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
CS 105 Tour of Black Holes of Computing Machine-Level Programming V: Miscellaneous Topics Topics Buffer Overflow Linux Memory Layout Understanding Pointers Floating-Point Code
Internet Worm and IM War November, 1988 Internet Worm attacks thousands of Internet hosts. How did it happen? July, 1999 Microsoft launches MSN Messenger (instant messaging system). Messenger clients can access popular AOL Instant Messaging Service (AIM) servers AIM client MSN server MSN client AIM server AIM client 105 2
Internet Worm and IM War (cont.) August 1999 Mysteriously, Messenger clients can no longer access AIM servers. Microsoft and AOL begin the IM war: AOL changes server to disallow Messenger clients Microsoft makes changes to clients to defeat AOL changes. At least 13 such skirmishes. How did it happen? The Internet Worm and AOL/Microsoft War were both based on stack buffer overflow exploits! Many Unix functions do not check argument sizes. Allows target buffers to overflow. 105 3
String Library Code Implementation of Unix function gets No way to specify limit on number of characters to read /* Get string from stdin */ char *gets(char *dest) { int c = getc(); char *p = dest; while (c != EOF && c != '\n') { *p++ = c; c = getc(); } *p = '\0'; return dest; } Similar problems with other Unix functions strcpy: Copies string of arbitrary length scanf, fscanf, sscanf, when given %s conversion specification 105 4
Vulnerable Buffer Code /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } int main() { printf("Type a string: "); echo(); return 0; } 105 5
Buffer Overflow Executions unix>./bufdemo Type a string:123 123 unix>./bufdemo Type a string:12345 Segmentation Fault unix>./bufdemo Type a string:12345678 Segmentation Fault 105 6
Buffer Overflow Stack Stack Frame for main /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ gets(buf); puts(buf); } Return Address Saved %ebp [3][2][1][0] buf %ebp Stack Frame for echo echo: pushl %ebp movl %esp,%ebp subl $20,%esp pushl %ebx addl $-12,%esp leal -4(%ebp),%ebx # Compute buf as %ebp-4 pushl %ebx # Push buf on stack call gets # Call gets . . . # Save %ebp on stack # Allocate space on stack # Save %ebx # Allocate space on stack 105 7
unix> gdb bufdemo (gdb) break echo Breakpoint 1 at 0x8048583 (gdb) run Breakpoint 1, 0x8048583 in echo () (gdb) print /x *(unsigned *)$ebp $1 = 0xfffff8f8 (gdb) print /x *((unsigned *)$ebp + 1) $3 = 0x804864d Buffer Overflow Stack Example Stack Frame for main Stack Frame for main Before call to gets Return Address Saved %ebp [3][2][1][0] buf Return Address Saved %ebp [3][2][1][0] buf xx xx xx xx 08 04 86 4d ff ff f8 f8 %ebp 0xfffff8d8 Stack Frame for echo Stack Frame for echo 8048648: call 804857c <echo> 804864d: mov 0xffffffe8(%ebp),%ebx # Return Point 105 8
Buffer Overflow Example #1 Before Call to gets Input = 123 Stack Frame for main Stack Frame for main Return Address Saved %ebp [3][2][1][0] buf Return Address Saved %ebp [3][2][1][0] buf 00 33 32 31 08 04 86 4d ff ff f8 f8 %ebp 0xfffff8d8 Stack Frame for echo Stack Frame for echo No Problem 105 9
Buffer Overflow Stack Example #2 Input = 12345 Stack Frame for main Stack Frame for main Return Address Saved %ebp [3][2][1][0] buf Return Address Saved %ebp [3][2][1][0] buf 34 33 32 31 08 04 86 4d %ebp ff ff 00 35 0xfffff8d8 Saved value of %ebp set to 0xffff0035 Stack Frame for echo Stack Frame for echo Bad news when later attempt to restore %ebp echo code: 8048592: push %ebx 8048593: call 80483e4 <_init+0x50> # gets 8048598: mov 0xffffffe8(%ebp),%ebx 804859b: mov %ebp,%esp 804859d: pop %ebp 804859e: ret # %ebp gets set to invalid value 105 10
Buffer Overflow Stack Example #3 Stack Frame for main Stack Frame for main Input = 12345678 Return Address Saved %ebp [3][2][1][0] buf Return Address Saved %ebp [3][2][1][0] buf 34 33 32 31 08 04 86 00 %ebp 38 37 36 35 0xfffff8d8 Stack Frame for echo Stack Frame for echo %ebp and return address corrupted Invalid address No longer pointing to desired return point 8048648: call 804857c <echo> 804864d: mov 0xffffffe8(%ebp),%ebx # Return Point 105 11
Malicious Use of Buffer Overflow Stack after call to gets() void foo(){ bar(); ... } foo stack frame return address A B data written by gets() void bar() { char buf[64]; gets(buf); ... } pad bar stack frame exploit code B Input string contains byte representation of executable code Overwrite return address with address of buffer When bar() executes ret, will jump to exploit code 105 12
Exploits Based on Buffer Overflows Buffer overflow bugs allow remote machines to execute arbitrary code on victim machines. Internet worm Early versions of the finger server (fingerd) used gets() to read the argument sent by the client: finger geoff@cs.hmc.edu Worm attacked fingerd server by sending phony argument: finger exploit-code padding new-return-address exploit code: executed a root shell on the victim machine with a direct TCP connection to the attacker. 105 13
Exploits Based on Buffer Overflows Buffer overflow bugs allow remote machines to execute arbitrary code on victim machines. IM War AOL exploited existing buffer overflow bug in AIM clients Exploit code: returned 4-byte signature (the bytes at some location in the AIM client) to server. When Microsoft changed code to match signature, AOL changed signature location. 105 14
Avoiding Overflow Vulnerability /* Echo Line */ void echo() { char buf[4]; /* Way too small! */ fgets(buf, sizeof buf, stdin); fputs(buf, stdout); } Use library routines that limit string lengths fgets instead of gets strncpy instead of strcpy Or strlcpy if available (see man strcpy and use lbsd) Don t use scanf with %s conversion specification Use fgets to read the string 105 15
Linux Memory Layout FF Stack Stack Runtime stack C0 Heap BF (Stack) Dynamically allocated storage When call malloc, calloc,realloc, new Upper 2 hex digits of address DLLs 80 Dynamically Linked (Shared) Libraries Library routines (e.g., printf, malloc) Linked into object code when first executed 7F Heap Data Statically allocated data E.g., arrays & strings declared in code DLLs 40 3F Heap Text Data Executable machine instructions Read-only Text 08 00 105 16
Linux Memory Allocation Some Heap Stack More Heap Initially Stack Linked FF Stack FF FF FF Stack 80 80 80 80 7F 7F 7F 7F Heap Heap DLLs DLLs 40 DLLs 40 40 40 3F 3F 3F 3F Heap Data Data Data Data Text Text Text Text 08 08 08 08 00 00 00 00 105 17
Text & Stack Example Initially Stack FF (gdb) break main (gdb) run Breakpoint 1, 0x804856f in main () (gdb) print $esp $3 = (void *) 0xfffffc78 80 7F Main Address 0x804856f should be read 0x0804856f 40 3F Stack Data Address 0xfffffc78 Text 08 00 105 18
Dynamic Linking Example Linked (gdb) print malloc $1 = {<text variable, no debug info>} 0x8048454 <malloc> (gdb) run Program exited normally. (gdb) print malloc $2 = {void *(unsigned int)} 0x40006240 <malloc> Stack FF 80 7F Initially Code in text segment that invokes dynamic linker Address 0x8048454 should be read 0x08048454 DLLs 40 3F Data Text 08 Final 00 Code in DLL region 105 19
Memory Allocation Example char big_array[1<<24]; /* 16 MB */ char huge_array[1<<28]; /* 256 MB */ int beyond; char *p1, *p2, *p3, *p4; int useless() { return 0; } int main() { p1 = malloc(1 << 28); /* 256 MB */ p2 = malloc(1 << 8); /* 256 B */ p3 = malloc(1 << 28); /* 256 MB */ p4 = malloc(1 << 8); /* 256 B */ /* Some print statements ... */ } 105 20
Example Addresses FF Stack $esp p3 p1 Final malloc p4 p2 beyond big_array huge_array main() useless() Initial malloc 0xfffffc78 0x500b5008 0x400b4008 0x40006240 0x1904a640 0x1904a538 0x1904a524 0x1804a520 0x0804a510 0x0804856f 0x08048560 0x08048454 80 7F Heap DLLs 40 3F Heap Data Text 08 00 105 21
C Operators Operators () [] -> . ! ~ ++ -- + - * & (type) sizeof right to left * / % + - << >> < <= > >= == != & ^ | && || ?: = += -= *= /= %= &= ^= != <<= >>= , Associativity left to right left to right left to right left to right left to right left to right left to right left to right left to right left to right left to right right to left right to left left to right Note: Unary +, -, and * have higher precedence than binary forms 105 22
C Pointer Declarations p is a pointer to int int *p p is an array[13] of pointer to int int *p[13] p is an array[13] of pointer to int int *(p[13]) p is a pointer to a pointer to an int int **p p is a pointer to an array[13] of int int (*p)[13] f is a function returning a pointer to int int *f() f is a pointer to a function returning int int (*f)() f is a function returning ptr to an array[13] of pointers to functions returning int int (*(*f())[13])() x is an array[3] of pointers to functions returning pointers to array[5] of ints int (*(*x[3])())[5] 105 23
Data Representations: IA32 + x86-64 Sizes of C Objects (in Bytes) C Data Type Typical 32-bit Intel IA32 unsigned int int long int char short long long float double long double char * Or any other pointer x86-64 4 4 4 1 2 8 4 8 8 4 4 4 4 1 2 8 4 8 4 4 8 1 2 8 4 8 10/12 16 8 4 105 24
x86-64 Integer Registers %ax ah Oh. My. al %r8w %r8w r8b %rax %eax %r8 %r8d %bx bh bl %r9w %r9w r9b %rbx %ebx %r9 %r9d %cx ch cl %r10w %r10w %rcx %ecx %r10 %r10d r10b %dx %dx dh dl %r11w %r11w %rdx %edx %r11 %r11d r11b %si %si sil %r12w %r12w %rsi %esi %r12 %r12d r12b %di %di dil %r13w %rdi %edi God. %r13 %r13d r13b %sp %sp spl %r14w %rsp %esp %r14 %r14d r14b %bp %bp bpl %r15w %rbp %ebp %r15 %r15d r15b Extend existing registers. Add 8 new ones. Make %ebp/%rbp general purpose 105 25
x86-64 Integer Registers %rax %eax %r8 %r8d %rbx %ebx %r9 %r9d %rcx %ecx %r10 %r10d %rdx %edx %r11 %r11d %rsi %esi %r12 %r12d %rdi %edi %r13 %r13d %rsp %esp %r14 %r14d %rbp %ebp %r15 %r15d Extend existing registers. Add 8 new ones. Make %ebp/%rbp general purpose 105 26
Instructions Long word l (4 Bytes) Quad word q (8 Bytes) New instructions: movl movq addl addq sall salq movzbq, movslq etc. 32-bit instructions that generate 32-bit results: Set higher order bits of destination register to 0 Example: addl, movl (thus no movzlq) 105 27
Swap in 32-bit Mode swap: void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } pushl %ebp movl %esp,%ebp pushl %ebx Setup movl 12(%ebp),%ecx movl 8(%ebp),%edx movl (%ecx),%eax movl (%edx),%ebx movl %eax,(%edx) movl %ebx,(%ecx) Body movl -4(%ebp),%ebx movl %ebp,%esp popl %ebp ret Finish 105 28
Swap in 64-bit Mode void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } swap: movl movl movl movl retq (%rdi), %edx (%rsi), %eax %eax, (%rdi) %edx, (%rsi) Operands passed in registers (why useful?) First (xp) in %rdi, second (yp) in %rsi 64-bit pointers No stack operations required 32-bit data Data held in registers %eax and %edx movl operation 105 29
Swap Long Ints in 64-bit Mode void swap_l (long int *xp, long int *yp) { long int t0 = *xp; long int t1 = *yp; *xp = t1; *yp = t0; } swap_l: movq movq movq movq retq (%rdi), %rdx (%rsi), %rax %rax, (%rdi) %rdx, (%rsi) 64-bit data Data held in registers %rax and %rdx movq operation Otherwise same 105 30
New Calling Conventions Most procedures no longer need stack frame First six arguments passed in registers Register %rbp available for general use Stack frame accessed via %rsp 128 bytes below %rsp usable by function ( red zone ) 105 31
IA32 Floating Point History 8086: first computer to implement IEEE FP separate 8087 FPU (floating point unit) Instruction decoder and sequencer 486: merged FPU and Integer Unit onto one chip Summary Hardware to add, multiply, and divide Floating point data registers Various control & status registers Integer Unit FPU Floating Point Formats Single precision (C float): 32 bits Double precision (C double): 64 bits Extended precision (C long double): 80 bits Memory 105 32
FPU Data Register Stack FPU register format (extended precision) 0 79 78 64 63 s exp frac FPU registers 8 registers Logically forms shallow stack Top called %st(0) When push too many, bottom values disappear %st(3) %st(2) %st(1) %st(0) Top stack grows down 105 33
FPU instructions Large number of floating point instructions and formats ~50 basic instruction types load, store, add, multiply sin, cos, tan, arctan, and log! Sample instructions: Instruction fldz flds Addr fmuls Addr faddp Effect push 0.0 push M[Addr] %st(0) <- %st(0)*M[Addr] %st(1) <- %st(0)+%st(1); pop Description Load zero Load s.p. real Multiply Add and pop 105 34
Final Observations Buffer Overflow Big C design mistake; most common security problem Memory Layout OS/machine dependent (including kernel version) Stack/data/text/heap/DLL found in most machines Type Declarations in C Notation obscure, but systematic IA64 New registers, completely different calling conventions IA32 Floating Point Strange shallow stack architecture 105 35