Linking in C Programming

undefined
Linking
Alan L. Cox
alc@rice.edu
Some slides adapted from CMU 15.213 slides
Objectives
 
Be able to answer the textbook problems 
 
Understand how C type attributes (e.g. static,
extern) control memory allocation for
variables
 
Be able to recognize some of the pitfalls when
developing modular programs
 
Appreciate how linking can help with
efficiency, modularity, and “evolvability”
 
C
o
x
L
i
n
k
i
n
g
2
C
o
x
L
i
n
k
i
n
g
3
Example Program (2 .c files)
/* main.c */
void swap(void);
int buf[2] = {1, 2};
int main(void)
{
  swap();
  return (0);
}
/* swap.c */
extern int buf[];
int *bufp0 = &buf[0];
int *bufp1;
void swap(void)
{
  int temp;
  bufp1  = &buf[1];
  temp   = *bufp0;
  *bufp0 = *bufp1;
  *bufp1 = temp;
}
An Analogy for Linking
C
o
x
L
i
n
k
i
n
g
4
C
o
x
L
i
n
k
i
n
g
5
Linking
 
Linking: collecting and combining various
pieces of code and data into a single file that
can be loaded into memory and executed
 
Why learn about linking?
It won’t make you a better jigsaw puzzle solver!
It will help you build large programs
It will help you avoid dangerous program errors
It will help you understand how language scoping
rules for variables are implemented
It will help you understand other important
concepts (that are covered later in the class)
It will enable you to exploit shared libraries
C
o
x
L
i
n
k
i
n
g
6
Compilation
UNIX% cc -v -O -g -o p main.c swap.c
cc1 -quiet -v main.c -quiet -dumpbase main.c -mtune=generic
-auxbase main -g -O -version -o 
/tmp/cchnheja.s
as -V -Qy -o 
/tmp/ccmNFRZd.o
 
/tmp/cchnheja.s
cc1 -quiet -v swap.c -quiet -dumpbase swap.c -mtune=generic
-auxbase swap -g -O -version -o 
/tmp/cchnheja.s
as -V -Qy -o 
/tmp/ccx8FECg.o
 
/tmp/ccheheja.s
collect2 --eh-frame-hdr –m elf_x86_64 --hash-style=gnu -dynamic-
linker /lib64/ld-linux-x86-64.so.2 -o p crt1.o crti.o
crtbegin.o –L
<..snip..>
 
/tmp/ccmNFRZd.o
 
/tmp/ccx8FECg.o
 –lgcc
--as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed
-lgcc_s --no-as-needed crtend.o crtn.o
Compiler: .c C source code to .s assembly code
Assembler: .s assembly code to .o relocatable object code
Linker: .o to executable
C
o
x
L
i
n
k
i
n
g
7
Compilation
m
a
i
n
.
c
s
w
a
p
.
c
ELF Format Files
Linking step
Executable
C source code
UNIX% cc -O –g -o p main.c swap.c
C
o
x
L
i
n
k
i
n
g
8
ELF (Executable Linkable Format)
Order & existence of
segments is arbitrary,
except ELF header must be
present and first
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
9
ELF Header
Basic description of file
contents:
File format identifier
Architecture
Endianness
Alignment requirement
for other sections
Location of other
sections
Code
s starting address
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
10
10
Program and Section Headers
Info about other sections
necessary for loading into
memory for execution
Required for
executables & libraries
Info about other sections
necessary for linking
Required for
relocatables
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
11
11
Text Section
Machine code (instructions)
read-only
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
12
12
Data Sections
Static data
initialized, read-only
initialized, read/write
uninitialized, read/write
(BSS =  
Block Started by
Symbol
 pseudo-op for
IBM 704)
Initialized
Initial values in ELF file
Uninitialized
Only total size in ELF file
Writable distinction enforced
at run-time
Why? Protection; sharing
How? Virtual memory
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
13
13
Symbol Table
Describes where global
variables and functions
are defined
Present in all relocatable
ELF files
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
/* main.c */
void swap(void);
int buf[2] = {1, 2};
int main(void)
{
  swap();
  return (0);
}
C
o
x
L
i
n
k
i
n
g
14
14
Relocation Information
Describes where and how
symbols are used
A list of locations in the
.text section that will
need to be modified
when the linker
combines this object file
with others
Relocation information
for any global variables
that are referenced or
defined by the module
Allows object files to be
easily relocated
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
15
15
Debug Section
Relates source code to the
object code within the ELF
file
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
16
16
Other Sections
Other kinds of sections
also supported, including:
Other debugging info
Version control info
Dynamic linking info
C++ initializing &
finalizing code
ELF header
0
Program header table
.text
.data
.bss
.symtab
.rel.text
.rel.data
.debug
Section header table
.rodata
C
o
x
L
i
n
k
i
n
g
17
17
Linker Symbol Classification
Global symbols
Symbols defined by module 
m
 that can be referenced by
other modules
C: non-
static
 functions & global variables
External symbols
Symbols referenced by module 
m
 but defined by some
other module
C: 
extern
 functions & variables
Local symbols
Symbols that are defined and referenced exclusively by
module 
m
C: 
static
 functions & variables
Local linker symbols 
 local function variables!
C
o
x
L
i
n
k
i
n
g
18
18
Linker Symbols
/* main.c */
void swap(void);
int buf[2] = {1, 2};
int main(void)
{
  swap();
  return (0);
}
/* swap.c */
extern int buf[];
int *bufp0 = &buf[0];
int *bufp1;
void swap(void)
{
  int temp;
  bufp1  = &buf[1];
  temp   = *bufp0;
  *bufp0 = *bufp1;
  *bufp1 = temp;
}
Definition of global
symbols 
buf
 and 
main
Reference to external
symbol 
swap
Definition of global
symbol 
swap
Definition of global 
symbols 
bufp0
 and 
bufp1
(even though not used
outside file)
Reference to external
symbol 
buf
Linker knows nothing
about local variables
C
o
x
L
i
n
k
i
n
g
19
19
Linker Symbols
What’
s missing?
swap – where is it?
/* main.c */
void swap(void);
int buf[2] = {1, 2};
int main(void)
{
  swap();
  return (0);
}
UNIX% cc -O -c main.c
UNIX% readelf -s main.o
 
Symbol table '.symtab' contains 11 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     8: 0000000000000000    19 FUNC    GLOBAL DEFAULT    1 main
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND swap
    10: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    3 buf
m
a
i
n
 
i
s
 
a
 
1
9
-
b
y
t
e
 
f
u
n
c
t
i
o
n
 
l
o
c
a
t
e
d
 
a
t
o
f
f
s
e
t
 
0
 
o
f
 
s
e
c
t
i
o
n
 
1
 
(
.
t
e
x
t
)
s
w
a
p
 
i
s
 
r
e
f
e
r
e
n
c
e
d
 
i
n
 
t
h
i
s
 
f
i
l
e
,
 
b
u
t
 
i
s
u
n
d
e
f
i
n
e
d
 
(
U
N
D
)
b
u
f
 
i
s
 
a
n
 
8
-
b
y
t
e
 
o
b
j
e
c
t
 
l
o
c
a
t
e
d
 
a
t
 
o
f
f
s
e
t
 
0
o
f
 
s
e
c
t
i
o
n
 
3
 
(
.
d
a
t
a
)
use 
readelf –S
 to see sections
C
o
x
L
i
n
k
i
n
g
20
20
Linker Symbols
What
s missing?
buf – where is it?
/* swap.c */
extern int buf[];
int *bufp0 = &buf[0];
int *bufp1;
void swap(void)
{
  int temp;
  bufp1  = &buf[1];
  temp   = *bufp0;
  *bufp0 = *bufp1;
  *bufp1 = temp;
}
Symbol table '.symtab' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     
8: 0000000000000000    38 FUNC    GLOBAL DEFAULT    1 swap
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND buf
    10: 0000000000000008     8 OBJECT  GLOBAL DEFAULT  COM bufp1
    11: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    3 bufp0
s
w
a
p
 
i
s
 
a
 
3
8
-
b
y
t
e
 
f
u
n
c
t
i
o
n
 
l
o
c
a
t
e
d
 
a
t
 
o
f
f
s
e
t
0
 
o
f
 
s
e
c
t
i
o
n
 
1
 
(
.
t
e
x
t
)
b
u
f
 
i
s
 
r
e
f
e
r
e
n
c
e
d
 
i
n
 
t
h
i
s
 
f
i
l
e
,
 
b
u
t
 
i
s
u
n
d
e
f
i
n
e
d
 
(
U
N
D
)
b
u
f
p
1
 
i
s
 
a
n
 
8
-
b
y
t
e
 
u
n
i
n
i
t
i
a
l
i
z
e
d
 
(
C
O
M
M
O
N
)
o
b
j
e
c
t
 
w
i
t
h
 
a
n
 
8
-
b
y
t
e
 
a
l
i
g
n
m
e
n
t
 
r
e
q
u
i
r
e
m
e
n
t
b
u
f
p
0
 
i
s
 
a
n
 
8
-
b
y
t
e
 
o
b
j
e
c
t
 
l
o
c
a
t
e
d
 
a
t
 
o
f
f
s
e
t
0
 
o
f
 
s
e
c
t
i
o
n
 
3
 
(
.
d
a
t
a
)
C
o
x
L
i
n
k
i
n
g
21
21
Linking Steps
Symbol Resolution
Determine where symbols are defined and what
size data/code they refer to
Relocation
Combine modules, relocate code/data, and fix
symbol references based on new locations
m
a
i
n
.
o
s
w
a
p
.
o
Relocatable
object code
Executable
C
o
x
L
i
n
k
i
n
g
22
22
Problem: Undefined Symbols
Missing symbols are not compiler errors
May be defined in another file
Compiler just inserts an undefined entry in the
symbol table
During linking, any undefined symbols that
cannot be resolved cause an error
UNIX% cc -O -o p main.c
/tmp/cccpTy0d.o: In function `main
:
main.c:(
.text+0x5): undefined reference to `swap
collect2: ld returned 1 exit status
UNIX%
forgot to type swap.c
C
o
x
L
i
n
k
i
n
g
23
23
Problem: Multiply Defined Symbols
Different files could define the same symbol
Is this an error?
If not, which one should be used?  One or many?
C
o
x
L
i
n
k
i
n
g
24
24
Linking: Example
int  x = 3;
int  y = 4;
int  z;
int foo(int a) {…}
int bar(int b) {…}
extern int  x;
static int  y = 6;
int         z;
int foo(int a);
static int bar(int b) {…}
?
?
Note: Linking uses object files
Examples use source-level for convenience
C
o
x
L
i
n
k
i
n
g
25
25
Linking: Example
int  x = 3;
int  y = 4;
int  z;
int foo(int a) {…}
int bar(int b) {…}
extern int  x;
static int  y = 6;
int         z;
int foo(int a);
static int bar(int b) {…}
int  x = 3;
int foo(int a) {…}
Defined in one file
Declared in other files
Only one copy exists
C
o
x
L
i
n
k
i
n
g
26
26
Linking: Example
int  x = 3;
int  y = 4;
int  z;
int foo(int a) {…}
int bar(int b) {…}
extern int  x;
static int  y = 6;
int         z;
int foo(int a);
static int bar(int b) {…}
int  x = 3;
int  y = 4;
int  y
 = 6;
int foo(int a) {…}
int bar(int b) {…}
int bar
(int b) {…}
Private names not
in symbol table.
Can
t conflict
with other files
names
Renaming is a
convenient source-level
way to understand this
C
o
x
L
i
n
k
i
n
g
27
27
Linking: Example
int  x = 3;
int  y = 4;
int  z;
int foo(int a) {…}
int bar(int b) {…}
extern int  x;
static int  y = 6;
int         z;
int foo(int a);
static int bar(int b) {…}
int  x = 3;
int  y = 4;
int  y
 = 6;
int  z;
int foo(int a) {…}
int bar(int b) {…}
int bar
(int b) {…}
C allows you to omit
extern
 in some
cases –  
Don
t!
C
o
x
L
i
n
k
i
n
g
28
28
Strong & Weak Symbols
Program symbol definitions are either strong
or weak
strong
 
procedures & initialized globals
weak
 
uninitialized globals
int foo=5;
p1() {}
int foo;
p2() {}
p1.c
p2.c
strong
weak
strong
strong
C
o
x
L
i
n
k
i
n
g
29
29
Strong & Weak Symbols
A strong symbol definition can only appear
once
A weak symbol definition can be overridden by
a strong symbol definition of the same
name
References to the weak symbol resolve to the
strong symbol
If there are multiple weak symbols
definitions, the linker can pick an arbitrary
one!
C
o
x
L
i
n
k
i
n
g
30
30
Linker Puzzles: What Happens?
Link time error: two strong symbols 
p1
References to 
x
 will refer to the same uninitialized int.
Is this what you really want?
Writes to 
x
 in 
p2
 might overwrite 
y
!
Evil!
Writes to 
x
 in 
p2
 will overwrite 
y
!
Nasty! 
Nightmare scenario: replace r.h.s. 
int
 with a
struct
 type, each file then compiled with different
alignment rules
References to 
x
 will refer to the same initialized
variable
C
o
x
L
i
n
k
i
n
g
31
31
Advanced Note: Name Mangling
 
Other languages (i.e. Java and C++) allow
overloaded methods
Functions then have the same name but take
different numbers/types of arguments
How does the linker disambiguate these symbols?
 
Generate unique names through 
mangling
Mangled names are compiler dependent
Example: class 
Foo
, method 
bar(int, long)
:
bar__3Fooil
_ZN3Foo3BarEil
Similar schemes are used for global variables, etc.
C
o
x
L
i
n
k
i
n
g
32
32
Linking Steps
Symbol Resolution
Determine where symbols are defined and what
size data/code they refer to
Relocation
Combine modules, relocate code/data, and fix
symbol references based on new locations
m
a
i
n
.
o
s
w
a
p
.
o
Relocatable
object code
Executable
C
o
x
L
i
n
k
i
n
g
33
33
.symtab & Pseudo-Instructions in main.s
        .file   "main.c"
        .text
.globl main
        .type   main, @function
main:
.LFB2:
        subq    $8, %rsp
.LCFI0:
        call    swap
        movl    $0, %eax
        addq    $8, %rsp
        ret
UNIX% cc -O -c main.c
UNIX% readelf -s main.o
 
Symbol table '.symtab' contains 11 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     8: 0000000000000000    19 FUNC    GLOBAL DEFAULT    1 main
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND swap
    10: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    3 buf
.LFE2:
        .size   main, .-main
.globl buf
        .data
        .align 4
        .type   buf, @object
        .size   buf, 8
buf:
        .long   1
        .long   2
                 ....
C
o
x
L
i
n
k
i
n
g
34
34
.symtab & Pseudo-Instructions in swap.s
        .file   "swap.c"
        .text
.globl swap
        .type   swap, @function
swap:
.LFB2:
        movq    $buf+4, bufp1(%rip)
        movq    bufp0(%rip), %rdx
        movl    (%rdx), %ecx
        movl    buf+4(%rip), %eax
        movl    %eax, (%rdx)
        movq    bufp1(%rip), %rax
        movl    %ecx, (%rax)
        ret
.LFE2:
        .size   swap, .-swap
.globl bufp0
        .data
        .align 8
        .type   bufp0, @object
        .size   bufp0, 8
bufp0:
        .quad   buf
        .comm   bufp1,8,8
               ....
Symbol table '.symtab' contains 12 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     
8: 0000000000000000    38 FUNC    GLOBAL DEFAULT    1 swap
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND buf
    10: 0000000000000008     8 OBJECT  GLOBAL DEFAULT  COM bufp1
    11: 0000000000000000     8 OBJECT  GLOBAL DEFAULT    3 bufp0
C
o
x
L
i
n
k
i
n
g
35
35
Symbol Resolution
 
Undefined symbols in
every relocatable object
file must be resolved
Where are they located
What size are they?
 
 
Linker looks in the symbol
tables of all relocatable
object files
Assuming every
unknown symbol is
defined once and only
once, this works well
main.o
.text
.symtab
.data
swap.o
.text
.symtab
.data
where
s
swap?
where
s
buf?
C
o
x
L
i
n
k
i
n
g
36
36
Relocation
Once all symbols are resolved, must combine
the input files
Total code size is known
Total data size is known
All symbols must be assigned run-time addresses
Sections must be merged
Only one text, data, etc. section in final executable
Final run-time addresses of all symbols are defined
Symbol references must be corrected
All symbol references must now refer to their
actual locations
C
o
x
L
i
n
k
i
n
g
37
37
Relocation: Merging Files
main.o
.text
.symtab
.data
swap.o
.text
.symtab
.data
p
.text
.symtab
.data
C
o
x
L
i
n
k
i
n
g
38
38
Linking: Relocation
/* main.c */
void swap(void);
int buf[2] = {1, 2};
int main(void)
{
  swap();
  return (0);
}
UNIX% objdump -r -d main.o
main.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
   0:   48 83 ec 08       sub    $0x8,%rsp
   4:   e8 00 00 00 00    callq  9 <main+0x9>
                        5: R_X86_64_PC32
                        swap+0xfffffffffffffffc
   9:   b8 00 00 00 00    mov    $0x0,%eax
   e:   48 83 c4 08       add    $0x8,%rsp
  12:   c3                retq
can also use 
readelf
–r
 to see relocation
information
Offset into text section (relocation information
is stored in a different section of the file)
Type of symbol (PC relative 32-bit signed)
Symbol name
C
o
x
L
i
n
k
i
n
g
39
39
Linking: Relocation
/* swap.c */
extern int buf[];
int *bufp0 = &buf[0];
int *bufp1;
void swap()
{
  int temp;
  bufp1  = &buf[1];
  temp   = *bufp0;
  *bufp0 = *bufp1;
  *bufp1 = temp;
}
UNIX% objdump -r -D swap.o
swap.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <swap>:
   0:   48 c7 05 00 00 00 00 movq $0x0,0(%rip)
   7:   00 00 00 00
                    3: R_X86_64_PC32
                    bufp1+0xfffffffffffffff8
                    7: R_X86_64_32S buf+0x4
   
<..snip..>
Disassembly of section .data:
0000000000000000 <bufp0>:
        ...
                        0: R_X86_64_64 buf
Need relocated address of bufp1
Need to initialize bufp0 with &buf[0] (== buf)
Need relocated address of buf[]
C
o
x
L
i
n
k
i
n
g
40
40
After Relocation
0000000000000000 <main>:
   0:   48 83 ec 08       sub    $0x8,%rsp
   4:   e8 00 00 00 00    callq  9 <main+0x9>
                        5: R_X86_64_PC32 swap+0xfffffffffffffffc
   9:   b8 00 00 00 00    mov    $0x0,%eax
   e:   48 83 c4 08       add    $0x8,%rsp
  12:   c3                retq
0000000000400448 <main>:
  400448:   48 83 ec 08       sub    $0x8,%rsp
  40044c:   e8 0b 00 00 00    callq  40045c <swap>
  400451:   b8 00 00 00 00    mov    $0x0,%eax
  400456:   48 83 c4 08       add    $0x8,%rsp
  40045a:   c3                retq
  40045b:   90                nop
000000000040045c <swap>:
  40045c:   48 c7 05 01 04 20 00  movq   $0x600848,2098177(%rip)
C
o
x
L
i
n
k
i
n
g
41
41
After Relocation
0000000000000000 <swap>:
   0:   48 c7 05 00 00 00 00 movq $0x0,0(%rip)
   7:   00 00 00 00
                    3: R_X86_64_PC32 bufp1+0xfffffffffffffff8
                    7: R_X86_64_32S  buf+0x4
   
<..snip..>
0000000000000000 <bufp0>:
        ...
                    0: R_X86_64_64 buf
000000000040045c <swap>:
  40045c:   48 c7 05 01 04 20 00 movq $0x600848,2098177(%rip)
  400463:   48 08 60 00                             # 600868 <bufp1>
   
<..snip..>
0000000000600850 <bufp0>:
  600850:   44 08 60 00 00 00 00 00
C
o
x
L
i
n
k
i
n
g
42
42
Libraries
 
How should functions commonly used by programmers
be provided?
Math, I/O, memory management, string manipulation,
etc.
Option 1: Put all functions in a single source file
Programmers link big object file into their programs
Space and time inefficient
Option 2: Put each function in a separate source file
Programmers explicitly link appropriate object files into
their programs
More efficient, but burdensome on the programmer
Solution: static libraries (.a archive files)
Multiple relocatable files + index 
 single archive file
Only links the subset of relocatable files from the library
that are used in the program
Example: 
cc –o fpmath main.c float.c -lm
C
o
x
L
i
n
k
i
n
g
43
43
Two Common Libraries
libc.a
 (the C standard library)
4 MB archive of 1395 object files
I/O, memory allocation, signal handling, string handling,
data and time, random numbers, integer math
Usually automatically linked
libm.a
 (the C math library)
1.3 MB archive of 401 object files
floating point math (sin, cos, tan, log, exp, sqrt, …)
Use 
-lm
 to link with your program
UNIX% ar t /usr/lib64/libc.a
fprintf.o
feof.o
fputc.o
strlen.o
UNIX% ar t /usr/lib64/libm.a
e_sinh.o
e_sqrt.o
e_gamma_r.o
k_cos.o
k_rem_pio2.o
k_sin.o
k_tan.o
C
o
x
L
i
n
k
i
n
g
44
44
Creating a Library
/* addvec.c */
#include 
vector.h
void addvec(int *x, int *y,
  int *z, int n)
{
  int i;
  for (i = 0; i < n; i++)
    z[i] = x[i] + y[i];
}
/* multvec.c */
#include 
vector.h
void multvec(int *x, int *y,
  int *z, int n)
{
  int i;
  for (i = 0; i < n; i++)
    z[i] = x[i] * y[i];
}
UNIX% cc –c addvec.c multvec.c
UNIX% ar rcs libvector.a addvec.o multvec.o
/* vector.h */
void addvec(int *x, int *y, int *z, int n);
void multvec(int *x, int *y, int *z, int n);
C
o
x
L
i
n
k
i
n
g
45
45
Using a library
/* main.c */
#include <stdio.h>
#include 
vector.h
int x[2] = {1, 2};
int y[2] = {3, 4};
int z[2];
int main(void)
{
  addvec(x, y, z, 2);
  printf(
z = [%d %d]\n
, z[0], z[1]);
  return (0);
}
UNIX% cc –O –c main.c
UNIX% cc –static –o program main.o ./libvector.a
main.o
libvector.a
libc.a
ld
addvec.o
printf.o
program
C
o
x
L
i
n
k
i
n
g
46
46
How to Link: Basic Algorithm
Keep a list of the current unresolved references.
For each object file (.o and .a) in command-line order
Try to resolve each unresolved reference in list to
objects defined in current file
Try to resolve each unresolved reference in current file
to objects defined in previous files
Concatenate like sections (.text with .text, etc.)
If list empty, output executable file, else error
Why 
UNIX% cc libvector.a main.o
Doesn
t Work
 
Linker keeps list of currently unresolved
symbols and searches an encountered library
for them
 
If symbol(s) found, a .o file for the found
symbol(s) is obtained and used by linker like
any other .o file
 
By putting libvector.a first, there is not yet
any unresolved symbol, so linker doesn
t
obtain any .o file from libvector.a!
C
o
x
L
i
n
k
i
n
g
47
47
C
o
x
L
i
n
k
i
n
g
48
48
Dynamic Libraries
Static
Linked at compile-time
UNIX: foo.a
Relocatable ELF File
Dynamic
Linked at run-time
UNIX: foo.so
Shared ELF File
What are the differences?
C
o
x
L
i
n
k
i
n
g
49
49
Static & Dynamic Libraries
Static
Library code added to
executable file
Larger executables
Must recompile to use
newer libraries
Executable is self-
contained
Some time to load
libraries at compile-time
Library code shared only
among copies of same
program
Dynamic
Library code not added
to executable file
Smaller executables
Uses newest (or
smallest, fastest, …)
library without
recompiling
Depends on libraries at
run-time
Some time to load
libraries at run-time
Library code shared
among all uses of library
C
o
x
L
i
n
k
i
n
g
50
50
Static & Dynamic Libraries
Static
Dynamic
cc –o zap zap.o -lfoo
cc –o zap zap.o -lfoo
ar rcs libfoo.a bar.o baz.o
ranlib libfoo.a
cc –shared –Wl,-soname,libfoo.so
 -o libfoo.so bar.o baz.o
Use
Use
Creation
Creation
Adds library
s code, data,
symbol table, relocation info, …
Adds library
s symbol table,
relocation info
C
o
x
L
i
n
k
i
n
g
51
51
Loading
Linking yields an executable that can actually
be run
Running a program
unix% ./program
Shell does not recognize 
program
 as a shell command,
so assumes it is an executable
Invokes the 
loader
 to load the executable into memory
(any unix program can invoke the loader with the 
execve
function – more later)
C
o
x
L
i
n
k
i
n
g
52
52
Creating the Memory Image (sort of…)
Create code and data
segments
Copy code and data
from executable into
these segments
Create initial heap
segment
Grows up from
read/write data
Create stack
Starts near the top and
grows downward
Call dynamic linker to
load shared libraries and
relocate references
U
s
e
r
 
S
t
a
c
k
S
h
a
r
e
d
 
L
i
b
r
a
r
i
e
s
H
e
a
p
R
e
a
d
/
W
r
i
t
e
 
D
a
t
a
R
e
a
d
-
o
n
l
y
 
C
o
d
e
 
a
n
d
 
D
a
t
a
Unused
0x7FFFFFFFFFFF
%rsp
 
0x000000000000
0x000000400000
C
o
x
L
i
n
k
i
n
g
53
53
Starting the Program
Jump to program
s entry point (stored in ELF
header)
For C programs, this is the 
_start
 symbol
Execute 
_start
 code (from 
crt1.o
 – same for
all C programs)
call __libc_init_first
call _init
call atexit
call main
call _exit
C
o
x
L
i
n
k
i
n
g
54
54
Position Independent Code
Static libraries compile with 
unresolved
 global & local
addresses
Library code & data concatenated & addresses resolved
when linking
C
o
x
L
i
n
k
i
n
g
55
55
Position Independent Code
By default (in C), dynamic libraries compile with
resolved
 global & local addresses
E.g.,
 
libfoo.so
 starts at 0x400000 in every application
using it
Advantage:  Simplifies sharing
Disadvantage:  Inflexible – must decide ahead of time
where each library goes, otherwise libraries can conflict
C
o
x
L
i
n
k
i
n
g
56
56
Position Independent Code
Can compile dynamic libraries with 
unresolved
 global &
local addresses
cc –shared –fPIC
Advantage:  More flexible – no conflicts
Disadvantage:  Code less efficient – referencing these
addresses involves indirection
C
o
x
L
i
n
k
i
n
g
57
57
Library Interpositioning
Linking with non-standard libraries that use
standard library symbols
Intercept
 calls to library functions
Some applications:
Security
Confinement (sandboxing)
Behind the scenes encryption
Automatically encrypt otherwise unencrypted network
connections
Monitoring & Profiling
Count number of calls to functions
Characterize call sites and arguments to functions
malloc tracing
Detecting memory leaks
Generating malloc traces
C
o
x
L
i
n
k
i
n
g
58
58
Dynamic Linking at Run-Time
Application access to dynamic linker via API:
#include <dlfcn.h>
void
dlink(void)
{
  void *handle = dlopen(
mylib.so
, RTLD_LAZY);
  /* type */ myfunc = dlsym(handle, 
myfunc
);
  myfunc(…);
  dlclose(handle);
}
Error-checking omitted for clarity
Symbols resolved at
first use, not now
C
o
x
L
i
n
k
i
n
g
59
59
Next Time
Lab: Hash tables and linked Lists
Exceptions
Slide Note
Embed
Share

Learn about the importance of linking in C programming, including memory allocation, modularity, and avoiding errors. Explore how linking combines code into executable files and enables the use of shared libraries, with insights into compilation processes and language scoping rules.

  • C programming
  • Linking
  • Memory Allocation
  • Modularity
  • Compilation

Uploaded on Sep 14, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Linking Alan L. Cox alc@rice.edu Some slides adapted from CMU 15.213 slides

  2. Objectives Be able to answer the textbook problems Understand how C type attributes (e.g. static, extern) control memory allocation for variables Be able to recognize some of the pitfalls when developing modular programs Appreciate how linking can help with efficiency, modularity, and evolvability Cox Linking 2

  3. Example Program (2 .c files) /* main.c */ void swap(void); int buf[2] = {1, 2}; /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; int main(void) { swap(); return (0); } void swap(void) { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } Cox Linking 3

  4. An Analogy for Linking Cox Linking 4

  5. Linking Linking: collecting and combining various pieces of code and data into a single file that can be loaded into memory and executed Why learn about linking? It won t make you a better jigsaw puzzle solver! It will help you build large programs It will help you avoid dangerous program errors It will help you understand how language scoping rules for variables are implemented It will help you understand other important concepts (that are covered later in the class) It will enable you to exploit shared libraries Cox Linking 5

  6. Compilation Compiler: .c C source code to .s assembly code Assembler: .s assembly code to .o relocatable object code UNIX% cc -v -O -g -o p main.c swap.c cc1 -quiet -v main.c -quiet -dumpbase main.c -mtune=generic -auxbase main -g -O -version -o /tmp/cchnheja.s as -V -Qy -o /tmp/ccmNFRZd.o /tmp/cchnheja.s cc1 -quiet -v swap.c -quiet -dumpbase swap.c -mtune=generic -auxbase swap -g -O -version -o /tmp/cchnheja.s as -V -Qy -o /tmp/ccx8FECg.o /tmp/ccheheja.s collect2 --eh-frame-hdr m elf_x86_64 --hash-style=gnu -dynamic- linker /lib64/ld-linux-x86-64.so.2 -o p crt1.o crti.o crtbegin.o L<..snip..> /tmp/ccmNFRZd.o /tmp/ccx8FECg.o lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed crtend.o crtn.o Linker: .o to executable Cox Linking 6

  7. Compilation UNIX% cc -O g -o p main.c swap.c C source code main.c swap.c cc1 cc1 main.s swap.s Assembly code as as Relocatable object code main.o swap.o Linking step ld (collect2) ELF Format Files p Executable Cox Linking 7

  8. ELF (Executable Linkable Format) 0 Order & existence of segments is arbitrary, except ELF header must be present and first ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 8

  9. ELF Header 0 Basic description of file contents: File format identifier Architecture Endianness Alignment requirement for other sections Location of other sections Code s starting address ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 9

  10. Program and Section Headers 0 Info about other sections necessary for loading into memory for execution Required for executables & libraries ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data Info about other sections necessary for linking Required for relocatables .debug Section header table Cox Linking 10

  11. Text Section 0 Machine code (instructions) read-only ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 11

  12. Data Sections 0 Static data initialized, read-only initialized, read/write uninitialized, read/write (BSS = Block Started by Symbol pseudo-op for IBM 704) Initialized Initial values in ELF file Uninitialized Only total size in ELF file Writable distinction enforced at run-time Why? Protection; sharing How? Virtual memory ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 12

  13. Symbol Table 0 Describes where global variables and functions are defined Present in all relocatable ELF files ELF header Program header table .text .rodata .data .bss /* main.c */ void swap(void); int buf[2] = {1, 2}; .symtab .rel.text .rel.data int main(void) { swap(); return (0); } .debug Section header table Cox Linking 13

  14. Relocation Information 0 Describes where and how symbols are used A list of locations in the .text section that will need to be modified when the linker combines this object file with others Relocation information for any global variables that are referenced or defined by the module Allows object files to be easily relocated ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 14

  15. Debug Section 0 Relates source code to the object code within the ELF file ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 15

  16. Other Sections 0 Other kinds of sections also supported, including: Other debugging info Version control info Dynamic linking info C++ initializing & finalizing code ELF header Program header table .text .rodata .data .bss .symtab .rel.text .rel.data .debug Section header table Cox Linking 16

  17. Linker Symbol Classification Global symbols Symbols defined by module m that can be referenced by other modules C: non-static functions & global variables External symbols Symbols referenced by module m but defined by some other module C: extern functions & variables Local symbols Symbols that are defined and referenced exclusively by module m C: static functions & variables Local linker symbols local function variables! Cox Linking 17

  18. Linker Symbols Definition of global symbols bufp0 and bufp1 (even though not used outside file) /* main.c */ void swap(void); int buf[2] = {1, 2}; /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; int main(void) { swap(); return (0); } void swap(void) { int temp; Definition of global symbols buf and main Definition of global symbol swap bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } Reference to external symbol swap Reference to external symbol buf Linker knows nothing about local variables Cox Linking 18

  19. Linker Symbols /* main.c */ void swap(void); int buf[2] = {1, 2}; What s missing? swap where is it? int main(void) { swap(); return (0); } main is a 19-byte function located at offset 0 of section 1 (.text) undefined (UND) of section 3 (.data) swap is referenced in this file, but is buf is an 8-byte object located at offset 0 use readelf S to see sections UNIX% cc -O -c main.c UNIX% readelf -s main.o Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name 8: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 main 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND swap 10: 0000000000000000 8 OBJECT GLOBAL DEFAULT 3 buf Cox Linking 19

  20. Linker Symbols /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; What s missing? buf where is it? void swap(void) { int temp; swap is a 38-byte function located at offset 0 of section 1 (.text) undefined (UND) object with an 8-byte alignment requirement 0 of section 3 (.data) buf is referenced in this file, but is bufp1 is an 8-byte uninitialized (COMMON) bufp0 is an 8-byte object located at offset bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 8: 0000000000000000 38 FUNC GLOBAL DEFAULT 1 swap 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND buf 10: 0000000000000008 8 OBJECT GLOBAL DEFAULT COM bufp1 11: 0000000000000000 8 OBJECT GLOBAL DEFAULT 3 bufp0 Cox Linking 20

  21. Linking Steps Symbol Resolution Determine where symbols are defined and what size data/code they refer to Relocation Combine modules, relocate code/data, and fix symbol references based on new locations Relocatable object code main.o swap.o ld (collect2) p Executable Cox Linking 21

  22. Problem: Undefined Symbols forgot to type swap.c UNIX% cc -O -o p main.c /tmp/cccpTy0d.o: In function `main : main.c:(.text+0x5): undefined reference to `swap collect2: ld returned 1 exit status UNIX% Missing symbols are not compiler errors May be defined in another file Compiler just inserts an undefined entry in the symbol table During linking, any undefined symbols that cannot be resolved cause an error Cox Linking 22

  23. Problem: Multiply Defined Symbols Different files could define the same symbol Is this an error? If not, which one should be used? One or many? Cox Linking 23

  24. Linking: Example int x = 3; int y = 4; int z; extern int x; static int y = 6; int z; int foo(int a) { } int bar(int b) { } int foo(int a); static int bar(int b) { } ? ? Note: Linking uses object files Examples use source-level for convenience Cox Linking 24

  25. Linking: Example int x = 3; int y = 4; int z; extern int x; static int y = 6; int z; int foo(int a) { } int bar(int b) { } int foo(int a); static int bar(int b) { } Defined in one file Declared in other files int x = 3; Only one copy exists int foo(int a) { } Cox Linking 25

  26. Linking: Example int x = 3; int y = 4; int z; extern int x; static int y = 6; int z; int foo(int a) { } int bar(int b) { } int foo(int a); static int bar(int b) { } Private names not in symbol table. Can t conflict with other files names int x = 3; int y = 4; int y = 6; Renaming is a convenient source-level way to understand this int foo(int a) { } int bar(int b) { } int bar (int b) { } Cox Linking 26

  27. Linking: Example int x = 3; int y = 4; int z; extern int x; static int y = 6; int z; int foo(int a) { } int bar(int b) { } int foo(int a); static int bar(int b) { } int x = 3; int y = 4; int y = 6; int z; C allows you to omit extern in some cases Don t! int foo(int a) { } int bar(int b) { } int bar (int b) { } Cox Linking 27

  28. Strong & Weak Symbols Program symbol definitions are either strong or weak strong weak procedures & initialized globals uninitialized globals p1.c p2.c weak strong int foo=5; int foo; strong strong p1() {} p2() {} Cox Linking 28

  29. Strong & Weak Symbols A strong symbol definition can only appear once A weak symbol definition can be overridden by a strong symbol definition of the same name References to the weak symbol resolve to the strong symbol If there are multiple weak symbols definitions, the linker can pick an arbitrary one! Cox Linking 29

  30. Linker Puzzles: What Happens? int x; p1() {} p1() {} Link time error: two strong symbols p1 References to x will refer to the same uninitialized int. Is this what you really want? int x; p1() {} int x; p2() {} int x; int y; p1() {} double x; p2() {} Writes to x in p2 might overwrite y! Evil! int x=7; int y=5; p1() {} double x; p2() {} Writes to x in p2 will overwrite y! Nasty! References to x will refer to the same initialized variable int x=7; p1() {} int x; p2() {} Nightmare scenario: replace r.h.s. int with a struct type, each file then compiled with different alignment rules Cox Linking 30

  31. Advanced Note: Name Mangling Other languages (i.e. Java and C++) allow overloaded methods Functions then have the same name but take different numbers/types of arguments How does the linker disambiguate these symbols? Generate unique names through mangling Mangled names are compiler dependent Example: class Foo , method bar(int, long) : bar__3Fooil _ZN3Foo3BarEil Similar schemes are used for global variables, etc. Cox Linking 31

  32. Linking Steps Symbol Resolution Determine where symbols are defined and what size data/code they refer to Relocation Combine modules, relocate code/data, and fix symbol references based on new locations Relocatable object code main.o swap.o ld (collect2) p Executable Cox Linking 32

  33. .symtab & Pseudo-Instructions in main.s UNIX% cc -O -c main.c UNIX% readelf -s main.o Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name 8: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 main 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND swap 10: 0000000000000000 8 OBJECT GLOBAL DEFAULT 3 buf .file "main.c" .LFE2: .size main, .-main .globl buf .data .align 4 .type buf, @object .size buf, 8 buf: .long 1 .long 2 .... .text .globl main .type main, @function main: .LFB2: subq $8, %rsp .LCFI0: call swap movl $0, %eax addq $8, %rsp ret Cox Linking 33

  34. .symtab & Pseudo-Instructions in swap.s Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 8: 0000000000000000 38 FUNC GLOBAL DEFAULT 1 swap 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND buf 10: 0000000000000008 8 OBJECT GLOBAL DEFAULT COM bufp1 11: 0000000000000000 8 OBJECT GLOBAL DEFAULT 3 bufp0 .file "swap.c" .LFE2: .size swap, .-swap .globl bufp0 .data .align 8 .type bufp0, @object .size bufp0, 8 bufp0: .quad buf .comm bufp1,8,8 .... .text .globl swap .type swap, @function swap: .LFB2: movq $buf+4, bufp1(%rip) movq bufp0(%rip), %rdx movl (%rdx), %ecx movl buf+4(%rip), %eax movl %eax, (%rdx) movq bufp1(%rip), %rax movl %ecx, (%rax) ret Cox Linking 34

  35. Symbol Resolution Undefined symbols in every relocatable object file must be resolved Where are they located What size are they? main.o .text .data where s swap? .symtab Linker looks in the symbol tables of all relocatable object files Assuming every unknown symbol is defined once and only once, this works well swap.o .text .data where s buf? .symtab Cox Linking 35

  36. Relocation Once all symbols are resolved, must combine the input files Total code size is known Total data size is known All symbols must be assigned run-time addresses Sections must be merged Only one text, data, etc. section in final executable Final run-time addresses of all symbols are defined Symbol references must be corrected All symbol references must now refer to their actual locations Cox Linking 36

  37. Relocation: Merging Files main.o .text .data p .symtab .text .data swap.o .text .symtab .data .symtab Cox Linking 37

  38. Linking: Relocation /* main.c */ void swap(void); int buf[2] = {1, 2}; UNIX% objdump -r -d main.o main.o: file format elf64-x86-64 Disassembly of section .text: int main(void) { swap(); return (0); } 0000000000000000 <main>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <main+0x9> 5: R_X86_64_PC32 swap+0xfffffffffffffffc 9: b8 00 00 00 00 mov $0x0,%eax e: 48 83 c4 08 add $0x8,%rsp 12: c3 retq can also use readelf r to see relocation information Offset into text section (relocation information is stored in a different section of the file) Type of symbol (PC relative 32-bit signed) Symbol name Cox Linking 38

  39. Linking: Relocation /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; UNIX% objdump -r -D swap.o swap.o: file format elf64-x86-64 Disassembly of section .text: void swap() { int temp; 0000000000000000 <swap>: 0: 48 c7 05 00 00 00 00 movq $0x0,0(%rip) 7: 00 00 00 00 3: R_X86_64_PC32 bufp1+0xfffffffffffffff8 7: R_X86_64_32S buf+0x4 <..snip..> Disassembly of section .data: bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } 0000000000000000 <bufp0>: ... 0: R_X86_64_64 buf Need relocated address of bufp1 Need relocated address of buf[] Need to initialize bufp0 with &buf[0] (== buf) Cox Linking 39

  40. After Relocation 0000000000000000 <main>: 0: 48 83 ec 08 sub $0x8,%rsp 4: e8 00 00 00 00 callq 9 <main+0x9> 5: R_X86_64_PC32 swap+0xfffffffffffffffc 9: b8 00 00 00 00 mov $0x0,%eax e: 48 83 c4 08 add $0x8,%rsp 12: c3 retq 0000000000400448 <main>: 400448: 48 83 ec 08 sub $0x8,%rsp 40044c: e8 0b 00 00 00 callq 40045c <swap> 400451: b8 00 00 00 00 mov $0x0,%eax 400456: 48 83 c4 08 add $0x8,%rsp 40045a: c3 retq 40045b: 90 nop 000000000040045c <swap>: 40045c: 48 c7 05 01 04 20 00 movq $0x600848,2098177(%rip) Cox Linking 40

  41. After Relocation 0000000000000000 <swap>: 0: 48 c7 05 00 00 00 00 movq $0x0,0(%rip) 7: 00 00 00 00 3: R_X86_64_PC32 bufp1+0xfffffffffffffff8 7: R_X86_64_32S buf+0x4 <..snip..> 0000000000000000 <bufp0>: ... 0: R_X86_64_64 buf 000000000040045c <swap>: 40045c: 48 c7 05 01 04 20 00 movq $0x600848,2098177(%rip) 400463: 48 08 60 00 # 600868 <bufp1> <..snip..> 0000000000600850 <bufp0>: 600850: 44 08 60 00 00 00 00 00 Cox Linking 41

  42. Libraries How should functions commonly used by programmers be provided? Math, I/O, memory management, string manipulation, etc. Option 1: Put all functions in a single source file Programmers link big object file into their programs Space and time inefficient Option 2: Put each function in a separate source file Programmers explicitly link appropriate object files into their programs More efficient, but burdensome on the programmer Solution: static libraries (.a archive files) Multiple relocatable files + index Only links the subset of relocatable files from the library that are used in the program Example: cc o fpmath main.c float.c -lm single archive file Cox Linking 42

  43. Two Common Libraries libc.a (the C standard library) 4 MB archive of 1395 object files I/O, memory allocation, signal handling, string handling, data and time, random numbers, integer math Usually automatically linked libm.a (the C math library) 1.3 MB archive of 401 object files floating point math (sin, cos, tan, log, exp, sqrt, ) Use -lm to link with your program UNIX% ar t /usr/lib64/libc.a fprintf.o feof.o fputc.o strlen.o UNIX% ar t /usr/lib64/libm.a e_sinh.o e_sqrt.o e_gamma_r.o k_cos.o k_rem_pio2.o k_sin.o k_tan.o Cox Linking 43

  44. Creating a Library /* vector.h */ void addvec(int *x, int *y, int *z, int n); void multvec(int *x, int *y, int *z, int n); /* addvec.c */ #include vector.h void addvec(int *x, int *y, int *z, int n) { int i; /* multvec.c */ #include vector.h void multvec(int *x, int *y, int *z, int n) { int i; for (i = 0; i < n; i++) z[i] = x[i] + y[i]; } for (i = 0; i < n; i++) z[i] = x[i] * y[i]; } UNIX% cc c addvec.c multvec.c UNIX% ar rcs libvector.a addvec.o multvec.o Cox Linking 44

  45. Using a library /* main.c */ #include <stdio.h> #include vector.h main.o libvector.a libc.a int x[2] = {1, 2}; int y[2] = {3, 4}; int z[2]; addvec.o printf.o ld int main(void) { addvec(x, y, z, 2); printf( z = [%d %d]\n , z[0], z[1]); return (0); } program UNIX% cc O c main.c UNIX% cc static o program main.o ./libvector.a Cox Linking 45

  46. How to Link: Basic Algorithm Keep a list of the current unresolved references. For each object file (.o and .a) in command-line order Try to resolve each unresolved reference in list to objects defined in current file Try to resolve each unresolved reference in current file to objects defined in previous files Concatenate like sections (.text with .text, etc.) If list empty, output executable file, else error Problem: Command line order matters! Link libraries last: UNIX% cc main.o libvector.a UNIX% cc libvector.a main.o main.o: In function `main': main.o(.text+0x4): undefined reference to `addvec' Cox Linking 46

  47. Why UNIX% cc libvector.a main.o Doesn t Work Linker keeps list of currently unresolved symbols and searches an encountered library for them If symbol(s) found, a .o file for the found symbol(s) is obtained and used by linker like any other .o file By putting libvector.a first, there is not yet any unresolved symbol, so linker doesn t obtain any .o file from libvector.a! Cox Linking 47

  48. Dynamic Libraries Static Dynamic Linked at compile-time UNIX: foo.a Linked at run-time UNIX: foo.so Relocatable ELF File Shared ELF File What are the differences? Cox Linking 48

  49. Static & Dynamic Libraries Static Library code added to executable file Larger executables Must recompile to use newer libraries Dynamic Library code not added to executable file Smaller executables Uses newest (or smallest, fastest, ) library without recompiling Depends on libraries at run-time Some time to load libraries at run-time Library code shared among all uses of library Executable is self- contained Some time to load libraries at compile-time Library code shared only among copies of same program Cox Linking 49

  50. Static & Dynamic Libraries Static Dynamic Creation Creation ar rcs libfoo.a bar.o baz.o ranlib libfoo.a cc shared Wl,-soname,libfoo.so -o libfoo.so bar.o baz.o Use Use cc o zap zap.o -lfoo cc o zap zap.o -lfoo Adds library s code, data, symbol table, relocation info, Adds library s symbol table, relocation info Cox Linking 50

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#