Introduction to Intel Assembly Language for x86 Processors

 
Introduction to Assembly
 
Here we have a brief introduction to Intel
Assembly Language
This is the assembly language developed for the
Intel 8086 processor, which continues to be the
basis for all x86/Pentium/iCore processors
CISC instruction set
Special purpose register set
8 and 16 bit operations initially, expanded to 32 and
64 bit operations for Pentium
Memory-register and register-register operations
available, several addressing modes including many
implied addresses
 
Instruction Format
 
[name/label]  [mnemonic] [operands] [;comments]
Operands are either literals, variables/constants,  or registers
Number of operands depends on type of instruction,
range from 0 to 2
Examples:
mov eax, ebx – 2 operands, source and destination
mov eax, 5 – one operand is a literal
mov y, eax – memory to register movement
add eax, 5 – 2 operands for add
mul value – 1 operand for mul, other operand is eax
nop – no operands for the no-op instruction
je location – 1 operand with comparison implied to be a flag
 
Literals and Variables
 
Literals require that the type of
value be specified by following the
value with one of the following:
D, d for decimal (the default)
H, h for hexadecimal
Q, q for octal
b for binary
Strings are placed in ‘ ’ or “ ” marks
Examples:
10101011b
0Ah
35
‘hello’
“goodbye”
 
We will define all
assembly code within
C/C++ programs so we
will declare all variables
in C/C++ code
int is 32 bit
short is 16 bit
char is 8 bit
We must insure that we
place the datum into the
right sized register (see
next slide)
 
Registers
 
14 registers, all special purpose
4 data registers
EAX – accumulator
EAX is an implied register in the Mul and Div instructions
EBX – base counter
used for addressing, particularly when dealing with arrays and strings
EBX can be used as a data register when not used for addressing
ECX – counter
implicitly used in loop instructions
in non-looping instructions, can be used as a data register
EDX – data register
used for In and Out instructions (input, output), also used to store partial
results of Mul and Div operations
in other cases, can be used as a data register
Register sizes:
_X – 16 bits (e.g., AX, BX)
 _H and _L –8 bits (high and low end of _X registers)
E_X – 32 bits (extends the _X registers to 32 bits)
 
Other Registers
 
Other registers can not be used for data but have
specific uses
Segment registers point to different segments in memory
SS – stack
CS – code
DS – data
ES – extra (used as a base pointer for variables)
Indexing registers as offsets into local function, stack, or
string
BP – base pointer used with SS to address subroutine local
variables on the stack
SP – stack pointer used with SS for top of stack
SI and DI – source and destination for string transfers
IP – program counter
Status flags
 
Operations:  Data Movement
 
mov and xchg instructions
mov allows for register-register, memory-register, register-
memory, register-immediate and memory-immediate
first item is destination, second is source
memory-memory moves must be done with 2 instructions using a
register as temporary storage
memory references can use direct, direct+offset, or register-indirect
modes
if datum is 8-bit, register only uses high or low side, 16-bit uses entire
register, 32-bit uses extended register (e.g., EAX, EDX) and 64-bit
combines two registers
xchg instruction allows only register-register, memory-
register and register-memory and exchanges two values rather
than moves one value as with mov
 
Operations:  Arithmetic/Conditional
 
inc/dec dest
add/sub dest, source
dest is register or memory reference,
source for add/sub is register,
memory reference, or literal, sizes
must match
mul/div source
one datum is source, the other is
implied to be eax (or ax or al)
destination is implied as eax/edx
combined (or ax/dx, al/ah
depending on size)
source can be a register or memory
reference but not a literal (cannot do
mul 2)
div: quotient in ax, al or eax,
remainder in dx, ah or edx
mul:  result is twice the size, so goes
into edx/eax or dx/ax or ah/al
 
shl, shr, sal, sar, shld, shrd ,
rol, ror, rcl, rcr
shift, shift arithmetic, shift
double, rotate, rotate w/ carry
two operands:   item being
shifted/rotated, bits
shifted/rotated
Logic operations: AND, OR,
XOR, NOT
form is OP dest, source
NEG dest
convert two’s complement value
to its opposite
CMP first, second
compare first and second and set
proper flag(s) (PF, ZF, NF)
the result of cmp operations are
then used for branch instructions
 
Operations:  Branches
 
Conditional branches:
instruction preceded by an
instruction which sets at least
one status flag, usually a cmp
instruction
flag tested based on type of
branch
je/jne location – branch if zero
flag set/clear
jg/jge/jl/jle location – branch
on positive/positive+zero,
negative/negative+zero flag set
jc/jnc/jz/jnz/jp/jnp location –
branch on carry/no carry,
zero/not zero, even parity/odd
parity
 
Unconditional branches:
branch automatically to location
jmp location
jmp instructions are used to
implement goto statements and
procedure calls
loop location
used for downward counting for
loops
initialize ecx (or cx) to starting
value
loop location combines
dec ecx
cmp ecx, 0
jg location
 
Addressing Modes
 
Immediate – place datum in instruction as a literal
add eax, 10
use this mode when datum is known at program implementation time
Direct – place variable in instruction
mov eax, x
 
; moves x into register ax
add y, eax
  
; sets y = [y] + [eax]
use this mode to access a variable in memory
Direct + Offset
mov eax, x+4      ;  eax 
 x[4 bytes] – this is not the same as x[4]
mov eax, x[ebx]   ;  eax 
 x[ebx] –ebx stores the byte offset
Note:  mov eax, x[y] is illegal because it has 2 memory references
use this mode when dealing with strings, arrays and structs
Register Indirect – use index and/or segment registers
mov eax, [si + ds]  
 
;  base-indexed
mov eax, [si – 4] 
  
;  base with displacement
mov eax, [si + ds – 6]
 
; base-indexed with displacement
we will not use these modes
 
Addressing Examples
 
Imagine that we have declared  in C:
int a[ ] = {0, 11, 15, 21, 99};
Then, the following accesses give us the values of a as
shown:
mov eax, a        
 
eax 
 0
mov eax, a+4    
 
eax 
 11
mov eax, a+8
 
eax 
 15
mov eax, a[ebx]     eax 
 99 if ebx = 16
If ebx and ecx both = 0 and size is the number of items in
the array, then we can iterate through the array as follows:
 
 top:     mov eax, a[ebx]
            … do something with the array value …
            add ebx, 4
            add ecx, 1
            cmp ecx, size
            jl     top
  
// use jl since we stop once ecx == size
 
Writing Assembly in a C Program
 
For simplicity, we will write our code inside of C (or
C++) programs
This allows us to
declare variables in C/C++ thus avoiding the .data section
do I/O in C/C++ thus avoiding difficulties dealing with
assembly input and assembly output
compile our programs rather than dealing with assembling
them using MASM or TASM
To include assembly code, in your C/C++ program,
add the following compiler directive
   
_ _ 
asm {
  
}
And place all of your assembly code between the { }
 
Data Types
 
One problem that might arise in using C/C++ to run our
assembly code is that we might mix up data types
if you declare a variable to be of type int, then this is a 4-byte
variable
moving it into a register means that you must move it into a 4-
byte register (such as eax) and not a 2-byte or 1-byte register!
if you try to move a variable into the wrong sized register, or a
register value into the wrong sized variable, you will get a
“operand size conflict” syntax error message when compiling
your program
to use ax, bx, cx, dx, declare variables to be of type short
to use eax, ebx, ecx, edx, declare variables to be of type int
also notice that char are 1 byte, so should use either the upper
or lower half a register (al, ah, dl, dh)
Slide Note
Embed
Share

Intel Assembly Language is a low-level programming language designed for Intel 8086 processors and their successors. It features a CISC instruction set, special purpose registers, memory-register operations, and various addressing modes. The language employs mnemonics to represent instructions, with operands ranging from literals to registers. Additionally, it utilizes registers like EAX, EBX, ECX, and EDX for data manipulation. Understanding literals, variables, and registers is crucial for effective programming in Intel Assembly Language.

  • Intel Assembly Language
  • x86 Processors
  • CISC Instruction Set
  • Registers
  • Operand Types

Uploaded on Jul 16, 2024 | 2 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Introduction to Assembly Here we have a brief introduction to Intel Assembly Language This is the assembly language developed for the Intel 8086 processor, which continues to be the basis for all x86/Pentium/iCore processors CISC instruction set Special purpose register set 8 and 16 bit operations initially, expanded to 32 and 64 bit operations for Pentium Memory-register and register-register operations available, several addressing modes including many implied addresses

  2. Instruction Format [name/label] [mnemonic] [operands] [;comments] Operands are either literals, variables/constants, or registers Number of operands depends on type of instruction, range from 0 to 2 Examples: mov eax, ebx 2 operands, source and destination mov eax, 5 one operand is a literal mov y, eax memory to register movement add eax, 5 2 operands for add mul value 1 operand for mul, other operand is eax nop no operands for the no-op instruction je location 1 operand with comparison implied to be a flag

  3. Literals and Variables Literals require that the type of value be specified by following the value with one of the following: D, d for decimal (the default) H, h for hexadecimal Q, q for octal b for binary Strings are placed in or marks Examples: 10101011b 0Ah 35 hello goodbye We will define all assembly code within C/C++ programs so we will declare all variables in C/C++ code int is 32 bit short is 16 bit char is 8 bit We must insure that we place the datum into the right sized register (see next slide)

  4. Registers 14 registers, all special purpose 4 data registers EAX accumulator EAX is an implied register in the Mul and Div instructions EBX base counter used for addressing, particularly when dealing with arrays and strings EBX can be used as a data register when not used for addressing ECX counter implicitly used in loop instructions in non-looping instructions, can be used as a data register EDX data register used for In and Out instructions (input, output), also used to store partial results of Mul and Div operations in other cases, can be used as a data register Register sizes: _X 16 bits (e.g., AX, BX) _H and _L 8 bits (high and low end of _X registers) E_X 32 bits (extends the _X registers to 32 bits)

  5. Other Registers Other registers can not be used for data but have specific uses Segment registers point to different segments in memory SS stack CS code DS data ES extra (used as a base pointer for variables) Indexing registers as offsets into local function, stack, or string BP base pointer used with SS to address subroutine local variables on the stack SP stack pointer used with SS for top of stack SI and DI source and destination for string transfers IP program counter Status flags

  6. Operations: Data Movement mov and xchg instructions mov allows for register-register, memory-register, register- memory, register-immediate and memory-immediate first item is destination, second is source memory-memory moves must be done with 2 instructions using a register as temporary storage memory references can use direct, direct+offset, or register-indirect modes if datum is 8-bit, register only uses high or low side, 16-bit uses entire register, 32-bit uses extended register (e.g., EAX, EDX) and 64-bit combines two registers xchg instruction allows only register-register, memory- register and register-memory and exchanges two values rather than moves one value as with mov

  7. Operations: Arithmetic/Conditional inc/dec dest add/sub dest, source dest is register or memory reference, source for add/sub is register, memory reference, or literal, sizes must match mul/div source one datum is source, the other is implied to be eax (or ax or al) destination is implied as eax/edx combined (or ax/dx, al/ah depending on size) source can be a register or memory reference but not a literal (cannot do mul 2) div: quotient in ax, al or eax, remainder in dx, ah or edx mul: result is twice the size, so goes into edx/eax or dx/ax or ah/al shl, shr, sal, sar, shld, shrd , rol, ror, rcl, rcr shift, shift arithmetic, shift double, rotate, rotate w/ carry two operands: item being shifted/rotated, bits shifted/rotated Logic operations: AND, OR, XOR, NOT form is OP dest, source NEG dest convert two s complement value to its opposite CMP first, second compare first and second and set proper flag(s) (PF, ZF, NF) the result of cmp operations are then used for branch instructions

  8. Operations: Branches Unconditional branches: branch automatically to location jmp location jmp instructions are used to implement goto statements and procedure calls loop location used for downward counting for loops initialize ecx (or cx) to starting value loop location combines dec ecx cmp ecx, 0 jg location Conditional branches: instruction preceded by an instruction which sets at least one status flag, usually a cmp instruction flag tested based on type of branch je/jne location branch if zero flag set/clear jg/jge/jl/jle location branch on positive/positive+zero, negative/negative+zero flag set jc/jnc/jz/jnz/jp/jnp location branch on carry/no carry, zero/not zero, even parity/odd parity

  9. Addressing Modes Immediate place datum in instruction as a literal add eax, 10 use this mode when datum is known at program implementation time Direct place variable in instruction mov eax, x ; moves x into register ax add y, eax ; sets y = [y] + [eax] use this mode to access a variable in memory Direct + Offset mov eax, x+4 ; eax x[4 bytes] this is not the same as x[4] mov eax, x[ebx] ; eax x[ebx] ebx stores the byte offset Note: mov eax, x[y] is illegal because it has 2 memory references use this mode when dealing with strings, arrays and structs Register Indirect use index and/or segment registers mov eax, [si + ds] ; base-indexed mov eax, [si 4] ; base with displacement mov eax, [si + ds 6] ; base-indexed with displacement we will not use these modes

  10. Addressing Examples Imagine that we have declared in C: int a[ ] = {0, 11, 15, 21, 99}; Then, the following accesses give us the values of a as shown: mov eax, a eax 0 mov eax, a+4 eax 11 mov eax, a+8 eax 15 mov eax, a[ebx] eax 99 if ebx = 16 If ebx and ecx both = 0 and size is the number of items in the array, then we can iterate through the array as follows: top: mov eax, a[ebx] do something with the array value add ebx, 4 add ecx, 1 cmp ecx, size jl top // use jl since we stop once ecx == size

  11. Writing Assembly in a C Program For simplicity, we will write our code inside of C (or C++) programs This allows us to declare variables in C/C++ thus avoiding the .data section do I/O in C/C++ thus avoiding difficulties dealing with assembly input and assembly output compile our programs rather than dealing with assembling them using MASM or TASM To include assembly code, in your C/C++ program, add the following compiler directive _ _ asm { And place all of your assembly code between the { } }

  12. Data Types One problem that might arise in using C/C++ to run our assembly code is that we might mix up data types if you declare a variable to be of type int, then this is a 4-byte variable moving it into a register means that you must move it into a 4- byte register (such as eax) and not a 2-byte or 1-byte register! if you try to move a variable into the wrong sized register, or a register value into the wrong sized variable, you will get a operand size conflict syntax error message when compiling your program to use ax, bx, cx, dx, declare variables to be of type short to use eax, ebx, ecx, edx, declare variables to be of type int also notice that char are 1 byte, so should use either the upper or lower half a register (al, ah, dl, dh)

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#