Chapter 7 - Assembly & Hello World

 

7.1  Assembly Language

When a CPU runs a program, it executes instructions written in its own native machine language.  These instructions are called either programs, executables, software, or binaries.  Machine language instructions are strings of 1's and 0's and therefore cumbersome to work with.  Assembly language allows the programmer to write recognizable words instead of binary.  For example, you can use the MOV command instead of writing the binary code for the instruction.  You can also assign labels to identify memory addresses which function as variables.  An assembler is required to compile the source code into machine language.

Assembly is used for direct hardware manipulation, access to specialized processor instructions, or to address performance issues.  Programs can be hybrid where most is written in high level language(s) and some critical parts in assembly language.  The first compilers were written in machine or assembly language.  Now most compilers are written in C/C++.  Windows OS is written in C and C++ with some critical parts in assembly.  MacOS is written in C, C++, Objective C, and Swift.
x86 Program Memory Layout

Each application is sandboxed in its own memory space.  The diagram on the right shows how the the memory for an application is divided.

1.  Text (or Code) Segment - This is used to store the executable code.

2.  Data Segment - This stores data or variables that are initialized.

3.  BSS Segment - This stores uninitialized data or variables.  They are initialized to zero at runtime.  No value has to be stored in the executable as does in the data segment.

4.  Stack - This stores the hardware supplied LIFO stack data.  The stack pointer register stores the top of the stack which is adjusted with each push or pop.

5.  Heap - This memory stores dynamic data created during runtime.  An example is C++/Java objects.

Here is addition information on memory layout.


7.2  Hello World using Flat Assembler (FASM)

Here is an example Hello World console program using FASM.  The program uses the Microsoft C standard library msvcrt.dll to call the printf command.  The getch function is called to keep the command prompt open after the program is finished.

hello_world.asm

format PE console
include 'win32ax.inc'

;=======================================
section '.code' code readable executable
;=======================================
start:
    cinvoke printf, "Hello%c", 10
    cinvoke printf, "World%c", 10
    invoke getch

;====================================
section '.idata' import data readable
;====================================
library msvcrt,'msvcrt.dll',kernel32,'kernel32.dll'
import msvcrt,printf,'printf',getch,'_getch'

Output

Hello
World


7.3  Hello World with Console Input


Here is an example program that performs console input.

scanf.asm

format PE console
include 'win32ax.inc'

;=======================================
section '.code' code readable executable
;=======================================
start:
        cinvoke printf, "Enter a number: "
        cinvoke scanf, "%d", Number1
        cinvoke printf, "Enter a number: "
        cinvoke scanf, "%d", Number2

        mov ecx, [Number1]
        mov ebx, [Number2]
        add ebx, ecx
        mov [Sum], ebx
        cinvoke printf, "%d + %d = %d%c",[Number1],[Number2],[Sum],10
        cinvoke printf, "Press any key to close console..."
        invoke getch

;======================================
section '.data' data readable writeable
;======================================
Number1 dd 0
Number2 dd 0
Sum     dd 0

;====================================
section '.idata' import data readable
;====================================
library msvcrt,'msvcrt.dll',kernel32,'kernel32.dll'
import msvcrt,printf,'printf',scanf,'scanf',getch,'_getch' 

Output

Enter a number: 15
Enter a number: 7
15 + 7 = 22

Below is an explanation of the parts of the scanf.asm program.  The C commands printf and scanf are covered separately in the next section.

scanf.asm Program Parts

format PE console This selects the Portable Executable format for the Windows executable program.  It can either be a console or GUI application.
include 'win32ax.inc' This header is needed to create a 32-bit application.  It is the extended version of win32a.inc
;======================================= Anything after a semicolon is a comment.
section '.code' code readable executable This designates the executable text or code section.
start: This is a user-created label used for jumping to different sections of the program.
cinvoke printf, "Enter a number: " Use cinvoke to call imported C functions.
mov ecx, [Number1]
mov
ebx, [Number1]
Move the value at memory address Number1 into register ecx.  Move Number2 into register ebx
add ebx, ecx Add register ecx to ebx.
mov [Sum], ebx Move register ebx into memory at address Sum.
cinvoke printf, "%d + %d = %d%c",[Number1],[Number2],[Sum],10 Call the printf function and send it Number1, Number2, Sum, and a newline (10).
invoke getch Calling the get character function is used to keep the console from closing after the program finishes.  It is essentially doing a "press any key to continue".
section '.data' data readable writable This designates the section for declaring initialized variables.  Variables must be readable and writable
Number1 dd 0
Number2 dd 0
Declare 4 bytes labeled Number1 and store 0.
Declare 4 bytes labeled Number2 and store 0.
section '.idata' import data readable This designates the section for importing functions.
library msvcrt,'msvcrt.dll',kernel32,'kernel32.dll' This specifies the C library (msvcrt.dll) and Windows 32-bit library.
import msvcrt,printf,'printf', scanf,'scanf', getch,'_getch' Imports the printf, scanf, and getch functions from the C library.