Read From stdin in Linux Assembler

Reading from stdin means to let the user type text and to consume that text in an application as soon as the user finishes their input by typing enter. Enter will add a linefeed character in Linux

\n = 10 = 0x0A = line feed

The user input first goes into a Linux buffer. You can call a Linux function to retrieve an amount of bytes from that buffer. Once you retrieved bytes, those bytes are subtracted from the Linux buffer so it contains only the input that was not consumed yet. You should always consume the Linux Buffer completely so that it is empty. The reason is that the buffer survives function calls. When you ask the user to input new data on a new occasion, the same input buffer is used. If it was not drained, old input will be read. The second user input might goes behind the existing data. You will expect new data but you are reading the old data first! So always drain the input buffer when asking the user for input, even if you are only interested in the first n characters.

The input is read into a array variable in your application (array of consecutive bytes in the data section). The array variable has to be defined with a fixed length in assembler, e.g. you define a byte array of 100 bytes.

Two things can happen when the user types and sends the input via enter:

  1. The user input from the Linux buffer and the newline fit into the variable in it’s entirety
  2. The user input from the Linux buffer and the newline is too large to fit into the variable.

If the input fits into the buffer, you just have to call the Linux function once which will then drain the entire input buffer. If the input is too large for your variable, you have call the Linux function several times until the Linux Input buffer is empty.

Implementation wise, reading from stdin can be done via int 80h which lets an assembler application call the Linux interrupt 80h. int 80h supports several functions https://www.tutorialspoint.com/assembly_programming/assembly_system_calls.htm. You select the function by putting its id into the eax register.

Reading from stdin has the id 3. ebx remains 0, ecx contains the array variable to put the bytes into. edx contains the amount of bytes to read, which is set to the length of the array variable.

To find out how many characters really were read from the function 3, function 3 will put the amount of bytes read into eax.

The implementation here is taken from https://stackoverflow.com/questions/23468176/read-and-print-user-input-with-x86-assembly-gnu-linux

It will read the first 5 bytes into an array variable and then it will drain the Linux input buffers one byte at a time by reading bytes into a dummy character variable until it sees the newline character. The dummy character is not processed further which means all the rest of the input is just ignored by this solution. In other words this code is only interested in the first 5 bytes and it will ignore the entire rest. The program then proceeds to output the first 5 bytes before it terminates itself.

BUFFER_SIZE equ 5
LINE_FEED equ 10

global _start           ; must be declared for using gcc ???

section .data
    str: times BUFFER_SIZE db 0 ; Allocate buffer of x bytes
    lf:  db 10          ; LF line feed

section .bss
    e1_len resd 1
    dummy resd 1

section .text

_start:                 ; tell linker entry point ???
    ; https://stackoverflow.com/questions/23468176/read-and-print-user-input-with-x86-assembly-gnu-linux

; read using function 3 (sys_read)
    mov eax, 3          ; Read user input into str
    mov ebx, 0          ; |
    mov ecx, str        ; | <- destination
    mov edx, BUFFER_SIZE        ; | <- length
    int 80h             ; \

    mov [e1_len], eax   ; Store number of inputted bytes
    cmp eax, edx        ; all bytes read?
    jb .2               ; yes: ok
    mov bl, [ecx+eax-1] ; BL = last byte in buffer
    cmp bl, LINE_FEED   ; LF in buffer?
    je .2               ; yes: ok
    inc DWORD [e1_len]  ; no: length++ (include 'lf')

; drain the linux input buffer
    .1:                 ; Loop
    mov eax, 3           ; SYS_READ
    mov ebx, 0          ; EBX=0: STDIN
    mov ecx, dummy      ; pointer to a temporary buffer
    mov edx, 1          ; read one byte
    int 0x80            ; syscall
    test eax, eax       ; EOF? eax contains the amount of bytes read
    jz .2               ; yes: ok
    mov al, [dummy]     ; AL = character
    cmp al, LINE_FEED   ; character = LF
    jne .1              ; no -> next character
    .2:                 ; end of loop

; output the array variable using function 4 from int 80h (sys_write)
    mov eax, 4          ; Print 100 bytes starting from str
    mov ebx, 1          ; |
    mov ecx, str        ; | <- source
    mov edx, [e1_len]   ; | <- length
    int 80h             ; \

; return using function 1 from int 80h (sys_exit)
    mov eax, 1          ; Return
    mov ebx, 0          ; | <- return code
    int 80h             ; \
TARGET_DIR := target
MKDIR_P = mkdir -p

all: directories main

main: main.o
	ld target/main.o -o target/main

main.o: main.asm
	nasm -f elf64 main.asm -o target/main.o

clean:
	rm target/main.o target/main

# https://www.gnu.org/software/make/manual/html_node/Phony-Targets.html
.PHONY: directories
directories: ${TARGET_DIR}

${TARGET_DIR}:
	${MKDIR_P} ${TARGET_DIR}

Leave a Reply