Operating System Design & Implementation Tutorial - JOSH

Shell implementation

The implementation of the shell will be done in small steps that do meaningful work. As an aside, I prefer to develop small applications to test routines so that development can be done fast without the need to reboot the system often (I assume that most users will have access to one PC to do all the development/testing). I develop small .com binaries that I test immediately and once I am happy with the working of the routine, I move it to the JOSH source to finally test it.

First we will write small routines that will do some screen oriented chores like displaying space, moving to the beginning of next line etc. Let us start with a small function to display a space and advance the cursor. This can be done everytime we need in the required routine itself but it is time consuming. It is easy to write small functions and call them when needed. The function looks like below:

_display_space:
    mov ah, 0x0E    ; BIOS teletype
    mov al, 0x20    ; space character
    mov bh, 0x00    ; display page 0
    mov bl, 0x07    ; text attribute
    int 0x10        ; invoke BIOS
    ret
			

The code is rather self explanatory and you should have no difficulty understanding it. These functions should go to the end of the code section after the interrupt handler (it is only a matter of taste, you can place it anywhere in the code section). The next function will move the cursor to the beginning of the next line. The code is:

_display_endl:
    mov ah, 0x0E        ; BIOS teletype acts on newline!
    mov al, 0x0D
    mov bh, 0x00
    mov bl, 0x07
    int 0x10
    mov ah, 0x0E        ; BIOS teletype acts on linefeed!
    mov al, 0x0A
    mov bh, 0x00
    mov bl, 0x07
    int 0x10
    ret
			

The code is similar to the previous function but displays two characters in succession, the 'newline' and the 'linefeed' characters. The newline character will move the cursor to the beginning of the same line and the linefeed will move the cursor to the next line. The BIOS has many interrupt sub-services to display characters, this one will display the character and also act on the character by moving the cursor accordingly (for eg. if you try to display the backspace character, the cursor actually will move one space back - but only upto the beginning of the line and also will not erase existing displayed characters as it moves!).

The next function we develop will display the shell prompt! To do this we have to decide how our shell prompt is going to look. The JOSH shell prompt is goint to look like 'JOSH>>' and if you don't like it you can always change it to your liking. Before displaying the prompt, we have to create a string variable to hold the prompt, and changing the contents of this variable will change the prompt. Let us do it like this:

[SEGMENT .data]
    strWelcomeMsg   db  "Welcome to JOSH!", 0x00
    strPrompt   db  "JOSH>>", 0x00
    ...
			

This adds a variable called 'strPrompt' made of a string of bytes having 'JOSH>>' in it as well as a terminal 0x00. Now we will add a function to display the string.

_display_prompt:
    mov si, strPrompt   ;load message
    mov al, 0x01        ;request sub-service 0x01
    int 0x21
    ret
			

The code again is self explanatory and you should have no difficulty understanding it. Now we have to create a buffer to hold the command that the user will type in. As we have already decided, a user command can be 255 chars long, so our buffer should be 256 chars long to accommodate a terminal 0x00. We can declare un-initiated data in NASM in the .bss section at the end of the code. We also need the ability to change the length of the commands later, so we will not hard-code the length in the concerned routine but rather will declare a variable called 'cmdMaxLen' and initiate it to the maximun length desired. We also need a counter to keep track of the number of characters entered in the command. This will be called 'cmdChrCnt'. The declaration section looks like this:

[SEGMENT .data]
    strWelcomeMsg   db  "Welcome to JOSH!", 0x00
    strPrompt   db  "JOSH>>", 0x00
    cmdMaxLen   db  255         ;maximum length of commands

[SEGMENT .bss]
    strUserCmd  resb    256     ;buffer for user commands
    cmdChrCnt   resb    1       ;count of characters
			

Now that we have the necessary helper functions and the data variables in place, we can delve into the actual function that will accept a user input, which is called '_get_command'. This function will display the cursor and wait for the keyboard input. When a key is struck, it analyses the key pressed. If the key is an extended key (F1, HOME etc.) it will do nothing and will continue to wait. If the key pressed is ENTER, it will terminate the buffer with 0x00 and return. If the key pressed is a backspace, it will move the buffer pointer one character behind if not already at the beginning and will also move one space back on the screen. If the key pressed is a character key, it will be added to the buffer if the buffer is not 255 characters long. It is a pretty long listing and try to understand it fully. The code can be re-written using the more efficient MOVSB assembler directive, which I leave it to you. The listing is as follows:

    ...
    _int0x21_end:
    iret

_get_command:
    ;initiate count
    mov BYTE [cmdChrCnt], 0x00
    mov di, strUserCmd

    _get_cmd_start:
    mov ah, 0x10        ;get character
    int 0x16

    cmp al, 0x00        ;check if extended key
    je _extended_key
    cmp al, 0xE0        ;check if new extended key
    je _extended_key

    cmp al, 0x08        ;check if backspace pressed
    je _backspace_key

    cmp al, 0x0D        ;check if Enter pressed
    je _enter_key

    mov bh, [cmdMaxLen]     ;check if maxlen reached
    mov bl, [cmdChrCnt]
    cmp bh, bl
    je  _get_cmd_start

    ;add char to buffer, display it and start again
    mov [di], al            ;add char to buffer
    inc di                  ;increment buffer pointer
    inc BYTE [cmdChrCnt]    ;inc count

    mov ah, 0x0E            ;display character
    mov bl, 0x07
    int 0x10
    jmp _get_cmd_start

    _extended_key:          ;extended key - do nothing now
    jmp _get_cmd_start

    _backspace_key:
    mov bh, 0x00            ;check if count = 0
    mov bl, [cmdChrCnt]
    cmp bh, bl
    je  _get_cmd_start      ;yes, do nothing
    
    dec BYTE [cmdChrCnt]    ;dec count
    dec di

    ;check if beginning of line
    mov ah, 0x03        ;read cursor position
    mov bh, 0x00
    int 0x10

    cmp dl, 0x00
    jne _move_back
    dec dh
    mov dl, 79
    mov ah, 0x02
    int 0x10

    mov ah, 0x09        ; display without moving cursor
    mov al, ' '
    mov bh, 0x00
    mov bl, 0x07
    mov cx, 1           ; times to display
    int 0x10
    jmp _get_cmd_start

    _move_back:
    mov ah, 0x0E        ; BIOS teletype acts on backspace!
    mov bh, 0x00
    mov bl, 0x07
    int 0x10
    mov ah, 0x09        ; display without moving cursor
    mov al, ' '
    mov bh, 0x00
    mov bl, 0x07
    mov cx, 1           ; times to display
    int 0x10
    jmp _get_cmd_start

    _enter_key:
    mov BYTE [di], 0x00
    ret

_display_space:
    ...
			

The routine above will get user input into a string and terminate it with 0x00 but we cannot use this string directly as a command. As we have already discussed, a command can have many parts with directives/operands added after the command. Each of them will be separated by one or many spaces (we will not be so rigid and will let users key in leading/trailing spaces as well as multiple spaces between directives). We will have to split the user input into the various components before using it. We will declare five new buffers, one for each component of the command. The complete .bss declaration section is as follows:

[SEGMENT .bss]
    strUserCmd  resb    256     ;buffer for user commands
    cmdChrCnt   resb    1       ;count of characters
    strCmd0     resb    256     ;buffers for the command components
    strCmd1     resb    256
    strCmd2     resb    256
    strCmd3     resb    256
    strCmd4     resb    256
			

Now we will look at the function that will split the string into five sub-strings (more parts will be ignored!). To get a string, first we will analyse the string from the beginning and move over spaces, then we will copy characters to the sub-string until we meet a space or the terminator char 0x00. We will do this cycle five times to get all the command components. If only one or two components are present, the other components will be terminated at the first character with 0x00. This function can be added after the previous function and the listing for this is:

_split_cmd:
    ;adjust si/di
    mov si, strUserCmd
    ;mov di, strCmd0

    ;move blanks
    _split_mb0_start:
    cmp BYTE [si], 0x20
    je _split_mb0_nb
    jmp _split_mb0_end

    _split_mb0_nb:
    inc si
    jmp _split_mb0_start

    _split_mb0_end:
    mov di, strCmd0

    _split_1_start:         ;get first string
    cmp BYTE [si], 0x20
    je _split_1_end
    cmp BYTE [si], 0x00
    je _split_1_end
    mov al, [si]
    mov [di], al
    inc si
    inc di
    jmp _split_1_start

    _split_1_end:
    mov BYTE [di], 0x00

    ;move blanks
    _split_mb1_start:
    cmp BYTE [si], 0x20
    je _split_mb1_nb
    jmp _split_mb1_end

    _split_mb1_nb:
    inc si
    jmp _split_mb1_start

    _split_mb1_end:
    mov di, strCmd1

    _split_2_start:         ;get second string
    cmp BYTE [si], 0x20
    je _split_2_end
    cmp BYTE [si], 0x00
    je _split_2_end
    mov al, [si]
    mov [di], al
    inc si
    inc di
    jmp _split_2_start

    _split_2_end:
    mov BYTE [di], 0x00

    ;move blanks
    _split_mb2_start:
    cmp BYTE [si], 0x20
    je _split_mb2_nb
    jmp _split_mb2_end

    _split_mb2_nb:
    inc si
    jmp _split_mb2_start

    _split_mb2_end:
    mov di, strCmd2

    _split_3_start:         ;get third string
    cmp BYTE [si], 0x20
    je _split_3_end
    cmp BYTE [si], 0x00
    je _split_3_end
    mov al, [si]
    mov [di], al
    inc si
    inc di
    jmp _split_3_start

    _split_3_end:
    mov BYTE [di], 0x00

    ;move blanks
    _split_mb3_start:
    cmp BYTE [si], 0x20
    je _split_mb3_nb
    jmp _split_mb3_end

    _split_mb3_nb:
    inc si
    jmp _split_mb3_start

    _split_mb3_end:
    mov di, strCmd3

    _split_4_start:         ;get fourth string
    cmp BYTE [si], 0x20
    je _split_4_end
    cmp BYTE [si], 0x00
    je _split_4_end
    mov al, [si]
    mov [di], al
    inc si
    inc di
    jmp _split_4_start

    _split_4_end:
    mov BYTE [di], 0x00

    ;move blanks
    _split_mb4_start:
    cmp BYTE [si], 0x20
    je _split_mb4_nb
    jmp _split_mb4_end

    _split_mb4_nb:
    inc si
    jmp _split_mb4_start

    _split_mb4_end:
    mov di, strCmd4

    _split_5_start:         ;get last string
    cmp BYTE [si], 0x20
    je _split_5_end
    cmp BYTE [si], 0x00
    je _split_5_end
    mov al, [si]
    mov [di], al
    inc si
    inc di
    jmp _split_5_start

    _split_5_end:
    mov BYTE [di], 0x00

    ret
			

Now we have all the components of a shell ready. A component to get user inputs, and the one to split it into the needed sub-strings. We have to write a wrapper to display the prompt, get a command, split it, analyse the first component and act on it. If the command is an internal command, the result is displayed and the cycle continues till the user exits from the shell - using the 'exit' command. Another thing I forgot to add previously is that we will follow the UNIX convention of case-sensitive commands and filenames (it makes life a lot simpler to code! but maybe a bit tough on the user).

We also have to define the internal commands available and set up a couple of strings to hold the OS name, version numbers etc. The complete listing of the declarations are:

[SEGMENT .data]
    strWelcomeMsg   db  "Welcome to JOSH Ver 0.03", 0x00
    strPrompt       db  "JOSH>>", 0x00
    cmdMaxLen       db  255         ;maximum length of commands

    strOsName       db  "JOSH", 0x00    ;OS details
    strMajorVer     db  "0", 0x00
    strMinorVer     db  ".03", 0x00

    cmdVer          db  "ver", 0x00     ; internal commands
    cmdExit         db  "exit", 0x00

    txtVersion      db  "version", 0x00 ;messages and other strings
    msgUnknownCmd   db  "Unknown command or bad file name!", 0x00
			

The shell routine listing is:

_shell:
    _shell_begin:
    ;move to next line
    call _display_endl

    ;display prompt
    call _display_prompt

    ;get user command
    call _get_command
    
    ;split command into components
    call _split_cmd

    ;check command & perform action

    ; empty command
    _cmd_none:      
    mov si, strCmd0
    cmp BYTE [si], 0x00
    jne _cmd_ver        ;next command
    jmp _cmd_done
    
    ; display version
    _cmd_ver:       
    mov si, strCmd0
    mov di, cmdVer
    mov cx, 4
    repe    cmpsb
    jne _cmd_exit       ;next command
    
    call _display_endl
    mov si, strOsName       ;display version
    mov al, 0x01
    int 0x21
    call _display_space
    mov si, txtVersion      ;display version
    mov al, 0x01
    int 0x21
    call _display_space

    mov si, strMajorVer     
    mov al, 0x01
    int 0x21
    mov si, strMinorVer
    mov al, 0x01
    int 0x21
    jmp _cmd_done

    ; exit shell
    _cmd_exit:      
    mov si, strCmd0
    mov di, cmdExit
    mov cx, 5
    repe    cmpsb
    jne _cmd_unknown        ;next command

    je _shell_end           ;exit from shell

    _cmd_unknown:
    call _display_endl
    mov si, msgUnknownCmd       ;unknown command
    mov al, 0x01
    int 0x21

    _cmd_done:

    ;call _display_endl
    jmp _shell_begin
    
    _shell_end:
    ret
			

The shell currently understands two commands. The 'ver' command will display the OS name and the version, and the 'exit' command will exit from the shell.

Finally we have to call the shell from the main routine after displaying the welcome message. We can also safely remove the two lines of code that wait for a key press to reboot. This will finish the shell integration for now. The call looks like:

    ...
    mov si, strWelcomeMsg   ; load message
    mov al, 0x01            ; request sub-service 0x01
    int 0x21

    call _shell             ; call the shell
    
    int 0x19                ; reboot
    ...
			

That comes to the end of a lengthy session. The complete kernel can be downloaded here. Go ahead and compile the new kernel. Copy it to your JOSH boot-disk and boot JOSH. Play around with the shell. Try inputting all sorts of keys and see how the shell responds. As of now it should display the version in response to the 'ver' command, and reboot for the 'exit' command. Check if it responds to leading/trailing spaces.

That comes to the end of another part of our journey. Now we have understood how an OS boots. We have successfully installed an interrupt service, and have integrated a rudimentary shell to play with. The next step would be to add as many interrupt sub-services as needed to display things, and to add most of the internal shell commands that do not need access to the filesystem. These chores will be done in the next chapter.