Disasm Visual Basic Applications
Disasm Visual Basic Applications
INDEX [INDX]
The information provided in this tutorial must not be used for Reverse
Engineering any application.
THE TEXT HAS BEEN WRITTEN IN SUCH A WAY THAT THE READER CAN LEARN, AND NOT JUST
GAIN INFORMATION WITHOUT KNOWING HOW STUFF WORKS.
If the Reader still chooses to break Protection Mechamisms after reading this
tutorial, he/she shall alone be responsible for the damages cause and not the
Author.
If you wish to post certain Sections of the Tutorial on a Website, you are free
to do so provided you inform the author and publish the Selected Text from the
Tutorial as it is without modification.
The Author has not copied text or any other information directly from a Source.
However, some information from some sources has been used to write this tutorial.
These Sources have been mentioned in the References Section.
You are permitted to continue reading the tutorial only if you agree to the text
given above.
II. READING THIS TUTORIAL [RTUT]
Each Section in this Tutorial has a specific Topic Code enclosed in square
brackets. This arrangement has been made so that you can jump to a specific
topic simply by searching for the topic code from your Browser.
At many places in the tutorial, I've explained a few things which are almost
unnecessary to know when dealing with Visual BASIC programs but I've written
them for those who are interested in Hacking (And I don't mean that
'hack-an-email-address' sort of a kid. The original meaning of Hacking has
been ruined by pathetic people like them. Hacking in literal terms stands for
'curiousity'.
The original meaning of a Hacker is:
"A person who enjoys exploring the details of programmable systems, as opposed
to most users, who prefer to learn only the minimum necessary."
The extra details I've given in this tutorial are for those who want to be such
ethical hackers.
The Topic Code [XTRA] and [/XTRA] has been given for marking "extra-information"
sections and you are free to skip such sections. Text within the [XTRA]..[/XTRA]
blocks is given for extra information.
You can search for the Extra Information using the Topic Code.
However the same is not true for Applications written in Visual BASIC. VB
Programs are said to be very slow and hence deliver poor performance. There is a
reason for this. Visual BASIC programs unlike those written in other languages
don't use Windows API Directly. Local functions present in VB Runtime Files are
called which call functions from the Windows API.Most of the Visual BASIC
functions are present in MSVBM60.DLL (if you've got Runtime Files ver. 6.0).
So to study VB programs, we must disassemble and analyze the MSVBM60.DLL file as
well.
Since VB programs use such a complex API Function call procedure, programs tend
to run slower.(There are other reasons as to why VB programs run slower but I
won't be covering it as it's off-topic.)
My secondary aim is to help you realise why Visual BASIC is not suitable for
writing small,fast and efficient programs.
Almost all authors of Visual BASIC books mention that Visual BASIC does not
give you applications with good performance.
This tutorial tells you why.
The Tutorial will talk about executable files compiled in Visual BASIC in
Native Code ONLY and not p-code.
You are required to have a basic understanding of Visual BASIC,C,the Windows API
and 80x86 Microprocessor Assembly Language.
It would be advisable to have a copy of Intel's 80x86 Instruction Set Manual.
Intel provides this manual free of charge. If you need this manual, contact me.
You can provide these Order Numbers to get a copy of these manuals. For this
tutorial only Volume 2 is required.
You will need the following tools to proceed with the Tutorial.
VBDE is not required but it's always better to have it as it gives addresses of
entry-points of most VB procedures.
VBReformer is used to see the Property values of all objects in a Visual Basic
Form. It even allows you to change the value of Object Properties such as Forms,
Command Buttons etc. Import Libraries can also been seen. This Application is
not required for this Tutorial but it's better to have it.
NuMeGa SmartCheck is again not required but it is useful when we have no idea
what a particular procedure of VB does. You can run a program from it like a
Debugger and view its log files and find out which procedure is called and what
operations are carried out etc.
You can have API Documentation from MSDN or you can use the API Text Viewer Tool
supplied with Visual Studio or browse MSDN Online (msdn.microsoft.com). Certain
Applications like APIViewer will also do.
I have given the names of the Tools that I have used. But you are free to use
any disassembler and debugger as long as you are comfortable using it but I
advice you to use the tools that I have used above. SoftIce is better than
OllyDebug but the latter is good enough for VB Programs so it doesn't matter
which one you use.
Once you have the necessary knowledge and tools, you can proceed further.
Let's begin.
When you open a VB Program from IDA, you'll end up with the following code.
start:
push offset dword_4012B4
call ThunRTMain
; ---------------------------------------------------------------------------
dd 0, 300000h, 400000h, 0, 0E9960000h, 82E6FCDFh, 939C4C23h
dd 0EB969B2Fh, 73D5h, 0, 1, 34303230h, 72503033h, 63656A6Fh
; etc etc etc
This doesn't make any sense does it? If you keep scrolling further you will see
sections of code and data. Each Section has a meaning in VB Programs and you can
see a general idea of a Visual BASIC program's Section Map below.
00401000:
... IAT (First Thunk ok apis)
Next Section(NS):
... some data
NS:
... transfer area (Jumps to imported functions)
NS:
... lots of data
NS:
... local transfer area (for internal event handlers)
NS:
... other data
NS:
... code
NS:
... lots of data
NS:
... .data Section
Let us now start analysis from the entry point of the program.
A function ThunRTMain is called which accepts one parameter. We'll soon find out
that the parameter is a structure.
Simply putting a step over command on the CALL statement results in the
execution of the Application.
Wierd Isn't it?
For Pascal,C and C++ Programs there is always a start() function that takes all
CommandLine Parameters,Gets ProcessThreads,Module Handles etc. We didn't see
anything of the sort in a Visual BASIC Program.
But actually, VB does have a start function. The start function code is placed
in the ThunRTMain Function. Let's verify that by disassembling the MSVBM60.DLL
and viewing the ThunRTMain Function. I've mentioned only a part of the
ThunRTMain Function Code.
or [ebp+var_4], 0FFFFFFFFh
push 0 ; uExitCode
call ds:ExitProcess
jmp loc_734619B3
As you can see, it does call all the Functions that the start() function does in
C and PASCAL programs. But what about CommandLine() Function from KERNEL32.DLL?
MSVBM60.DLL does call that function as well but that function call is placed in
deeply nested function calls. You can open the Imports Window to see the
Imported Function and see the cross-reference to a procedure in MSVBM60.DLL
The sub_Free_Memory procedure calls various API Functions but if you keep
reading the procedure, you'll soon come across the HeapFree() Function which is
imported from kernel32.dll.
Now I guess you now know the purpose of the ThunRTMain Function.
Let us now see what structure is passed to it.
Explaining the Structure will take up a lot of time and since I want to focus on
the Code Constructs of Visual BASIC, I won't explain the Structure Passed to
ThunRTMain.
All I can tell you is that the structure contains the PE (Portable Executable)
Header Details. It is this header that is read by Resource Editors.
Create a Form with a CommandButton. Click the CommandButton and add a simple
Msgbox Code as shown below:
fake_a_call_instr:
retn
; ---------------------------------------------------------------------------
continue_after_jump:
mov eax, [ebp+arg_0]
push eax
mov ecx, [eax]
call dword ptr [ecx+8] ; Calls MSVBM60.Zombie_Release
mov eax, [ebp+var_4]
mov ecx, [ebp+var_14]
pop edi
pop esi
mov large fs:0, ecx
pop ebx
mov esp, ebp
pop ebp ; Closes Stack Frame
retn 4
Command1_Click endp
Simply by looking at the entire procedure you can't exactly figure out what the
hell happens when the whole subroutine is executed. If you know Assembly well
and have had the patience to read through the code, you should notice a few neat
things in the code.
[XTRA]
Before I begin explaining the procedure, I want to teach you how to recognise a
procedure in Visual BASIC. They can be called Procedure Signatures.
1) A Procedure has the open and close Stack Frame instructions.
2) The First Procedure in a VB Program is always preceded by
12 0xCC Bytes (which corresponds to the INT 3 Instruction) followed by
4 'T' bytes (0xE9) followed by 12 0xCC bytes.
3) Procedures other than the first are preceded by 10 NOP(0x90) Instructions.
: 1) STACK FRAME:
The Open/Close Stack Frame Instructions are even found in C/C++ and Pascal
programs and hence can be termed as a universal method of determining procedures.
However that is not always the case.
--> Many compilers just JMP instructions to fake a Call Instruction. This Jump
is at times a CALL to a procedure. IDA Pro does not detect such
CALL 'emulating' instructions but OllyDebug does recognise such
code patterns.
--> Visual C++ allows the programmer to write naked functions. Naked functions
mean that the compiler does not allocate space for its arguments nor does it
include the stack open and close frame instructions.
But since we are dealing with Visual BASIC, we can ignore the second case. You
will see an example of the first case shortly.
The 0xCC Byte is used to Generate the INT 3 Exception, which is known as the
"CALL TO INTERRUPT" Procedure. It is used by Debuggers such as OllyDebug and
SoftIce to set software Breakpoints. Debuggers insert the 0xCC byte before the
instruction which it wants to set a breakpoint on. As soon as the INT 3
Instruction is executed, Control is passed onto the Debuggers Exception Handler.
Here is the description taken directly from Intel's Software Developers Manual
Volume 2 : Instruction Set Reference.
"The INT 3 instruction generates a special one byte opcode (CC) that is intended
for calling the debug exception handler. (This one byte form is valuable because
it can be used to replace the first byte of any instruction with a breakpoint,
including other one byte instructions, without over-writing other code). To
further support its function as a debug breakpoint, the interrupt generated with
the CC opcode also differs from the regular software interrupts as follows:
Interrupt redirection does not happen when in VME mode; the interrupt is
handled by a protected-mode handler.
The virtual-8086 mode IOPL checks do not occur. The interrupt is taken without
faulting at any IOPL level."
That's how debuggers work. That's also the concept of certain anti-debugging
techniques. Since the 0xCC code is injected by Debuggers before an instruction,
the CRC (Cyclic Redundancy Check) Value of the code also changes. Some
Antidebugging techniques encrypt the program with a key which is the CRC value
of the program. When a program is being debugged, its CRC value changes and with
the result the program doesn't get decrypted.
Such methods are effective in stopping amateur wannabe hackers from
understanding their code but its not foolproof and an expert hacker can get past
this technique with ease.
So much for what '0xCC' is. But why is it placed before the First Procedure in
VB Programs?
I've found no answer to that so far. This wastes a lot of space in a program.
If you try to disassemble a Console Program written in Visual C++, you'll find
many instructions which set parts of the stack to the '0xCC' value. You will
also find 0xCC bytes scattered across the disassembled listing.
If only Visual Studio was Open Source, we could have seen the code generation
code and come up with an answer and improve the code generation code too.
I hope you also realise why Open Source is slowly gaining momentum.
Here is the description taken directly from Intel's Software Developers Manual
Volume 2 : Instruction Set Reference.
But again, I see no reason why the 0x90 Byte is present in Visual BASIC.
Removing such entries will reduce the executable size drastically.
[/XTRA]
What does the Zombie_AddRef Function do? It Takes the Object Reference.
In this function the parent object (in this case Form) is passed as a parameter
and uses AddRef to increment reference count of the object (instantiation).
Since COM objects are responsible for their lifetime, the resources they use are
allocated until the reference count is 0, when it reaches 0 the objects enter
zombie state & can be deallocated to free resources.
Refer COM object management documentation for more detailed information.
Right after the call of the Zombie_AddRef Function there are MOV instructions
which assigns values to many variables. That follows a reference to the "Ssup"
string followed by a call to the rtcMsgbox procedure.
Why does it seem so wierd? Shouldn't it simply call the rtcMsgbox Function?
To do that, start OllyDebug and load the Executable file by pressing F3.
After the program is loaded, press Alt+E to open the Executable Modules Window.
Double click USER32.DLL to open the disassembled listing of the User32.dll file.
From there press Ctrl+N to open the Imports/Exports Window. Then Scroll over
till you see the MessageBoxA and MessageBoxW Functions. Click them one at a time
and press F2 to set a breakpoint.
Now press F9 to run the program. The Application should open. Click the
CommandButton. Now instead of the Debugger halting at a breakpoint of MessageBox,
the MessageBox comes up without any halt to the Debugger.
Why does this happen? Does this mean that rtcMsgBox has a seperate copy of the
MessageBox code within itself? Though it seems like a possible reason, it is
unlikely to happen as Microsoft Developers built the Windows API so that they
could be reused for performance. So that means that some API Function is called
which displays the MessageBox.
So let us try another experiment. In the same Imports/Exports Section of
User32.dll we see 2 more MessageBox functions which are MessageBoxIndirectA and
MessageBoxIndirectW. Let's try setting a breakpoint on both these Messages.
After the breakpoint is set, press F9, and click the Command Button.
This time, the Debugger halts at the MessageBoxIndirectA function.
Interesting isn't it? All Visual BASIC Applications which use the Msgbox()
Function are actually calls to MessageBoxIndirectA and not MessageBox as thought.
Only One Parameter? So then how is the Message Body and Title passed to the
Function? For that we'll need to see the declaration of the MSGBOXPARAMS
Structure.
This suggests that the required parameters are assigned to variables and the
reference to that object is passed to that function.
So That suggests that the many MOV instructions found before the rtcMsgbox call
are used to initialise the MSGBOXPARAMS Structure.
To confirm our doubt, let's compare the MOV instructions with the code found
before the MessageBoxIndirect function is called.
loc_734A6133:
Next comes the __vbaFreeVarList Function. From its name we can see that it
deallocates the address of a certain number of variables. This function actually
does no work except call the __vbaFreeVar Function multiple number of times.
The code is pretty easy to understand. This function frees temporary variables
that are passed as arguments to it.Interestingly each memory location is
16 bytes wide.
This is an interesting function as it can accept variable arguments.
It's equivalent function call in C would be:
__vbaFreeVarList(4,&var_24,&var_34,&var_44,&var_54);
public __vbaFreeVarList
__vbaFreeVarList proc near
freed_all_vars:
pop esi ; Value of esi is restored.
retn ; Return to the calling function
ENGINE:7352009D __vbaFreeVarList endp
As you can see, __vbaFreeVarList uses a while loop to free each variable one by
one using the __vbaFreeVar Function.
Notice that the address of the variable to be freed is stored in ECX always.
You can disassemble the __vbaFreeVar Function to confirm that.
Now let us see what happens when after the MessageBox is shown.
This is the most interesting part.
fake_a_call_instr:
retn
; ---------------------------------------------------------------------------
continue_after_jump:
; code
Hmm...this contains more offsets? By simply double-clicking the offsets you land
up at the destructor code again. That's why the CALL simulation code is used so
that the destructor code looks like its an inline function.
If you're more curious, you can also double-click the 'exception_handler' text
to see where that leads to.
Well, after a long journey into the Command1_Click() Procedure, we're finally
done analyzing it.
From this point onwards, I shall explain only the important section of code
rather than explain such intricate details once again.
Sub Main()
MsgBox "Ssup"
End Sub
What you will realise that the Procedure code is an exact copy of the code we
dealt with earlier. This means that Form Procedures and Module Procedures are
treated alike. This also means that the Command Button code procedure had no
chance of using any information of the Form Object.
Create a form without any controls. The code in the form module is as follows:
push ebp
mov ebp, esp
sub esp, 0Ch
push (offset vba_exception_handler+1)
mov eax, large fs:0
push eax
mov large fs:0, esp
sub esp, 0F0h
push ebx
push esi
push edi
mov [ebp+var_C], esp
mov [ebp+var_8], offset destructor
mov eax, [ebp+arg_0]
mov ecx, eax
and ecx, 1
mov [ebp+var_4], ecx
and al, 0FEh
push eax
mov [ebp+arg_0], eax
mov edx, [eax]
call dword ptr [edx+4] ; Zombie_AddRef()
xor eax, eax
mov ebx, 80020004h
mov edi, 0Ah
; code...
call ds:rtcInputBox
mov edx, eax
lea ecx, [ebp+var_18]
call ds:__vbaStrMove
push eax
call ds:__vbaStrCmp
mov esi, eax
lea ecx, [ebp+var_18]
neg esi
sbb esi, esi
neg esi
neg esi
call ds:__vbaFreeStr
lea ecx, [ebp+var_88]
lea edx, [ebp+var_78]
push ecx
lea eax, [ebp+var_68]
push edx
lea ecx, [ebp+var_58]
push eax
lea edx, [ebp+var_48]
push ecx
push edx
lea eax, [ebp+var_38]
lea ecx, [ebp+var_28]
push eax
push ecx
push 7
call ds:__vbaFreeVarList
add esp, 20h
mov [ebp+var_50], ebx
test si, si
mov [ebp+var_58], edi
mov [ebp+var_40], ebx
mov [ebp+var_48], edi
mov [ebp+var_30], ebx
mov [ebp+var_38], edi
jz short jump_if_right
lea edx, [ebp+var_98]
lea ecx, [ebp+var_28]
mov [ebp+var_90], offset aWrong ; "wrong"
mov [ebp+var_98], 8
call ds:__vbaVarDup
lea edx, [ebp+var_58]
lea eax, [ebp+var_48]
push edx
lea ecx, [ebp+var_38]
push eax
push ecx
lea edx, [ebp+var_28]
push 0
push edx
call ds:rtcMsgBox
lea eax, [ebp+var_58]
lea ecx, [ebp+var_48]
push eax
lea edx, [ebp+var_38]
push ecx
lea eax, [ebp+var_28]
push edx
push eax
jmp short free_up_resources
; ---------------------------------------------------------------------------
jump_if_right:
free_up_resources:
push 4
call ds:__vbaFreeVarList
add esp, 14h
mov [ebp+var_4], 0
push offset call_emulate
jmp short jump_as_a_call
; ---------------------------------------------------------------------------
lea ecx, [ebp+var_18]
call ds:__vbaFreeStr
lea eax, [ebp+var_88]
lea ecx, [ebp+var_78]
push eax
lea edx, [ebp+var_68]
push ecx
lea eax, [ebp+var_58]
push edx
lea ecx, [ebp+var_48]
push eax
lea edx, [ebp+var_38]
push ecx
lea eax, [ebp+var_28]
push edx
push eax
push 7
call ds:__vbaFreeVarList
add esp, 20h
retn
; ---------------------------------------------------------------------------
jump_as_a_call:
retn
; ---------------------------------------------------------------------------
call_emulate:
Most of the code is irrelevant to us but the important chunk of code of the
function (which took me quite some time to find) is given below:
In the first portion of the code, the push instructions push all the seven
parameters to the function that creates the InputBox Dialogbox. I know that with
the current example that I'm disassembling, it's not quite possible to believe
that all the push instructions stand for what I've mentioned. So what you can do
is disassemble the following code given below and set a breakpoint on the PUSH
instructions in the MSVBM60.DLL File using OllyDebug or SoftIce.
With this you can actually verify the contents of the push instruction to
confirm what I've written.
Now, after the call instruction you can see a PUSH instruction pushing the
contents of a variable. This is a parameter for the SysFreeString Function.
Next is a MOV instruction transferring the contents of the EAX register into EDI.
EAX at this point of time contains the address of the String that we filled in
the Text Box. This is done to save the value of EAX.
Then the SysFreeString Function is called. This function takes 1 argument which
is the string that needs to be deallocated. This function does not return any
value after execution.
This is deploring code generated by Visual C++ 6.0 Compiler (which was used to
write the MSVBM60.DLL File) and since Visual BASIC programs use this routine, it
results in slow, sluggish programs.
And this is just one function...imagine what would happen if we analyzed all of
them?
Anyway, the original contents of the registers are restored, the stack frame is
closed (with LEAVE) and the function returns after adding 0x1C bytes to ESP.
call ds:rtcInputBox
mov edx, eax
lea ecx, [ebp+var_18]
call ds:__vbaStrMove
push eax
call ds:__vbaStrCmp
mov esi, eax
lea ecx, [ebp+var_18]
neg esi
sbb esi, esi
neg esi
neg esi
call ds:__vbaFreeStr
After the call of rtcInputBox, EAX contains the entered string in the Text Box
of the InputBox dialog. The address of this string is moved to EDX.
Is this done to save the contents of the String for the compare function?
As obvious as it may seem, it is really not like that. Let us see.
The __vbaStrMove function moves a String from one place in memory to another
place. On Analysis of the function code it is found that this function accepts
two arguments and returns one value as shown below:
PARAMETERS:
RETURNS : Source String. Sets EAX to EDI (which holds value of EDX)
Now we see that by setting the value of EAX to EDX, we are setting the entered
string as a SOURCE string to be copied into another location. This is neat.
Now the Source String is pushed again and the __vbaStrCmp Function is called.
This on first glance looks wierd again. It seems that the StrCmp Function
accepts only one argument. Then what does it compare it with?
If you scroll above you will find a push instruction that pushes the address of
the string "Sanchit" ( push offset aSanchit ; "Sanchit" )
Such cases remind us that unlike code generated by Pascal and C/C++ compilers,
Visual BASIC functions can have it's arguments pushed anywhere and not right
before the function call.
Keep this thing in mind when you set out to disassemble your own Visual BASIC
programs.
Now after the Comparing function returns its value via the EAX register, it is
copied to ESI.
Then the LEA instruction is used to load the address of a variable in the ECX
register. Now since we are aware of VB's tricks we know that this is a parameter
to the __vbaFreeStr function. You should notice that the same variable which
held the value of the entered string is now being passed to this function to
deallocate it as its not required after the comparison has been done.
neg esi
sbb esi, esi
neg esi
neg esi
call ds:__vbaFreeStr
The NEG statement's actual use is to change the sign of a number for example,
from 3 to -3.
But this one has an indirect use. This Instruction affects many flags. But the
one it is meant for is the Carry Flag (CF) which is used in the next SBB
Instruction.If ESI is equal to 0 then CF is reset to 0 and otherwise is set to 1.
Now the SBB Instruction stands for Subtraction with Borrow. In this case it can
be translated to this:
ESI = ESI - (ESI + CF);
The next two NEG instructions are of no use. You can take this as another
example to show why VB code is slow.
Following this is the __vbaFreeStr Function which deallocates space for a string
variable.
Now you might wonder that if comparison has been performed, then why isn't there
any JUMP instruction?
If you keep reading the code you will find the TEST and JUMP Instructions after
the seven variables used have been cleared. These instructions are found
together in C/C++ and Pascal programs but in Visual BASIC this isn't always the
case.
That's why Visual BASIC programs take up more time for analysis compared to C
and Pascal programs.
I'm not discussing the rest of the code as it has been covered in the previous
section.
Remember the two useless NEG instructions before the __vbaFreeStr Function call?
Since these instructions are of no practical use, we can replace these two bytes
with XOR ESI, ESI which consumes only 2 bytes. That way the TEST instruction
would always result in ZERO making control jump always to the Msgbox("Right")
code.
A lot of Hackers use such 'useless' code to implement such hacking techniques.
IX. CONCLUSION [END1]
This isn't exactly the end of the tutorial. I'd rather say that it is the end
for now. This topic is vast and I'm trying to include explanation and analysis
of all Visual BASIC functions. If I plan to release this tutorial with all
functions inclusive, it's going to take a lot of time.
So i've decided to post this tutorial first and keep updating it every 15 to 20
days. You can check the LAST UPDATED Section in the beginning of the Tutorial to
see how recently this tutorial has been updated.
Keep checking for updated versions every month.
I hope you've enjoyed my tutorial as I've put in a lot of hard work and time on
writing this.
Since I haven't come across any Books or Articles on this subject, I don't quite
know what exactly is expected from my tutorial. I would appreciate it if you
could email me suggestions and comments on this tutorial.
I may not be able to reply to every email, but I do read each one of them.
Thanks.
********************************************************************************
[EOF]