Intro:
A couple of weeks ago, I spotted a suspicious binary on VT, it then had only 11 hits and still has a valid digital signature. In this blogpost I’m going to demonstrate how to statically deobfuscate strings using IDAPython.
Let’s play with it inside IDA.
Reverse it:
After opening IDA, one of the first things I do is look for information that is rapidly understandable so that I get a quick grasp of the file in front of me.
Judging from the export table, this is a DLL with some interesting exported functions.
If you read my previous post, you’ll find this screenshot familiar, it’s the same DLL and I’ll continue reversing it.
The EP of the DLL is inside the “.itext” section, also the arguments to functions are passed in EAX, EDX, ECX, makes me think this PE is compiled using Delphi. I thought that Delphi is slowly been forgotten, but looks like it’s here to stay.
Here are a few embedded strings inside the DLL:
00000027827C 000000678E7C 0 VBoxService.exe 0000002758C0 0000006764C0 0 CONNECT 0000002758DC 0000006764DC 0 HTTP/1.0 200 OK 0000002B9538 0000006BA138 0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) 0000002B2618 0000006B3218 0 <|Folder|> 0000002B264C 0000006B324C 0 <|Files|> 0000002B2680 0000006B3280 0 <|DownloadFile|> 0000002B26B0 0000006B32B0 0 <|UploadFile|> 0000002B2DD8 0000006B39D8 0 SYSTEMSTART 0000002BB3E0 0000006BBFE0 0 W5EUYQ33UJ 00000024E8AA 00000064F4AA 0 X509_PUBKEY 00000024E78A 00000064F38A 0 X509_ALGOR 00000021C8BC 00000061D4BC 0 .ppt=application/mspowerpoint 00000021B094 00000061BC94 0 .exe=application/x-msdos-program 00000021B388 00000061BF88 0 .hpf=application/x-icq-hpf 00000021B3CC 00000061BFCC 0 .hqx=application/mac-binhex40 00000021B414 00000061C014 0 .hta=application/hta 00000021AE2C 00000061BA2C 0 .dll=application/x-msdos-program 00000021AE7C 00000061BA7C 0 .dmg=application/x-apple-diskimage 00000021AED0 00000061BAD0 0 .doc=application/msword 00000021AF0C 00000061BB0C 0 .dot=application/msword 00000021AF48 00000061BB48 0 .dvi=application/x-dvi 00000021E9CC 00000061F5CC 0 .c++=text/x-c++src 00000021EA00 00000061F600 0 .cpp=text/x-c++src 0000002B3440 0000006B4040 0 AA25CA6D9C4287AA5B80B12CF520C0 0000002B346C 0000006B406C 0 13BD7CB755EC6485BF4CF46E8BAB5D82A9CA 0000002B3564 0000006B4164 0 A629C377A3B22AC17F85 0000002B3588 0000006B4188 0 C40725CE79BA1307C571A7C3 0000002B35B0 0000006B41B0 0 F56CB7739A5CB8B9A8AC 0000002B35D4 0000006B41D4 0 51F30A34ED3E9A5087BE438E53838D 0000002B3600 0000006B4200 0 E51BDB10C64887AA56D35F9F47EB0F3CF15D 0000002B4114 0000006B4D14 0 DC1FC277F66A889C47E56D96BF6FF0011EB670F870ED6CA4BB3CC778E70706588183FD6A8F33369F38D45FC80161C90A57F061FB638B955EE66BEA 0000002B7570 0000006B8170 0 CA182FDC0E2BBB729AB35AFE36ED21D61145E77EAE113991538ABC4DE66681
Judging from the above strings the PE might have the following capabilities:
- look for virtual environments and adjust it’s behaviour accordingly
- communicate with a C&C server and get some commands
- might use some asymmetric encryption for communication
- some file manipulation and MIME determination
- has some encoded strings (on which I’ll focus my attention in this post)
Following the strings and searching references in code, I stumble upon a interesting routine that looks to be responsible for decryption. Searching for references to this function I see 357 CALLs to it.
After a few of minutes fiddling with the subroutine I see that the second CALL instruction points to a subroutine that contains a XOR with a value from the stack “[ebp+var_20]” inside a while loop. This might be it:
Decompiled code:
Now we have to open a debugger, place some breakpoints, understand the logic of the encryption routine and also get the decryption key. Don’t forget to rebase the segments in IDA like shown in the previous post. After that, we can continue with reimplementing the algorithm to decrypt all of the strings and add comments to the IDA database. My debugging instance loaded the DLL at 0x01EA0000.
A few breakpoints later I stumble upon a move from memory to EAX, at this moment I don’t really care where is the password taken from, I just want to confirm that it’s the same for every string, thus I continue running to check if the same key is used with different encoded strings, and my expectation is correct.
I now have the password, but I still need to reverse the decryption algorithm. Looking at the XOR instruction, I see that the first character (0x35 or ‘5’) of the password is being xored against the second byte of the encoded string (0x44 or ‘D’). What happens with the first byte of the encoded string?
A few instructions later, a comparison is made against the previous byte from the encoded string:
- if the xor result is smaller than the prev. byte, continue by adding 0xFF and then subtract the prev. byte value.
- eg: 0x44 ^ 0x35 = 0x71; if(0x71 > 0xF9): 0x71+0xFF-0xF9 = 0x77 ; ASCII ‘w’
- if the xor result is bigger than the prev. byte, continue by subtracting the prev. byte.
- eg: 0xFB ^ 0x56 = 0xAD; if(0xAD > 0x44): 0xAD-0x44 = 0x69 ‘i’
Looking at the decompiled code inside IDA, we can observe that the algorithm is correct. The other functions surrounding the main logic are there to fetch and convert the encoded string and password to hex representations.
Python reimplementation:
These are the steps to follow in the python code:
- search for all references to the decryption function
- iterate though each address and collect the encoded strings
- hardcode the password somewhere in the code
- implement the decryption algorithm and apply it
- patch the IDA Database
import idaapi, idc, idautils, re, json def GetString(instr): chars = [] count = 0 while(Byte(instr+count)): chars.append(chr(Byte(instr+count))) count+=1 return "".join(chars) def IsAddress(instr): pattern = re.compile("0[a-fA-F0-9]{7}") if not pattern.match(instr): return False return True def DecodeString(encodedString): key = "" # paste the password here keyLen = len(key) encStrLen = len(encodedString) prevEncodedByte = 0 encStrList = [] keyList = [] encodedString = [encodedString[i:i+2] for i in range(0, encStrLen, 2)] for byte in encodedString: encStrList.append(int(byte, 16)) key = [key[i] for i in range(0, keyLen, 1)] for byte in key: keyList.append(ord(byte)) prevEncodedByte = encStrList[0] encStrLen = len(encStrList) keyLen = len(keyList) keyOffset = 0 decodedString = [] for i in range(1,encStrLen): if(keyOffset>=keyLen): keyOffset = 0 decodedByte = encStrList[i] ^ keyList[keyOffset] if(decodedByte>prevEncodedByte): decodedByte -= prevEncodedByte else: decodedByte = decodedByte + 0xFF - prevEncodedByte prevEncodedByte = encStrList[i] decodedString.append(chr(decodedByte)) keyOffset += 1 return("".join(decodedString)) # =========================================================================== # =========================================================================== funcName = LocByName('fDecryptionRoutine') funcXrefs = CodeRefsTo(funcName,1) strings = [] for xref in funcXrefs: intructs = 0 tempInstr = xref while(intructs==0): tempInstr = PrevHead(tempInstr,SegStart(tempInstr)) mnem = GetMnem(tempInstr) if(mnem == 'mov'): intructs = tempInstr encodedStrAddr = GetOperandValue(intructs,1) if(IsAddress("{:08X}".format(encodedStrAddr))): encodedStr = GetString(encodedStrAddr) decodedStr = DecodeString(encodedStr) print('CALL @ {:08X} with \"{}\" -> \"{}\"'.format(xref,encodedStr,decodedStr)) MakeComm(xref,''.join(decodedStr)) print("FIN")
I stripped out the decryption password, I’ll leave that for you to paste inside the code.
After running the code, comments have been added on each CALL and they are also displayed in the xref window.
And here are a few decrypted strings:
"DB11D6022D0F1B33E51139A941F5280ED178CB1DC11533A34650E4063B99E96A9547F85CE9748CCA096085ADF83AF612B562F66F84AC529682A147F415062ACE7AA55881A0F22172A82BCC062FCC73EF3AAC593DF45327B24DF15288BA63E21BCB71AA" -> "Hola, Enviamos un codigo como simulacion de transaccion para validar y sincronizar su dispositivo." "1F51E94BFE24AC4125DF0C4AF92DE01664F13698B411CD78CD053CF31FA04690B8598C955FF40716B36A8FBF132BD47CEA1F26A197BA6ED8788EB16A9F3AC86F92F4084E3D9942944580A73F082AC318D1779E88" -> "En este momento estamos efectuando una modificaci�n de seguridad en nuestra pagina." "2B42E71A748EDB0336E4469C5D81B25E83DC12B36AB960F630669855F053953FE202329858BD8CCD0D36173AAE72BB64B1508B35CC7AA05044E11335E879B05380B24A40" -> "Los datos ingresados son incorrectos, por favor intente nuevamente." "0E549FFE3CF35394BD6684975D87B8DCAE192DA45EF41547FD5494E61FB376E047E917BC4A8FA91546F153F86FD812C81BC61DB5497CA484" -> "Su sistema ser� reiniciado para finalizar la operaci�n." "2D48E1157BB72DF024DB4B582E36E540F16B8DCEA33BED37FD45FD4CF0539681A341F1589F86A83F90B7964486AE94429E4DF754EF1FDD15BCA5" -> "Por favor, En caso de que los datos sean los siguientes:" "EF00381930F55E90F42FDD788BBC7099EB6282904C88A13EFE589CEE18BF75D47FA057FA345928A929D2B35D8B559E5B4EFC5EE71BC8738E9A5BF8210245F026CB77F7" -> "En este momento no podemos atenderle, por favor intente mas tarde." "90E41FCD79B12FF122D978EF093E1FC579A145944A9CB01D" -> "Administrador de tareas" "0962F50D1C3542D561FE7F" -> "NAVEGADOR=" "0F5EE416C56FD76E9747E0" -> "\Trusteer\" "1D45DC68E371F141C06F9C22DD083711056FED096488BC1210748481B92CCE073DE21545CC7FA329BE65" -> "SELECT Caption FROM Win32_OperatingSystem" "73F36FFA0E076FDC432E37BA5CEC4CCB68E41D41E769933EDB78B047F46B8F" -> "SELECT * FROM AntiVirusProduct" "1D6B9795" -> "AV=" "214DE00E3CFD41E74CED0C4CE147D5718DD66DE01E" -> "Windows Task Manager" "245E9B4AFF3AAB989F45F45792B2" -> "Google Chrome" "2F49EE28DF1AB2472324CE7281A2538D" -> "Mozilla Firefox" "9CD0699C4CE8639B54393D86A048F62FC601" -> "Internet Explorer" "79C579A65A9731F62E25C22CCE7B9F47E172B214C511CE0723A1AA5C8DC91075" -> "winmgmts:\\localhost\root\cimv2" "82DE193DE2075991BC9183DA11C37698AE25CB" -> "ipconfig /flushdns" "AEC2678F4CE76F87E30433609540E50922A99ACC7FAC5DF531A15C9C4290A33AEF1A0C" -> "Ingrese el codigo de confirmacion." "83E27284819AD77787EA59AC9A8C94A795C07ADE0343E52DC7768C" -> "TASKKILL /F /IM chrome.exe" "B030273BC842BF5FEF5DCF25130711186AE80141EB619D2F085B84BB" -> "TASKKILL /F /IM firefox.exe" "FA7AED0005067B94A489FB71C7B3BD4D3F90A62DD00E309B54B44D97B4" -> "TASKKILL /F /IM iexplore.exe" "E40C1B2F342946C14A271568DE4BD565D779BE12CA1FC2162ABB4AEC1EA74742E11EC7" -> "TASKKILL /F /IM itauaplicativo.exe" "B01027DB0922BF7FAF82F26F8A" -> "taskkill /im" "9AC341F116DC609241EB0B6B82A05287A820D57899C360F7094E9254F80FD470A44EF854FE5D9E23BC6592A9DB0329DF7AA7389FADAF6B89CC43F629DA748F41E7" -> "\Software\Microsoft\Windows\CurrentVersion\Explorer\AutoComplete" "BE2ED766E2618C9EB4B452AC40E618C167EE0648E972963CF5539D432B4ACE1C19CA79D26CE3096EE31FD37ADC1C22" -> "\SOFTWARE\Microsoft\Windows NT\CurrentVersion\" "D903000E1A38BBAF46C6609E5E84B76186CD67E90B51F75F9630F036DA2DF25282A356F31EBA7CDE78AE5CFF678BA3" -> "\SOFTWARE\Microsoft\Windows\CurrentVersion\Run" "F944FB2BD668E1053C3BD40238E2072EC8092BAC6CFB23AB4EF4060B38A84EF727DF1D59EA6085DB7EF9" -> "winmgmts:\\localhost\root\SecurityCenter2"
Here is he hash of the DLL:
-
- 033489ac01edb282a139a19058fb746db01f62c5c70bf49cc34e5cc35130cd4b
Conclusion:
Some tasks may look intimidating, but decrypting string statically is a good thing to do because malware often decrypts only what it’s needed and might even encrypt back. So debugging the PE will not guarantee iteration though every possible encoded string and also might be more time consuming than the reversing process.
I hope this is useful for some of you, C yeah!