Deobfuscating strings with IDAPython

Intro:

A couple of weeks ago, I spotted a suspicious binary on VT, it then had only 11 hits and still has a valid digital signature. In this blogpost I’m going to demonstrate how to statically deobfuscate strings using IDAPython.

Let’s play with it inside IDA.

Reverse it:

After opening IDA, one of the first things I do is look for information that is rapidly understandable so that I get a quick grasp of the file in front of me.

Judging from the export table, this is a DLL with some interesting exported functions.

exporttable

If you read my previous post, you’ll find this screenshot familiar, it’s the same DLL and I’ll continue reversing it.

The EP of the DLL is inside the “.itext” section, also the arguments to functions are passed in EAX, EDX, ECX, makes me think this PE is compiled using Delphi. I thought that Delphi is slowly been forgotten, but looks like it’s here to stay.

deplhi

Here are a few embedded strings inside the DLL:

00000027827C   000000678E7C      0   VBoxService.exe
0000002758C0   0000006764C0      0   CONNECT
0000002758DC   0000006764DC      0   HTTP/1.0 200 OK
0000002B9538   0000006BA138      0   Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
0000002B2618   0000006B3218      0   <|Folder|>
0000002B264C   0000006B324C      0   <|Files|>
0000002B2680   0000006B3280      0   <|DownloadFile|>
0000002B26B0   0000006B32B0      0   <|UploadFile|>
0000002B2DD8   0000006B39D8      0   SYSTEMSTART
0000002BB3E0   0000006BBFE0      0   W5EUYQ33UJ
00000024E8AA   00000064F4AA      0   X509_PUBKEY
00000024E78A   00000064F38A      0   X509_ALGOR
00000021C8BC   00000061D4BC      0   .ppt=application/mspowerpoint
00000021B094   00000061BC94      0   .exe=application/x-msdos-program
00000021B388   00000061BF88      0   .hpf=application/x-icq-hpf
00000021B3CC   00000061BFCC      0   .hqx=application/mac-binhex40
00000021B414   00000061C014      0   .hta=application/hta
00000021AE2C   00000061BA2C      0   .dll=application/x-msdos-program
00000021AE7C   00000061BA7C      0   .dmg=application/x-apple-diskimage
00000021AED0   00000061BAD0      0   .doc=application/msword
00000021AF0C   00000061BB0C      0   .dot=application/msword
00000021AF48   00000061BB48      0   .dvi=application/x-dvi
00000021E9CC   00000061F5CC      0   .c++=text/x-c++src
00000021EA00   00000061F600      0   .cpp=text/x-c++src
0000002B3440   0000006B4040      0   AA25CA6D9C4287AA5B80B12CF520C0
0000002B346C   0000006B406C      0   13BD7CB755EC6485BF4CF46E8BAB5D82A9CA
0000002B3564   0000006B4164      0   A629C377A3B22AC17F85
0000002B3588   0000006B4188      0   C40725CE79BA1307C571A7C3
0000002B35B0   0000006B41B0      0   F56CB7739A5CB8B9A8AC
0000002B35D4   0000006B41D4      0   51F30A34ED3E9A5087BE438E53838D
0000002B3600   0000006B4200      0   E51BDB10C64887AA56D35F9F47EB0F3CF15D
0000002B4114   0000006B4D14      0   DC1FC277F66A889C47E56D96BF6FF0011EB670F870ED6CA4BB3CC778E70706588183FD6A8F33369F38D45FC80161C90A57F061FB638B955EE66BEA
0000002B7570   0000006B8170      0   CA182FDC0E2BBB729AB35AFE36ED21D61145E77EAE113991538ABC4DE66681

Judging from the above strings the PE might have the following capabilities:

  • look for virtual environments and adjust it’s behaviour accordingly
  • communicate with a C&C server and get some commands
  • might use some asymmetric encryption for communication
  • some file manipulation and MIME determination
  • has some encoded strings (on which I’ll focus my attention in this post)

Following the strings and searching references in code, I stumble upon a interesting routine that looks to be responsible for decryption. Searching for references to this function I see 357 CALLs to it.

xrefDecodeFunc

357calls

After a few of minutes fiddling with the subroutine I see that the second CALL instruction points to a subroutine that contains a XOR with a value from the stack “[ebp+var_20]” inside a while loop. This might be it:

decryptionloop

Decompiled code:

xorloop

Now we have to open a debugger, place some breakpoints, understand the logic of the encryption routine and also get the decryption key. Don’t forget to rebase the segments in IDA like shown in the previous post. After that, we can continue with reimplementing the algorithm to decrypt all of the strings and add comments to the IDA database. My debugging instance loaded the DLL at 0x01EA0000.

A few breakpoints later I stumble upon a move from memory to EAX, at this moment I don’t really care where is the password taken from, I just want to confirm that it’s the same for every string, thus I continue running to check if the same key is used with different encoded strings, and my expectation is correct.

password

I now have the password, but I still need to reverse the decryption algorithm. Looking at the XOR instruction, I see that the first character (0x35 or ‘5’) of the password is being xored against the second byte of the encoded string (0x44 or ‘D’). What happens with the first byte of the encoded string?

xorbyte

A few instructions later, a comparison is made against the previous byte from the encoded string:

  • if the xor result is smaller than the prev. byte, continue by adding 0xFF and then subtract the prev. byte value.
    • eg: 0x44 ^ 0x35 = 0x71; if(0x71 > 0xF9): 0x71+0xFF-0xF9 = 0x77 ; ASCII ‘w’
  • if the xor result is bigger than the prev. byte, continue by subtracting the prev. byte.
    • eg: 0xFB ^ 0x56 = 0xAD; if(0xAD > 0x44): 0xAD-0x44 = 0x69 ‘i’

cmpprevbyte

Looking at the decompiled code inside IDA, we can observe that the algorithm is correct. The other functions surrounding the main logic are there to fetch and convert the encoded string and password to hex representations.

decompiledXOR

Python reimplementation:

These are the steps to follow in the python code:

  1. search for all references to the decryption function
  2. iterate though each address and collect the encoded strings
  3. hardcode the password somewhere in the code
  4. implement the decryption algorithm and apply it
  5. patch the IDA Database
import idaapi, idc, idautils, re, json

def GetString(instr):
	chars = []
	count = 0
	while(Byte(instr+count)):
		chars.append(chr(Byte(instr+count)))
		count+=1
	return "".join(chars)

def IsAddress(instr):
	pattern = re.compile("0[a-fA-F0-9]{7}")
	if not pattern.match(instr):
		return False
	return True

def DecodeString(encodedString):
	key = "" # paste the password here
	keyLen = len(key)
	encStrLen = len(encodedString)
	prevEncodedByte = 0
	encStrList = []
	keyList = []

	encodedString = [encodedString[i:i+2] for i in range(0, encStrLen, 2)]
	for byte in encodedString:
		encStrList.append(int(byte, 16))

	key = [key[i] for i in range(0, keyLen, 1)]
	for byte in key:
		keyList.append(ord(byte))

	prevEncodedByte = encStrList[0]

	encStrLen = len(encStrList)
	keyLen = len(keyList)
	keyOffset = 0

	decodedString = []
	for i in range(1,encStrLen):
		if(keyOffset>=keyLen):
			keyOffset = 0
		decodedByte = encStrList[i] ^ keyList[keyOffset]
		if(decodedByte>prevEncodedByte):
			decodedByte -= prevEncodedByte
		else:
			decodedByte = decodedByte + 0xFF - prevEncodedByte
		prevEncodedByte = encStrList[i]
		decodedString.append(chr(decodedByte))

		keyOffset += 1

	return("".join(decodedString))
# ===========================================================================
# ===========================================================================
funcName = LocByName('fDecryptionRoutine')
funcXrefs = CodeRefsTo(funcName,1)
strings = []

for xref in funcXrefs:
	intructs = 0
	tempInstr = xref
	while(intructs==0):
		tempInstr = PrevHead(tempInstr,SegStart(tempInstr))
		mnem = GetMnem(tempInstr)
		if(mnem == 'mov'):
			intructs = tempInstr

	encodedStrAddr = GetOperandValue(intructs,1)

	if(IsAddress("{:08X}".format(encodedStrAddr))):
		encodedStr = GetString(encodedStrAddr)
		decodedStr = DecodeString(encodedStr)
		print('CALL @ {:08X} with \"{}\" -> \"{}\"'.format(xref,encodedStr,decodedStr))
		MakeComm(xref,''.join(decodedStr))

print("FIN")

I stripped out the decryption password, I’ll leave that for you to paste inside the code.

After running the code, comments have been added on each CALL and they are also displayed in the xref window.

patchedcomment

And here are a few decrypted strings:

"DB11D6022D0F1B33E51139A941F5280ED178CB1DC11533A34650E4063B99E96A9547F85CE9748CCA096085ADF83AF612B562F66F84AC529682A147F415062ACE7AA55881A0F22172A82BCC062FCC73EF3AAC593DF45327B24DF15288BA63E21BCB71AA" -> "Hola, Enviamos un codigo como simulacion de transaccion para validar y sincronizar su dispositivo."
"1F51E94BFE24AC4125DF0C4AF92DE01664F13698B411CD78CD053CF31FA04690B8598C955FF40716B36A8FBF132BD47CEA1F26A197BA6ED8788EB16A9F3AC86F92F4084E3D9942944580A73F082AC318D1779E88" -> "En este momento estamos efectuando una modificaci�n de seguridad en nuestra pagina."
"2B42E71A748EDB0336E4469C5D81B25E83DC12B36AB960F630669855F053953FE202329858BD8CCD0D36173AAE72BB64B1508B35CC7AA05044E11335E879B05380B24A40" -> "Los datos ingresados son incorrectos, por favor intente nuevamente."
"0E549FFE3CF35394BD6684975D87B8DCAE192DA45EF41547FD5494E61FB376E047E917BC4A8FA91546F153F86FD812C81BC61DB5497CA484" -> "Su sistema ser� reiniciado para finalizar la operaci�n."
"2D48E1157BB72DF024DB4B582E36E540F16B8DCEA33BED37FD45FD4CF0539681A341F1589F86A83F90B7964486AE94429E4DF754EF1FDD15BCA5" -> "Por favor,  En caso de que los datos sean los siguientes:"
"EF00381930F55E90F42FDD788BBC7099EB6282904C88A13EFE589CEE18BF75D47FA057FA345928A929D2B35D8B559E5B4EFC5EE71BC8738E9A5BF8210245F026CB77F7" -> "En este momento no podemos atenderle, por favor intente mas tarde."
"90E41FCD79B12FF122D978EF093E1FC579A145944A9CB01D" -> "Administrador de tareas"

"0962F50D1C3542D561FE7F" -> "NAVEGADOR="
"0F5EE416C56FD76E9747E0" -> "\Trusteer\"

"1D45DC68E371F141C06F9C22DD083711056FED096488BC1210748481B92CCE073DE21545CC7FA329BE65" -> "SELECT Caption FROM Win32_OperatingSystem"
"73F36FFA0E076FDC432E37BA5CEC4CCB68E41D41E769933EDB78B047F46B8F" -> "SELECT * FROM AntiVirusProduct"
"1D6B9795" -> "AV="
"214DE00E3CFD41E74CED0C4CE147D5718DD66DE01E" -> "Windows Task Manager"
"245E9B4AFF3AAB989F45F45792B2" -> "Google Chrome"
"2F49EE28DF1AB2472324CE7281A2538D" -> "Mozilla Firefox"
"9CD0699C4CE8639B54393D86A048F62FC601" -> "Internet Explorer"
"79C579A65A9731F62E25C22CCE7B9F47E172B214C511CE0723A1AA5C8DC91075" -> "winmgmts:\\localhost\root\cimv2"

"82DE193DE2075991BC9183DA11C37698AE25CB" -> "ipconfig /flushdns"
"AEC2678F4CE76F87E30433609540E50922A99ACC7FAC5DF531A15C9C4290A33AEF1A0C" -> "Ingrese el codigo de confirmacion."
"83E27284819AD77787EA59AC9A8C94A795C07ADE0343E52DC7768C" -> "TASKKILL /F /IM chrome.exe"
"B030273BC842BF5FEF5DCF25130711186AE80141EB619D2F085B84BB" -> "TASKKILL /F /IM firefox.exe"
"FA7AED0005067B94A489FB71C7B3BD4D3F90A62DD00E309B54B44D97B4" -> "TASKKILL /F /IM iexplore.exe"
"E40C1B2F342946C14A271568DE4BD565D779BE12CA1FC2162ABB4AEC1EA74742E11EC7" -> "TASKKILL /F /IM itauaplicativo.exe"
"B01027DB0922BF7FAF82F26F8A" -> "taskkill /im"

"9AC341F116DC609241EB0B6B82A05287A820D57899C360F7094E9254F80FD470A44EF854FE5D9E23BC6592A9DB0329DF7AA7389FADAF6B89CC43F629DA748F41E7" -> "\Software\Microsoft\Windows\CurrentVersion\Explorer\AutoComplete"
"BE2ED766E2618C9EB4B452AC40E618C167EE0648E972963CF5539D432B4ACE1C19CA79D26CE3096EE31FD37ADC1C22" -> "\SOFTWARE\Microsoft\Windows NT\CurrentVersion\"
"D903000E1A38BBAF46C6609E5E84B76186CD67E90B51F75F9630F036DA2DF25282A356F31EBA7CDE78AE5CFF678BA3" -> "\SOFTWARE\Microsoft\Windows\CurrentVersion\Run"

"F944FB2BD668E1053C3BD40238E2072EC8092BAC6CFB23AB4EF4060B38A84EF727DF1D59EA6085DB7EF9" -> "winmgmts:\\localhost\root\SecurityCenter2"

Here is he hash of the DLL:

    • 033489ac01edb282a139a19058fb746db01f62c5c70bf49cc34e5cc35130cd4b

 

Conclusion:

Some tasks may look intimidating, but decrypting string statically is a good thing to do because malware often decrypts only what it’s needed and might even encrypt back. So debugging the PE will not guarantee iteration though every possible encoded string and also might be more time consuming than the reversing process.

I hope this is useful for some of you, C yeah!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s