deobfuscate push ret to jmp #2539
-
There are a lot of malware that obfuscate according to this:
So to deobfuscate it i need to change it to:
The way to overcome that i know in binary ninja, is to manually nop ret and then patch the |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
I think you can write a script to look for that particular pattern and automate the patch. Also, when analysis pass feature arrives, one can write a pass to simplify the IL and deobfuscate them automatically |
Beta Was this translation helpful? Give feedback.
-
The best approach to this is to modify the architecture via an architecture hook. For example: What you'd want to do is modify that second example to detect instances of this pattern and change the lifting to consume both instructions and return a jump to the target. |
Beta Was this translation helpful? Give feedback.
-
Both approaches work. The architecture hook is more complicated, but has the advantage of intervening early enough to prevent the return instructions from causing a bunch of function boundaries. In the attached .zip, see Approach 1: scriptdef deobfuscate_function(bv, func):
for block in func.basic_blocks:
(instrs, lengths) = zip(*[i for i in block])
addrs = [block.start + sum(lengths[0:i]) for i in range(len(lengths))]
for i in range(len(instrs)-1):
if instrs[i][0].text == 'push' and (instrs[i+1][0].text in ['ret', 'retn']):
source = 'jmp '+instrs[i][-1].text
data = bv.arch.assemble(source, addrs[i])
while len(data) < lengths[i]+lengths[i+1]:
data += b'\x90'
print('%08X: push+ret found, assembled %s to %s, patching...' % (addrs[i], source, data))
bv.write(addrs[i], data)
def deobfuscate_all(bv):
for func in bv.functions:
deobfuscate_function(bv, func) Approach 2: pluginfrom binaryninja.architecture import Architecture, ArchitectureHook
from binaryninja.function import InstructionTextToken, InstructionInfo
from binaryninja.enums import InstructionTextTokenType, BranchType
class X86DeobfuscateHook(ArchitectureHook):
# test whether the data qualifies for the push/ret deobfuscation
# return (destination, length) if it qualifies
# return None otherwise
def qualifies(self, data, addr):
arch = super(X86DeobfuscateHook, self)
toks_a, len_a = arch.get_instruction_text(data, addr)
if not toks_a:
return None
if toks_a[0].text == 'push':
toks_b, len_b = arch.get_instruction_text(data[len_a:], addr+len_a)
if toks_b[0].text in ['ret', 'retn']:
tok_dest = toks_a[-1]
assert tok_dest.text.startswith('0x')
return (int(tok_dest.text, 16), len_a + len_b)
return None
def get_instruction_text(self, data, addr):
tmp = self.qualifies(data, addr)
if not tmp:
return super(X86DeobfuscateHook, self).get_instruction_text(data, addr)
(push_addr, length) = tmp
print('%08X: push+ret found, disassembling as jmp to 0x%X' % (addr, push_addr))
tok_jmp = InstructionTextToken(InstructionTextTokenType.InstructionToken, "jmp")
tok_space = InstructionTextToken(InstructionTextTokenType.TextToken, ' ')
tok_dest = InstructionTextToken(InstructionTextTokenType.PossibleAddressToken, hex(push_addr), push_addr)
return [tok_jmp, tok_space, tok_dest], length
def get_instruction_info(self, data, addr):
tmp = self.qualifies(data, addr)
if not tmp:
return super(X86DeobfuscateHook, self).get_instruction_info(data, addr)
(push_addr, length) = tmp
print('%08X: push+ret found, informing binja of unconditional branch to 0x%X' % (addr, push_addr))
result = InstructionInfo()
result.length = length
result.add_branch(BranchType.UnconditionalBranch, push_addr)
return result
X86DeobfuscateHook(Architecture['x86']).register() |
Beta Was this translation helpful? Give feedback.
The best approach to this is to modify the architecture via an architecture hook. For example:
https://github.com/Vector35/binaryninja-api/blob/dev/examples/x86_extension/src/x86_extension.cpp
https://github.com/Vector35/binaryninja-api/blob/dev/python/examples/arch_hook.py
What you'd want to do is modify that second example to detect instances of this pattern and change the lifting to consume both instructions and return a jump to the target.