hamburger

BitDam Blog

How to Automate Investigation in IDA Python Scripting
Alex Livshiz
Alex Livshiz
4 minutes & 16 seconds read · April 29, 2019

How to Automate Investigation in IDA Python Scripting

As a researcher in the Cybersecurity field, IDA is a tool that I use almost on a daily basis. IDA allows me to reverse engineer executables in order to deeply understand what happens under the hood.

If you’re like me, the first time you opened IDA blew your mind. I’m not just talking about their GUI (which I think is great), but the sheer amount of data IDA is able to extract from a Portable Executable (PE) file:

  • – Strings inside the PE
  • – Imports, Exports
  • – Functions, with their parameters and flows
  • – And so much more

In this post I’m going to discuss IDA Python scripting, why I needed it, and why you should use it too.

Why did I use it?

IDA Python is great for scripting, especially when you can’t just search manually for what you’re looking for. When investigating a suspicious behavior in a certain DLL, or extracting specific data, I find it very convenient.

After working and analyzing various malicious EXEs and DLLs, I noticed that my methodology doesn’t change too much. It always starts with:

  • – Search for interesting strings
  • – Search for WinApi uses that may indicate an attempt for achieving persistency on the machine
  • – Detect obfuscated content
  • – Etc.

IDA Python provides scripting capabilities, which allows me to extract this data, and saves a lot of manual hastle. Moreover, if there’s interesting info I want to extract (like size of code section, debug section info, etc), I can add it to the script for future uses.

Of course, there’s a lot more you can do with IDA. Everything that IDA displays, and much more, can be accessed using scripting.

Since IDA Python lacks a lot in documentation, here are few code samples.

IDA Python Tutorial

To run a python script on IDA, you need to make sure that you have IDA Python installed. I’m using IDA 6.5 and Python 2.7.

There are two ways to run your script:

1. Run your script directly from IDA, in the lower output window:

2. Inject your python code to IDA.

To do so, you create a .py file, write your code, and run IDA in the following way:

                                   idaq64.exe -c -A -T”Portable executable” -S”<<Your scipt path>>”

I’ll try to summarize the most useful\undocumented APIs by providing a few examples.

Example 1 – Print All Functions

Let’s say I want to print all existing functions in the DLL. Here’s all the code you need:

from idaapi import *
from idautils import *
from idc import *
  
# Wait for IDA to finish loading
autoWait()
  
# get the entry point of the PE file
start_address = BeginEA()
  
# If there’s no start address, there’s probably no .text section for the PE file
if start_address == BADADDR:
    qexit(
1)
  
  
# Go over all the functions
for funcea in Functions(SegStart(start_address), SegEnd(start_address)):
    function_name = GetFunctionName(funcea)
    function_start = funcea
    function_end = FindFuncEnd(funcea)
  
   
print “function name – {0}, start address – {1}, end address – {2}”.\
        format(function_name
, str(function_start), str(function_end))

Example 2 – Opcodes And Operands

IDA Python also provides you with API to go through opcodes and their operands. In this example, we iterate over all instructions in the “.text” section and print all addresses referenced by another address. Basically, this will print all function and location addresses.

from idaapi import *
from idautils import *
from idc import *
  
CODE_REFERENCE =
“Code”
DATA_REFERENCE = “Data”
  
# Wait for IDA to finish loading
autoWait()
  
# This returns the entry point of the PE file
start_address = BeginEA()
  
# If there’s no start address, there’s probably no .text section for the PE file
if start_address == BADADDR:
    qexit(
1)
  
# Go over all the instructions
for address in Heads(SegStart(start_address), SegEnd(start_address)):
   
if isCode(GetFlags(address)):
       
# Check if there are references to the address
       
has_ref = False
       
for ref in XrefsTo(address):
            ref_type = XrefTypeName(ref.type)
  
           
if ref_type.startswith(CODE_REFERENCE) or ref_type.startswith(DATA_REFERENCE):
                has_ref =
True
               
break
               
        if
has_ref:
           
print address

Summary

IDA Python is a great tool for extracting data from PE files, it enables basic scripting as well as many cool APIs. In this post I showed the rationale behind using this tool, and provided two easy-to-use code samples. Enjoy.