Research and Development

The term dynamic instrumentation refers to the act of monitoring the execution of a program in order to extract debug information, to measure code performance or to detect errors. Dynamic instrumentation can be used to generate measures of functions properties such as execution time, call counts, registers status or call graphs.

Tools

Several software solutions exists to help the developers and researchers to commit this task of inspecting a program runtime behaviour, being some of the prominent the following ones:

Two of the the most widely used tool dynamic binary instrumentation tools are PIN and DynamoRIO. PIN is developed by Intel and provided by the University of Virginia whereas DynamoRIO is a collaboration between Hewlett-Packard and MIT. Both are free to use but only DynamoRIO is open source. PIN and DynamoRIO are both equally useful and usually the election is the result of personal taste.

mtrace

mtrace is the memory debugger included in the GNU C Library. The usage of mtrace(3) can be summarised as follows:

  1. Export the variable MALLOC_TRACE to point to the result logfile
  2. Include the header mcheck.h in the code
  3. Call to mtrace() before allocating memory
  4. Call untrace() at the end of the code (usually the main() function)
  5. Compile the program with debugging symbols (gcc prog.c -g)
  6. Read output using mtrace exec_file mtrace_output filename

Following a example extracted from the mtrace(3) man page to clarify the usage:

#include <mcheck.h>
#include <stdlib.h>
#include <stdio.h>

int main()
{
   int j;

   mtrace();

   for (j = 0; j < 2; j++)
       malloc(100);            /* Never freed--a memory leak */

   calloc(16, 16);             /* Never freed--a memory leak */
   exit(EXIT_SUCCESS);
}

A little script could be created in order to automate the compilation the analysis:

#!/bin/sh
export MALLOC_TRACE=mtrace.log
bin=prog
src=mtrace.c
gcc -g $src -o $bin
./$bin
mtrace $bin $MALLOC_TRACE

The execution will result in the following excerpt:

$ ./mtrace.sh 

Memory not freed:
-----------------
           Address     Size     Caller
0x0000000002054460     0x64  at /home/user/mtrace.c:11 (discriminator 2)
0x00000000020544d0     0x64  at /home/user/mtrace.c:11 (discriminator 2)
0x0000000002054540    0x100  at /home/user/mtrace.c:15

As can be seen, mtrace(3) has detected the error absence of free(), resulting in a memory leak error.

Intel PIN tool

PIN is a framework for creating dynamic binary analysis tools for the i386 and x86-64 architectures that can be used to perform program analysis on user space applications in Linux and Windows at run time on the compiled binary files. Pin provides an API that abstracts the instruction set and the binary schema and allows to inspect register contents, program code and symbol and debug information.

PIN performs instrumentation by taking control of the program just after it loads into memory and recompiling just-in-time sections of binary code just before they are run. Regarding the performance, PIN’s overhead is about 30 percent.PIN was originally created as a tool for computer architecture analysis, but its flexible API and an active community (called "Pinheads") have created a diverse set of tools for security, emulation and parallel program analysis.

A pin tool comprises of three types of routines:

  1. Instrumentation routines that enable the insertion of analysis routines
  2. Analysis routines which are called when the code they are associated is run
  3. Callback routines that called when specific conditions are met or when a certain event has occurred such as library loads, system calls, signals/exceptions and thread creation events

PIN includes in the directories source/tools/ManualExamples and source/tools/SimpleExamples several examples. Below can be seen the execution of the instruction counter and opcode mix profiler PIN tools:

$ cd pin-2.12-58423-gcc.4.4.7-linux
$ ./pin -appdebug -t source/tools/ManualExamples/obj-ia32/inscount0.so -- `which evolution`
$ cd pin-2.12-58423-gcc.4.4.7-linux
$ ./pin -t source/tools/SimpleExamples/obj-intel64/opcodemix.so -- /bin/ls

Another interesting example can be found on the imageload.cpp file. This example demonstrates the use of the API functions IMG_AddInstrumentFunction and IMG_AddUnloadFunction which are called when the binary images are loaded by the operating system:

$ cd source/tools/ManualExamples
$ make -f makefile obj-intel64/imageload.so
$ ../../../pin -t obj-intel64/imageload.so -- /bin/ls
$ cat imageload.out
	Loading /bin/ls, Image id = 1
	Loading /lib64/ld-linux-x86-64.so.2, Image id = 2
	Loading /lib/x86_64-linux-gnu/libselinux.so.1, Image id = 3
	Loading /lib/x86_64-linux-gnu/librt.so.1, Image id = 4
	Loading /lib/x86_64-linux-gnu/libacl.so.1, Image id = 5
	Loading /lib/x86_64-linux-gnu/libc.so.6, Image id = 6
	Loading /lib/x86_64-linux-gnu/libdl.so.2, Image id = 7
	Loading /lib/x86_64-linux-gnu/libpthread.so.0, Image id = 8
	Loading /lib/x86_64-linux-gnu/libattr.so.1, Image id = 9
	Unloading /bin/ls
	Unloading /lib64/ld-linux-x86-64.so.2
	Unloading /lib/x86_64-linux-gnu/libselinux.so.1
	Unloading /lib/x86_64-linux-gnu/librt.so.1
	Unloading /lib/x86_64-linux-gnu/libacl.so.1
	Unloading /lib/x86_64-linux-gnu/libc.so.6
	Unloading /lib/x86_64-linux-gnu/libdl.so.2
	Unloading /lib/x86_64-linux-gnu/libpthread.so.0
	Unloading /lib/x86_64-linux-gnu/libattr.so.1	

A classification of some of the shipped PIN tools examples are:

Analysing instructions

  • inscount0.cpp: checks number of executed instructions of application program
  • itrace.cpp: lists addresses of executed instructions of application program
  • pinatrace.cpp: lists addresses of accessed memory and type of operation (read, write)
  • opcodemix.cpp: lists opcodes of executed instructions with number of executions and category summary
  • regmix.cpp: analyses the usage or program registers

Analysing basic blocks

  • inscount1.cpp: counts all instructions of a basic block at once.
  • edgcnt.cpp:lists addresses of jump instructions, type of jump (direct, indirect), and number of times performed

Analysing routines

  • proccount.cpp: counts number of times each routine is invoked and number of instructions in it

Analysing libraries

  • imageload.cpp: lists loading and unloading of dynamic libraries invoked by program

Developing the first PIN tool

PIN includes in source/tools/MyPinTool/ a sample tool that can be used as code code base to start creating basic PIN tools. To create, compile and execute a this basic example, the following commands could be used:

$ cp -r pin-2.12-58423-gcc.4.4.7-linux/source/tools/MyPinTool/ pin1/
$ cd pin1
$ make -f makefile mPIN_ROOT=../pin-2.12-58423-gcc.4.4.7-linux
$ ls obj-intel64/
MyPinTool.o  MyPinTool.so
$ ../pin-2.12-58423-gcc.4.4.7-linux/pin -t obj-intel64/MyPinTool.so -- /bin/ls
===============================================
This application is instrumented by MyPinTool
===============================================
makefile  makefile.rules  MyPinTool.cpp  obj-intel64

To make things simple, a development directory “mytools” could be created outside the PIN directory hierarchy:

$ ls -l
total 35636
drwxr-xr-x 4 user users     4096 mar  9 22:39 mytools
lrwxrwxrwx 1 user users       30 ago 24  2013 pin -> pin-2.12-58423-gcc.4.4.7-linux
drwxr-xr-x 7 user users     4096 mar 10 01:12 pin-2.12-58423-gcc.4.4.7-linux

The development directory will contain the tools source code and the objects:

 
$ ls -l mytools
total 120
-rw-r--r-- 1 user users  676 feb  3 2013 Makefile
-rwxr-xr-x 1 user users  366 feb  3 2013 build.sh
-rw-r--r-- 1 user users 6033 feb  9 05:08 check_pc_sections.cpp
-rw-r--r-- 1 user users 7062 feb  9 03:27 check_secciones.cpp
-rw-r--r-- 1 user users 7796 feb  9 04:44 detect_pc_noexec.cpp
-rw-r--r-- 1 user users 5278 feb  9 02:30 load_unload_imag.cpp
-rw-r--r-- 1 user users 5435 feb  9 02:44 load_unload_sec.cpp
-rw-r--r-- 1 user users 2822 feb  4  2013 makefile.rules
drwxr-xr-x 2 user users 4096 feb  3 18:46 obj-intel64

A script (build.sh) could be created in order to help the compilation of PIN tools that reside in the development directory. The code of the script could be as shown below:

#!/bin/sh
PIN_ROOT=../pin
dstdir=obj-intel64
function negrita() { echo -e &quot;\033[1m${1}\033[0m&quot;; }

test -d $dstdir || mkdir $dstdir

for f in *.cpp; do
    negrita &quot;[*] Building $f&quot;;
    obj=`echo $f | sed -e 's/cpp/so/'`
    make $dstdir/$obj PIN_ROOT=$PIN_ROOT
    negrita &quot;[*] Exec $PIN_ROOT/pin -t $dstdir/$obj -- <program>&quot;
done

This script can be used to compile or all tools contained in the “mytools” development directory at one. Also, some tricks could help when using this dynamic instrumentation framework:

  • The option -appdebug tells PIN to start a GDB server to debug the application
  • The -appdebug option can also be used to debug from IDA Pro any application using PIN using the remote GDB debugger
  • It can’t be specified which port PIN will listen in as it will be randomly selected every time we execute PIN

    • Request to be added to the Portcullis Labs newsletter

      We will email you whenever a new tool, or post is added to the site.

      Your Name (required)

      Your Email (required)