Though I've played with QEMU for a while and have done some experiments on the translation process with success, it is still of great importance to get a better and correct understanding about its dynamic translation process. Following is pieces of information from QEMU's official document, the QEMU Internals, and my understanding.
The basic idea is to split every x86 instruction into fewer simpler instructions called micro operations.
I think this idea is similar to Intel's micro instruction, every x86 instruction is first translated/interpreted as a few simple instructions, e.g. the call instruction (0xff) used in my PoC example is translated into:
movtl_T1_im(nextip) //save next ip in T1
push_T1 //push next ip onto stack
jmp_T0 //jump to the target ip saved in T0
Each simple instruction is implemented by a piece of C code.
Each simple instructions, like push_T1
is implemented by a piece of C code (a simple function in op.h). These code is then compiled into binary code of the target platform. These code snippet is then used to generate the dynamic translator.
QEMU is no more difficult to port than a dynamic linker.
The generated dynamic translator is much like a linker which links the target platform binary code snippet to form a basic block, then executes the block. One interesting thing about the translator is parameter a snippet needs is passed to it at link time through code patching. Following is the example given in its paper: QEMU, a Fast and Portable Dynamic Translator.
The translated micro operations of PowerPC instruction add r1,r1,-16
is:
movl_T0_r1
addl_T0_im(-16)
movl_T0_r1
The C code of addl_T0_im
is:
extern int __op_param1;
void op_addl_T0_im(void)
{
T0 = T0 + ((long)(&__op_param1));
}
The generated dynamic translator passes immediate parameter -16 to the code snippet by:
case INDEX_op_addl_T0_im:
{
long param1;
extern void op_addl_T0_im();
memcpy(gen_code_ptr,(char *)&op_addl_T0_im+0,6); //dynamic link
param1 = *opparam_ptr++; //read parameter
*(uint32_t *)(gen_code_ptr + 2) = param1; //patch parameter __op_param1 with runtime value
gen_code_ptr += 6;
break;
}
From this example we can see, by using runtime patch to pass opration parameters, when write functions for micro operations, we no longer need to worry about the runtime value of parameters. This also makes it possible to directly instrument C code snippet to add function like dynamic taint and target address checking.
P.S. From version 0.10.0, QEMU begin to use TCG (tiny code generator), which brings some differences. I think this may require another blog to describe these changes.