Monday, January 23, 2006

Functional Files


Hacked by chrootstrap September 2003

You've probably used function pointers in your C or C++ programs.
Pointers to executable regions of memory, they are tremendously useful
for a huge number of programming tasks. Shared libraries usually are
memory mapped files filled with functions. In this article, we'll take
a look at how you can keep functions in ordinary files and find some
creative uses for this.

The technique of treating functions as ordinary data is sometimes used
in cracking servers and the stored functions are known as shell code.
Many shell code examples involve writing the function in C, compiling
it, disassembling it, reassembling it, and snarfing the machine code
into a C buffer. Well, it needn't be so difficult to just grab some raw
instructions from a C function. The GNU tool, objcopy, makes it as easy
as pie.

To do this we put one function in a file, like this:

void f (void)
{
asm("mov $1, %eax");
asm("mov $25, %ebx");
asm("int $0x80");
}

Now we compile it like "gcc -c f.c". Now we have a file, f.o, which in
my case is 680 bytes. Doesn't that seem a little steep for a function
that just exits with a code of 25? If we take a look a look at with
"objdump -d f.o" we see that the actual code inside is as simple as we
expected:

0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: b8 01 00 00 00 mov $0x1,%eax
8: bb 19 00 00 00 mov $0x19,%ebx
d: cd 80 int $0x80
f: 5d pop %ebp
10: c3 ret

That's the kind of thing that would be manually cut and pasted into a
.s file, but, we've got a simpler solution: "objcopy f.o -O binary". It
is now just 17 bytes in size -- matching the disassembled code
perfectly.

Now if you wanted it to be shell code you'd copy it into a program and
have the function do something like dup2 to hookup stdin/stdout to a
socket and then exec your way into something interesting. But, that's
not what we're doing here.

Instead, let's keep it in the file and load it for our very own
dynamically loaded function. Let's change the function to make it do
something more interesting, like print out a string:

int f (void *ptr)
{
int i=0;

while (((char *)ptr)[i] != 0)
i++;
asm("mov $4, %eax");
asm("mov $1, %ebx");
asm("mov %0, %%ecx"::"g"(ptr));
asm("mov %0, %%edx"::"g"(i));
asm("int $0x80");
asm("mov %%eax, %0":"=g"(i));
return i;
}

This function will print the string passed to it and pass back the
string's length. It turns into a miniscule 57 bytes when we run it
through objcopy. Now we make a little program to load it from a file
and execute it:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

int main (int argc, char **argv)
{
int fd;
int i;
int (*f) (void *);
struct stat buf;

if (argc < 2) {
printf("Usage: %s FILENAME\n", argv[0]);
exit(1);
}
if ((fd = open(argv[1], O_RDONLY)) < 0) {
printf("Could not open %s.\n", argv[1]);
exit(1);
}
if (fstat(fd, &buf)) {
printf("Could not stat %s.\n", argv[1]);
exit(1);
}
f = malloc(buf.st_size);
i = buf.st_size;
while (i)
i -= read(fd, f, i);
close(fd);
return f("Bink!\n");
}

To use this, you just pass it the function file's name, e.g. f.o. Note
that you can use any function signature that you want! You can pass a
pointer to a complex struct with function pointers of its own and,
thus, provide callback functions, etc.
You can do all kinds of things with this technique! You can make your
own dynamic library system, you can send and run functions directly
over the network, you can store your program as a collection of
extremely small files, you can modify your program code in place
through mixing and matching the contents of these files, build your own
persistent object system, provide a faster alternative to init, network
bootstrapper, in-place patch framework, etc, etc.

If there is function call formation compatibility among systems, by
abstracting system specific functionality through a passed structure's
function pointers, you could make portions of a program system
independent. In particular, it may be very possible to make your
library's code (network backdoor code, etc) work simultaneously on
Windows and x86 Unix boxen. Depending on the program, you might be able
to write 90% or more of the code this way.

Your function files are also highly portable among similar systems as
they don't contain any linker information. Also, typical dynamic
library loaders are slow! Step through the full process to get to your
function of choice with gdb and you might be surprised. You can save
loads of clock cycles and megabytes of memory by making your own
dynamic function loader. Cool beans!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.