[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Directly comparing two addresses from different memory regions (one is f
From: |
TU Haoxin |
Subject: |
Directly comparing two addresses from different memory regions (one is from the stack and another is from the heap) might be problematic |
Date: |
Tue, 15 Aug 2023 15:14:47 +0000 |
Dear developers,
I hope this is the right place to raise questions.
We have developed a new tool built on top of KLEE (http://klee.github.io/) to
automatically test GNU make-4.2 and found there might be a possible issue
caused by improper comparison of two pointers (one point to the stack and
another point to the heap). Before submitting this issue to the bug repository,
we would like to seek your suggestions on whether this is a valid issue in the
make package.
The potential issue may be caused by the code ` if (key == *slot) return
slot;` in the function hash_find_slot in `hash.c`, and I believe this is an
important function that is used in many other functions or programs. The
pointer key typically points to a stack variable `var_key` in function
`lookup_variable` in `variable.c`, and `*slot` points to a heap object `v`
(e.g., allocated by `xmalloc` in function `define_variable_in_set` in
`variable.c`). To simplify the problem, we have written a simple client program
(Lines 1061-1107 in `main.c` in the attached file) by removing irrelevant code
in the main function and writing a new function named `test_client` (Other
functions are kept the same except for inserting a few printing functions for
debugging purposes). The core code is as follows:
```
struct variable var_key;
void *stack_var = &var_key;
void test_client(){
var_key.name = "test_name";
var_key.length = strlen(var_key.name);
struct variable *v;
struct variable_set *set = &global_variable_set;
//insert the v to the table
struct variable ** var_slot;
var_slot = (struct variable **) hash_find_slot (&set->table, &var_key);
//starting of heap manipulation
intptr_t *p1 = malloc(256);
int real_size = malloc_usable_size(p1);
intptr_t *ptr_top = (intptr_t *) ((char *)p1 + real_size - sizeof(long));
*(intptr_t *)((char *)ptr_top + sizeof(long)) = -1;
unsigned long evil_size = (unsigned long long)&stack_var - sizeof(long)*4 -
(unsigned long)ptr_top;
void *new_ptr = malloc(evil_size);
fprintf(stderr, "new_ptr: %p\n", new_ptr - sizeof(long)*2);
//ending of heap manipulation
v = malloc(sizeof(struct variable)); // address of v may equal to the address
of stack_var
v->name = var_key.name;
v->length = strlen(var_key.name);
int size = 137; // buggy size; can be else
v->value = malloc(size);
printf("address of v->value : %p\n", v->value);
memset(v->value, 'A', size);
fprintf(stderr, "addr of set->table => %p!\n", &set->table);
assert(v == &stack_var);
void ** ret_insert = hash_insert_at(&set->table, v, var_slot);
printf("***1 current_variable_set_list = %p\n", current_variable_set_list);
}
int
main (int argc, char **argv){
initialize_global_hash_tables ();
printf("Test our own client program!!!\n");
test_client(); // our new function
char* name = "test_name";
struct variable * var = lookup_variable(name, strlen(name));
printf("found v in hash table : %p\n", var);
}
```
Without the code in Lines 1074-1082, other lines of code could be commonly used
in the make program I think. Also, it works as expected without the code in
Lines 1074-1082 (please check the logs in the file
`./build/without-heap-manipulation-log.txt`). However, the potential problem is
that if the if-branch ` if (key == *slot) return slot;` becomes true when the
value of `key` (from the stack) is equal to the value of `*slot` (from the
heap), the returned address of `*slot` can be problematic. The problem is that
there will be other allocation operations (e.g., `v->value = malloc(size);` in
the client program) after the malloc for `v` and if the `v` is allocated at an
address where a stack object located at, the following memory copy will
completely destroy the subsequent memory content in the stack. Therefore, this
will lead to a crash of the `make` program.
To make the address from the heap return as the same one from the stack, we
leverage some known heap exploitation techniques (introduced in
https://github.com/shellphish/how2heap, particularly the house of force in
https://github.com/shellphish/how2heap/blob/master/glibc_2.27/house_of_force.c)
to make the `assert(v == &stack_var` true. So, as reported in the file
`./build/without-heap-manipulation-log.txt`, there will be a Segmentation fault
due to the fact that the global variable (i.e., current_variable_set_list) in
the stack has already been overwritten to an invalid address (i.e.,
0x4141414141414141, where the value comes from the function `memset(v->value,
'A', size);`).
From the above client program example, we can see there might be a potential
issue if the address returned by malloc equals the one in the stack. So, we
highly recommend that developers shall try to avoid such a scenario by re-write
the code of `if (key == *slot) return slot;` because newer programmers might
write code like the one we provided. If the crash happened, programmers may not
know what happened inside the `make` program and feel hopeless in this case.
For the building setting, we are simply running `../configure --pre=xxx` in the
build folder and running `make` executable to check the outputs. Also, we are
using Ubuntu 18.04 (uses glibc-2.27 as default) OS in the experiment. We used
the make-4.2 version but we found the latest versions share the same code as
the version under test. So, we only reported the make-4.2 version here.
Could you please take a look and check whether the one reported can be a
potential issue or not? Thank you very much for your time! Any suggestions are
welcome!
Best regards,
Haoxin
make-4.2-new-client-prog.zip
Description: make-4.2-new-client-prog.zip
- Directly comparing two addresses from different memory regions (one is from the stack and another is from the heap) might be problematic,
TU Haoxin <=