The entry point address of an ELF has some peculiar definite values depending on the architecture but 400k is a common one.
For this program,
int main() {
return 0;
}gcc test.c and readelf -h a.out gives:
Entry point address: 0x1020
But if I compile as gcc test.c -no-pie, the entry point address is:
Entry point address: 0x401020
An offset of 400k, with the actual program related data then starting at 400k + 0 and the instruction to begin at being at 400k + 1020.
Why 400k?
Clearly the former is better, and for modern programs that is mostly the case. See from the ELF header,
Type: DYN (Position-Independent Executable file)
The segments are squashed together and you only have what’s needed. Relevant stackoverflow.
The choice of 400k is ancient now. Though from another stackoverflow there are some insights as for the reasons in history.
- higher than
mmap_min_addr( debian page link ) of 64k. Kernel is not allowed to transform addresses below this value, point being this makes a NULL pointer dereference leading to silently executing code more difficult and apparently this is a security feature ( again not sure why the exploit cannot just use a higher address on triggering a NULL pointer dereference ) - 400k is 4MB and the pages being 2MB, this places the program at the start of a page. The entire page is read at once so this means most of the relevant data and instructions are in memory rather than bordering on the edge of a page. (This is perhaps related to TLB and the way MMU traverses the page table but my knowledge is limited here at the moment)
To read later:
- some deep chinese blog on the topic: link