Linux mmap weirdness
Linux mmap() call is full of surprises. Take for example the MAP_PRIVATE option. According to the man page, mmap() with this option makes copy-on-write pages, meaning that any changes the process makes to the mmaped region will not be propagated to the underlying file.
This is of course works as advertised, but there is a catch. What if some other process modifies the file? Take for example this simple program:
#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>
int main()
{
int fd = open("file", O_RDRW);
char *p = mmap(NULL, 1,
PROT_READ | PROT_WRITE,
MAP_PRIVATE, fd, 0);
while(1) {
printf("%c\n", *p);
sleep(2);
}
close(fd);
}
If file contains a character "A", this program will print a string of As to the console. Now while the program is running you change the contents of the file. Does the output of the program change?
It turns out it does! However only if the program didn't change something in the mmaped region. If you modify the program above so that it writes something, the stream of characters on the console will not change when you modify the contents of the file.
Why is this significant (and why I spent a couple of hours exploring it)? It turns out that the dlopen() call in Linux loads shared objects by simply mmaping them (look at the strace output if you don't believe me). So if you change a .so file on the disk while some application is using it, you'll get a nice segmentation fault.
Now, /proc/*/maps file reveals that the executable itself is also mmaped. However a program doesn't crash when you modify the executable file, so either something gets changed in the program's image after it gets mmaped or I still don't understand everything that's going on here.


