Linux mmap weirdness

14.03.2008 19:10

Linux mmap() call is full of surprises. Take for example the MAP_PRIVATE option. According to the man page, mmap() with this option makes copy-on-write pages, meaning that any changes the process makes to the mmaped region will not be propagated to the underlying file.

This is of course works as advertised, but there is a catch. What if some other process modifies the file? Take for example this simple program:

#include <stdio.h>
#include <fcntl.h>
#include <sys/mman.h>

int main() 
{
	int fd = open("file", O_RDRW);

	char *p = mmap(NULL, 1, 
			PROT_READ | PROT_WRITE, 
			MAP_PRIVATE, fd, 0);

	while(1) {
		printf("%c\n", *p);
		sleep(2);
	}

	close(fd);
}

If file contains a character "A", this program will print a string of As to the console. Now while the program is running you change the contents of the file. Does the output of the program change?

It turns out it does! However only if the program didn't change something in the mmaped region. If you modify the program above so that it writes something, the stream of characters on the console will not change when you modify the contents of the file.

Why is this significant (and why I spent a couple of hours exploring it)? It turns out that the dlopen() call in Linux loads shared objects by simply mmaping them (look at the strace output if you don't believe me). So if you change a .so file on the disk while some application is using it, you'll get a nice segmentation fault.

Now, /proc/*/maps file reveals that the executable itself is also mmaped. However a program doesn't crash when you modify the executable file, so either something gets changed in the program's image after it gets mmaped or I still don't understand everything that's going on here.

Posted by Tomaž | Categories: Code
Comments

I have one possible explanation to the behavior that you mentioned. The binary may get relocated due to address space randomization (for security). In this case, the binary will be modified in memory (addresses patched) unless the code is PIC (position independent). That's my guess ...

Posted on 19 October 2009 by Nadav

That sounds plausible.

Posted on 20 October 2009 by Tomaž

Well this behavior is semi-specified in the mmap man page (note the last sentence):

MAP_PRIVATE Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. ***It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region.***

So as to *why* it does that, well, I imagine it probably wasn't protected since the standard didn't mandate it. But that's just my 2 cents. :)

Posted on 10 November 2009 by Dan Fego
Add a new comment

Your name

Your email (optional, will be published)

Your web site (optional)


(No HTML tags allowed. Separate paragraphs with a blank line.)