Reverse DNS lookups and Apache

02.04.2011 13:13

While debugging some unrelated problem (another case of the #349) I noticed that logs from an Apache web server I take care of sometimes contain complete hostnames of clients instead of just IP addresses. That is usually considered bad practice - making reverse DNS requests for every HTTP connection your server gets makes the web site slower and loads DNS servers for no good reason. I decided to look into it.

A tcpdump session quickly confirmed that Apache was doing double reverse lookups even though the HostnameLookups directive was clearly turned off. What was puzzling was that only a part of logged requests contained hostnames (that is, LogFormat "%h" expanded to a hostname instead of an IP). Most, but not all of those were for automatically generated index pages (mod_autoindex) and I could reliably reproduce only some requests.

In the end, the culprits turned out to be .htaccess files that contained allow from directives with hostnames instead of IP ranges. Naturally those can only be verified by Apache by doing double reverse lookups. The surprising thing that made this problem harder to find is that with index pages Apache will check the .htaccess file for the current directory and all subdirectories. So if any subdirectory has restriction based on a hostname, the lookups will also be made when a client requests its parent directory.

I also found out that when persistent connections are used, the hostname will be logged for all requests done through that connection. So once an URL requiring a lookup is retrieved by a client, all subsequent requests will have a hostname instead of an IP logged (even though only one DNS lookup was made).

By the way, Google turns up this blog post on the topic. It might be a bit outdated, but while investigating I did also look into the Apache 2.2.9 source and I can add that:

  • The only place in the code where double reverse lookups are made is the mod_authz_host module. So if you see double lookups in tcpdump a hostname based allow / deny rule is the only option.
  • Using "%h" in LogFormat alone does most certainly not cause a DNS lookup on its own. Replacing it with "%a" will however hide the fact that the server is doing lookups, because that one will expand to the remote IP address whether the hostname is known to Apache or not.
Posted by Tomaž | Categories: Code

Comments

After some time of websearch your article helped alot. Finally i have removed a "deny from something.com" in a .htaccess and away are the hostnames in the apache logs. Thanks for that!

Posted by Stephan

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)