On "ap_pass_brigade failed"

25.05.2016 20:37

Related to my recent rant regarding the broken Apache 2.4 in Debian Jessie, another curious thing was the appearance of the following in /var/log/apache2/error.log after the upgrade:

[fcgid:warn] [pid ...:tid ...] (32)Broken pipe: [client ...] mod_fcgid: ap_pass_brigade failed in handle_request_ipc function, referer: ...

Each such error is also related to a 500 Internal Server Error HTTP response logged in the access log.

There's a lot of misinformation floating about this on the web. Contrary to the popular opinion, this is not caused by wrong values of various Fcgid... options or the PHP_FCGI_MAX_REQUESTS variable. Actually, I don't know much about PHP (which seems to be the primary use case for FCGI), but I do know how to read the mod_fcgid source code and this error seems to have a very simple cause: clients that close the connection before waiting for the server to respond.

The error is generated on line 407 of fcgid_bridge.c (mod_fcgid 2.3.9):

/* Now pass any remaining response body data to output filters */
if ((rv = ap_pass_brigade(r->output_filters,
                          brigade_stdout)) != APR_SUCCESS) {
    if (!APR_STATUS_IS_ECONNABORTED(rv)) {
        ap_log_rerror(APLOG_MARK, APLOG_WARNING, rv, r,
                      "mod_fcgid: ap_pass_brigade failed in "
                      "handle_request_ipc function");
    }

    return HTTP_INTERNAL_SERVER_ERROR;
}

The comment at the top already suggests the cause of the error message: failure to send the response generated by the FCGI script. The condition is easy to reproduce with a short Python script that sends a request and immediately closes the socket:

import socket, ssl

HOST="..."
# path to some document generated by an FCGI script
PATH="..."

ctx = ssl.create_default_context()
conn = ctx.wrap_socket(socket.socket(socket.AF_INET), server_hostname=HOST)
conn.connect((HOST, 443))
conn.sendall("GET " + PATH + " HTTP/1.0\r\nHost: " + HOST + "\r\n\r\n")
conn.close()

Actually, you can do the same with a browser by mashing refresh and stop buttons. The success somewhat depends on how long the script takes to generate the response - for very fast scripts it's hard to tear down the connection fast enough.

Probably at some point ap_pass_brigade() returned ECONNABORTED when the client broke the connection, hence the if statement in the code above. It appears that now EPIPE is returned and mod_fcgid was not properly updated. I was testing this on apache2 2.4.10-10+deb8u4.

In any case, this error message is benign. Fiddling with the FcgidOutputBufferSize might cause the response to be sent out earlier and reduce the chance that this will be triggered by buggy crawlers and such, but in the end there is nothing you can do about it on the server side. The 500 response in the log is also clearly an artifact in this case, since it's the client that caused the error, not the server, and no error page was actually delivered.

Posted by Tomaž | Categories: Code

Comments

Thank you! It sounds like a problem with server or at least that was the impression before landing on your page.

Posted by MRG

Write the hexadecimal number 0x2A in decimal???

Anyway, just wanted to say thanks for the explanation

Posted by ft

writing to the error log is not entirely benign... the error log file must be locked, so a DDoS attack of broken connections could lock out legitimate requests if writing to the error log is common in the application. Regardless of that, cluttering the error log with things that aren't actual application errors seems very wrong.

is there no way to turn these errors off??

Posted by mike

Mike, I don't know if you can turn off specific error logs. Check Apache documentation for the version you're using. As a last resort you can comment out the function call in the source and recompile mod_fcgid yourself.

For me this isn't anything more than log spam. In fact, I've just grepped through the last month of logs and I can't find a single instance of this error. I'm not sure if it's just a coincidence that no client closed a connection prematurely in this time or if something changed in the later Apache versions and this error isn't happening anymore.

Unfortunately mod_fcgid is pretty buggy and unmaintained at this point. For the past few years I've been running a patched version myself to work around some other bugs in it. It seems there's no one there to apply the patches I've submitted to the bug tracker.

Posted by Tomaž

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)