From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.5) Gecko/20041110 Firefox/1.0 Description of problem: Mod_perl will occasionally start dumping out huge quantities of "failed to get bucket brigade" error messages in the /var/log/httpd/error_log file. I get these problems at least once a week. Judging by the messages I see on the net, it is a common problem. The patch to fix it is pretty small and is listed in the above message. I have a modified patch that seems to work with 1.99_12-2.1. Version-Release number of selected component (if applicable): 1.99_12-2.1 How reproducible: Sometimes Steps to Reproduce: I let my server run for a week at www.onlineauction.com. Actual Results: "failed to get bucket brigade" error starts filling the logs. It sometimes will stop before filling the logs, but not always. (20+ gigs at a time.) Expected Results: The error should not occur. Additional info: http://www.gossamer-threads.com/lists/modperl/modperl/62201
Created attachment 106882 [details] Patch for bucket brigade failure error This is the patch mentioned in the url corrected to insert correctly, but without the comments. (I was in a hurry.)
Another comment on this error... Updating to the current version of mod_perl from FC3 does not work unless you want to upgrade Perl to 5.8.5. (Carp::Heavy is required.) It also does not include the module Apache::Server which causes some older scripts to fail. (1.99_12.2.1 has the module. 1.99_16-3 does not.)
New data... The server generated the "failed to get bucket brigade message with the patch", but it only generated one message, not 20 gigs worth. So far, the patch appears to fix the problem.
Thanks for reporting and tracking this down, Alan. Did you have an idea which script and how this was being triggered (and hence, what the repro case is)?
I don't have a 100% reproducable test case. It appears to happen under certain conditioons with slow links and disconnecting in the middle of a request. (I have a user on the site who is able to trigger it.) The patch I pointed you to records the error once, instead of going into an infinite loop. I have been running with the patch since November 17th and I have yet to have the looping error message reoccur.
I've now looked further into this: the Apache::RequestIO::read function is not really misbehaving. It looks like the only repro case for this is if the script ignores the $r->read() return value and carries on calling it even after an error. That's just a script error. Can you paste the section of code which reads the POST body to confirm or deny this?
(better yet, attach or send me privately the entire edit_photo.pl script)
Marking NOTABUG per comment above.