Created attachment 1504445 [details] cups-browsed log of the situation If I turn on another laptop which is exporting a few printers, then this causes very high CPU loads with cups-browsed seemingly failing to add a printer and retrying in a loop. The printer in question is a network printer, which is not in my local network. So e.g. contacting the endpoint for that printer would obviously fail; that said, maybe there is also something wrong with the exporting machine. Attaching the full log, I believe the relevant lines are: Sun Nov 11 20:09:10 2018 HTTP timeout! (consider increasing HttpLocalTimeout/HttpRemoteTimeout value) Sun Nov 11 20:09:10 2018 Unable to create/modify CUPS queue (Success)! I have no idea what the HTTP request in question is.
Hi Ben, thank you for reporting the issue! Do I understand you correctly when you are saying: (In reply to Benjamin Berg from comment #0) > If I turn on another laptop which is exporting a few printers, then this > causes very high CPU loads with cups-browsed you have 2 machines - the one which is exporting the print queues and the other one which experience the issue - both in the? > > The printer in question is a network printer, which is not in my local > network. So e.g. contacting the endpoint for that printer would obviously > fail; that said, maybe there is also something wrong with the exporting > machine. If the setup is as I wrote above, then cups-browsed on 2nd machine is not trying to reach your printer, but your cups daemon on 1st machine, which exports the print queues. > > Attaching the full log, I believe the relevant lines are: > > Sun Nov 11 20:09:10 2018 HTTP timeout! (consider increasing > HttpLocalTimeout/HttpRemoteTimeout value) > Sun Nov 11 20:09:10 2018 Unable to create/modify CUPS queue (Success)! > > I have no idea what the HTTP request in question is. cups-browsed created http connection to local CUPS server when starting to create print queues, because cups-browsed creates print queues for local CUPS. Then cups-browsed sets timeout - 5s by default - for that connection. And the connection timeouts during getting ppd file. Ben, would you mind trying to set HttpLocalTimeout to higher value (f.e. 10) in /etc/cups/cups-browsed.conf, restart cups-browsed service and see if it helps?
Created attachment 1504844 [details] Log with HttpLocalTimeout 10 (In reply to Zdenek Dohnal from comment #1) > (In reply to Benjamin Berg from comment #0) > > If I turn on another laptop which is exporting a few printers, then this > > causes very high CPU loads with cups-browsed > > you have 2 machines - the one which is exporting the print queues and the > other one which experience the issue - both in the? Yep, ben-t450 is exporting the printers, ben-x1 is the Fedora machine with cups-browsed. Both in the same network. > cups-browsed created http connection to local CUPS server when starting to > create print queues, because cups-browsed creates print queues for local > CUPS. Then cups-browsed sets timeout - 5s by default - for that connection. > And the connection timeouts during getting ppd file. > > Ben, would you mind trying to set HttpLocalTimeout to higher value (f.e. 10) > in /etc/cups/cups-browsed.conf, restart cups-browsed service and see if it > helps? Yup, HttpLocalTimeout set to 10s works around the issue. It seems that cups simply needs the time to process the PPD(s)?
(In reply to Benjamin Berg from comment #2) > Yup, HttpLocalTimeout set to 10s works around the issue. It seems that cups > simply needs the time to process the PPD(s)? It is not actually a workaround, it is correct way to solve the issue - some PPDs needs more time to process, but it isn't reason to keep timeout length bigger for PPDs, which are smaller and quicker to process. I'm closing the issue as NOTABUG and I'll add the mention of the issue on https://fedoraproject.org/wiki/How_to_debug_printing_problems .
I have eyes and don't see - it is actually there already https://fedoraproject.org/wiki/How_to_debug_printing_problems#cups-browsed_consumes_large_amount_of_CPU
Sorry, I do not agree with the decision that this is NOTABUG. I think it is fair to consider the failure of the operation an expected behaviour. In the end, one probably needs to draw a line somewhere as to expected cupsd response times. However, cups-browsed is handling this error condition poorly. It is retrying the operation in a tight loop and using up system resources for an operation that has very little hope of succeeding. This failure case should be handled more gracefully be cups-browsed. A simple back-off algorithm to only try to re-add the printer periodically would already do the trick.
Hmm, actually it is strange to have only 3s or 4s time frame for updating the queues, when TIMEOUT_RETRY interval should be 10s (which could give time to CUPS to recover from previous query). It is caused by getting current time at the start of function, but it is used lower in the code for setting the timeout, so current time is not so current as I would expect and causes shorter timeouts. Would you mind trying the scratch build https://koji.fedoraproject.org/koji/taskinfo?taskID=30881492 ?
Just clarifying things - I don't consider reaching http timeout as bug in the case (for most PPDs timeout is not reached, user can set bigger timeout in the config). Retries after only 3s/4s instead of 10s is the bug. Reported upstream with patch https://github.com/OpenPrinting/cups-filters/pull/71
With that patch the CPU isn't maxed out all the time anymore and I have some seconds before cups-browsed tries to add the printer again. So, a slight improvement, but it still doesn't give up trying to add the printer.
With the newest patch cups-browsed abandons his try to add a specific printer after 5 tries and skip the printer after that (until other cups-browsed restart). cups-browsed cannot abandon whole updating process, because there can be still functional print queues or newly discovered printers. Would you mind trying this build? https://koji.fedoraproject.org/koji/taskinfo?taskID=30899328
Nice, that sounds good! I'll try it tonight (or tomorrow morning) when I am back home.
Hm, weird, the new build does not seem to change anything. It is still retrying indefinitely.
Sorry, my bad - overlooked 'break' lines, I realized it this morning at home... this one should be ok: https://koji.fedoraproject.org/koji/taskinfo?taskID=30917994
Ah, happens :) Unfortunately, that still doesn't seem to make a difference.
Just reminding - it will should try to add each printer 5 times and then skipping it - so in your case, for your 3 printers, it will be do retries about for 3 minutes until cups-browsed starts to skip all of them (if they get timeouts) - 50s (5 retries, each in 10s timeframe) for the first, 50s the second e.g.
Actually, 'break' wasn't for cycle, but for inner 'switch', so https://koji.fedoraproject.org/koji/taskinfo?taskID=30899328 (from yesterday) should work (after 3 minutes cups-browsed skips the printers).
If (I hope I did not) I did not let max retries assigned to 10 (older patch had it)... for just to be sure https://koji.fedoraproject.org/koji/taskinfo?taskID=30919474
Thanks! I'll try again tonight. Yeah, I probably had not waited long enough then. Though I believe only one of the printers was running into timeouts in my case.
Hm, something is still wrong. It is still looping (for like 6 minutes now). # grep 'Fri Nov 16 20:' /var/log/cups/cups-browsed_log | grep -c 'Non-raw queue Molly with PPD' 43 So, 43 timeouts already on that printer.
Darn... would you mind sharing the logs in the file and attach it?
Created attachment 1507463 [details] Log with build 30919475 Sorry, didn't get around to it yesterday. So, did a reinstall of https://koji.fedoraproject.org/koji/taskinfo?taskID=30919475 just to make sure I am on the right version. The
No problem, thank you for cooperating! I actually still see problem with short timeouts (the build below has attempt to fix it), plus not only Molly printer got timeout, but PDF too - but cups-browsed gracefully ignores that and thinks it is okay and sets state to CONFIRMED... This is more 'debug' build https://koji.fedoraproject.org/koji/taskinfo?taskID=31034955 , I would like to see contents of p->timeouts variable when cups-browsed is running. Would you mind testing it and attaching debug log, please?
Created attachment 1507831 [details] Log with debug patch applied Hrm, so, after not seeing *any* change, I figured I would grab the source RPM. It seems like the patch was simply not applied :-/ A local rebuild in mock later, and it works correctly: Wed Nov 21 22:27:56 2018 How many timeouts already the queue has? 5 And, there it stops. Attached the full log.
I don't know what to say than 'I'm sorry'... I tried to use different way to create the patches than I was used to, and made such mistake... at least skipping the printer works... I'll show the patch to upstream and see on what we can agree.
Upstream merged the patch, I'll backport it to F29.
Nice. Thanks a lot for solving this!
Bodhi is extremely cooperative today, so I'll change the bug manually... https://bodhi.fedoraproject.org/updates/FEDORA-2018-317a8771d0 requested to testing.
cups-filters-1.20.3-11.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-317a8771d0
cups-filters-1.20.3-11.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.