Description of problem: My computer is connected to the printed of my work via samba. Once in a while cups just dies without warnings or even notice. I see it by seing all printers being absent. dmesg returns at the bottom: [1185471.785553] SELinux: 2048 avtab hash slots, 210653 rules. [1185471.841767] SELinux: 2048 avtab hash slots, 210653 rules. [1185471.982002] SELinux: 9 users, 14 roles, 3473 types, 181 bools, 1 sens, 1024 cats [1185471.982006] SELinux: 77 classes, 210653 rules [1211561.415362] cupsd[1386]: segfault at 8 ip 00007fc3ba58dd81 sp 00007fffa1791b18 error 4 in libc-2.12.90.so[7fc3ba50b000+19a000] [1273199.184587] cupsd[10509]: segfault at 8 ip 00007f5d8b46ad81 sp 00007fff58142a58 error 4 in libc-2.12.90.so[7f5d8b3e8000+19a000] Version-Release number of selected component (if applicable): # rpm -q cups cups-1.4.4-11.fc14.x86_64 How reproducible: often Steps to Reproduce: 1. I don't know how to reproduce it since I do not know when it happens 2. 3. Actual results: # service cups status cupsd dead but pid file exists Expected results: Having cups running Additional info:
Please run this command as root: debuginfo-install cups and then attach gdb to cupsd and leave it running until it crashes. Do this by finding the PID of cupsd (ps ax | grep [c]upsd) and running gdb like this: gdb /usr/sbin/cupsd 1711 (gdb) continue When it crashes again, please run these gdb commands: bt info locals up info locals up info locals up info locals and paste the results here. Thanks!
Created attachment 465248 [details] gdb output asked
Btw, I attached the asked gdb output
Thank you, but the output unfortunately doesn't contain the info we need. I think (maybe I'm wrong) that you pressed Enter when you saw that cupsd had crashed which evoked the previous command, i.e 'continue' which caused that cupsd was already terminated in the time when you tried to obtain info with 'bt' and 'info locals'. Can you try it again, please ? Thanks.
Attached in the output from gdb, I ran the given command after seeing that gdb announced the segfault but I still have: # service cups status cupsd (pid 32204) is running... (on the contrary to previous time)
Created attachment 467928 [details] gdb output asked (version 2)
Does it contains useful information this time ?
It tells us how it crashed, yes, but I haven't had time to fully investigate it yet I'm afraid. It was trying to send a D-Bus message at the time, but it isn't clear from the stack trace what's wrong.
Well, thanks for looking at this. At the moment I have no problem reproducing it ([1]) so let me know if I can help. [1] # service cups status cupsd dead but pid file exists
OK, if you can repeat it that makes it a lot easier to debug. Can you get to the same point in gdb again, and then run this?: select-frame 6 info locals p *dest Thanks.
Should I run bt info locals up info locals up info locals up info locals before select-frame 6 info locals p *dest ?
Just: bt select-frame 6 info locals p *dest In the "bt" output, I'm hoping that it says "#6 ... cupsd_send_dbus ..." -- if not, take a look to see which frame number cupsd_send_dbus is in and use 'select-frame' for that number instead. Thanks.
And there is the output (told you I can reproduce it :-)).
Created attachment 468646 [details] gdb output of the select-frame 6 & info locals
Thanks. How are you reproducing the problem? At the moment I'm thinking that it's some sort of issue with a job not getting stopped when the printer it is queued for reaches its browse timeout. We expect the printer to disappear from the CUPS list of printers, and for any jobs currently processing to be stopped, but it looks like that might not be happening for some reason. It would be a great help if you can still reproduce the problem after doing this: cupsctl LogLevel=debug2 service cups restartlog If so, please attach /var/log/cups/error_log (you might have to compress it with e.g. xz first as it might be quite large).
To reproduce the problem I just let cups running. I don't even try to print or anything. The surprising bits is that: # service cups status cupsd (pid 20806) is running... While I have not touched it since I posted the log yesterday (not even restarted it). I will run the command and keep you informed of the output.
Created attachment 468886 [details] error_log from cups running debug2
Ha! It times out the printer, stops the job, and then starts the job again. No wonder... Still analysing the cause.
Could you please attach your /etc/cups/cupsd.conf file? Thanks.
Created attachment 468923 [details] cups.conf
OK, I've managed to reproduce the problem locally. Here's what I did: 1. Stop CUPS. 2. Edit /var/cache/cups/remote.cache and add these lines to the end: <Printer Nemo> BrowseTime 1292500224 Info Not there Location Nowhere MakeModel Mysteriously absent DeviceURI ipp://192.168.20.10:631/printers/Nemo State Idle Type 16814094 Accepting Yes JobSheets none none Option job-sheets none,none Option lease-duration 300 </Printer> 3. Start CUPS 4. Turn on debugging: cupsctl LogLevel=debug2 5. Submit a job to Nemo e.g. "echo . | lp -d Nemo" 6. Now restart CUPS again and clear the log: service cups restartlog 7. Wait for 5 minutes Without performing step 6 I couldn't reproduce the problem, which leads me to wonder if it's a timing issue of some sort. The backend is retrying every 30 seconds, and the browse timeout is 300 seconds, so the backend is logging messages at exactly the moment we time out its printer. Here's the log at that point: d [16/Dec/2010:12:54:32 +0000] select_timeout(0): 8 seconds to browse timeout a printer d [16/Dec/2010:12:54:40 +0000] cupsdAddEvent(event=printer-deleted, dest=0x7f6d86a6f4e0(Nemo), job=(nil)(0), text="%s '%s' deleted by directory services (timeout).", ...) d [16/Dec/2010:12:54:40 +0000] cupsd_send_notification(sub=0x7f6d86aa7590(281), event=0x7f6d86ac6aa0(printer-deleted)) d [16/Dec/2010:12:54:40 +0000] cupsd_send_notification(sub=0x7f6d86aa76e0(282), event=0x7f6d86ac46f0(printer-deleted)) D [16/Dec/2010:12:54:40 +0000] cupsdMarkDirty(-----S) D [16/Dec/2010:12:54:40 +0000] cupsdSetBusyState: Printing jobs and dirty files D [16/Dec/2010:12:54:40 +0000] Remote destination "Nemo" has timed out; deleting it... d [16/Dec/2010:12:54:40 +0000] cupsdDeletePrinter(p=0x7f6d86a6f4e0(Nemo), update=1) d [16/Dec/2010:12:54:40 +0000] cupsdSetJobState(job=0x7f6d86a749b0(47), state=5, newstate=3, action=1, message="Job stopped due to printer being deleted.") d [16/Dec/2010:12:54:40 +0000] stop_job(job=0x7f6d86a749b0(47), action=1) d [16/Dec/2010:12:54:40 +0000] cupsdEndProcess(pid=1058, force=1) d [16/Dec/2010:12:54:40 +0000] cupsdAddEvent(event=job-state-changed, dest=0x7f6d86a6f4e0(Nemo), job=0x7f6d86a749b0(47), text="%s", ...) d [16/Dec/2010:12:54:40 +0000] cupsd_send_notification(sub=0x7f6d86aa7590(281), event=0x7f6d86ac4cb0(job-state-changed)) D [16/Dec/2010:12:54:40 +0000] cupsdMarkDirty(-----S) I [16/Dec/2010:12:54:40 +0000] [Job 47] Job stopped due to printer being deleted. D [16/Dec/2010:12:54:40 +0000] cupsdMarkDirty(----J-) d [16/Dec/2010:12:54:40 +0000] finalize_job(job=0x7f6d86a749b0(47)) d [16/Dec/2010:12:54:40 +0000] cupsdSetPrinterReasons(p=0x7f6d86a6f4e0(Nemo),s="-connecting-to-device" d [16/Dec/2010:12:54:40 +0000] cupsdSetPrinterReasons(p=0x7f6d86a6f4e0(Nemo),s="-com.apple.print.recoverable-warning" D [16/Dec/2010:12:54:40 +0000] cupsdMarkDirty(P-----) d [16/Dec/2010:12:54:40 +0000] cupsdSetPrinterReasons(p=0x7f6d86a6f4e0(Nemo),s="-offline-report" d [16/Dec/2010:12:54:40 +0000] cupsdDeleteProfile(profile="(null)") d [16/Dec/2010:12:54:40 +0000] cupsdRemoveSelect(fd=13) d [16/Dec/2010:12:54:40 +0000] cupsdCheckJobs: 1 active jobs, sleeping=0, reload=0 d [16/Dec/2010:12:54:40 +0000] start_job(job=0x7f6d86a749b0(47), printer=0x7f6d86a6f4e0(Nemo)) ...and it starts the job again.
Reported upstream. The reason is that cupsdDeletePrinter() assumes that this is sufficient to prevent jobs being started in the queue: 787 cupsdSetPrinterState(p, IPP_PRINTER_STOPPED, update); In fact, it isn't: 2952 /* 2953 * Can't set status of remote printers... 2954 */ 2955 2956 if (p->type & CUPS_PRINTER_DISCOVERED) 2957 return; When cupsdCheckJobs() is called later (from finalize_job()), it re-starts the job.
I am glad you could reproduce it. Let me know if there is anything else I can do.
This build contains the fix: http://koji.fedoraproject.org/koji/buildinfo?buildID=211214
*** Bug 667920 has been marked as a duplicate of this bug. ***
cups-1.4.6-1.fc14 has been submitted as an update for Fedora 14. https://admin.fedoraproject.org/updates/cups-1.4.6-1.fc14
cups-1.4.6-1.fc13 has been submitted as an update for Fedora 13. https://admin.fedoraproject.org/updates/cups-1.4.6-1.fc13
cups-1.4.6-1.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with su -c 'yum --enablerepo=updates-testing update cups'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/cups-1.4.6-1.fc13
cups-1.4.6-1.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.
cups-1.4.6-1.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.