Bug 218552

Summary: Cups dies after update to 1.2.7
Product: [Fedora] Fedora Reporter: Stephen John Smoogen <smooge>
Component: cupsAssignee: Tim Waugh <twaugh>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6   
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: F-8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-30 09:55:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 235705    
Attachments:
Description Flags
Default mime-types no change from rpm -Vf /etc/cups/mime.types
none
Requested error log none

Description Stephen John Smoogen 2006-12-05 23:12:43 UTC
Description of problem:

Cups server crashes without error within 10 minutes of startup after updating to  
cups-1.2.7-1.3.fc6. 

Version-Release number of selected component (if applicable):
cups-1.2.7-1.3.fc6

How reproducible:
100% Multiple fedora core 6 boxes here are unable to print.

We have 2 centralized ESP Print Pro servers that all of our printers spool from.
Our past configuration of our cups clients was to add two lines: 

BrowsePoll print2.unm.edu
BrowsePoll print.unm.edu

and the rest of the lines are supposed to be default. On upgrading to cups-1.2.7
we are seeing that cups will run for about 10 minutes and then crash out without
any error messages. Have looked in selinux errors and not seen anything there
showing a violation. 

What is left behind are multiple cups-polld which do not seem to die unless
given a specific kill [service cups stop does not kill them]

With debugging turned on I get the following:

D [05/Dec/2006:16:09:38 -0700] [cups-polld print.unm.edu:631] Found 82 printers.
D [05/Dec/2006:16:09:39 -0700] [cups-polld print2.unm.edu:631] Found 79 printers.
D [05/Dec/2006:16:10:08 -0700] [cups-polld print.unm.edu:631] Found 82 printers.
D [05/Dec/2006:16:10:09 -0700] [cups-polld print2.unm.edu:631] Found 79 printers.
D [05/Dec/2006:16:10:38 -0700] cupsdNetIFUpdate: "lo" = localhost...
D [05/Dec/2006:16:10:38 -0700] cupsdNetIFUpdate: "eth0" = glasya.unm.edu...
D [05/Dec/2006:16:10:38 -0700] cupsdNetIFUpdate: "lo" = localhost...
D [05/Dec/2006:16:10:38 -0700] cupsdNetIFUpdate: "eth0" =
fe80::207:e9ff:fecc:b559%eth0...
D [05/Dec/2006:16:10:38 -0700] [cups-polld print.unm.edu:631] Found 82 printers.
D [05/Dec/2006:16:10:39 -0700] Discarding unused printer-deleted event...
D [05/Dec/2006:16:10:39 -0700] Remote destination "puppis@print" has timed out;
deleting it...
D [05/Dec/2006:16:10:39 -0700] Discarding unused printer-deleted event...
D [05/Dec/2006:16:10:39 -0700] Remote destination "puppis@print2-new" has timed
out; deleting it...
D [05/Dec/2006:16:10:39 -0700] Discarding unused printer-state-changed event...

Looking at this after a restart.. it ALWAYS dies after that.

Comment 1 Tim Waugh 2006-12-06 09:45:19 UTC
Please install the corresponding cups-debuginfo package and then, after
restarting cups (so that it is expected to exhibit this bug), find out the
process ID of cupsd and attach gdb to it like this:

gdb /usr/sbin/cupsd 4295

(if 4295 is the process ID of cupsd).

Then, at the (gdb) prompt, enter:

  continue

The next (gdb) prompt should be from the crash -- at that point, enter 'bt' and
copy the backtrace you get into this bug report.  If possible, don't exit the
session and I will try to get back to you in order to ask for any further
information I need.  Thanks.


Comment 2 Stephen John Smoogen 2006-12-06 18:39:37 UTC
I am looking at it. I installed the cups-debuginfo and attached gdb to it, and
the problem stopped happening, but I am not seeing the "Remote Destination" line
appearing anymore either. I am trying to see if I can casue the "Remote
destination" and see if it triggers cups from dieing. Will let you know.

Comment 3 Stephen John Smoogen 2007-03-12 02:40:57 UTC
Sorry for the long delay. It would happen when I didnt have gdb running.. and
when I had gdb running it wouldnt happen for a week or more. I finally got it
today with gdb

Long time tracking it down. The only settings away from the default cups conf
are adding two lines:

BrowsePoll print.unm.edu:631
BrowsePoll print2.unm.edu:631

cups-1.2.7-1.8.fc6
cups-debuginfo-1.2.7-1.8.fc6
cups-libs-1.2.7-1.8.fc6
cups-pdf-2.4.4-1.fc6


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208961328 (LWP 19605)]
delete_rules (rules=0x81006e6f) at mime.c:452
452         if (rules->child != NULL)
(gdb) where
#0  delete_rules (rules=0x81006e6f) at mime.c:452
#1  0x80048624 in delete_rules (rules=0x81bac768) at mime.c:453
#2  0x800493a6 in mimeDeleteType (mime=0x81b327c0, mt=0x81bebe48) at mime.c:169
#3  0x80042cdf in cupsdDeletePrinter (p=0x81bb3ba8, update=1) at printers.c:743
#4  0x8001dbac in cupsdSendBrowseList () at dirsvc.c:839
#5  0x80020931 in main (argc=1, argv=0xbfae2a34) at main.c:935
(gdb) t
[Current thread is 1 (Thread -1208961328 (LWP 19605))]

Let me know what other commands you want me to try.

Comment 4 Stephen John Smoogen 2007-03-12 02:45:08 UTC
(gdb) frame
#0  delete_rules (rules=0x81006e6f) at mime.c:452
452         if (rules->child != NULL)
(gdb) up
#1  0x80048624 in delete_rules (rules=0x81bac768) at mime.c:453
453           delete_rules(rules->child);
(gdb) up 
#2  0x800493a6 in mimeDeleteType (mime=0x81b327c0, mt=0x81bebe48) at mime.c:169
169       delete_rules(mt->rules);
(gdb) up
#3  0x80042cdf in cupsdDeletePrinter (p=0x81bb3ba8, update=1) at printers.c:743
743       mimeDeleteType(MimeDatabase, p->filetype);
(gdb) up
#4  0x8001dbac in cupsdSendBrowseList () at dirsvc.c:839
839             cupsdDeletePrinter(p, 1);
(gdb) up
#5  0x80020931 in main (argc=1, argv=0xbfae2a34) at main.c:935
935           cupsdSendBrowseList();
(gdb) up
Initial frame selected; you cannot go up.


Comment 5 Stephen John Smoogen 2007-03-12 02:50:20 UTC
There is nothing in the logs that I can tell occuring on the client. I am
looking at the upstream cups printers to see if they are triggering something.
Any ideas would be welcome. I will keep the gdb open for the time being in case
there are stack items you want me to give you.

Comment 6 Tim Waugh 2007-03-12 10:40:15 UTC
Try this:

select-frame 3
print p->name
print p->uri


Comment 7 Stephen John Smoogen 2007-03-12 14:23:55 UTC
(gdb) select-frame 3
(gdb) print p->name
$1 = 0x81bebdd0 "vaad5"
(gdb) print p->uri
$2 = 0x81b50e90 "ipp://print2.unm.edu:631/printers/vaad5"


I will look on the second print server to see what was going on with this one
last night (if anything.)

Comment 8 Stephen John Smoogen 2007-03-12 14:29:49 UTC
Ok vaad5 is the last printer on the list. We have 78 printers defined on print
and print2.. I do not see any errors listed for vaad5 on the print server which
is currently running ESP-Pro's version.

Comment 9 Tim Waugh 2007-03-13 12:13:55 UTC
vaad5 was being removed from the printer list because too much time had elapsed
since it was last advertised by the server (this is the BrowseTimeout).  Somehow
while we were cleaning up the MIME database to remove printer/vaad5, we followed
a stale pointer.

Not sure at this point what the root cause is.

1. Could you attach your /etc/cups/mime.types file please?

2. It would also be helpful if I could see the debugging output when the printer
is first seen.  Here's how to get that (as root):

Stop CUPS:

/sbin/service cups stop

Clear out the error_log:

>/var/log/cups/error_log

Edit /etc/cups/cupsd.conf and change the 'LogLevel' line to read:

LogLevel debug2

Remove the remote printers cache:

rm -f /var/cache/cups/remote.cache

Start CUPS again:

/sbin/service cups start

Now wait for a few minutes and see if 'vaad5' is available to print to.  Once it
is, please attach the /var/log/cups/error_log file to this bug report.

Thanks!

Comment 10 Stephen John Smoogen 2007-03-13 20:10:07 UTC
Created attachment 149977 [details]
Default mime-types no change from rpm -Vf /etc/cups/mime.types

This is the requested mime.types. Looks to be default file.

Comment 11 Stephen John Smoogen 2007-03-13 20:25:25 UTC
Created attachment 149980 [details]
Requested error log

This is the error log from debug2. One interesting thing is that when I put the
system into debug2.. I think the issue may be with one of the polled servers
timing out. We have two polls (print.unm.edu and print2.unm.edu). Should I look
at the cups-polld next time also?

Comment 12 Tim Waugh 2007-07-02 12:14:26 UTC
(Fixing bug status)

Comment 13 Stephen John Smoogen 2007-07-02 19:14:54 UTC
Did you need more info? By the way... the poll patch seems to have shown up in
the RHEL-3.9 cups and is causing problems on a lot of boxes with /var/log/cups
filling up.

Comment 14 Tim Waugh 2007-07-02 22:42:32 UTC
Only today spotted this had the information added; haven't checked through it yet.

What poll patch?  Does that bug have its own bugzilla report?  If not, would you
mind filing one?  Thanks.

Comment 15 Stephen John Smoogen 2007-07-02 23:20:30 UTC
It has a RH Trouble ticket.. but I am guessing we are one of the few places
using browsepolling :).

Bugzilla Bug 246545: Cups fills up logfiles if queue is turned on

Comment 16 Stephen John Smoogen 2008-01-30 02:22:41 UTC
As far as I can tell.. this is fixed except for the RHEL-3 bug. Should you or I
combine the two now

Comment 17 Tim Waugh 2008-01-30 09:55:43 UTC
I'll close this one and try to get the RHEL-3 one addressed.