From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 Description of problem: Hi Tim, Sometimes I have a print job jam a print queue on cups. This also happened on the pre-ipp 100% patch version I was running. Anyway, here's a log (from log level debug) of a job which stopped a printer. Let me know what else I can do to get you more info. This is all that appears in the log file for this job. (and the print queue just james and cancelling this job frees it up again) I [12/Sep/2003:19:51:05 +0100] Job 5573 queued on 'filtered_lime' by 'nt'. D [12/Sep/2003:19:51:05 +0100] Job 5573 hold_until = 0 D [12/Sep/2003:19:51:05 +0100] StartJob(5573, 0x809f1f0) D [12/Sep/2003:19:51:05 +0100] StartJob() id = 5573, file = 0/1 D [12/Sep/2003:19:51:05 +0100] job-sheets=none,none D [12/Sep/2003:19:51:05 +0100] banner_page = 0 D [12/Sep/2003:19:51:05 +0100] StartJob: argv = "filtered_lime","5573","nt","KDE Print System","1","orientation-requested=3","/var/spool/cups/d05573-001" D [12/Sep/2003:19:51:05 +0100] StartJob: envp = "PATH=/usr/lib/cups/filter:/bin:/usr/bin","SOFTWARE=CUPS/1.1","USER=root","CHARSET=iso-8859-1","LANG=en_US","","PPD=/etc/cups/ppd/filtered_lime.ppd","CUPS_SERVERROOT=/etc/cups","RIP_MAX_CACHE=8m","TMPDIR=/var/spool/cups/tmp","CONTENT_TYPE=application/postscript","DEVICE_URI=ipp://prn1106-2.ee.ic.ac.uk:631/ipp/","PRINTER=filtered_lime","CUPS_DATADIR=/usr/share/cups","CUPS_FONTPATH=/usr/share/cups/fonts","","","" D [12/Sep/2003:19:51:05 +0100] StartJob: statusfds = 15, 16 D [12/Sep/2003:19:51:05 +0100] StartJob: filterfds[1] = 17, -1 D [12/Sep/2003:19:51:05 +0100] StartJob: filter = "/usr/lib/cups/filter/pstops" D [12/Sep/2003:19:51:05 +0100] StartJob: filterfds[0] = 18, 19 D [12/Sep/2003:19:51:05 +0100] start_process("/usr/lib/cups/filter/pstops", 0xbffefa10, 0xbffeeec0, 17, 19, 16) I [12/Sep/2003:19:51:05 +0100] Started filter /usr/lib/cups/filter/pstops (PID 6748) for job 5573. D [12/Sep/2003:19:51:05 +0100] StartJob: backend = "/usr/lib/cups/backend/ipp" D [12/Sep/2003:19:51:05 +0100] StartJob: filterfds[1] = -1, 17 D [12/Sep/2003:19:51:05 +0100] start_process("/usr/lib/cups/backend/ipp", 0xbffefa10, 0xbffeeec0, 18, 17, 16) I [12/Sep/2003:19:51:05 +0100] Started backend /usr/lib/cups/backend/ipp (PID 6749) for job 5573. D [12/Sep/2003:19:51:05 +0100] ProcessIPPRequest: 14 status_code=0 D [12/Sep/2003:19:51:05 +0100] CloseClient() 14 D [12/Sep/2003:19:51:05 +0100] [Job 5573] Page = 595x842; 12,12 to 583,830 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%Pages: (atend) D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%BoundingBox: 0 0 810 1085 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%HiResBoundingBox: 0.000000 0.000000 809.625000 1084.125000 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%Creator: ESP Ghostscript 705 (pswrite) D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%CreationDate: 2003/09/12 19:51:02 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%DocumentData: Clean7Bit D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%LanguageLevel: 2 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%EndComments D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%BeginProlog D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%BeginResource: procset GS_pswrite_2_0_1001 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%EndResource D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%EndProlog D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%Page: 1 1 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%Page: 1 1 D [12/Sep/2003:19:51:05 +0100] [Job 5573] pw = 571.1, pl = 817.6 D [12/Sep/2003:19:51:05 +0100] [Job 5573] PageLeft = 12.0, PageRight = 583.1 D [12/Sep/2003:19:51:05 +0100] [Job 5573] PageTop = 829.7, PageBottom = 12.1 D [12/Sep/2003:19:51:05 +0100] [Job 5573] PageWidth = 595.0, PageLength = 842.0 D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%BeginPageSetup D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%EndPageSetup D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%PageTrailer D [12/Sep/2003:19:51:05 +0100] [Job 5573] 0 %%Trailer D [12/Sep/2003:19:51:05 +0100] [Job 5573] Saw Trailer! D [12/Sep/2003:19:51:05 +0100] [Job 5573] Saw EOF! D [12/Sep/2003:19:51:05 +0100] [Job 5573] Getting supported attributes... I have not been able to figure out any corrolation between files which occasionally jam the printers. Happens sonmetimes from jobs submitted from other cups clients and sometimes from windows. Version-Release number of selected component (if applicable): cups-1.1.17-13.3.0.1 How reproducible: Sometimes Steps to Reproduce: 1.print lots 2.occasionally qil jam a queue - other queues are not effected. 3. Additional info: Cheers, Tim. Paul
just thought, let me know if you want me to send you the d of the c files. Paul
At the point that happens, is the ipp process still around? I'd love to know what it's up to. First of all it would be useful to see the system calls that it is using, with: strace -p {pid} 2>strace.log It could also be useful to see which function it is in, and how it got there, using: gdb /usr/lib/cups/backend/ipp {pid} (gdb) bt Also: how was this print queue configured, and what does the DeviceURI line associated with that queue look like (in /etc/cups/printers.conf)? Thanks.
Hi Tim - yes, correct, the ipp process is still hanging around...... here we go again: [root@capprint root]# strace -p 23061 recv(3, That's all for that.... (but just remembered the printer'e ethernet is unplugged at the moment so perhaps this is why it is pausing here - will plug it back in the morning. Here's a back trace: (gdb) bt #0 0xffffe002 in ?? () #1 0x4001e65d in httpUpdate (http=0x8055540) at http.c:1546 #2 0x400291ab in cupsDoFileRequest (http=0x8055540, request=0x8058318, resource=0xbfffe0a0 "/ipp/", filename=0x0) at util.c:322 #3 0x0804944c in main (argc=7, argv=0xbfffeb04) at ipp.c:318 #4 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6 (gdb) I will leave the job jamming the queue until you let me know if you need more info. Cheers and thanks for working on this, Paul
almost forgot - this queue is configured a raw queue: <Printer raw_redbull> Info HP Laserjet 4200DTN Location 1109 DeviceURI ipp://prn1109-2.ee.ic.ac.uk/ipp/ State Idle Accepting Yes JobSheets none none QuotaPeriod 0 PageLimit 0 KLimit 0 </Printer>
Just thought - I also get this on lpd backends too - I have a couple of netgear print servers which run lpd and they stick in the same fasion. bet that makes it harder... :-( Paul
Tim, I have plugged up the printer again and the stack trace give no more info than before It's just hanging on: recv(3,: Paul
Tim, Getting the recv hang on the strace again. strace -p 20438 recv(3, How can I help debug this? This is now with cups version cups-1.1.17-13.3.0.3 Paul
Please install this package: ftp://people.redhat.com/twaugh/tmp/cups-debuginfo-1.1.17-13.3.0.3.i386.rpm and attach to the process with gdb as before. Does the stack trace look the same as last time? Also is this the 'ipp' process still?
[root@capprint root]# gdb /usr/lib/cups/backend/ipp 998 GNU gdb Red Hat Linux (5.3post-0.20021129.18rh) Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...(no debugging symbols found)... Attaching to program: /usr/lib/cups/backend/ipp, process 998 Reading symbols from /usr/lib/libcups.so.2...loading separate debug info from '/usr/lib/debug//usr/lib//libcups.so.2.debug' Reading symbols from /usr/lib/debug//usr/lib//libcups.so.2.debug...done. done. Loaded symbols for /usr/lib/libcups.so.2 Reading symbols from /lib/libnsl.so.1...done. Loaded symbols for /lib/libnsl.so.1 Reading symbols from /lib/libcrypt.so.1...done. Loaded symbols for /lib/libcrypt.so.1 Reading symbols from /lib/tls/libc.so.6...done. Loaded symbols for /lib/tls/libc.so.6 Reading symbols from /lib/libssl.so.4...done. Loaded symbols for /lib/libssl.so.4 Reading symbols from /lib/libcrypto.so.4...done. Loaded symbols for /lib/libcrypto.so.4 Reading symbols from /lib/ld-linux.so.2...done. Loaded symbols for /lib/ld-linux.so.2 Reading symbols from /usr/kerberos/lib/libgssapi_krb5.so.2...done. Loaded symbols for /usr/kerberos/lib/libgssapi_krb5.so.2 Reading symbols from /usr/kerberos/lib/libkrb5.so.3...done. Loaded symbols for /usr/kerberos/lib/libkrb5.so.3 Reading symbols from /usr/kerberos/lib/libcom_err.so.3...done. Loaded symbols for /usr/kerberos/lib/libcom_err.so.3 Reading symbols from /usr/kerberos/lib/libk5crypto.so.3...done. Loaded symbols for /usr/kerberos/lib/libk5crypto.so.3 Reading symbols from /lib/libresolv.so.2...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /lib/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 0xffffe002 in ?? () (gdb) bt #0 0xffffe002 in ?? () #1 0x4001df58 in httpRead (http=0x8057058, buffer=0xbfff0f40 "", length=134574168) at http.c:921 #2 0x4001dbf4 in httpFlush (http=0x8057058) at http.c:888 #3 0x400290b1 in cupsDoFileRequest (http=0x8055418, request=0x8058208, resource=0xbfffe390 "/ipp/", filename=0xbfffdf90 "/var/spool/cups/d10545-001") at util.c:482 #4 0x08049854 in main (argc=7, argv=0xbfffedf4) at ipp.c:583 #5 0x420156a4 in __libc_start_main () from /lib/tls/libc.so.6 (gdb) what next?
I'd like to see what 'print *http' says.
ok, how do i do this? Paul
Just type it at the (gdb) prompt.
(gdb) print *http No symbol "http" in current context.
Oops, misread the stack. Try this: (gdb) up (gdb) print *http
(gdb) up #1 0x4001df58 in httpRead (http=0x8057058, buffer=0xbfff0f40 "", length=134574168) at http.c:921 921 if (httpGets(len, sizeof(len), http) == NULL) (gdb) print *http $1 = {fd = 1751321101, blocking = 1866673509, error = 1869771886, activity = 1830828652, state = 1630369889, status = 859661671, version = 741355574, keep_alive = 1651863584, hostaddr = {sin_family = 26988, sin_port = 3427, sin_addr = {s_addr = 1846152458}, sin_zero = "trol: ma"}, hostname = "x-age=3600, public\r\n\r\nol: max-age=3600, public\r\n\r\nrol: max-age=3600, public\r\n\r\n=3600, public\r\n\r\none\003\r\n\r\n\nssion-supported\000\004none\003\r\n\r\n\r\n\nion-supported\000\004none\003\r\n\r\n\r\n\nion-supported\000\004none\003\r\n\r\n\r\n\nmpression-su"..., fields = { "\r\n\r\n\nssion-supported\000\004none\003\r\n\r\nsion-supported\000\004none\003\r\n\r\n\nne\003\r\n\r\n¶\205\234D\000\025compression-supported\000\004none\003\r\n\r\n\nne\003\r\n\r\npression-supported\000\004none\003\r\n\r\n\nne\003\r\n\r\n\r\n\n\004none\003\r\n\r\nupported\000\004none\003\r\n\r\n\n\004none\003\r\n\r\nsupported\000"..., "e\003\r\n\r\nn-supported\000\004none\003 ===================== and lots more crap before this at the end: ==================== @;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@;¸\002@"..., "Cancel\000\000\000\000\000\000\021\000\000\000Help\000\000\000\000\000\000\000\000\021\000\000\000Quit\000\000\000\000\000\000\000\000\021\000\000\000Close\000\000\000\000\000\000\000\021\000\000\000Yes\000\000\000\000\000\000\000\000\000\021\000\000\000No\000\000\000\000\000\000\000\000\000\000\021\000\000\000On\000\000\000\000\000\000\000\000\000\000\021\000\000\000Off\000\000\000\000\000\000\000\000\000\021\000\000\000Save\000\000\000\000\000\000\000\000\021\000\000\000Discard\000\000\000\000\000\021\000\000\000Default\000\000\000\000\000\021\000\000\000Options\000\000\000\000\000\021\000\000\000More Inf"...}, data = 0x6567614d <Address 0x6567614d out of bounds>, data_encoding = 6386798, data_remaining = 0, used = 17, buffer = "Yellow\000\000\000\000\000\000I\000\000\000Copyright 1993-2002 by Easy Software Products, All Rights Reserved.\000\021\000\000\000General\000\000\000\000\000\021\000\000\000Printer\000\000\000\000\000\021\000\000\000Image\000\000\000\000\000\000\000\021\000\000\000HP-GL/2\000\000\000\000\000\021\000\000\000Extra\000\000\000\000\000\000\000\021\000\000\000Document\000\000\000\000\021\000\000\000Other\000\000\000\000\000\000\000\031\000\000\000"..., auth_type = 2036428832, md5_state = {count = {544175136, 1701012321}, abcd = {1948283763, 544434536, 1987208563, 3043941}, buf = "I\000\000\000You don't have permission to access the resource on this ser"}, nonce = "ver.\000\000\000\000A\000\000\000The requested resource was not found on this server.\000\000\000\000\000\000\000\000A\000\000\000The requested method is not allowed with the resource.\000\000\000\000\000\000Y\000\000\000An appropriate representation for the resource was not found"..., nonce_count = 1702065440, tls = 0x69687420, encryption = 1702043763}
Superb. I've had an idea that will let me analyze this 'off-line': at the (gdb) prompt type 'detach' and then quit. Then try using the 'gcore' program to create a core file from the running process, like: gcore -o ipp-core 998 If that works, send me the ipp-core file by mail (twaugh) and I'll be able to do the gdbing myself. Thanks.
What sort of device is 'ipp://prn1109-2.ee.ic.ac.uk/ipp/'? A CUPS server, or an actual printer?
This is an HP 4200DTN printer. Firmware version is 20030530 04.016.1 in case you need it. I'm using the postscript drivers for windows printing and the PPD file for CUPS clients. (Now running 1.1.19 and all seems well except that very very rarely more than 1 copy of a dosument is printed from windows when only one is requested. But I guss thisn mucst be the windows drivers?) There have been no james yet. Have you/Mike had any ideas yet? Do you think it is the printer? Also, have the other IPP changes been incorporated into RHEL 3? Thanks, Paul
I don't see any changes between our patched 1.1.17 and the 1.1.19 code that look like they might have any effect on this. And at the point we're at, the IPP request has been sent and we're waiting for a response. (Yes, the other IPP issue is fixed in RHEL 3.)
tcpdump |grep prn1109-2 > /root/ipp.txt will this do? Paul
Unfortunately not, you'll need to use tcpdump's -w option to capture the full packets. The best idea is to limit the captured packets to those whose source or destination is that printer (if it's always that one that jams?) and TCP port 'ipp'. Something like this: tcpdump -w captured host prn1109-2.ee.ic.ac.uk and port ipp Once you have a 'captured' file that contains the traffic involved in a jam (heh), I can read the packets out using tcpdump -r. To save disk space it might be a good idea to restart it each day (when no-one is using the printer).
Tim, OK: Here's what I remember: All the packages exhibit the infrequenct and random jamming behaviour. cups-1.1.17 (original redhat 9 version) no multiple copies (as far as I remember) - but did have ipp loop. cups-1.1.17 (released version with loop patch from as implemented in 1.1.19) exhibited the multiple copies printing but did not show the ipp loop. cups-1.1.19-13.0.0.0.1test.i386.rpm - multiple copies, no ippp loop bug 1.1.20 - under test now. Will tell you tomorrow if the multiple copies bug is fixed. I have one document which reliably caused the multiple copies. Cheers, Paul
Tim, As it is now, cups-1.1.20-1 from rawhide is working fine and has done for about a week. We have had no jams and no multiple copy printing. Where do we go from here? Can you just make 1.1.20 packages for redhat 9? (I guess 7.3 and below are almost done so no need to release for those of you hang on another 2 weeks!) Have you figured out the offending code? Is it somewhere in the changed implementation of ipp when the ipp code was changed for to stop the looping? Regards, Paul
Hmm, I had a report from Jeff Johnson that the multiple copies problem (which he also saw) was *not* fixed with 1.1.20-1. Jeff, can you clarify whether we are looking at the same problem here?
Is 104360 the appropriate bugzilla number for the multiple copy bug? I'm getting that with cups-1.1.17-13.3.0.3. Should I file another bug or follow up here?
Technically it's a separate bug, but since most of the information about it is already in this report, let's stick to this one. Just to clarify the situation, I think the original bug here (IPP loop) is fixed. It's just the multiple copies bug that is outstanding. Here is a package with some of the possible-cause fixes from the 1.1.19-13.0.0.0.3test -> 1.1.20-1 diff applied: ftp://people.redhat.com/twaugh/tmp/cups-1.1.17-13.3.0.4.i386.rpm ftp://people.redhat.com/twaugh/tmp/cups-libs-1.1.17-13.3.0.4.i386.rpm Please let me know how that one behaves.
Dan, Any chance you can test this new version? - the only box I can really do it on is my production print server and I have changed CUPS on that more times over the last few weeks than In have changed my socks. (And with the .20 version all currently is holding together well so I don't want to regress) If you are not able, then I will do it. Please advise. Paul
I'm on a production box too. I rolled back to 1.1.17-13, since the jamming bug (rarely exhibited for me) was preferable to eating a whole ream of paper. I'll try 1.1.17-13.3.0.4 this afternoon and will report back.
1.1.17-13.3.0.4 has been behaving for the last 22 hours or so, w/ no complaints from users about multiple copies. Take that w/ a grain of salt, since my feedback loop is rather lame (announce change, make change, wait for screams). I had little luck initially getting the bug to manifest itself on demand. So far no screams; Works For Me.
Any problems since then?
Hi Tim, still running cups-1.1.20-1 and it has not had a single error. No jams, no multiple print jobs, no ipp busy loop. Perfection. I would suggest to release this for EL3 and RH9 if you dare. (Actually I would be very pleased if you would put it on EL3 because I now have many workstations on EL3 (qute good - have some stuff in bugzilla about it but generally good) and my print server is still on 9, and I am not going to u/g that to EL until cups >= 1.1.20 exists on EL) Happy new year, belatedly, btw. Best wishes, Paul
I'd really like to know whether 1.1.17-13.3.0.4 also fixes the problem, since the risk is much lower. (Yes, happy new year!) Dan: are you still running that?
I just got a report of a Mac OS X 10.2 client getting multiple copies printing thru a raw queue on our server which has cups-1.1.17-13.3.0.4. He asked for two copies, but got four. I'm going to go try and confirm this personally. I'll add additional comments after that.
An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2004-228.html