Bug 505807 - Canceled jobs spontaneously re-appear
Canceled jobs spontaneously re-appear
Status: CLOSED INSUFFICIENT_DATA
Product: Fedora
Classification: Fedora
Component: cups (Show other bugs)
11
x86_64 Linux
low Severity medium
: ---
: ---
Assigned To: Tim Waugh
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-13 22:53 EDT by Matt Castelein
Modified: 2009-08-10 07:50 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-08-05 15:39:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
packet capture (608.54 KB, application/octet-stream)
2009-06-17 13:11 EDT, Matt Castelein
no flags Details
All CUPS error logs (21.99 KB, application/octet-stream)
2009-06-18 08:32 EDT, Matt Castelein
no flags Details
CUPS error log (3.39 KB, application/octet-stream)
2009-06-23 14:15 EDT, Matt Castelein
no flags Details

  None (edit)
Description Matt Castelein 2009-06-13 22:53:39 EDT
Every large print job sent through the box running CUPS since updating it to Fedora 11 results in CUPS error "Unable to send trailing nul to printer" and at the printer, the job prints to about page 15, and then starts over again.

I have cups-1.4-0.b2.18.fc11.x86_64.
Comment 1 Matt Castelein 2009-06-15 15:40:18 EDT
Boy, I don't know.. CUPS is acting really STUPID since updating to the latest.  Jobs that were canceled are magically re-appearing, queues are getting frozen in "processing" when nothing is being processed..  Jobs printing halfway then starting over.. LPR socket errors, and none of this was happening before the update!
Comment 2 Tim Waugh 2009-06-17 10:06:35 EDT
I'd like to see the TCP traffic between the CUPS server and the printer.  Please run this command, as root:

tcpdump -s0 -U -w lpr.pcap -n host ${HOST}

replacing ${HOST} with the IP address of the printer, i.e. the hostname part of the lpd://${HOST}/... device URI.

Once tcpdump is running, print a large job to the printer so that the 'Unable to send trailing nul to printer' error occurs, then cancel the job and stop tcpdump (press control+C in that terminal window).

Then attach the lpr.pcap file to this bug report using the 'Add an attachment' link.  Thanks.
Comment 3 Matt Castelein 2009-06-17 11:32:08 EDT
The job is stuck at 2%.. how long before CUPS will give up?
Comment 4 Tim Waugh 2009-06-17 12:20:08 EDT
Without knowing the nature of the problem it's hard to say I'm afraid.  Is it behaving differently now, and not giving you the 'trailing nul' error?
Comment 5 Matt Castelein 2009-06-17 13:11:53 EDT
Created attachment 348301 [details]
packet capture

Wasn't always the 'trailing nul' anyway, as I said, since I upgraded.. Jobs get stuck in processing, or LPR error.. Some jobs that were canceled seem to re-appear on their own..  I'm going to stop the tcpdump now and attach it.. Looking at the traffic in wireshark, it looks like cups quit sending.. at least all the traffic I see is coming FROM the printer not going to it.
Comment 6 Matt Castelein 2009-06-17 13:16:55 EDT
This job DOES infact say "unable to send trailing nul to printer" - But only AFTER I canceled it.
Comment 7 Tim Waugh 2009-06-17 13:54:47 EDT
The printer's TCP buffer is full (173.626 seconds in), and the window is never opened up again.  I don't know why it does that rather than failing the job.

So given that this worked fine in Fedora 10 (right?) we'll need to look at how the PostScript we're sending to the printer has changed between Fedora 10 and Fedora 11.

How is your job submitted?  Is it from an application running on Fedora 11, or from some other machine on the network?
Comment 8 Matt Castelein 2009-06-17 13:58:06 EDT
Right, I don't recall ever having problems printing large jobs in Fedora 10..  This job is being sent from a windows client via samba.
Comment 9 Matt Castelein 2009-06-17 21:11:27 EDT
The test job resulted in no pages output at all.  At 16 Pages/Min I would think I would have seen page 1 after ~174 seconds..
Comment 10 Matt Castelein 2009-06-17 21:13:41 EDT
This test job, despite having been CANCELED by me, and confirmed thru the web interface as no longer listed..  Is back.  With a status of "pending" and an error "/usr/lib/cups/backend/lpd failed"  This is what I'm talking about when I say canceled jobs are coming back to life.
Comment 11 Tim Waugh 2009-06-18 04:57:07 EDT
1. Please attach /var/log/cups/error_log* in case there are any clues about why the job restarted.

2. I have prepared a new set of CUPS packages (1.4-0.rc1.3.fc11) and would like to know whether these problems are still present.

Here it is:
http://koji.fedoraproject.org/koji/buildinfo?buildID=106775

Please download all of the x86_64 RPM files, then update cups like this (as root):

rpm -Fvh cups-*1.4-0.rc1.3.fc11*.rpm

You shouldn't need to reboot or restart CUPS (updating the packages will restart it).

Let me know whether (a) LPD printing works, and (b) cancelled jobs are restarted.
Comment 12 Matt Castelein 2009-06-18 08:32:21 EDT
Created attachment 348431 [details]
All CUPS error logs
Comment 13 Matt Castelein 2009-06-18 08:38:29 EDT
rpm -Fvh cups-*1.4-0.rc1.3.fc11*.rpm installs only cups and cups-libs; Is that expected?
Comment 14 Matt Castelein 2009-06-18 08:40:55 EDT
The first thing I notice is, in the web interface there are now jobs showing on the "active jobs" page with a status of canceled.  These jobs are as old as last Saturday.  Shouldn't they be in the "completed jobs" page?
Comment 15 Matt Castelein 2009-06-18 09:03:27 EDT
There may be multiple issues here that are leading me off track.. This test job I've been using (a 276-page PDF) also fails to print directly from windows.  Could be a problem with the latest adobe reader- I've seen that happen before.  However that still does not explain why these canceled jobs are re-appearing.  I'm going to change the summary for this report, and just focus on that, OK?
Comment 16 Matt Castelein 2009-06-18 10:36:52 EDT
All canceled jobs are shown on the "completed jobs" page.  Some of those are also shown on the "active jobs" page.
Comment 17 Tim Waugh 2009-06-19 06:31:05 EDT
(In reply to comment #15)
> I'm going to change the summary for this report, and just focus on that, OK?  

Good idea.

What does this say when you paste it into a terminal window?:

python <<EOF
import cups, pprint
pprint.pprint (cups.Connection().getJobs())
EOF

Also, what does this say when you do the same, changing '1' to the job ID of a cancelled job that is showing in the active jobs list?:

JOBID=1
python <<EOF
import cups, pprint
pprint.pprint (cups.Connection().getJobAttributes($JOBID))
EOF
Comment 18 Matt Castelein 2009-06-19 09:59:28 EDT
(In reply to comment #17)

> What does this say when you paste it into a terminal window?:
> 
> python <<EOF
> import cups, pprint
> pprint.pprint (cups.Connection().getJobs())
> EOF

{161: {'job-k-octets': 80574,
       'job-media-sheets-completed': 0,
       'job-name': u'smbprn.00000001 181761.pdf',
       'job-originating-user-name': u'192.168.1.12\\blueh2o',
       'job-preserved': True,
       'job-printer-uri': u'ipp://arturo:631/printers/OptraR',
       'job-priority': 50,
       'job-state': 7,
       'time-at-completed': 1244946374,
       'time-at-creation': 1244723428,
       'time-at-processing': 1244848177},
 167: {'job-k-octets': 29,
       'job-media-sheets-completed': 1,
       'job-name': u'smbprn.00000001 Nero Cover Designer',
       'job-originating-user-name': u'192.168.1.12\\blueh2o',
       'job-preserved': True,
       'job-printer-uri': u'ipp://arturo:631/printers/OptraR',
       'job-priority': 50,
       'job-state': 7,
       'time-at-completed': 1245241987,
       'time-at-creation': 1245093965,
       'time-at-processing': 1245093991}}
 
> Also, what does this say when you do the same, changing '1' to the job ID of a
> cancelled job that is showing in the active jobs list?:
> 
> JOBID=1
> python <<EOF
> import cups, pprint
> pprint.pprint (cups.Connection().getJobAttributes($JOBID))
> EOF  
{'attributes-charset': u'utf-8',
 'attributes-natural-language': u'en-us',
 'document-count': 1,
 'document-format': u'application/vnd.cups-raw',
 'job-hold-until': u'no-hold',
 'job-id': 167,
 'job-k-octets': 29,
 'job-media-progress': 0,
 'job-media-sheets-completed': 1,
 'job-more-info': u'ipp://localhost:631/jobs/167',
 'job-name': u'smbprn.00000001 Nero Cover Designer',
 'job-originating-host-name': u'192.168.1.12',
 'job-originating-user-name': u'192.168.1.12\\blueh2o',
 'job-preserved': True,
 'job-printer-state-message': u'',
 'job-printer-state-reasons': [u'none'],
 'job-printer-up-time': 1245419832,
 'job-printer-uri': u'ipp://arturo:631/printers/OptraR',
 'job-priority': 50,
 'job-sheets': [u'none', u'none'],
 'job-state': 7,
 'job-state-reasons': u'job-canceled-by-user',
 'job-uri': u'ipp://localhost:631/jobs/167',
 'job-uuid': u'urn:uuid:414e2b81-9fa8-3a27-5732-133cb85bd384',
 'printer-uri': u'ipp://localhost/printers/OptraR',
 'time-at-completed': 1245241987,
 'time-at-creation': 1245093965,
 'time-at-processing': 1245093991}
Comment 19 Tim Waugh 2009-06-23 10:49:48 EDT
Interestingly, both jobs 161 and 167 have had this happen:

error_log:67:E [15/Jun/2009:15:26:31 -0400] [Job 167] Unable to convert file 1 to printable format!
error_log-20090614:316:E [12/Jun/2009:19:09:37 -0400] [Job 161] Unable to convert file 1 to printable format!

...which seems odd as they are raw jobs.
Comment 20 Tim Waugh 2009-06-23 11:27:46 EDT
This is something in common with bug #507442.  Both jobs 161 and 167 had been restarted, and in bug #507442 the job had been stopped due to lack of authentication, then restarted using CUPS-Authenticate-Job.

Perhaps something screwy is going on with CUPS's mechanism for restarting jobs.
Comment 21 Tim Waugh 2009-06-23 13:18:53 EDT
I've just built a CUPS package that will log some useful information (at log level 'error') when it is unable to convert a file.  Please install it, and watch for the problem to reappear.  When it does, please attach the error_log file here.

I'm hoping to see something like:

E [15/Jun/2009:15:26:31 -0400] [Job 167] Unable to convert file 1 to printable format!
E [15/Jun/2009:15:26:31 -0400] [Job 167] Required: application/vnd.cups-raw -> printer/OptraR
E [15/Jun/2009:15:26:31 -0400] [Job 167] Available: 
application/octet-stream -> application/vnd.cups-raw (-)
E [15/Jun/2009:15:26:31 -0400] [Job 167] Available: application/pdf -> application/vnd.cups-postscript (pdftops)
...

listing all of the available filters and the MIME type transformations they perform.
Comment 22 Tim Waugh 2009-06-23 13:19:40 EDT
Oh, I meant to hold off posting that until the package had actually built... I'll add a link here when it's finished.
Comment 23 Tim Waugh 2009-06-23 13:23:37 EDT
Here it is:
http://koji.fedoraproject.org/koji/buildinfo?buildID=111507
Comment 24 Matt Castelein 2009-06-23 14:15:56 EDT
Created attachment 349125 [details]
CUPS error log

Well the canceled jobs disappeared from the active page.. I couldn't get the same error (yet) but I did get errors.  I'm not sure what to make of them.
Comment 25 Matt Castelein 2009-06-23 14:31:27 EDT
Well, that last job seems to have given the printer a hissy-fit.  I can't reset it, and it's not even responding to ping anymore.  I'll have to continue this testing later as I am not anywhere near the printer to physically reset it.
Comment 26 Tim Waugh 2009-07-01 09:17:55 EDT
Those errors are from the printer not accepting the job correctly.

Have you seen the "Unable to convert file" error message again since using the package from comment #23?
Comment 27 Matt Castelein 2009-07-01 09:25:22 EDT
No, I have not.
Comment 28 Tim Waugh 2009-07-01 10:32:00 EDT
OK.  Well, nothing in particular has changed that would prevent the problem, so presumably it's still there.  I'll leave this in NEEDINFO until you spot it again.  Thanks for helping to track this down!
Comment 29 Tim Waugh 2009-08-10 07:50:05 EDT
I actually saw this myself today but was unable to reproduce it.

Note You need to log in before you can comment on or make changes to this bug.