Bug 236736 - No paper-out notification for USB devices
Summary: No paper-out notification for USB devices
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Pete Zaitcev
QA Contact: Brian Brock
URL:
Whiteboard: bzcl34nup
Depends On:
Blocks: F8Target
TreeView+ depends on / blocked
 
Reported: 2007-04-17 12:53 UTC by Tim Waugh
Modified: 2008-08-02 23:40 UTC (History)
6 users (show)

Fixed In Version: 2.6.24.4-64.fc8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-04-04 09:52:40 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
linux-2.6-usb-paperout.patch (482 bytes, patch)
2007-04-19 16:11 UTC, Tim Waugh
no flags Details | Diff
Patch "X3" (1.99 KB, patch)
2007-05-30 05:44 UTC, Pete Zaitcev
no flags Details | Diff
usb-paperout-fix.patch (480 bytes, patch)
2007-06-13 12:21 UTC, Tim Waugh
no flags Details | Diff
Patch "X4" - may be ready now (4.29 KB, patch)
2007-06-23 00:22 UTC, Pete Zaitcev
no flags Details | Diff
cups-usb-paperout.patch (1.15 KB, patch)
2007-06-26 14:44 UTC, Tim Waugh
no flags Details | Diff

Description Tim Waugh 2007-04-17 12:53:32 UTC
Description of problem:
The CUPS usb backend does not detect a paper-out condition.

Version-Release number of selected component (if applicable):
cups-1.2.10-6.fc7

Expected results:
Should add media-tray-empty-error to printer-state-reasons.

Comment 1 Tim Waugh 2007-04-19 10:20:49 UTC
Okay, the problem is actually in the kernel usblp driver.  It should give an
ENOSPC error when the printer reports paper-out.

I have a patch that does this, but it doesn't work correctly yet.  I will take
another look when I get some time.

Comment 2 Tim Waugh 2007-04-19 16:11:27 UTC
Created attachment 153022 [details]
linux-2.6-usb-paperout.patch

For reference, here is my broken patch.  I've sent an email to the usblp
maintainer to ask their opinion.

Comment 3 Pete Zaitcev 2007-04-21 03:37:31 UTC
You should've sent it to linux-usb-devel.net.
Vojtech is far too busy running things at SuSE, he's the director
of SuSE Labs now.

Anyway, what doesn't work? The patch looks sane to me.

Comment 4 Tim Waugh 2007-04-23 14:02:18 UTC
I get corrupted output.  I can't tell whether it is due to missing data or
repeated data.

In my testing, the file description is *not* in O_NONBLOCK mode, and writes in
8192-byte chunks using select() first to check for writability.  The write calls
all return 8192 except the one that returns -ENOSPC, which is retried (by which
time I have replaced the paper) and which returns 8192.

I tried to copy the code from the signal handling bits.. which makes me wonder
how safe *that* code is.

Comment 5 Pete Zaitcev 2007-04-23 17:55:37 UTC
What does happen on an unpatched kernel? If you replace the paper,
does the corruption happen?

To the best of what I can discern, your patch is correct, but the
the driver is not processing urb->actual_length on writes. If the
stream gets corrupted on an out-of-paper condition without your
patch, we should fix that, then apply your patch and it should work.

The alternative is that the printer consumes the whole block and
keeps it across the paper change. Then we have to ensure that cups'
backend does not retry this block and sends the next one.

Could you capture a usbmon trace (running strace simultaneously is
optional)? There's a howto here:
/usr/share/doc/kernel-doc-2.6.20/Documentation/usb/usbmon.txt
I think that would help to see what printer really says.


Comment 6 Tim Waugh 2007-04-24 11:55:41 UTC
d681cd24 1259541414 S Bi:016:01 -115 8192 <
d681ca68 1259541726 S Bo:016:02 -115 8192 = ef7f8468 4e11867c 5d1584fc 0084795b 
158479db 9584ad01 846c5193 84785a95
d681cd24 1259542055 C Bi:016:01 0 0
d681ca68 1259704947 C Bo:016:02 0 8192 >
  [successful transfer]
d681cd24 1259705361 S Bi:016:01 -115 8192 <
d681ca68 1259705683 S Bo:016:02 -115 8192 = 30383559 1b2a6235 569fffff 39001b2a 
6230571b 2a703130 3836591b 2a623556
d681cd24 1259706013 C Bi:016:01 0 0
daf9a694 1264705526 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a694 1264711593 C Ci:016:00 0 1 = 18
daf9a694 1269711526 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a694 1269717272 C Ci:016:00 0 1 = 18
daf9a694 1274716524 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a694 1274717953 C Ci:016:00 0 1 = 38
  [transfer in progress, paper out]
daf9a694 1279719521 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a694 1279723629 C Ci:016:00 0 1 = 38
da7de34c 1284723510 S Ci:016:00 s a1 01 0000 0000 0001 1 <
da7de34c 1284729311 C Ci:016:00 0 1 = 38
da7de34c 1289728515 S Ci:016:00 s a1 01 0000 0000 0001 1 <
da7de34c 1289740987 C Ci:016:00 0 1 = 38
da7de34c 1294741513 S Ci:016:00 s a1 01 0000 0000 0001 1 <
da7de34c 1294746670 C Ci:016:00 0 1 = 38
daf9a1a8 1299747404 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a1a8 1299789341 C Ci:016:00 0 1 = 38
daf9a1a8 1304788520 S Ci:016:00 s a1 01 0000 0000 0001 1 <
daf9a1a8 1304793018 C Ci:016:00 0 1 = 18
  [paper ready]
d681ca68 1308574095 C Bo:016:02 0 8192 >
  [transfer completes]

I'll test out this patch next:

--- linux-2.6.20.noarch/drivers/usb/class/usblp.c.usb-paperout  2007-04-18
18:26:42.000000000 +0100
+++ linux-2.6.20.noarch/drivers/usb/class/usblp.c       2007-04-24
12:26:55.000000000 +0100
@@ -698,6 +698,11 @@
                                err = usblp_check_status(usblp, err);
                        mutex_unlock (&usblp->mut);
 
+                       if (!usblp->wcomplete && err == 1) { /* Paper out */
+                               usb_kill_urb(usblp->writeurb);
+                               return writecount ? writecount : -ENOSPC;
+                       }
+
                        /* if the fault was due to disconnect, let khubd's
                         * call to usblp_disconnect() grab usblp->mut ...
                         */

because at that point:
  writecount is zero
  transfer_length is 8192
  the URB is in transit

Does that look sufficient to you, or does it need some other error handling?

Comment 7 Tim Waugh 2007-04-24 21:48:24 UTC
Actually after some testing this patch seems to be behaving well:

--- linux-2.6.20.noarch/drivers/usb/class/usblp.c.usb-paperout  2007-04-24
12:39:19.000000000 +0100
+++ linux-2.6.20.noarch/drivers/usb/class/usblp.c       2007-04-24
22:09:37.000000000 +0100
@@ -301,7 +301,7 @@
                return;
        if (unlikely(!usblp->present))
                goto unplug;
-       if (unlikely(urb->status))
+       if (unlikely(urb->status && urb->status != -ENOENT))
                warn("usblp%d: nonzero read/write bulk status received: %d",
                        usblp->minor, urb->status);
        usblp->wcomplete = 1;
@@ -698,6 +698,13 @@
                                err = usblp_check_status(usblp, err);
                        mutex_unlock (&usblp->mut);
 
+                       if (!usblp->wcomplete && err == 1) { /* Paper out */
+                               usb_kill_urb(usblp->writeurb);
+                               writecount += usblp->writeurb->actual_length;
+                               usblp->writeurb->status = 0;
+                               return writecount ? writecount : -ENOSPC;
+                       }
+
                        /* if the fault was due to disconnect, let khubd's
                         * call to usblp_disconnect() grab usblp->mut ...
                         */

I will continue testing it tomorrow.

Comment 8 Pete Zaitcev 2007-04-24 23:13:09 UTC
Monkeying with URB's status is unacceptable. In fact killing the URB is
a very dubious technique as well. For one thing, you never know how much
was actually sent to the device if you do that.

I'm quite concerned that you seem to be encoding pecularities of your
printer carelessly into this, just a bit too busy to give you proper
critique right now.


Comment 9 Tim Waugh 2007-04-25 13:23:35 UTC
The reason I have to monkey with the status at that point is because usblp.c
checks writeurb->status to see if the request is complete for some reason.  I
don't know why it doesn't check usblp->wcomplete -- if it did that, I wouldn't
need this line:    usblp->writeurb->status = 0;

After usb_kill_urb() returns, writeurb->status will be -ENOENT and
usblp->wcomplete will be 1.

Why does killing the URB mean we don't know how much was transferred?  Isn't
that what writeurb->actual_length tells us?  That's what the usb_unlink_urb()
documentation seems to be saying anyway, and isn't usb_kill_urb() the same but
with a wait-until-completion?

What peculiarities of my printer are being encoded in?  Perhaps I explained my
reasoning badly in comment #6 -- but I assure you I am doing my best to make
this work on all USB printers with minimal changes.

An approach that avoids the need to cancel URBs:

+                       if (!usblp->wcomplete && err == 1) { /* Paper out */
+                               writecount += transfer_length;
+                               return writecount ? writecount : -ENOSPC;
+                       }
+

is no good because a proper select() loop won't ever see the file descriptor as
writable (the URB is in flight), and so user-space will never see the ENOSPC error.

So I think we really do need to cancel the URB; the question is how to do that
robustly.

Should I continue this on linux-usb-devel?

Comment 10 Pete Zaitcev 2007-04-25 16:58:51 UTC
Controller is not obligated to update the actual length in case of an unlink.
It might not even know that something is amiss. This probably works for you
only because UHCI updates it every frame.

Taking it to linux-usb-devel may be a good idea. Although I suspect it's just
going to evoke curt responses from Oliver. If you can wait a week or so, I'll
get to it.


Comment 11 Tim Waugh 2007-05-09 12:57:24 UTC
I suppose if usblp could set POLLERR for poll() when the printer is out of
paper, that might be sufficient.

Wish I knew how this was *supposed* to work..

Comment 12 Pete Zaitcev 2007-05-30 05:34:04 UTC
Setting POLLERR would not be sufficient unless write(2) finishes. If the
same context continues to loop there waiting for the URB to continue, it
will never reach select(2). Also, once there, the routine which parses
the result of select() may get confused if unusual bits are set and write
bit is not set. So we have to continue to set POLLOUT if we want the
application to get back to write() and receive its ENOSPC.

Comment 13 Pete Zaitcev 2007-05-30 05:44:39 UTC
Created attachment 155659 [details]
Patch "X3"

Comment 14 Pete Zaitcev 2007-05-30 05:54:12 UTC
Tim, how about the X3? It takes your suggestion in comment #9 and makes
the select to report writeable. Please review and test it for me.
For some odd reason, equipment that I have only reports 0x18 and not
0x38 like it should, so I was unable to test this at all.

The patch is very crude:
 - Next write will wait 20s before it reaches usblp_check_status.
 - This behaviour must not be default and enabled with ioctl LPABORT
   instead. Oddly, my back-end does not issue any LP ioctls at all.
 - Ditt LPABORTOPEN
So, this is for testing only.


Comment 15 Tim Waugh 2007-06-13 12:21:48 UTC
Created attachment 156865 [details]
usb-paperout-fix.patch

I had a chance to test this today.  The X3 patch did not work as expected, but
once I added this change (usb-paperout-fix.patch) on top, it worked fine.

Comment 16 Pete Zaitcev 2007-06-13 18:40:39 UTC
Ah yes, of course. Sorry about that. I somehow imagined that the bits were
set when the condition evaluated to true, but in reality '0' follows '?'.

Comment 17 Pete Zaitcev 2007-06-23 00:22:55 UTC
Created attachment 157670 [details]
Patch "X4" - may be ready now

Comment 18 Pete Zaitcev 2007-06-23 00:30:54 UTC
I implemented a patch which adds ioctl LPABORT and thus is safe to push out
both to RHEL 5 and upstream.

Tim, I need you to look at this, and tell me if you can coax CUPS to call
this ioctl. For some odd reason, my back-end does NOT attempt to call it.
Maybe it only does this for non-USB lp.c. But this is silly. IMHO, it should
try issuing LPABORT, then proceed as before if it ends with ENOTTY
(unfortunately EINVAL, while not a sensible error, could be returned
by some silly drivers, so that may be worth ignoring as well).

The LP_ABORT condition is made non-persistent for now, because it seems
very, very odd (and unsafe) how it was done in lp.c. Tim, please let me
know if CUPS prefers it the old way. I speculated that it can't care
because it has to issue an ioctl anyway just to know if it's supported.


Comment 19 Tim Waugh 2007-06-26 14:44:31 UTC
Created attachment 157906 [details]
cups-usb-paperout.patch

Yes, this works fine.  Here is the CUPS patch I used to call the ioctl() in the
right place (just for reference).

Comment 20 Pete Zaitcev 2007-07-11 03:58:29 UTC
Tim, thanks for the confirmation.

Unfortunately, something lost the notification, so I only found this today,
sorry.

Do you want this in RHEL 5 as well, or would RHEL 6 do?

Comment 21 Tim Waugh 2007-07-11 10:47:23 UTC
Just RHEL-6 is fine for me.  Thanks.

Comment 22 Hans de Goede 2007-07-29 13:43:13 UTC
Pete,

I believe I saw that this has made it into the upstream kernel, can this be
closed now then? (or atleast be removed from F7Target)


Comment 23 Pete Zaitcev 2007-07-30 00:45:05 UTC
Probably better removed from F7 because a) we missed that and b) CUPS needs
a patchlet to use it anyway (although perhaps URI type lp:// could be possible
as a workaround...). We'll close when everything tests out and not just being
theoretically available in some upstream tree.

Comment 24 Hans de Goede 2007-07-30 06:50:55 UTC
AFAIK most of the cups changes are in rawhide now, Tim?

Anyways, moving to F8Target


Comment 25 Tim Waugh 2007-07-30 09:14:27 UTC
cups-1.3-0.b1.5 is the one where the proper patch went in.

Comment 26 Tim Waugh 2007-11-20 15:46:52 UTC
Any word on when we might see this in our kernel package?

Comment 27 Chuck Ebbert 2007-11-20 18:08:01 UTC
Patch went in kernel 2.6.24-rc1 and is in rawhide/F9 now.


Comment 28 Bug Zapper 2008-04-04 00:07:27 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 29 Pete Zaitcev 2008-04-04 00:49:31 UTC
I cannot belive 11 months has passed since Tim filed this. No wonder
auto-closer is getting restive. :-)

Tim, does it work for you on F8? It ships with 2.6.24.4-64.fc8
in updates, so it should work after a run of yum. Please close
if it does.

Comment 30 Tim Waugh 2008-04-04 09:52:40 UTC
Yes, 2.6.24.4-64.fc8 works fine.  Yay!


Note You need to log in before you can comment on or make changes to this bug.