Description of problem: The CUPS usb backend does not detect a paper-out condition. Version-Release number of selected component (if applicable): cups-1.2.10-6.fc7 Expected results: Should add media-tray-empty-error to printer-state-reasons.
Okay, the problem is actually in the kernel usblp driver. It should give an ENOSPC error when the printer reports paper-out. I have a patch that does this, but it doesn't work correctly yet. I will take another look when I get some time.
Created attachment 153022 [details] linux-2.6-usb-paperout.patch For reference, here is my broken patch. I've sent an email to the usblp maintainer to ask their opinion.
You should've sent it to linux-usb-devel.net. Vojtech is far too busy running things at SuSE, he's the director of SuSE Labs now. Anyway, what doesn't work? The patch looks sane to me.
I get corrupted output. I can't tell whether it is due to missing data or repeated data. In my testing, the file description is *not* in O_NONBLOCK mode, and writes in 8192-byte chunks using select() first to check for writability. The write calls all return 8192 except the one that returns -ENOSPC, which is retried (by which time I have replaced the paper) and which returns 8192. I tried to copy the code from the signal handling bits.. which makes me wonder how safe *that* code is.
What does happen on an unpatched kernel? If you replace the paper, does the corruption happen? To the best of what I can discern, your patch is correct, but the the driver is not processing urb->actual_length on writes. If the stream gets corrupted on an out-of-paper condition without your patch, we should fix that, then apply your patch and it should work. The alternative is that the printer consumes the whole block and keeps it across the paper change. Then we have to ensure that cups' backend does not retry this block and sends the next one. Could you capture a usbmon trace (running strace simultaneously is optional)? There's a howto here: /usr/share/doc/kernel-doc-2.6.20/Documentation/usb/usbmon.txt I think that would help to see what printer really says.
d681cd24 1259541414 S Bi:016:01 -115 8192 < d681ca68 1259541726 S Bo:016:02 -115 8192 = ef7f8468 4e11867c 5d1584fc 0084795b 158479db 9584ad01 846c5193 84785a95 d681cd24 1259542055 C Bi:016:01 0 0 d681ca68 1259704947 C Bo:016:02 0 8192 > [successful transfer] d681cd24 1259705361 S Bi:016:01 -115 8192 < d681ca68 1259705683 S Bo:016:02 -115 8192 = 30383559 1b2a6235 569fffff 39001b2a 6230571b 2a703130 3836591b 2a623556 d681cd24 1259706013 C Bi:016:01 0 0 daf9a694 1264705526 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a694 1264711593 C Ci:016:00 0 1 = 18 daf9a694 1269711526 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a694 1269717272 C Ci:016:00 0 1 = 18 daf9a694 1274716524 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a694 1274717953 C Ci:016:00 0 1 = 38 [transfer in progress, paper out] daf9a694 1279719521 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a694 1279723629 C Ci:016:00 0 1 = 38 da7de34c 1284723510 S Ci:016:00 s a1 01 0000 0000 0001 1 < da7de34c 1284729311 C Ci:016:00 0 1 = 38 da7de34c 1289728515 S Ci:016:00 s a1 01 0000 0000 0001 1 < da7de34c 1289740987 C Ci:016:00 0 1 = 38 da7de34c 1294741513 S Ci:016:00 s a1 01 0000 0000 0001 1 < da7de34c 1294746670 C Ci:016:00 0 1 = 38 daf9a1a8 1299747404 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a1a8 1299789341 C Ci:016:00 0 1 = 38 daf9a1a8 1304788520 S Ci:016:00 s a1 01 0000 0000 0001 1 < daf9a1a8 1304793018 C Ci:016:00 0 1 = 18 [paper ready] d681ca68 1308574095 C Bo:016:02 0 8192 > [transfer completes] I'll test out this patch next: --- linux-2.6.20.noarch/drivers/usb/class/usblp.c.usb-paperout 2007-04-18 18:26:42.000000000 +0100 +++ linux-2.6.20.noarch/drivers/usb/class/usblp.c 2007-04-24 12:26:55.000000000 +0100 @@ -698,6 +698,11 @@ err = usblp_check_status(usblp, err); mutex_unlock (&usblp->mut); + if (!usblp->wcomplete && err == 1) { /* Paper out */ + usb_kill_urb(usblp->writeurb); + return writecount ? writecount : -ENOSPC; + } + /* if the fault was due to disconnect, let khubd's * call to usblp_disconnect() grab usblp->mut ... */ because at that point: writecount is zero transfer_length is 8192 the URB is in transit Does that look sufficient to you, or does it need some other error handling?
Actually after some testing this patch seems to be behaving well: --- linux-2.6.20.noarch/drivers/usb/class/usblp.c.usb-paperout 2007-04-24 12:39:19.000000000 +0100 +++ linux-2.6.20.noarch/drivers/usb/class/usblp.c 2007-04-24 22:09:37.000000000 +0100 @@ -301,7 +301,7 @@ return; if (unlikely(!usblp->present)) goto unplug; - if (unlikely(urb->status)) + if (unlikely(urb->status && urb->status != -ENOENT)) warn("usblp%d: nonzero read/write bulk status received: %d", usblp->minor, urb->status); usblp->wcomplete = 1; @@ -698,6 +698,13 @@ err = usblp_check_status(usblp, err); mutex_unlock (&usblp->mut); + if (!usblp->wcomplete && err == 1) { /* Paper out */ + usb_kill_urb(usblp->writeurb); + writecount += usblp->writeurb->actual_length; + usblp->writeurb->status = 0; + return writecount ? writecount : -ENOSPC; + } + /* if the fault was due to disconnect, let khubd's * call to usblp_disconnect() grab usblp->mut ... */ I will continue testing it tomorrow.
Monkeying with URB's status is unacceptable. In fact killing the URB is a very dubious technique as well. For one thing, you never know how much was actually sent to the device if you do that. I'm quite concerned that you seem to be encoding pecularities of your printer carelessly into this, just a bit too busy to give you proper critique right now.
The reason I have to monkey with the status at that point is because usblp.c checks writeurb->status to see if the request is complete for some reason. I don't know why it doesn't check usblp->wcomplete -- if it did that, I wouldn't need this line: usblp->writeurb->status = 0; After usb_kill_urb() returns, writeurb->status will be -ENOENT and usblp->wcomplete will be 1. Why does killing the URB mean we don't know how much was transferred? Isn't that what writeurb->actual_length tells us? That's what the usb_unlink_urb() documentation seems to be saying anyway, and isn't usb_kill_urb() the same but with a wait-until-completion? What peculiarities of my printer are being encoded in? Perhaps I explained my reasoning badly in comment #6 -- but I assure you I am doing my best to make this work on all USB printers with minimal changes. An approach that avoids the need to cancel URBs: + if (!usblp->wcomplete && err == 1) { /* Paper out */ + writecount += transfer_length; + return writecount ? writecount : -ENOSPC; + } + is no good because a proper select() loop won't ever see the file descriptor as writable (the URB is in flight), and so user-space will never see the ENOSPC error. So I think we really do need to cancel the URB; the question is how to do that robustly. Should I continue this on linux-usb-devel?
Controller is not obligated to update the actual length in case of an unlink. It might not even know that something is amiss. This probably works for you only because UHCI updates it every frame. Taking it to linux-usb-devel may be a good idea. Although I suspect it's just going to evoke curt responses from Oliver. If you can wait a week or so, I'll get to it.
I suppose if usblp could set POLLERR for poll() when the printer is out of paper, that might be sufficient. Wish I knew how this was *supposed* to work..
Setting POLLERR would not be sufficient unless write(2) finishes. If the same context continues to loop there waiting for the URB to continue, it will never reach select(2). Also, once there, the routine which parses the result of select() may get confused if unusual bits are set and write bit is not set. So we have to continue to set POLLOUT if we want the application to get back to write() and receive its ENOSPC.
Created attachment 155659 [details] Patch "X3"
Tim, how about the X3? It takes your suggestion in comment #9 and makes the select to report writeable. Please review and test it for me. For some odd reason, equipment that I have only reports 0x18 and not 0x38 like it should, so I was unable to test this at all. The patch is very crude: - Next write will wait 20s before it reaches usblp_check_status. - This behaviour must not be default and enabled with ioctl LPABORT instead. Oddly, my back-end does not issue any LP ioctls at all. - Ditt LPABORTOPEN So, this is for testing only.
Created attachment 156865 [details] usb-paperout-fix.patch I had a chance to test this today. The X3 patch did not work as expected, but once I added this change (usb-paperout-fix.patch) on top, it worked fine.
Ah yes, of course. Sorry about that. I somehow imagined that the bits were set when the condition evaluated to true, but in reality '0' follows '?'.
Created attachment 157670 [details] Patch "X4" - may be ready now
I implemented a patch which adds ioctl LPABORT and thus is safe to push out both to RHEL 5 and upstream. Tim, I need you to look at this, and tell me if you can coax CUPS to call this ioctl. For some odd reason, my back-end does NOT attempt to call it. Maybe it only does this for non-USB lp.c. But this is silly. IMHO, it should try issuing LPABORT, then proceed as before if it ends with ENOTTY (unfortunately EINVAL, while not a sensible error, could be returned by some silly drivers, so that may be worth ignoring as well). The LP_ABORT condition is made non-persistent for now, because it seems very, very odd (and unsafe) how it was done in lp.c. Tim, please let me know if CUPS prefers it the old way. I speculated that it can't care because it has to issue an ioctl anyway just to know if it's supported.
Created attachment 157906 [details] cups-usb-paperout.patch Yes, this works fine. Here is the CUPS patch I used to call the ioctl() in the right place (just for reference).
Tim, thanks for the confirmation. Unfortunately, something lost the notification, so I only found this today, sorry. Do you want this in RHEL 5 as well, or would RHEL 6 do?
Just RHEL-6 is fine for me. Thanks.
Pete, I believe I saw that this has made it into the upstream kernel, can this be closed now then? (or atleast be removed from F7Target)
Probably better removed from F7 because a) we missed that and b) CUPS needs a patchlet to use it anyway (although perhaps URI type lp:// could be possible as a workaround...). We'll close when everything tests out and not just being theoretically available in some upstream tree.
AFAIK most of the cups changes are in rawhide now, Tim? Anyways, moving to F8Target
cups-1.3-0.b1.5 is the one where the proper patch went in.
Any word on when we might see this in our kernel package?
Patch went in kernel 2.6.24-rc1 and is in rawhide/F9 now.
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
I cannot belive 11 months has passed since Tim filed this. No wonder auto-closer is getting restive. :-) Tim, does it work for you on F8? It ships with 2.6.24.4-64.fc8 in updates, so it should work after a run of yum. Please close if it does.
Yes, 2.6.24.4-64.fc8 works fine. Yay!