Bug 486092 - httpd Sendfile troubles reading from a CIFS share
Summary: httpd Sendfile troubles reading from a CIFS share
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Jeff Layton
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-18 11:14 UTC by Paolo Penzo
Modified: 2014-06-18 07:38 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:43:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
attempted reproducer (1.28 KB, text/plain)
2009-06-22 14:54 UTC, Jeff Layton
no flags Details
patch -- fix sb->s_maxbytes so that it casts properly to a signed value (2.11 KB, patch)
2009-07-21 23:40 UTC, Jeff Layton
no flags Details | Diff
updated reproducer (1.45 KB, text/plain)
2009-10-20 13:36 UTC, Jeff Layton
no flags Details
patch -- cifs/libfs: fix sb->s_maxbytes so that it casts properly to a signed value (2.59 KB, patch)
2009-10-30 12:18 UTC, Jeff Layton
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description Paolo Penzo 2009-02-18 11:14:56 UTC
The same issue of rhbz#403531 arises on RHEL 5.2 with 
kernel-2.6.18-92.el5
samba-3.0.28-1.el5_2.1
httpd-2.2.3-11.el5_1.3


Setting EnableSendfile Off still acts as a workaround.

Comment 1 Joe Orton 2009-03-06 10:44:05 UTC
Have you tested this with the RHEL 5.3 kernel?

If sendfile() is failing it's almost certainly a kernel bug.

Comment 2 Jeff Layton 2009-06-18 14:49:15 UTC
I'm pretty sure this is a known bug. I think it's also broken upstream but it would be nice to confirm whether it's a problem on recent kernels.

Comment 3 Paolo Penzo 2009-06-18 16:13:10 UTC
It's still there with kernel 2.6.18-128.1.6.el5

Comment 4 Jeff Layton 2009-06-22 14:54:39 UTC
Created attachment 348923 [details]
attempted reproducer

How exactly are you testing this? I need some idea of how to reproduce it so I can track down the problem.

This is a reproducer I wrote for the fedora bug report to try and reproduce this. It works fine on CIFS AFAICT, but it uses local unix sockets so maybe that's a factor. You may want to download and build it and and see if it works on your setup:

Basically run it with a file sitting on a CIFS share as an argument:

$ sendfiletest /file/on/cifs.txt

...it should output the size of the file as seen by the receiving end (for bonus points, make it do a checksum of the received file).

If that also works for you, then we'll need to come up with a reproducer that demonstrates the problem. I suppose we can change the reproducer so that it does a TCP connection and use netcat to send it to a new file and compare contents.

If the reproducer also fails for you, then please test the kernels on my people.redhat.com page and see if they make any difference:

http://people.redhat.com/jlayton

...they have updated CIFS code that's slated for 5.4 (plus some other patches I'm looking at for 5.5). I suspect that they'll still have the problem too, but it would be good to confirm.

Comment 5 Paolo Penzo 2009-06-23 14:40:51 UTC
The reproducer works fine:
the reported size equals the size on the CIFS file.
The kernel version is still 2.6.18-128.1.6.el5.

Comment 6 Jeff Layton 2009-06-23 14:44:52 UTC
By works fine, I assume you mean that the reported size is correct. If so, that means that the reproducer is just no good.

I'll need to get some info on how you're reproducing this with apache. Can you provide details?

Comment 7 Paolo Penzo 2009-06-23 14:57:53 UTC
Yes, the reproducer reported the same size of the ls command.

Apache is serving part of a web site by reading files (autocad drawings) from a remote linux server (with security=ADS).
The remote CIFS share is connected by autofs using these options 
-fstype=cifs,username=XXXXXXX,workgroup=YYYYYYYY,password=ZZZZZZZZ,uid=apache,
gid=tomcat,file_mode=0664,dir_mode=0775,port=139

Comment 8 Jeff Layton 2009-06-23 15:07:12 UTC
The mount options are nice, but I'm more concerned about how you tell that the problem has been reproduced. What goes wrong when you try to download these files?

Comment 9 Paolo Penzo 2009-06-25 15:39:09 UTC
When using sendfile, the size of the received file is smaller of the original file.

Comment 10 Jeff Layton 2009-07-21 20:28:12 UTC
Wrote a new reproducer that connects to an IPv4 socket (which I connected to netcat) and then sends the file there. The received file seems to be the right size when I send a file on CIFS, but the sendfile call returns an error:

sendfile(3, 4, NULL, 15886)             = -1 EOVERFLOW (Value too large for defined data type)

While I haven't looked at how apache uses sendfile, if it was calling it in a loop of chunks smaller than the file size then it may give up when it gets an error. Either way, sendfile shouldn't return that error, so something is wrong here.

Comment 11 Jeff Layton 2009-07-21 23:40:29 UTC
Created attachment 354613 [details]
patch -- fix sb->s_maxbytes so that it casts properly to a signed value

I think this patch will fix the problem. It fixes the reproducer I have so that sendfile no longer returns an EOVERFLOW error.

Comment 12 Jeff Layton 2009-07-21 23:44:37 UTC
I'll see about building some test kernels soon with this patch, but if you're able to test it in the meantime and report back it would be helpful. I've also gone ahead and pushed this upstream too.

Comment 13 Joe Orton 2009-07-22 07:33:56 UTC
> While I haven't looked at how apache uses sendfile, if it was calling it in a
> loop of chunks smaller than the file size then it may give up when it gets an
> error.

Yes, httpd uses a non-blocking socket and will end up popping in and out of sendfile() each time the TCP socket blocks.

Thanks a lot for tracking this down, Jeff.  This has plagued people upstream for a long time, too.

Comment 14 Jeff Layton 2009-07-22 19:30:06 UTC
Kernels with this patch (and another one for a similar fix for get_sb_pseudo) are on my people.redhat.com page:

http://people.redhat.com/jlayton/

Please test these and report back as to whether they fix the problem for you.

Comment 15 RHEL Program Management 2009-10-09 20:19:34 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 17 Evan McNabb 2009-10-16 15:36:47 UTC
Paolo,

Have you been able to try the test kernel in comment #14? Thanks!

Comment 19 Paolo Penzo 2009-10-20 06:39:49 UTC
With kernel 2.6.18-169.el5.jtltest.90 it works fine.
I apologize for the delay of my answer.

Comment 20 Jeff Layton 2009-10-20 13:36:27 UTC
Created attachment 365350 [details]
updated reproducer

I think this program should serve as a reproducer for this. Should return success if the sendfile returns success and vice versa.

I haven't tested this on unpatched CIFS yet, so if it doesn't seem to work let me know and I'll have a closer look.

Comment 22 Jeff Layton 2009-10-30 12:18:06 UTC
Created attachment 366791 [details]
patch -- cifs/libfs: fix sb->s_maxbytes so that it casts properly to a signed value

This is the patch that was proposed. I added the patch to get_sb_pseudo as well.

Comment 23 Don Zickus 2009-11-03 22:52:21 UTC
in kernel-2.6.18-172.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 26 errata-xmlrpc 2010-03-30 07:43:07 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.