Bug 417961
Description
Jeff Layton
2007-12-10 12:15:26 UTC
Created attachment 283091 [details]
tarball with backport of cifs-1.50 for older kernels
Tarball containing Steve French's backport of upstream cifs-1.50 code to older
kernels.
Created attachment 283101 [details]
patch 1 -- update RHEL5's cifs code to 1.50c
Created attachment 283111 [details]
patch 2 -- remove duplicate inc_nlink definition
Created attachment 283121 [details]
patch 3 -- don't include config.h
Created attachment 283131 [details]
patch 4 -- fix bad handling of EAGAIN error on kernel_recvmsg in cifs_demultiplex_thread
Created attachment 283141 [details]
patch 5 -- Fix spurious reconnect on 2nd peek from read of SMB length
Created attachment 283151 [details]
patch 6 -- fix oops on second mount to same server when null auth is used
Created attachment 283161 [details]
patch 7 -- log better errors on failed mounts
Created attachment 283171 [details]
patch 8 -- Fix buffer overflow if server sends corrupt response to small request
Created attachment 283181 [details]
patch 9 -- Fix memory leak in statfs to very old servers
Created attachment 283191 [details]
patch 10 -- Reduce chance of list corruption in find_writable_file
Created attachment 283201 [details]
patch 11 -- Fix cifsd so shuts down when signing fails during mount
Created attachment 283211 [details]
patch 12 -- fix error message about packet signing
Created attachment 283221 [details]
patch 13 -- when mount helper missing fix slash wrong direction in share
Created attachment 283231 [details]
patch 14 -- Fix potential data corruption when writing out cached dirty pages
Created attachment 283241 [details]
patch 15 -- Fix endian conversion problem in posix mkdir
Created attachment 283251 [details]
patch 16 -- update CHANGES file and version string
I'm building a set of test kernels with this patchset now and plan to spend significant time over the next few days running regression tests on it. Cursory hello-world style testing so far looks good... This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. *** Bug 373741 has been marked as a duplicate of this bug. *** *** Bug 414151 has been marked as a duplicate of this bug. *** *** Bug 287401 has been marked as a duplicate of this bug. *** *** Bug 373001 has been marked as a duplicate of this bug. *** *** Bug 329431 has been marked as a duplicate of this bug. *** in 2.6.18-62.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Have run into other problems using directio in the cifs mount options, so this isn't going to help them work around the behavior they are seeing. There would be performance differences anyway, since with directio we would NOT be using the local pagecache to enhance client performance. In any case, as Jeff mentioned, what they are seeing is really a different issue than what the case was originally opened on (and which a fix has been committed for), so Jeff will need a separate BZ to be filed for this different problem. This will also mean we need a separate Issue Tracker. Thanks, Vince Internal Status set to 'Waiting on Support' This event sent from IssueTracker by vincew issue 134794 Navid, Actually we need to focus a little closer on what the firefox strace log is really telling us. I'm going to quote some lines from the firefox strace file and cite line numbers for purpose of reference below: lines 76700 - 76703: 6527 open("/mnt/cifs/foo.dat", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0600) = 38 6527 open("/tmp/ft3un2ku.bin", O_RDONLY|O_LARGEFILE) = 39 6527 read(39, "\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"..., 8192) = 8192 6527 write(38, "\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"..., 8192) = 8192 After doing some initial "setup", firefox open()'s /mnt/cifs/foo.dat (the DESTINATION file on the CIFS mount) for WRITING, and gets file descriptor 38 returned from the open() call. Firefox then open()'s /tmp/ft3un2ku.bin (the SOURCE file) for READING, and gets file descriptor 39 back for this open() call. With both files open, it begins reading from fd 39 (the source file) in 8K buffer sizes, and writing the buffer out to fd38 (the destination on CIFS), until it hits EOF on the source file: lines 79517 - 79523 6527 read(39, "\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"..., 8192) = 8192 6527 write(38, "\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0\\0"..., 8192) = 8192 6527 read(39, "", 8192) = 0 6527 close(38 <unfinished ...> 6531 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 6531 gettimeofday({1201511323, 406093}, NULL) = 0 6531 futex(0x8b9f340, FUTEX_WAKE, 1) = 0 Above, we have finished reading the source file (EOF returned from the read() call on fd39), so there is nothing more to read and therefore nothing more to write to the destination file on fd38. So we call close() on fd 38 - the destination file on the CIFS mount. We then begin a series of futex/gettimeofday waits until our close() call on the destination file descriptor returns. However, we do not get a normal return from the close of our destination file descriptor - we get a -1 ENOSPC error returned TO FIREFOX instead: lines 79545 - 79549 6527 <... close resumed> ) = -1 ENOSPC (No space left on device) 6527 close(39) = 0 6527 lstat64("/tmp/ft3un2ku.bin", {st_mode=S_IFREG|0600, st_size=11534336, ...}) = 0 6527 unlink("/tmp/ft3un2ku.bin") = 0 6527 chmod("/mnt/cifs/foo.dat", 0644) = 0 Firefox fails to see (check for) an error return on the close and proceeds to close the source file's file descriptor (fd39). Note that a 0 return code from the close() on the SOURCE file is completely normal and expected behavior since all we needed to do was READ from this file. However, we DID get an error returned to Firefox when we tried to close() the destination file (the file we wanted to write to the CIFS mount) - but Firefox ignored the error and went on about its business as if the file was written successfully. So this is a Firefox problem, and a separate case should be opened against Firefox for failing to properly check for (and/or handle) errors returned from close() calls. Jeff suggests that if they are going to fix Firefox to check the return value on close(), they should also make it do an fsync() beforehand and check that return value as well. Hopefully it is understood now that the problem here is Firefox, *NOT* CIFS. As such, there is not much more to be done here. Thank you, Vince Internal Status set to 'Waiting on Support' This event sent from IssueTracker by vincew issue 134794 Please refer them the write(2) man page, NOTES section at the bottom, to answer their question. The behavior is by design and a direct result of using the system page cache. It also discusses using fsync() calls as a way to ensure data is actually being written. This will be something to perhaps mention in the ticket opened to request firefox be fixed to actually check for errors when close() is called. --vince This event sent from IssueTracker by vincew issue 134794 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html *** Bug 705693 has been marked as a duplicate of this bug. *** |