Bug 187295
Summary: | CIFS under heavy load crashes system | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Jason Bradley Nance <jbnance> | ||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.0 | CC: | jbaron, nhruby | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHBA-2007-0304 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2007-05-08 01:03:18 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 199167 | ||||||
Bug Blocks: | 176344 | ||||||
Attachments: |
|
Description
Jason Bradley Nance
2006-03-29 20:23:26 UTC
is this a regression from U2 cifs behavior? I experienced the same behavior in U2 and was waiting to test in U3 due to the cifs update. looks like we need another update :( Then I won't report the "processes become deadlocked when cifs server goes away" bug that I'm trying to figure out as well... :) Is there an estimate on the next kernel rollout? Do you know offhand if the e1000 issues have been resolved in the U3? This machine uses the e1000 driver with an onboard card as well as a dual-port PCI-X card. I'd hate to blame the wrong module... =) Guess I need to do some more test... Okay, I ran stress tests on the e1000 driver and was unable to make it crash, so I believe this is actually a CIFS issue. METOO I'm seeing the same behavior as well on a HP DL380 g3 (tg3 based nic's). Luckily (?) I seem to be able to reporduce this at will with rsync, as it generally locks when rsync'ing a specific user's data from one system to the CIFS filesystem. Though the load is not heavy at all (on both the system as a whole and the filesystem). Please LMK if's there any specific debugging steps you'd like me to take. Created attachment 130880 [details]
OOPS from kernel
OOPS from serial console when running an rsync against user data that triggers
this issue.
Hurm.. Looking at the OOPS, the top of the call trace is cifs_rename. Oddly, the directory with the bad data contains two files "Buddy.jpg" and "buddy.jpg" The first one syncs just duckily, the latter one is the last file rsync spits out before the crash. Could this be a case sensitivity issue tickling a deadlock? I can not reporduce this with vi, so rsync must do domething funky when copying? ok, here is a pointer to a changeset which might resolve this issue: http://marc.theaimsgroup.com/?l=git-commits-head&m=111767133714737&w=2 I've examined the backtrace in comment #9, and it is indeed falling over on exactly the same code that is fixed in the above patch. Therefore, i think it is likely that this patch will solve your issue. I've crated a test kernel with this patch (it doesn't quite apply cleanly to rhel4), based on latest beta kernel: http://people.redhat.com/~jbaron/bz187295/. I could build it on top of 34.0.1 if you want... Also, i've contacted the upstream CIFS maintainer, and we are in discussion as to what further CIFS improvements are appropriate for rhel4. thanks. anybody had a chance to test this? thanks. That kernel wouldn't boot for me. Hangs on PCI probing. *shrug* hmmm...sounds like a different issue...if possible could you post the boot log up to the point where it fails...i'm also curious if the latest U4 beta kernels work for you, located at: http://people.redhat.com/~jbaron/rhel4/ thanks. 2.6.9-40 has the same cifs module as 2.6.9-34(.0.1), is there any reason to test that? As far as the kernel you built goes... it's an i686 and I'm running an x86_64 install... *shrug* ok, i've place test kernels for x86 and x86_64 at: http://people.redhat.com/~jbaron/bz187295/ Please let us know if these resolve the issue. thanks. This issues appears to be resolved in 2.6.9-40.1.EL.cifs.1smp. Thank you. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. committed in stream U5 build 42.10. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html |