Bug 178469
Summary: | SPECsfs NFS work load receives RPC errors from single node GFS; works fine with EXT3 | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Barry Marson <bmarson> |
Component: | gfs | Assignee: | Wendy Cheng <nobody+wcheng> |
Status: | CLOSED ERRATA | QA Contact: | GFS Bugs <gfs-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4 | CC: | dshaks, rkenna |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHBA-2006-0234 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-03-09 19:46:40 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 164915 |
Description
Barry Marson
2006-01-20 20:52:23 UTC
This was done on RHEL4 U3 Beta I was adding some traps code within the SPECsfs code itself (on the NFS client machine) over the weekend and noticed the NFS/GFS server got frozen. Didn't expect this so there was no crash server setup. Went to check the machines in the lab today (Monday) - part of the panic log had rolled over the screen so I couldn't catch the exact place. This shouldn't happen since SPECSsfs is a stand-alone application that doesn't have any kernel piece and nor it uses any of kernel NFS code. It directly interacts with NFS/GFS server via network packages. Part of crash route: nfsd_getattr write_inode wh_kupdate __sync_single_inode write_inode_now_err kthread nfsd_setattr nfsd_create_v3 nfsd_proc_create nfsd_dispatch svc_process nfsd Assertion "FALSE" failed function = check_seg_usage ... gfs-kernel-2.6.9-47/smp/src/log.c line 590 time=1137891017 I ran thru one round of SPECsfs today (first time ever) with "NFS Op Error Count" all zero with the following combination: 1. Comment out (disabled) base kernel NFSD's readache code. 2. Byte swap the 6th word of GFS file handle during decoding (gfs_decode_fh). 3. (NFS) Export the filesystem using "fsid" option (bypassing GFS diaper device). 4. Re-make GFS filesystem (gfs_mkfs). 5. Fix a debug buffer over-flow issue. Need to see whether it is repeatble and further isolate the culprits. Third round works without error. So let's ship it ! Thanks Wendy - Good news - the SPECsfs workloads in x86 mode runs to completion no transactions errors reported. Please login to bigbaddell2.lab which is updated to 2.6.9-32.ELsmp Barry will baseline the hugemem against 2cpu, 4cpu and 8cpus on the SPECsfs testbed. The fix has been built by Christ Feist and should be on RHEL4 U3 GFS release. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2006-0234.html |