Bug 762551 (GLUSTER-819)

Summary: NFS server hangs on dd run
Product: [Community] GlusterFS Reporter: Shehjar Tikoo <shehjart>
Component: nfsAssignee: Shehjar Tikoo <shehjart>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: nfs-alphaCC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
GlusterFS server log
none
nfs server log part 1
none
nfs server log part 2 none

Description Shehjar Tikoo 2010-04-12 07:31:27 UTC
Reported on gluster-users:
chadr wrote:
> I ran the same dd tests from KnowYourNFSAlpha-1.pdf and performance is
> inconsistent and causes the server to become unresponsive.
> 
> My server freezes every time when I run the following command:
> dd if=/dev/zero of=garb bs=256k count=64000
> 
> I would also like to mount a path like: /volume/some/random/dir 
> 
> # mount host:/gluster/tmp /mnt/test
> mount: host:/gluster/tmp failed, reason given by server: No such file or
> directory
> 
> I can mount it up host:/volume_name and /mnt/test/tmp exists
> 
> dd if=/dev/zero of=garb bs=64K count=100
> 100+0 records in
> 100+0 records out
> 6553600 bytes (6.6 MB) copied, 0.068906 seconds, 95.1 MB/s
> 
> dd of=garb if=/dev/zero bs=64K count=100
> 100+0 records in
> 100+0 records out
> 6553600 bytes (6.6 MB) copied, 0.057207 seconds, 115 MB/s
> 
> dd if=/dev/zero of=garb bs=64K count=1000
> 1000+0 records in
> 1000+0 records out
> 65536000 bytes (66 MB) copied, 0.523117 seconds, 125 MB/s
> 
> dd of=garb if=/dev/zero bs=64K count=1000
> 1000+0 records in
> 1000+0 records out
> 65536000 bytes (66 MB) copied, 1.04666 seconds, 62.6 MB/s
> 
> dd if=/dev/zero of=garb bs=64K count=10000
> 10000+0 records in
> 10000+0 records out
> 655360000 bytes (655 MB) copied, 10.9809 seconds, 59.7 MB/s
> 
> dd of=garb if=/dev/zero bs=64K count=10000
> 10000+0 records in
> 10000+0 records out
> 655360000 bytes (655 MB) copied, 11.3515 seconds, 57.7 MB/s
> 
> dd if=/dev/zero of=garb bs=128K count=100
> 100+0 records in
> 100+0 records out
> 13107200 bytes (13 MB) copied, 0.105364 seconds, 124 MB/s
> 
> dd of=garb if=/dev/zero bs=128K count=100
> 100+0 records in
> 100+0 records out
> 13107200 bytes (13 MB) copied, 0.254225 seconds, 51.6 MB/s
> 
> dd if=/dev/zero of=garb bs=128K count=1000
> 1000+0 records in
> 1000+0 records out
> 131072000 bytes (131 MB) copied, 60.1008 seconds, 2.2 MB/s
> 
> dd of=garb if=/dev/zero bs=128K count=1000
> 1000+0 records in
> 1000+0 records out
> 131072000 bytes (131 MB) copied, 1.51868 seconds, 86.3 MB/s
> 
> dd if=/dev/zero of=garb bs=128K count=10000
> 10000+0 records in
> 10000+0 records out
> 1310720000 bytes (1.3 GB) copied, 18.7755 seconds, 69.8 MB/s
> 
> dd of=garb if=/dev/zero bs=128K count=10000
> 10000+0 records in
> 10000+0 records out
> 1310720000 bytes (1.3 GB) copied, 18.9837 seconds, 69.0 MB/s
> 
> dd if=/dev/zero of=garb bs=256k count=64000
> My server freezes.
> 
> 
> Here is the recent nfs log when the server froze:
> 
> [2010-04-09 23:37:33] D [nfs3-helpers.c:2114:nfs3_log_rw_call] nfs-nfsv3:
> XID: 6f68c85f, WRITE: args: FH: hashcount 2, xlid 0, gen
> 5458285267163021319, ino 11856898, offset: 1129578496,  count: 65536,
> UNSTABLE
> [2010-04-09 23:37:33] D [rpcsvc.c:1790:rpcsvc_request_create] rpc-service:
> RPC XID: 7068c85f, Ver: 2, Program: 100003, ProgVers: 3, Proc: 7
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [nfs3-helpers.c:2114:nfs3_log_rw_call] nfs-nfsv3:
> XID: 7068c85f, WRITE: args: FH: hashcount 2, xlid 0, gen
> 5458285267163021319, ino 11856898, offset: 1129644032,  count: 65536,
> UNSTABLE
> [2010-04-09 23:37:33] D [rpcsvc.c:1790:rpcsvc_request_create] rpc-service:
> RPC XID: 7168c85f, Ver: 2, Program: 100003, ProgVers: 3, Proc: 7
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [rpcsvc.c:1266:rpcsvc_program_actor] rpc-service:
> Actor found: NFS3 - WRITE
> [2010-04-09 23:37:33] D [nfs3-helpers.c:2114:nfs3_log_rw_call] nfs-nfsv3:
> XID: 7168c85f, WRITE: args: FH: hashcount 2, xlid 0, gen
> 5458285267163021319, ino 11856898, offset: 1129709568,  count: 65536,
> UNSTABLE
> [2010-04-09 23:38:33] D [rpcsvc.c:1790:rpcsvc_request_create] rpc-service:
> RPC XID: 6268c85f, Ver: 2, Program: 100003, ProgVers: 3, Proc: 7
> 
> Thanks,
> Chad Richards
> _________________________

Comment 1 Chad Richards 2010-04-12 14:50:18 UTC
Created attachment 176 [details]
RH6.x doesnt work. same code

Comment 2 Chad Richards 2010-04-12 14:58:40 UTC
Created attachment 177 [details]
Fixed EMU10k1 patch. There's only 1 modification SUB_DIRS -> MOD_SUB_DIRS

Comment 3 Chad Richards 2010-04-12 14:59:02 UTC
Created attachment 178 [details]
trivial #ifdef STAT64 => #ifdef HAVE_STAT64

Comment 4 Shehjar Tikoo 2010-04-21 03:55:18 UTC
Chad, I've looked at the NFS server log and it is receiving and serving the write request just fine, right till the end of the log provided. Knowing that the server also freezes up under a unfs+fuse load, I'd suggest testing the same dd load against the NFS translator on a different machine, just to be sure that the problem is unrelated to the kernel freeze.

Thanks

Comment 5 Shehjar Tikoo 2010-05-04 07:13:02 UTC
Closing bug on account of no response from user.