Bug 987847
Summary: | nfs: EIO while untarring linux kernel and creating dirs with 200 depth simultaneously | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | M S Vishwanath Bhat <vbhat> | ||||||
Component: | gluster-nfs | Assignee: | Niels de Vos <ndevos> | ||||||
Status: | CLOSED EOL | QA Contact: | Saurabh <saujain> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 2.1 | CC: | mzywusko, poelstra, rhs-bugs, rjoseph, saujain, surs, vagarwal, vbellur, vbhat | ||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-12-03 17:24:58 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 777695 [details]
nfs log from mustang.blr.redhat.com
Considering the directory depth is ~200 here, I am moving the priority to 'low'. If it was ~20 or lesser, this is high priority. From the logs it seems that one of the replicate set bricks have gone down during or before I/O operation. Because of which NFS is failing all the I/O operations on that brick with "I/O Error". Please check if all the brick processes are running properly. Also check if any brick process crashed during the operation. If yes then please attach the core file for details. As far I remember, I didn't bring down any node and all the bricks were online. There were no crashes. I don't have that setup now. But there is nothing geo-rep specific to it so Ideally it should be readily reproducible, even though I haven't tried it once more. I tried to repro the issue with a single brick NFS server but no success. I dont see any issue. It would be great if QE can try with latest RHS build. Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/ If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |
Created attachment 777694 [details] nfs log from harrier.blr.redhat.com Description of problem: I was trying to to untar the linux kernel on the nfs mountpoint. It was taking *long* time (more than an hour). Now I tried to create dirs with 200 depth (well within the path max). Now I see EIO while untarring the kernel. Note: There was a geo-rep session going on between master (where I hit this issue) and another slave node of 2*2 distribute-distribute Version-Release number of selected component (if applicable): glusterfs-3.4.0.12rhs.beta6-1.el6rhs.x86_64 How reproducible: Hit once. Not sure if 100% reproducible. Steps to Reproduce: 1. Create and start 2*2 distributed-replicated volume. 2. Mount via nfs and start untarring the linux kernel. 3. While untarring is still in progress created dirs with 200 depth from another client. mkdir -p `perl -e "print 'foo/' x 200"` Actual results: Errors seen during kernel Untar linux-3.10.1/arch/ia64/include/asm/timex.h linux-3.10.1/arch/ia64/include/asm/tlb.h linux-3.10.1/arch/ia64/include/asm/tlbflush.h linux-3.10.1/arch/ia64/include/asm/topology.h linux-3.10.1/arch/ia64/include/asm/types.h tar: linux-3.10.1/arch/ia64/include/asm/types.h: Cannot close: Input/output error linux-3.10.1/arch/ia64/include/asm/uaccess.h linux-3.10.1/arch/ia64/include/asm/unaligned.h tar: linux-3.10.1/arch/ia64/include/asm/unaligned.h: Cannot close: Input/output error Expected results: Linux kernel untar should not error out and should not take long time. Additional info: Messages from the log files [2013-07-24 07:57:28.714625] I [afr-common.c:2118:afr_set_root_inode_on_first_lookup] 0-hosa-master-replicate-1: added root inode [2013-07-24 07:57:28.715705] I [afr-common.c:2181:afr_discovery_cbk] 0-hosa-master-replicate-0: selecting local read_child hosa-master-client-1 [2013-07-24 09:17:30.264006] E [rpc-clnt.c:207:call_bail] 0-hosa-master-client-0: bailing out frame type(GlusterFS 3.3) op(FXATTROP(34)) xid = 0x123674x sent = 2013-07-24 08:47:24.430155. timeout = 1800 [2013-07-24 09:17:30.264055] W [client-rpc-fops.c:1811:client3_3_fxattrop_cbk] 0-hosa-master-client-0: remote operation failed: Transport endpoint is not connected [2013-07-24 09:17:30.264076] E [rpc-clnt.c:207:call_bail] 0-hosa-master-client-1: bailing out frame type(GlusterFS 3.3) op(FXATTROP(34)) xid = 0x128217x sent = 2013-07-24 08:47:24.430210. timeout = 1800 [2013-07-24 09:17:30.264084] W [client-rpc-fops.c:1811:client3_3_fxattrop_cbk] 0-hosa-master-client-1: remote operation failed: Transport endpoint is not connected [2013-07-24 09:17:31.341269] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: e95cef95: /linux-3.10.1/arch/ia64/include/asm/types.h => -1 (Transport endpoint is not connected) [2013-07-24 09:17:31.341314] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: e95cef95, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, STABLE,wverf: 1374652648 [root@mustang ~]# tailf /var/log/glusterfs/nfs.log [2013-07-24 09:33:44.158046] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: e95cef95, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, STABLE,wverf: 1374652648 [2013-07-24 09:33:44.158177] W [client-rpc-fops.c:1579:client3_3_finodelk_cbk] 0-hosa-master-client-1: remote operation failed: Invalid argument [2013-07-24 09:33:44.158191] I [afr-lk-common.c:669:afr_unlock_inodelk_cbk] 0-hosa-master-replicate-0: (null): unlock failed on 1 unlock by 2434990000000000 [2013-07-24 09:33:44.158213] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: a5def95: /linux-3.10.1/arch/ia64/include/asm/unaligned.h => -1 (Transport endpoint is not connected) [2013-07-24 09:33:44.158221] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: a5def95, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, STABLE,wverf: 1374652648 [2013-07-24 09:33:53.224235] I [rpc-clnt.c:1675:rpc_clnt_reconfig] 0-hosa-master-client-0: changing port to 49152 (from 0) [2013-07-24 09:42:34.383365] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: 6745391f: /foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/.linux-3.10.1.tar.gz.oToFw3 => -1 (Transport endpoint is not connected) [2013-07-24 09:42:34.383383] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: 6745391f, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, UNSTABLE,wverf: 1374652647 [2013-07-24 09:42:34.383571] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: 6945391f: /foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/.linux-3.10.1.tar.gz.oToFw3 => -1 (Transport endpoint is not connected) [2013-07-24 09:42:34.383603] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: 6945391f, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, UNSTABLE,wverf: 1374652647 [2013-07-24 09:42:34.383771] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: 6b45391f: /foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/.linux-3.10.1.tar.gz.oToFw3 => -1 (Transport endpoint is not connected) [2013-07-24 09:42:34.383789] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: 6b45391f, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, UNSTABLE,wverf: 1374652647 [2013-07-24 09:42:34.383968] W [nfs3.c:2069:nfs3svc_write_cbk] 0-nfs: 6d45391f: /foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/.linux-3.10.1.tar.gz.oToFw3 => -1 (Transport endpoint is not connected) [2013-07-24 09:42:34.384006] W [nfs3-helpers.c:3443:nfs3_log_write_res] 0-nfs-nfsv3: XID: 6d45391f, WRITE: NFS: 5(I/O error), POSIX: 107(Transport endpoint is not connected), count: 0, UNSTABLE,wverf: 1374652647 I have attached the nfs logs from both the nfs servers.