Bug 763577 (GLUSTER-1845) - Create fop hangs beyond distribute
Summary: Create fop hangs beyond distribute
Keywords:
Status: CLOSED WORKSFORME
Alias: GLUSTER-1845
Product: GlusterFS
Classification: Community
Component: nfs
Version: nfs-alpha
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-07 08:35 UTC by Shehjar Tikoo
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-10-07 05:43:13 UTC
Log at: /share/tickets/1845/

Comment 1 Shehjar Tikoo 2010-10-07 08:35:25 UTC
From Allen:

While testing a 4 node setup of dist+mirror with rc15 GNFS, we ran into a lockup issue after the vIP is failed over to the mirrored node with ucarp.

vIPs are 153.90.178.104/24 and 153.90.178.105/24 . 104 failed over to 105 then the lockup happens.

There are two problems here:

1. nfs server is receiving changed inode numbers from the server. The first response by nfs server should be to revalidate the inode. This is fixed in a patch committed after rc15 so this is not a problem right now.

2. The end of the log shows that the lock-up happens because there are no replies from the either replicate or from the brick for a create fop, for eg:

[2010-10-06 17:52:46] D [nfs3-helpers.c:2275:nfs3_log_create_call] nfs-nfsv3: XID: 74d03094, CREATE: args: FH: hashcount 0, xlid 0, gen 0, ino 1, name: 11, mode: UNCHECKED
[2010-10-06 17:52:46] T [nfs3.c:2240:nfs3_create] nfs-nfsv3: FH to Volume: glustervol1
[2010-10-06 17:52:46] T [nfs3-helpers.c:3000:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 1, gen: 0, entry: 11, hashidx: 0
[2010-10-06 17:52:46] T [nfs3-helpers.c:3008:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /11
[2010-10-06 17:52:46] T [nfs-fops.c:280:nfs_fop_lookup] nfs: Lookup: /11
[2010-10-06 17:52:46] T [nfs3-helpers.c:2549:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /11: No such file or directory
[2010-10-06 17:52:46] T [nfs.c:600:nfs_user_create] nfs: uid: 0, gid 0, gids: 1
[2010-10-06 17:52:46] T [nfs.c:608:nfs_user_create] nfs: gid: 0
[2010-10-06 17:52:46] T [nfs-fops.c:602:nfs_fop_create] nfs: Create: /11
[2010-10-06 17:52:46] T [nfs-fops.c:130:nfs_create_frame] nfs: uid: 0, gid 0, gids: 1
[2010-10-06 17:52:46] T [nfs-fops.c:132:nfs_create_frame] nfs: gid: 0
[2010-10-06 17:52:46] T [dht-common.c:3050:dht_create] glustervol1: creating /11 on mirror-0
[2010-10-06 17:52:46] D [dht-diskusage.c:71:dht_du_info_cbk] glustervol1: on subvolume 'mirror-1': avail_percent is: 99.00 and avail_space is: 1476552265728
[2010-10-06 17:52:46] D [dht-diskusage.c:71:dht_du_info_cbk] glustervol1: on subvolume 'mirror-0': avail_percent is: 99.00 and avail_space is: 1476552269824


Not sure yet whether this is a dht, afr or server problem.

Comment 2 Shehjar Tikoo 2010-11-09 02:10:52 UTC
Sac has been doing failover tests in the last week or so. We havent run into such a hang in 3.1. I am closing this bug.


Note You need to log in before you can comment on or make changes to this bug.