Bug 763577 - (GLUSTER-1845) Create fop hangs beyond distribute
Create fop hangs beyond distribute
Status: CLOSED WORKSFORME
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
nfs-alpha
All Linux
low Severity medium
: ---
: ---
Assigned To: Shehjar Tikoo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-07 04:35 EDT by Shehjar Tikoo
Modified: 2015-12-01 11:45 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shehjar Tikoo 2010-10-07 01:43:13 EDT
Log at: /share/tickets/1845/
Comment 1 Shehjar Tikoo 2010-10-07 04:35:25 EDT
From Allen:

While testing a 4 node setup of dist+mirror with rc15 GNFS, we ran into a lockup issue after the vIP is failed over to the mirrored node with ucarp.

vIPs are 153.90.178.104/24 and 153.90.178.105/24 . 104 failed over to 105 then the lockup happens.

There are two problems here:

1. nfs server is receiving changed inode numbers from the server. The first response by nfs server should be to revalidate the inode. This is fixed in a patch committed after rc15 so this is not a problem right now.

2. The end of the log shows that the lock-up happens because there are no replies from the either replicate or from the brick for a create fop, for eg:

[2010-10-06 17:52:46] D [nfs3-helpers.c:2275:nfs3_log_create_call] nfs-nfsv3: XID: 74d03094, CREATE: args: FH: hashcount 0, xlid 0, gen 0, ino 1, name: 11, mode: UNCHECKED
[2010-10-06 17:52:46] T [nfs3.c:2240:nfs3_create] nfs-nfsv3: FH to Volume: glustervol1
[2010-10-06 17:52:46] T [nfs3-helpers.c:3000:nfs3_fh_resolve_entry_hard] nfs-nfsv3: FH hard resolution: ino: 1, gen: 0, entry: 11, hashidx: 0
[2010-10-06 17:52:46] T [nfs3-helpers.c:3008:nfs3_fh_resolve_entry_hard] nfs-nfsv3: Entry needs lookup: /11
[2010-10-06 17:52:46] T [nfs-fops.c:280:nfs_fop_lookup] nfs: Lookup: /11
[2010-10-06 17:52:46] T [nfs3-helpers.c:2549:nfs3_fh_resolve_entry_lookup_cbk] nfs-nfsv3: Lookup failed: /11: No such file or directory
[2010-10-06 17:52:46] T [nfs.c:600:nfs_user_create] nfs: uid: 0, gid 0, gids: 1
[2010-10-06 17:52:46] T [nfs.c:608:nfs_user_create] nfs: gid: 0
[2010-10-06 17:52:46] T [nfs-fops.c:602:nfs_fop_create] nfs: Create: /11
[2010-10-06 17:52:46] T [nfs-fops.c:130:nfs_create_frame] nfs: uid: 0, gid 0, gids: 1
[2010-10-06 17:52:46] T [nfs-fops.c:132:nfs_create_frame] nfs: gid: 0
[2010-10-06 17:52:46] T [dht-common.c:3050:dht_create] glustervol1: creating /11 on mirror-0
[2010-10-06 17:52:46] D [dht-diskusage.c:71:dht_du_info_cbk] glustervol1: on subvolume 'mirror-1': avail_percent is: 99.00 and avail_space is: 1476552265728
[2010-10-06 17:52:46] D [dht-diskusage.c:71:dht_du_info_cbk] glustervol1: on subvolume 'mirror-0': avail_percent is: 99.00 and avail_space is: 1476552269824


Not sure yet whether this is a dht, afr or server problem.
Comment 2 Shehjar Tikoo 2010-11-08 21:10:52 EST
Sac has been doing failover tests in the last week or so. We havent run into such a hang in 3.1. I am closing this bug.

Note You need to log in before you can comment on or make changes to this bug.