Bug 997385 - NUFA: ENOENT is thrown when a file is created on `nufa' mount
NUFA: ENOENT is thrown when a file is created on `nufa' mount
Status: CLOSED WONTFIX
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
2.1
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Bug Updates Notification Mailing List
storage-qa-internal@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-15 07:13 EDT by Sachidananda Urs
Modified: 2015-11-27 05:48 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-27 05:48:47 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sosreports (7.61 MB, application/x-tar)
2013-08-21 03:31 EDT, Sachidananda Urs
no flags Details

  None (edit)
Description Sachidananda Urs 2013-08-15 07:13:26 EDT
Description of problem:

If a file is deleted on a client mount and a file with same name is created on `nufa' mount ENOENT errors are seen. However, creating a file with a new name works fine.

Volume Name: nufa-1
Type: Distribute
Volume ID: d12e0b94-72c5-4e67-8056-742ceb1c3490
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.97:/rhs/brick1/nufa-1
Brick3: 10.70.37.124:/rhs/brick1/nufa-1
Brick4: 10.70.37.82:/rhs/brick1/nufa-1
Options Reconfigured:
cluster.nufa: on


Version-Release number of selected component (if applicable):

glusterfs 3.4.0.19rhs built on Aug 14 2013 00:11:42

How reproducible:
Always


Steps to Reproduce:
1. Create a 4 node distribute volume and set nufa on
2. Create some files from a client (The files are distributed across 4 nodes)
3. Create a mount on one of the servers (This will be the NUFA mount)
4. From the client remove any one file.
5. Try to create a file with same name on the NUFA mount, below error is seen

root@boggs nufa-1]# dd if=/dev/zero of=fil1 bs=10M count=20
dd: opening `fil1': No such file or directory

Actual results:

ENOENT

Expected results:

Should be able to create a file with the same name.

Additional info:

Log snippet:
=========================================================

[2013-08-15 10:55:23.698084] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3cb661be7d] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80])) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 10:55:23.699804] E [fuse-bridge.c:1162:fuse_getattr_resume] 0-glusterfs-fuse: 3390: GETATTR 140067732189852 (2677b206-57ad-46dc-a63e-66876bfb88e6) resolution failed
[2013-08-15 11:03:08.168420] W [client-rpc-fops.c:519:client3_3_stat_cbk] 1-nufa-1-client-0: remote operation failed: No such file or directory
[2013-08-15 11:03:08.169552] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/distribute.so(dht_migration_complete_check_task+0x11e) [0x7f6414e6c7fe] (-->/usr/lib64/libglusterfs.so.0(syncop_lookup+0x19a) [0x3cb664b56a] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80]))) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 11:03:08.171443] W [fuse-bridge.c:1133:fuse_attr_cbk] 0-glusterfs-fuse: 3396: STAT() /fil5 => -1 (No such file or directory)
Comment 1 shishir gowda 2013-08-21 00:24:28 EDT
Can you please attach the sos-reports from the clients (and servers if possible)?
Comment 2 Sachidananda Urs 2013-08-21 03:31:45 EDT
Created attachment 788737 [details]
sosreports
Comment 3 Sachidananda Urs 2013-08-21 03:34:20 EDT
Attaching sosreports.

Volume looks like:

Volume Name: nufa
Type: Distribute
Volume ID: a956aa02-befa-4a7c-bc5a-67a495bff7c6
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.100:/rhs/brick1/nufa-2
Brick3: 10.70.37.124:/rhs/brick1/nufa-3
Brick4: 10.70.37.82:/rhs/brick1/nufa-4

sosreports are named after their corresponding ip-addresses.
Comment 4 shishir gowda 2013-08-21 06:23:21 EDT
Couple of observations:
1. Looks like if creation is done from client on which rm -rf was not issued, then it fails.
2. The failure seems to be in fuse_attr. dht_attr does not find the file (as it should).
3. A remount of client does fix the issue
4. Though the file is deleted, looks like the inode is still linked in the inode table

Breakpoint 15, dht_stat (frame=0x7f88dccfeb98, this=0x1d0ca90, 
    loc=0x7f88cc02ef50, xdata=0x0) at dht-inode-read.c:259
259	{
(gdb) p *loc
$22 = {path = 0x7f88cc007170 "/file-10", name = 0x0, inode = 0x7f88d3b1f6e0, 
  parent = 0x0, gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", 
  pargfid = '\000' <repeats 15 times>}

(gdb) p *loc->inode 
$24 = {table = 0x1db6f50, 
  gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", lock = 1, 
  nlookup = 6, fd_count = 0, ref = 3, ia_type = IA_IFREG, fd_list = {
    next = 0x7f88d3b1f718, prev = 0x7f88d3b1f718}, dentry_list = {
    next = 0x7f88d387e320, prev = 0x7f88d387e320}, hash = {
    next = 0x7f88d38440c0, prev = 0x7f88d38440c0}, list = {
    next = 0x7f88d3b1f094, prev = 0x1db6fb0}, _ctx = 0x1dd4340}
Comment 5 shishir gowda 2013-08-21 06:29:08 EDT
Could we try checking if mounting the clients with --entry-timeout=0 and --attribute-timeout=0 fixes the issue?
Comment 6 Sachidananda Urs 2013-08-21 08:56:35 EDT
Same results
Comment 7 Amar Tumballi 2013-08-22 02:18:07 EDT
Removing the 'blocker' flag as per discussion yesterday.


NUFA supportability scope in Big Bend

- Only supported when a client that is mounting a NUFA enabled volume is present within the trusted storage pool i.e. co-resident with a Red Hat Storage Server.
- Only supported for the FUSE client
- Only supported with one brick per server
- When the local brick runs out of space or hits the cluster mindiskfree limit files will get distributed to other bricks in the same volume as long as there is space instead of returning ENOSPACE

Note You need to log in before you can comment on or make changes to this bug.