Bug 997385 - NUFA: ENOENT is thrown when a file is created on `nufa' mount
Summary: NUFA: ENOENT is thrown when a file is created on `nufa' mount
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-15 11:13 UTC by Sachidananda Urs
Modified: 2015-11-27 10:48 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-27 10:48:47 UTC
Embargoed:


Attachments (Terms of Use)
sosreports (7.61 MB, application/x-tar)
2013-08-21 07:31 UTC, Sachidananda Urs
no flags Details

Description Sachidananda Urs 2013-08-15 11:13:26 UTC
Description of problem:

If a file is deleted on a client mount and a file with same name is created on `nufa' mount ENOENT errors are seen. However, creating a file with a new name works fine.

Volume Name: nufa-1
Type: Distribute
Volume ID: d12e0b94-72c5-4e67-8056-742ceb1c3490
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.97:/rhs/brick1/nufa-1
Brick3: 10.70.37.124:/rhs/brick1/nufa-1
Brick4: 10.70.37.82:/rhs/brick1/nufa-1
Options Reconfigured:
cluster.nufa: on


Version-Release number of selected component (if applicable):

glusterfs 3.4.0.19rhs built on Aug 14 2013 00:11:42

How reproducible:
Always


Steps to Reproduce:
1. Create a 4 node distribute volume and set nufa on
2. Create some files from a client (The files are distributed across 4 nodes)
3. Create a mount on one of the servers (This will be the NUFA mount)
4. From the client remove any one file.
5. Try to create a file with same name on the NUFA mount, below error is seen

root@boggs nufa-1]# dd if=/dev/zero of=fil1 bs=10M count=20
dd: opening `fil1': No such file or directory

Actual results:

ENOENT

Expected results:

Should be able to create a file with the same name.

Additional info:

Log snippet:
=========================================================

[2013-08-15 10:55:23.698084] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3cb661be7d] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80])) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 10:55:23.699804] E [fuse-bridge.c:1162:fuse_getattr_resume] 0-glusterfs-fuse: 3390: GETATTR 140067732189852 (2677b206-57ad-46dc-a63e-66876bfb88e6) resolution failed
[2013-08-15 11:03:08.168420] W [client-rpc-fops.c:519:client3_3_stat_cbk] 1-nufa-1-client-0: remote operation failed: No such file or directory
[2013-08-15 11:03:08.169552] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/distribute.so(dht_migration_complete_check_task+0x11e) [0x7f6414e6c7fe] (-->/usr/lib64/libglusterfs.so.0(syncop_lookup+0x19a) [0x3cb664b56a] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80]))) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 11:03:08.171443] W [fuse-bridge.c:1133:fuse_attr_cbk] 0-glusterfs-fuse: 3396: STAT() /fil5 => -1 (No such file or directory)

Comment 1 shishir gowda 2013-08-21 04:24:28 UTC
Can you please attach the sos-reports from the clients (and servers if possible)?

Comment 2 Sachidananda Urs 2013-08-21 07:31:45 UTC
Created attachment 788737 [details]
sosreports

Comment 3 Sachidananda Urs 2013-08-21 07:34:20 UTC
Attaching sosreports.

Volume looks like:

Volume Name: nufa
Type: Distribute
Volume ID: a956aa02-befa-4a7c-bc5a-67a495bff7c6
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.100:/rhs/brick1/nufa-2
Brick3: 10.70.37.124:/rhs/brick1/nufa-3
Brick4: 10.70.37.82:/rhs/brick1/nufa-4

sosreports are named after their corresponding ip-addresses.

Comment 4 shishir gowda 2013-08-21 10:23:21 UTC
Couple of observations:
1. Looks like if creation is done from client on which rm -rf was not issued, then it fails.
2. The failure seems to be in fuse_attr. dht_attr does not find the file (as it should).
3. A remount of client does fix the issue
4. Though the file is deleted, looks like the inode is still linked in the inode table

Breakpoint 15, dht_stat (frame=0x7f88dccfeb98, this=0x1d0ca90, 
    loc=0x7f88cc02ef50, xdata=0x0) at dht-inode-read.c:259
259	{
(gdb) p *loc
$22 = {path = 0x7f88cc007170 "/file-10", name = 0x0, inode = 0x7f88d3b1f6e0, 
  parent = 0x0, gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", 
  pargfid = '\000' <repeats 15 times>}

(gdb) p *loc->inode 
$24 = {table = 0x1db6f50, 
  gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", lock = 1, 
  nlookup = 6, fd_count = 0, ref = 3, ia_type = IA_IFREG, fd_list = {
    next = 0x7f88d3b1f718, prev = 0x7f88d3b1f718}, dentry_list = {
    next = 0x7f88d387e320, prev = 0x7f88d387e320}, hash = {
    next = 0x7f88d38440c0, prev = 0x7f88d38440c0}, list = {
    next = 0x7f88d3b1f094, prev = 0x1db6fb0}, _ctx = 0x1dd4340}

Comment 5 shishir gowda 2013-08-21 10:29:08 UTC
Could we try checking if mounting the clients with --entry-timeout=0 and --attribute-timeout=0 fixes the issue?

Comment 6 Sachidananda Urs 2013-08-21 12:56:35 UTC
Same results

Comment 7 Amar Tumballi 2013-08-22 06:18:07 UTC
Removing the 'blocker' flag as per discussion yesterday.


NUFA supportability scope in Big Bend

- Only supported when a client that is mounting a NUFA enabled volume is present within the trusted storage pool i.e. co-resident with a Red Hat Storage Server.
- Only supported for the FUSE client
- Only supported with one brick per server
- When the local brick runs out of space or hits the cluster mindiskfree limit files will get distributed to other bricks in the same volume as long as there is space instead of returning ENOSPACE


Note You need to log in before you can comment on or make changes to this bug.