997385 – NUFA: ENOENT is thrown when a file is created on `nufa' mount

Bug 997385 - NUFA: ENOENT is thrown when a file is created on `nufa' mount

Summary: NUFA: ENOENT is thrown when a file is created on `nufa' mount

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	storage-qa-internal@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-08-15 11:13 UTC by Sachidananda Urs
Modified:	2015-11-27 10:48 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-11-27 10:48:47 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sosreports (7.61 MB, application/x-tar) 2013-08-21 07:31 UTC, Sachidananda Urs	no flags	Details
View All

Description Sachidananda Urs 2013-08-15 11:13:26 UTC

Description of problem:

If a file is deleted on a client mount and a file with same name is created on `nufa' mount ENOENT errors are seen. However, creating a file with a new name works fine.

Volume Name: nufa-1
Type: Distribute
Volume ID: d12e0b94-72c5-4e67-8056-742ceb1c3490
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.97:/rhs/brick1/nufa-1
Brick3: 10.70.37.124:/rhs/brick1/nufa-1
Brick4: 10.70.37.82:/rhs/brick1/nufa-1
Options Reconfigured:
cluster.nufa: on


Version-Release number of selected component (if applicable):

glusterfs 3.4.0.19rhs built on Aug 14 2013 00:11:42

How reproducible:
Always


Steps to Reproduce:
1. Create a 4 node distribute volume and set nufa on
2. Create some files from a client (The files are distributed across 4 nodes)
3. Create a mount on one of the servers (This will be the NUFA mount)
4. From the client remove any one file.
5. Try to create a file with same name on the NUFA mount, below error is seen

root@boggs nufa-1]# dd if=/dev/zero of=fil1 bs=10M count=20
dd: opening `fil1': No such file or directory

Actual results:

ENOENT

Expected results:

Should be able to create a file with the same name.

Additional info:

Log snippet:
=========================================================

[2013-08-15 10:55:23.698084] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/libglusterfs.so.0(default_lookup+0x6d) [0x3cb661be7d] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80])) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 10:55:23.699804] E [fuse-bridge.c:1162:fuse_getattr_resume] 0-glusterfs-fuse: 3390: GETATTR 140067732189852 (2677b206-57ad-46dc-a63e-66876bfb88e6) resolution failed
[2013-08-15 11:03:08.168420] W [client-rpc-fops.c:519:client3_3_stat_cbk] 1-nufa-1-client-0: remote operation failed: No such file or directory
[2013-08-15 11:03:08.169552] E [dht-helper.c:429:dht_subvol_get_hashed] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/distribute.so(dht_migration_complete_check_task+0x11e) [0x7f6414e6c7fe] (-->/usr/lib64/libglusterfs.so.0(syncop_lookup+0x19a) [0x3cb664b56a] (-->/usr/lib64/glusterfs/3.4.0.19rhs/xlator/cluster/nufa.so(nufa_lookup+0x90) [0x7f640dd07f80]))) 1-nufa-1-dht: invalid argument: loc->parent
[2013-08-15 11:03:08.171443] W [fuse-bridge.c:1133:fuse_attr_cbk] 0-glusterfs-fuse: 3396: STAT() /fil5 => -1 (No such file or directory)

Comment 1 shishir gowda 2013-08-21 04:24:28 UTC

Can you please attach the sos-reports from the clients (and servers if possible)?

Comment 2 Sachidananda Urs 2013-08-21 07:31:45 UTC

Created attachment 788737 [details]
sosreports

Comment 3 Sachidananda Urs 2013-08-21 07:34:20 UTC

Attaching sosreports.

Volume looks like:

Volume Name: nufa
Type: Distribute
Volume ID: a956aa02-befa-4a7c-bc5a-67a495bff7c6
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.70.37.72:/rhs/brick1/nufa-1
Brick2: 10.70.37.100:/rhs/brick1/nufa-2
Brick3: 10.70.37.124:/rhs/brick1/nufa-3
Brick4: 10.70.37.82:/rhs/brick1/nufa-4

sosreports are named after their corresponding ip-addresses.

Comment 4 shishir gowda 2013-08-21 10:23:21 UTC

Couple of observations:
1. Looks like if creation is done from client on which rm -rf was not issued, then it fails.
2. The failure seems to be in fuse_attr. dht_attr does not find the file (as it should).
3. A remount of client does fix the issue
4. Though the file is deleted, looks like the inode is still linked in the inode table

Breakpoint 15, dht_stat (frame=0x7f88dccfeb98, this=0x1d0ca90, 
    loc=0x7f88cc02ef50, xdata=0x0) at dht-inode-read.c:259
259	{
(gdb) p *loc
$22 = {path = 0x7f88cc007170 "/file-10", name = 0x0, inode = 0x7f88d3b1f6e0, 
  parent = 0x0, gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", 
  pargfid = '\000' <repeats 15 times>}

(gdb) p *loc->inode 
$24 = {table = 0x1db6f50, 
  gfid = "\254\213\222\027+>LT\211\240\217\355\331\025\307\v", lock = 1, 
  nlookup = 6, fd_count = 0, ref = 3, ia_type = IA_IFREG, fd_list = {
    next = 0x7f88d3b1f718, prev = 0x7f88d3b1f718}, dentry_list = {
    next = 0x7f88d387e320, prev = 0x7f88d387e320}, hash = {
    next = 0x7f88d38440c0, prev = 0x7f88d38440c0}, list = {
    next = 0x7f88d3b1f094, prev = 0x1db6fb0}, _ctx = 0x1dd4340}

Comment 5 shishir gowda 2013-08-21 10:29:08 UTC

Could we try checking if mounting the clients with --entry-timeout=0 and --attribute-timeout=0 fixes the issue?

Comment 6 Sachidananda Urs 2013-08-21 12:56:35 UTC

Same results

Comment 7 Amar Tumballi 2013-08-22 06:18:07 UTC

Removing the 'blocker' flag as per discussion yesterday.


NUFA supportability scope in Big Bend

- Only supported when a client that is mounting a NUFA enabled volume is present within the trusted storage pool i.e. co-resident with a Red Hat Storage Server.
- Only supported for the FUSE client
- Only supported with one brick per server
- When the local brick runs out of space or hits the cluster mindiskfree limit files will get distributed to other bricks in the same volume as long as there is space instead of returning ENOSPACE

Note You need to log in before you can comment on or make changes to this bug.