848326 – Rename fails during graph change

Bug 848326 - Rename fails during graph change

Summary: Rename fails during graph change

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	distribute
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	Anush Shetty
Docs Contact:
URL:
Whiteboard:
Depends On:	804592
Blocks:	850501
TreeView+	depends on / blocked

Reported:	2012-08-15 09:32 UTC by Vidya Sakar
Modified:	2023-09-14 01:36 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.4.0qa5-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:	804592
Environment:
Last Closed:	2015-03-23 07:37:40 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Vidya Sakar 2012-08-15 09:32:11 UTC

+++ This bug was initially created as a clone of Bug #804592 +++

Description of problem: Renames fails with ENOENT while the graph change was going on. This was a single export volume.


Version-Release number of selected component (if applicable): 3.3.0qa29


How reproducible: Consistently


Steps to Reproduce:
1. while true; do echo 'sdsdsd' > dot; mv dot dot2; rm -rf *; done
2. while true; do gluster volume set test2 performance.write-behind off; sleep 1; gluster volume set test2 performance.write-behind on; sleep 1; done

3.
  
Actual results:

mv: cannot move `dot' to `dot2': No such file or directory


Expected results:

Renames should continue without errors.


Additional info:

Client log:

[2012-03-19 16:31:06.223804] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45223 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.229473] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed
[2012-03-19 16:31:06.235730] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45240 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.243696] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed
[2012-03-19 16:31:06.249440] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45257 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.252887] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed
[2012-03-19 16:31:06.259771] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45274 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.262661] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed
[2012-03-19 16:31:06.271101] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45291 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.273614] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed
[2012-03-19 16:31:06.277884] E [fuse-bridge.c:1511:fuse_rename_resume] 0-glusterfs-fuse: RENAME 45308 00000000-0000-0000-0000-000000000000/do
t -> 00000000-0000-0000-0000-000000000000/dot2 src resolution failed
[2012-03-19 16:31:06.280381] E [fuse-bridge.c:1329:fuse_unlink_resume] 0-glusterfs-fuse: UNLINK 1 (00000000-0000-0000-0000-000000000000/dot) 
resolution failed

--- Additional comment from amarts on 2012-03-25 03:14:16 EDT ---

Check if its already fixed.

--- Additional comment from ashetty on 2012-03-26 01:00:36 EDT ---

This issue still exists on the mainline.

--- Additional comment from rgowdapp on 2012-04-02 23:35:05 EDT ---

Patch has been sent for review at http://review.gluster.com/#change,3007

--- Additional comment from ashetty on 2012-04-23 03:07:12 EDT ---

With http://review.gluster.com/#change,3181 and http://review.gluster.com/#change,3181, this issue is fixed.

--- Additional comment from ashetty on 2012-04-23 03:08:09 EDT ---

With http://review.gluster.com/#change,3007 and  http://review.gluster.com/#change,3181 I meant.

--- Additional comment from aavati on 2012-05-08 18:34:57 EDT ---

CHANGE: http://review.gluster.com/3007 (fuse-resolve: consider cases where an entry should be resolved even when parent belongs to active itable.) merged in master by Anand Avati (avati)

--- Additional comment from aavati on 2012-05-15 20:09:15 EDT ---

CHANGE: http://review.gluster.com/3181 (fuse-resolve: Attempt fd-migration in resolver, if migration was never attempted.) merged in master by Anand Avati (avati)

--- Additional comment from vbellur on 2012-05-18 09:12:23 EDT ---

Addressing this post 3.3.0.

Comment 2 Amar Tumballi 2012-10-11 10:24:04 UTC

with proper fd-migration, it should be fixed now in upstream, not seeing the issue.

Comment 3 Anush Shetty 2013-01-21 09:17:50 UTC

We still saw renames failing on glusterfs-3.3.0.5rhs-40.el6rhs.x86_64.

Client sosreport here - http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/848326/sosreport-rhsvm02.848326-20130121144432-b8bb.tar.xz

Comment 4 Amar Tumballi 2013-01-25 07:01:23 UTC

we had set the 'fixed in version' as 3.4.0qa5 and also bug is targeted for rhs-2.1.0, so please confirm with the correct binary.

Comment 6 Anush Shetty 2013-08-07 10:35:37 UTC

We saw a OOM kill for the same testcase with glusterfs-3.4.0.17rhs-1.el6rhs.x86_64

out of memory: Kill process 2948 (glusterfs) score 933 or sacrifice child
Killed process 2948, UID 0, (glusterfs) total-vm:7634196kB, anon-rss:7225540kB, file-rss:1360kB
vdsm invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
vdsm cpuset=/ mems_allowed=0
Pid: 1919, comm: vdsm Not tainted 2.6.32-358.14.1.el6.x86_64 #1
Call Trace:
 [<ffffffff810cb561>] ? cpuset_print_task_mems_allowed+0x91/0xb0
 [<ffffffff8111cd80>] ? dump_header+0x90/0x1b0
 [<ffffffff8111d202>] ? oom_kill_process+0x82/0x2a0
 [<ffffffff8111d141>] ? select_bad_process+0xe1/0x120
 [<ffffffff8111d640>] ? out_of_memory+0x220/0x3c0
 [<ffffffff8112c2ec>] ? __alloc_pages_nodemask+0x8ac/0x8d0
 [<ffffffff811609ea>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff8111a167>] ? __page_cache_alloc+0x87/0x90
 [<ffffffff81119b4e>] ? find_get_page+0x1e/0xa0
 [<ffffffff8111b127>] ? filemap_fault+0x1a7/0x500
 [<ffffffff81007ca2>] ? check_events+0x12/0x20
 [<ffffffff810074fd>] ? xen_force_evtchn_callback+0xd/0x10
 [<ffffffff81143124>] ? __do_fault+0x54/0x530
 [<ffffffff81007c8f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff815106bc>] ? _spin_unlock_irqrestore+0x1c/0x20
 [<ffffffff811436f7>] ? handle_pte_fault+0xf7/0xb50
 [<ffffffff810074fd>] ? xen_force_evtchn_callback+0xd/0x10
 [<ffffffff81007ca2>] ? check_events+0x12/0x20
 [<ffffffff81007c8f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff81004a49>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
 [<ffffffff8114438a>] ? handle_mm_fault+0x23a/0x310
 [<ffffffff810474e9>] ? __do_page_fault+0x139/0x480
 [<ffffffff81007c8f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff815106bc>] ? _spin_unlock_irqrestore+0x1c/0x20
 [<ffffffff811c78d6>] ? ep_poll+0x306/0x330
 [<ffffffff81063330>] ? default_wake_function+0x0/0x20
 [<ffffffff815137be>] ? do_page_fault+0x3e/0xa0
 [<ffffffff81510b75>] ? page_fault+0x25/0x30
Mem-Info:
Node 0 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:  32
CPU    1: hi:  186, btch:  31 usd:  51
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  31
CPU    1: hi:  186, btch:  31 usd: 164
active_anon:1831753 inactive_anon:1 isolated_anon:0
 active_file:12 inactive_file:699 isolated_file:0
 unevictable:10968 dirty:1 writeback:2 unstable:0
 free:6454 slab_reclaimable:2094 slab_unreclaimable:5661
 mapped:1267 shmem:35 pagetables:5030 bounce:0
Node 0 DMA free:576kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:572kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 4024 7559 7559
Node 0 DMA32 free:20068kB min:5920kB low:7400kB high:8880kB active_anon:3815240kB inactive_anon:0kB active_file:32kB inactive_file:44kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:4120800kB mlocked:0kB dirty:4kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:56kB slab_unreclaimable:112kB kernel_stack:8kB pagetables:7492kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:118 all_unreclaimable? yes
lowmem_reserve[]: 0 0 3535 3535
Node 0 Normal free:1081988kB min:5200kB low:6500kB high:7800kB active_anon:2432940kB inactive_anon:4kB active_file:4kB inactive_file:4300kB unevictable:43872kB isolated(anon):0kB isolated(file):0kB present:3619840kB mlocked:43872kB dirty:44kB writeback:0kB mapped:5500kB shmem:140kB slab_reclaimable:8300kB slab_unreclaimable:22532kB kernel_stack:1568kB pagetables:12584kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
Node 0 DMA: 2*4kB 1*8kB 1*16kB 1*32kB 0*64kB 2*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 576kB
Node 0 DMA32: 277*4kB 128*8kB 77*16kB 63*32kB 57*64kB 47*128kB 34*256kB 32*512kB 10*1024kB 3*2048kB 2*4096kB = 64708kB
Node 0 Normal: 3023*4kB 2015*8kB 1523*16kB 1177*32kB 938*64kB 785*128kB 739*256kB 608*512kB 229*1024kB 45*2048kB 1*4096kB = 1081988kB
2220 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 0kB
Total swap = 0kB
1966079 pages RAM
87605 pages reserved
11357 pages shared
1560180 pages non-shared
[ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[  343]     0   343     2726      200   1     -17         -1000 udevd
[  828]     0   828     2279      153   1       0             0 dhclient
[  866]     0   866    63855      341   1       0             0 rsyslogd
[  888]    32   888     4743      193   0       0             0 rpcbind
[  908]    29   908     5836      292   0       0             0 rpc.statd
[  940]     0   940     6290      133   1       0             0 rpc.idmapd
[  963]     0   963    64671     2750   0       0             0 glusterd
[ 1091]    81  1091     5350      154   0       0             0 dbus-daemon
[ 1121]    68  1121     6231      331   1       0             0 hald
[ 1122]     0  1122     4526      185   1       0             0 hald-runner
[ 1187]     0  1187    16029      260   0     -17         -1000 sshd
[ 1197]    38  1197     7540      315   0       0             0 ntpd
[ 1217]     0  1217    21652      510   0       0             0 sendmail
[ 1227]    51  1227    19540      445   1       0             0 sendmail
[ 1242]     0  1242    27042      135   0       0             0 ksmtuned
[ 1374]     0  1374    41995     1046   0     -17         -1000 multipathd
[ 1413]     0  1413     3387      829   1       0             0 wdmd
[ 1435]   179  1435    65809     4379   1       0             0 sanlock
[ 1436]     0  1436     5769       71   0       0             0 sanlock-helper
[ 1469]     0  1469     7236     5699   1       0           -17 iscsiuio
[ 1474]     0  1474     1217      127   0       0             0 iscsid
[ 1475]     0  1475     1342      830   1       0           -17 iscsid
[ 1488]     0  1488   232150     1545   0       0             0 libvirtd
[ 1764]     0  1764     2725      188   0     -17         -1000 udevd
[ 1765]     0  1765     2725      175   0     -17         -1000 udevd
[ 1806]    36  1806     2300      116   0       0             0 respawn
[ 1809]    36  1809   363823     5323   1       0             0 vdsm
[ 1816]     0  1816    29302      292   1       0             0 crond
[ 1833]     0  1833    25971      127   0       0             0 rhsmcertd
[ 1866]     0  1866     1014      141   1       0             0 mingetty
[ 1868]     0  1868     1014      141   1       0             0 mingetty
[ 1871]     0  1871     1014      142   1       0             0 mingetty
[ 1873]     0  1873     1014      142   1       0             0 mingetty
[ 1875]     0  1875     1014      142   1       0             0 mingetty
[ 1877]     0  1877     1014      141   1       0             0 mingetty
[ 1889]     0  1889    19105      373   0       0             0 sudo
[ 1890]     0  1890   151115     4201   0       0             0 python
[ 2089]     0  2089    30134      689   0       0             0 screen
[ 2090]     0  2090    27075      300   0       0             0 bash
[ 2098]     0  2098    27075      299   1       0             0 bash
[ 2117]     0  2117    14950      448   0       0             0 ssh
[ 2118]     0  2118    27075      298   0       0             0 bash
[ 2131]     0  2131    14950      416   0       0             0 ssh
[ 2132]     0  2132    27075      298   1       0             0 bash
[ 2139]     0  2139    14950      416   0       0             0 ssh
[ 2140]     0  2140    27075      299   1       0             0 bash
[ 2153]     0  2153    14950      415   0       0             0 ssh
[ 2749]     0  2749    78077     7601   0       0             0 glusterfs
[ 2824]     0  2824    23947      470   0       0             0 sshd
[ 2828]     0  2828    27075      289   0       0             0 bash
[ 2846]     0  2846    29677      231   1       0             0 screen
[ 2959]     0  2959    27315      553   0       0             0 bash
[17593]     0 17593    25225      134   0       0             0 sleep
[20223]     0 20223    28404      187   0       0             0 mv
Out of memory: Kill process 1435 (sanlock) score 2 or sacrifice child
Killed process 1436, UID 0, (sanlock-helper) total-vm:23076kB, anon-rss:168kB, file-rss:116kB
[root@ip-10-253-22-43 ~]#

Comment 7 Nagaprasad Sathyanarayana 2014-05-06 11:43:38 UTC

Dev ack to 3.0 RHS BZs

Comment 8 Raghavendra G 2014-05-08 09:27:07 UTC

Is the current issue is OOM kill or fd not being migrated properly? If its the former, its a known issue as of now since memory consumed by old graphs (and inodes, fds etc in that old graph) is not freed. However, the application should not see any errors while doing I/O on fds the were opened prior to graph switch. If it is just about OOM kill, the bug can be closed by documenting it as a known issue.

Comment 10 Vivek Agarwal 2015-03-23 07:37:40 UTC

The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Comment 11 Vivek Agarwal 2015-03-23 07:39:30 UTC

The product version of Red Hat Storage on which this issue was reported has reached End Of Life (EOL) [1], hence this bug report is being closed. If the issue is still observed on a current version of Red Hat Storage, please file a new bug report on the current version.







[1] https://rhn.redhat.com/errata/RHSA-2014-0821.html

Comment 12 Red Hat Bugzilla 2023-09-14 01:36:42 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.