Bug 991409 - dd blocked on nfs mount when unmounted the bricks while dd was in progress
Summary: dd blocked on nfs mount when unmounted the bricks while dd was in progress
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterfs
Version: 2.1
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-02 11:36 UTC by Rahul Hinduja
Modified: 2015-11-27 09:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-27 09:29:52 UTC
Embargoed:


Attachments (Terms of Use)

Description Rahul Hinduja 2013-08-02 11:36:34 UTC
Description of problem:
======================

dd blocked on nfs mount when unmounted the bricks while dd was in progress. Same is successful from Fuse mount.

nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
INFO: task dd:2181 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dd            D 0000000000000000     0  2181   1836 0x00000080
 ffff880119b4dc78 0000000000000082 0000000000000000 0007761c43cb47ca
 ffff880119b4dbe8 ffff88011817bd70 000000000012709c ffffffffae04e010
 ffff880118234638 ffff880119b4dfd8 000000000000fb88 ffff880118234638
Call Trace:
 [<ffffffff81119d40>] ? sync_page+0x0/0x50
 [<ffffffff8150e513>] io_schedule+0x73/0xc0
 [<ffffffff81119d7d>] sync_page+0x3d/0x50
 [<ffffffff8150eecf>] __wait_on_bit+0x5f/0x90
 [<ffffffff81119fb3>] wait_on_page_bit+0x73/0x80
 [<ffffffff81096d00>] ? wake_bit_function+0x0/0x50
 [<ffffffff8112efb5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8111a3db>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff8111a5a8>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811b1b2e>] vfs_fsync_range+0x7e/0xe0
 [<ffffffff811b1bfd>] vfs_fsync+0x1d/0x20
 [<ffffffffa0250670>] nfs_file_flush+0x70/0xa0 [nfs]
 [<ffffffff8117de3c>] filp_close+0x3c/0x90
 [<ffffffff8117df35>] sys_close+0xa5/0x100
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
nfs: server rhs-client11 not responding, still trying
INFO: task dd:2181 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
dd            D 0000000000000000     0  2181   1836 0x00000080
 ffff880119b4dc78 0000000000000082 0000000000000000 0007761c43cb47ca
 ffff880119b4dbe8 ffff88011817bd70 000000000012709c ffffffffae04e010
 ffff880118234638 ffff880119b4dfd8 000000000000fb88 ffff880118234638
Call Trace:
 [<ffffffff81119d40>] ? sync_page+0x0/0x50
 [<ffffffff8150e513>] io_schedule+0x73/0xc0
 [<ffffffff81119d7d>] sync_page+0x3d/0x50
 [<ffffffff8150eecf>] __wait_on_bit+0x5f/0x90
 [<ffffffff81119fb3>] wait_on_page_bit+0x73/0x80
 [<ffffffff81096d00>] ? wake_bit_function+0x0/0x50
 [<ffffffff8112efb5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8111a3db>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff8111a5a8>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811b1b2e>] vfs_fsync_range+0x7e/0xe0
 [<ffffffff811b1bfd>] vfs_fsync+0x1d/0x20
 [<ffffffffa0250670>] nfs_file_flush+0x70/0xa0 [nfs]
 [<ffffffff8117de3c>] filp_close+0x3c/0x90
 [<ffffffff8117df35>] sys_close+0xa5/0x100
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
[root@tia n]# 


Version-Release number of selected component (if applicable):
=============================================================
glusterfs-rdma-3.4.0.14rhs-1.el6_4.x86_64
glusterfs-devel-3.4.0.14rhs-1.el6_4.x86_64
glusterfs-debuginfo-3.4.0.14rhs-1.el6_4.x86_64
glusterfs-3.4.0.14rhs-1.el6_4.x86_64
glusterfs-fuse-3.4.0.14rhs-1.el6_4.x86_64

Steps to Reproduce:
===================
1. Create and start 6*2 volume from 4 servers (rhs-client11-14)
2. Mount on client (Fuse and NFS)
3. Create directories f and n from Fuse mount
4. cd to f from fuse mount and cd to n from NFS mount.
5. start dd from both mounted directories (f and n ) using:

 dd if=/dev/zero of=test_file bs=1M count=10240

6. while dd is in progress unmount the brick directories. I unmounted /rhs/brick1 (umount -l) on rhs-cleint11 and rhs-client13 which was having bricks from rhs-client11(b1,b3,b5) and rhs-client13(b7,b9,b11)

7. dd was successful from fuse mount but blocked on nfs mount.

Actual results:
===============

dd was blocked on NFS mount


Additional info:
================

Status of volume: vol-dr
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick rhs-client11:/rhs/brick1/r1			N/A	N	N/A
Brick rhs-client12:/rhs/brick1/r2			49152	Y	4052
Brick rhs-client11:/rhs/brick1/r3			N/A	N	N/A
Brick rhs-client12:/rhs/brick1/r4			49153	Y	4056
Brick rhs-client11:/rhs/brick1/r5			N/A	N	N/A
Brick rhs-client12:/rhs/brick1/r6			49154	Y	4060
Brick rhs-client13:/rhs/brick1/r7			N/A	N	N/A
Brick rhs-client14:/rhs/brick1/r8			49155	Y	5448
Brick rhs-client13:/rhs/brick1/r9			N/A	N	N/A
Brick rhs-client14:/rhs/brick1/r10			49156	Y	5454
Brick rhs-client13:/rhs/brick1/r11			N/A	N	N/A
Brick rhs-client14:/rhs/brick1/r12			49157	Y	5459
NFS Server on localhost					2049	Y	7788
Self-heal Daemon on localhost				N/A	Y	7795
NFS Server on rhs-client14				2049	Y	1498
Self-heal Daemon on rhs-client14			N/A	Y	1505
NFS Server on rhs-client13				2049	Y	1018
Self-heal Daemon on rhs-client13			N/A	Y	1025
NFS Server on rhs-client12				2049	Y	11083
Self-heal Daemon on rhs-client12			N/A	Y	11092
 
There are no active volume tasks

Comment 3 Niels de Vos 2015-11-27 09:29:52 UTC
glusterfs-3.5 and newer have a health-checker for the bricks (http://review.gluster.org/5176). If a brick is unmounted, the brick process is killed. This should provide a more stable behaviour.

This problem is most likely fixed in current releases.


Note You need to log in before you can comment on or make changes to this bug.