Description of problem: ----------------------- 4 Node Master 2 Node Slave. Geo-rep syncing was I.P. (unsure if this has got anything to do with it). Was running Bonnie,smallfiles,iozone and kernel untar from 8 FUSE mounts. I see multiple Call Traces for glusterfsd in dmesg : <snip> [67440.859648] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [67440.867471] kswapd1 D ffff880c163fbec0 0 146 2 0x00000000 [67440.874547] ffff880c15c37908 0000000000000046 ffff880c163fbec0 ffff880c15c37fd8 [67440.882139] ffff880c15c37fd8 ffff880c15c37fd8 ffff880c163fbec0 0000000e000c5100 [67440.889665] ffff88017eecd128 0000000000000000 ffff880c04f01e18 ffff880c163fbec0 [67440.897148] Call Trace: [67440.899641] [<ffffffff8168b579>] schedule+0x29/0x70 [67440.904652] [<ffffffffa0316d02>] _xfs_log_force_lsn+0x1c2/0x350 [xfs] [67440.911314] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [67440.917045] [<ffffffffa0316ebe>] xfs_log_force_lsn+0x2e/0x90 [xfs] [67440.923403] [<ffffffffa0307d89>] ? xfs_iunpin_wait+0x19/0x20 [xfs] [67440.929694] [<ffffffffa030432c>] __xfs_iunpin_wait+0x9c/0x150 [xfs] [67440.936048] [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40 [67440.941989] [<ffffffffa0307d89>] xfs_iunpin_wait+0x19/0x20 [xfs] [67440.948105] [<ffffffffa02fc3ec>] xfs_reclaim_inode+0x8c/0x370 [xfs] [67440.954481] [<ffffffffa02fc937>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs] [67440.961318] [<ffffffffa02fd472>] ? xfs_inode_set_reclaim_tag+0xa2/0x170 [xfs] [67440.968573] [<ffffffff81242af7>] ? fsnotify_clear_marks_by_inode+0xa7/0x140 [67440.975699] [<ffffffffa030d45c>] ? xfs_fs_destroy_inode+0xdc/0x120 [xfs] [67440.982537] [<ffffffff8121a018>] ? destroy_inode+0x38/0x60 [67440.988146] [<ffffffff8121a146>] ? evict+0x106/0x170 [67440.993278] [<ffffffff8121a1ee>] ? dispose_list+0x3e/0x50 [67440.998818] [<ffffffffa02fd5c3>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [67441.005467] [<ffffffffa030cf95>] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [67441.012506] [<ffffffff81200808>] prune_super+0xe8/0x170 [67441.017812] [<ffffffff811940b3>] shrink_slab+0x163/0x330 [67441.023205] [<ffffffff811f5267>] ? vmpressure+0x87/0x90 [67441.028508] [<ffffffff81197ea1>] balance_pgdat+0x4b1/0x5e0 [67441.034071] [<ffffffff81198143>] kswapd+0x173/0x450 [67441.039031] [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30 [67441.044910] [<ffffffff81197fd0>] ? balance_pgdat+0x5e0/0x5e0 [67441.050715] [<ffffffff810b052f>] kthread+0xcf/0xe0 [67441.055642] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [67441.062274] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [67441.067784] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [67441.074433] INFO: task glusterfsd:27476 blocked for more than 120 seconds. [67441.081347] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [67441.089212] glusterfsd D ffff880c142ece70 0 27476 1 0x00000080 [67441.096431] ffff880a7169fe38 0000000000000086 ffff880c142ece70 ffff880a7169ffd8 [67441.104072] ffff880a7169ffd8 ffff880a7169ffd8 ffff880c142ece70 0000000e000c5100 [67441.111692] ffff88017eecd128 ffff880a7169fef4 ffff880c04f01e18 ffff880c142ece70 [67441.119380] Call Trace: [67441.121878] [<ffffffff8168b579>] schedule+0x29/0x70 [67441.126952] [<ffffffffa0316d02>] _xfs_log_force_lsn+0x1c2/0x350 [xfs] [67441.133549] [<ffffffff81181e35>] ? __filemap_fdatawrite_range+0x65/0x80 [67441.140318] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [67441.146016] [<ffffffffa02f78ee>] xfs_file_fsync+0x10e/0x1e0 [xfs] [67441.152258] [<ffffffff8122fdb5>] do_fsync+0x65/0xa0 [67441.157299] [<ffffffff81230080>] SyS_fsync+0x10/0x20 [67441.162390] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [67681.147260] INFO: task glusterfsd:27468 blocked for more than 120 seconds. [67681.154152] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [67681.162025] glusterfsd D ffff880c01ebbec0 0 27468 1 0x00000080 [67681.169249] ffff88070f32be38 0000000000000086 ffff880c01ebbec0 ffff88070f32bfd8 [67681.176933] ffff88070f32bfd8 ffff88070f32bfd8 ffff880c01ebbec0 ffff880c01ebbec0 [67681.184483] ffff88017eecd128 ffff88070f32bef4 0000000000000000 ffff880c01ebbec0 [67681.191992] Call Trace: [67681.194474] [<ffffffff8168b579>] schedule+0x29/0x70 [67681.199538] [<ffffffffa0316e3a>] _xfs_log_force_lsn+0x2fa/0x350 [xfs] [67681.206067] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [67681.211741] [<ffffffffa02f78ee>] xfs_file_fsync+0x10e/0x1e0 [xfs] [67681.217926] [<ffffffff8122fdb5>] do_fsync+0x65/0xa0 [67681.223041] [<ffffffff81230080>] SyS_fsync+0x10/0x20 [67681.228212] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [68041.202808] INFO: task xfsaild/dm-42:6989 blocked for more than 120 seconds. [68041.209848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [68041.217749] xfsaild/dm-42 D ffff8805ecb1d140 0 6989 2 0x00000080 [68041.224811] ffff8805ff20bb30 0000000000000046 ffff880036501f60 ffff8805ff20bfd8 [68041.232238] ffff8805ff20bfd8 ffff8805ff20bfd8 ffff880036501f60 ffff8809fe34f558 [68041.239661] ffff8809fe34f560 7fffffffffffffff ffff880036501f60 ffff8805ecb1d140 [68041.247089] Call Trace: [68041.249538] [<ffffffff8168b579>] schedule+0x29/0x70 [68041.254495] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [68041.260320] [<ffffffff812f0374>] ? blk_finish_plug+0x14/0x40 [68041.266125] [<ffffffffa02f2014>] ? _xfs_buf_ioapply+0x334/0x460 [xfs] [68041.272693] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [68041.278831] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [68041.284478] [<ffffffffa02f3e53>] ? _xfs_buf_read+0x23/0x40 [xfs] [68041.290631] [<ffffffffa02f3d46>] xfs_buf_submit_wait+0xe6/0x1d0 [xfs] [68041.297222] [<ffffffffa0323709>] ? xfs_trans_read_buf_map+0x199/0x400 [xfs] [68041.304303] [<ffffffffa02f3e53>] _xfs_buf_read+0x23/0x40 [xfs] [68041.310259] [<ffffffffa02f3f68>] xfs_buf_read_map+0xf8/0x160 [xfs] [68041.316594] [<ffffffffa0323709>] xfs_trans_read_buf_map+0x199/0x400 [xfs] [68041.323528] [<ffffffffa02e758e>] xfs_imap_to_bp+0x6e/0xf0 [xfs] [68041.329603] [<ffffffffa0308ae3>] xfs_iflush+0x113/0x260 [xfs] [68041.335506] [<ffffffffa031aa9e>] xfs_inode_item_push+0xee/0x150 [xfs] [68041.342105] [<ffffffffa0322360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [68041.349387] [<ffffffffa0322671>] xfsaild+0x311/0x660 [xfs] [68041.355020] [<ffffffffa0322360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [68041.362282] [<ffffffff810b052f>] kthread+0xcf/0xe0 [68041.367202] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [68041.373776] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [68041.379216] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [69001.301766] INFO: task kswapd1:146 blocked for more than 120 seconds. [69001.308200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69001.316017] kswapd1 D ffff880c15c37be8 0 146 2 0x00000000 [69001.323080] ffff880c15c378d0 0000000000000046 ffff880c163fbec0 ffff880c15c37fd8 [69001.330505] ffff880c15c37fd8 ffff880c15c37fd8 ffff880c163fbec0 ffff880617c96c40 [69001.337931] 0000000000000000 7fffffffffffffff ffffffffa0307d89 ffff880c15c37be8 [69001.345357] Call Trace: [69001.347842] [<ffffffffa0307d89>] ? xfs_iunpin_wait+0x19/0x20 [xfs] [69001.354107] [<ffffffff8168b579>] schedule+0x29/0x70 [69001.359114] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [69001.365135] [<ffffffffa0316c24>] ? _xfs_log_force_lsn+0xe4/0x350 [xfs] [69001.371746] [<ffffffff811dc1f1>] ? __slab_free+0x81/0x2f0 [69001.377249] [<ffffffffa0307d89>] ? xfs_iunpin_wait+0x19/0x20 [xfs] [69001.383557] [<ffffffff8168ab1e>] io_schedule_timeout+0xae/0x130 [69001.389598] [<ffffffff810b1266>] ? prepare_to_wait+0x56/0x90 [69001.395391] [<ffffffff8168abb8>] io_schedule+0x18/0x20 [69001.400687] [<ffffffffa0304369>] __xfs_iunpin_wait+0xd9/0x150 [xfs] [69001.407085] [<ffffffff810b16c0>] ? wake_bit_function+0x40/0x40 [69001.413078] [<ffffffffa0307d89>] xfs_iunpin_wait+0x19/0x20 [xfs] [69001.419247] [<ffffffffa02fc3ec>] xfs_reclaim_inode+0x8c/0x370 [xfs] [69001.425667] [<ffffffffa02fc937>] xfs_reclaim_inodes_ag+0x267/0x390 [xfs] [69001.432530] [<ffffffffa02fd5c3>] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [69001.439212] [<ffffffffa030cf95>] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [69001.446300] [<ffffffff81200808>] prune_super+0xe8/0x170 [69001.451659] [<ffffffff811940b3>] shrink_slab+0x163/0x330 [69001.457117] [<ffffffff811867af>] ? zone_watermark_ok+0x1f/0x30 [69001.463075] [<ffffffff811a78d3>] ? compaction_suitable+0xa3/0xb0 [69001.469219] [<ffffffff81197ea1>] balance_pgdat+0x4b1/0x5e0 [69001.474833] [<ffffffff81198143>] kswapd+0x173/0x450 [69001.479840] [<ffffffff810b1600>] ? wake_up_atomic_t+0x30/0x30 [69001.485722] [<ffffffff81197fd0>] ? balance_pgdat+0x5e0/0x5e0 [69001.491497] [<ffffffff810b052f>] kthread+0xcf/0xe0 [69001.496407] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [69001.502996] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [69001.508425] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [69001.515070] INFO: task xfsaild/dm-44:6998 blocked for more than 120 seconds. [69001.522132] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69001.529994] xfsaild/dm-44 D ffff880611ff0000 0 6998 2 0x00000080 [69001.537206] ffff8805f5247d58 0000000000000046 ffff880c13d5de20 ffff8805f5247fd8 [69001.544879] ffff8805f5247fd8 ffff8805f5247fd8 ffff880c13d5de20 ffff880c04f01600 [69001.552424] 0000000000000000 ffff880c13d5de20 ffff88017eecd128 ffff880611ff0000 [69001.560039] Call Trace: [69001.562537] [<ffffffff8168b579>] schedule+0x29/0x70 [69001.567588] [<ffffffffa031697d>] _xfs_log_force+0x1bd/0x2b0 [xfs] [69001.573810] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [69001.579429] [<ffffffffa0316a96>] xfs_log_force+0x26/0x80 [xfs] [69001.585420] [<ffffffffa0322360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [69001.592689] [<ffffffffa03224ba>] xfsaild+0x15a/0x660 [xfs] [69001.598389] [<ffffffffa0322360>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs] [69001.605645] [<ffffffff810b052f>] kthread+0xcf/0xe0 [69001.610593] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [69001.617168] [<ffffffff81696418>] ret_from_fork+0x58/0x90 [69001.622576] [<ffffffff810b0460>] ? kthread_create_on_node+0x140/0x140 [69001.629168] INFO: task glusterfsd:25443 blocked for more than 120 seconds. [69001.636098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69001.643968] glusterfsd D ffff8805e4e68fb0 0 25443 1 0x00000080 [69001.651243] ffff8805eea9fe38 0000000000000086 ffff8805e4e68fb0 ffff8805eea9ffd8 [69001.658969] ffff8805eea9ffd8 ffff8805eea9ffd8 ffff8805e4e68fb0 ffff8805e4e68fb0 [69001.666534] ffff88017eecd128 ffff8805eea9fef4 0000000000000000 ffff8805e4e68fb0 [69001.674116] Call Trace: [69001.676621] [<ffffffff8168b579>] schedule+0x29/0x70 [69001.681679] [<ffffffffa0316e3a>] _xfs_log_force_lsn+0x2fa/0x350 [xfs] [69001.688265] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [69001.693926] [<ffffffffa02f78ee>] xfs_file_fsync+0x10e/0x1e0 [xfs] [69001.700110] [<ffffffff8122fdb5>] do_fsync+0x65/0xa0 [69001.705138] [<ffffffff812300a3>] SyS_fdatasync+0x13/0x20 [69001.710592] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [69001.716662] INFO: task glusterfsd:26890 blocked for more than 120 seconds. [69001.723581] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69001.731440] glusterfsd D ffff8805db733e00 0 26890 1 0x00000080 [69001.738660] ffff88071323f7f0 0000000000000086 ffff880c126ece70 ffff88071323ffd8 [69001.746257] ffff88071323ffd8 ffff88071323ffd8 ffff880c126ece70 ffff880610db9d58 [69001.753858] ffff880610db9d60 7fffffffffffffff ffff880c126ece70 ffff8805db733e00 [69001.761410] Call Trace: [69001.763890] [<ffffffff8168b579>] schedule+0x29/0x70 [69001.768880] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [69001.774756] [<ffffffff812f0374>] ? blk_finish_plug+0x14/0x40 [69001.780584] [<ffffffffa02f2014>] ? _xfs_buf_ioapply+0x334/0x460 [xfs] [69001.787158] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [69001.793313] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [69001.798960] [<ffffffffa02f3e53>] ? _xfs_buf_read+0x23/0x40 [xfs] [69001.805117] [<ffffffffa02f3d46>] xfs_buf_submit_wait+0xe6/0x1d0 [xfs] [69001.811751] [<ffffffffa0323709>] ? xfs_trans_read_buf_map+0x199/0x400 [xfs] [69001.818876] [<ffffffffa02f3e53>] _xfs_buf_read+0x23/0x40 [xfs] [69001.824872] [<ffffffffa02f3f68>] xfs_buf_read_map+0xf8/0x160 [xfs] [69001.831194] [<ffffffffa0323709>] xfs_trans_read_buf_map+0x199/0x400 [xfs] [69001.838140] [<ffffffffa02e758e>] xfs_imap_to_bp+0x6e/0xf0 [xfs] [69001.844215] [<ffffffffa02e7a8e>] xfs_iread+0x7e/0x430 [xfs] [69001.849936] [<ffffffffa02fcd08>] xfs_iget+0x2a8/0x750 [xfs] [69001.855640] [<ffffffffa0305ae7>] xfs_lookup+0xf7/0x140 [xfs] [69001.861425] [<ffffffffa030278b>] xfs_vn_lookup+0x7b/0xd0 [xfs] [69001.867403] [<ffffffff812083dd>] lookup_real+0x1d/0x50 [69001.872685] [<ffffffff81208d52>] __lookup_hash+0x42/0x60 [69001.878139] [<ffffffff8168318b>] lookup_slow+0x42/0xa7 [69001.883418] [<ffffffff8120b96f>] link_path_walk+0x80f/0x8b0 [69001.889153] [<ffffffffa02b9176>] ? xfs_attr_shortform_getvalue+0x116/0x130 [xfs] [69001.896685] [<ffffffff8120bb6b>] path_lookupat+0x6b/0x7a0 [69001.902233] [<ffffffff811ddf75>] ? kmem_cache_alloc+0x35/0x1e0 [69001.908200] [<ffffffff8120ebbf>] ? getname_flags+0x4f/0x1a0 [69001.913926] [<ffffffff8120c2cb>] filename_lookup+0x2b/0xc0 [69001.919547] [<ffffffff8120fce7>] user_path_at_empty+0x67/0xc0 [69001.925441] [<ffffffff8120eb4d>] ? putname+0x3d/0x60 [69001.930519] [<ffffffff8120fcf2>] ? user_path_at_empty+0x72/0xc0 [69001.936578] [<ffffffff8120fd51>] user_path_at+0x11/0x20 [69001.941937] [<ffffffff812031c3>] vfs_fstatat+0x63/0xc0 [69001.947177] [<ffffffff81203791>] SYSC_newlstat+0x31/0x60 [69001.952623] [<ffffffff8121e8e4>] ? mntput+0x24/0x40 [69001.957645] [<ffffffff8111ede6>] ? __audit_syscall_exit+0x1e6/0x280 [69001.964049] [<ffffffff81203a1e>] SyS_newlstat+0xe/0x10 [69001.969310] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [69001.975367] INFO: task glusterfsd:26894 blocked for more than 120 seconds. [69001.982291] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69001.990147] glusterfsd D ffff8805db733e00 0 26894 1 0x00000080 [69001.997389] ffff880a578277f0 0000000000000086 ffff880c126e9f60 ffff880a57827fd8 [69002.004973] ffff880a57827fd8 ffff880a57827fd8 ffff880c126e9f60 ffff8805e36d98d8 [69002.012635] ffff8805e36d98e0 7fffffffffffffff ffff880c126e9f60 ffff8805db733e00 [69002.020248] Call Trace: [69002.022773] [<ffffffff8168b579>] schedule+0x29/0x70 [69002.027798] [<ffffffff81688fc9>] schedule_timeout+0x239/0x2d0 [69002.033680] [<ffffffff812f0374>] ? blk_finish_plug+0x14/0x40 [69002.039513] [<ffffffffa02f2014>] ? _xfs_buf_ioapply+0x334/0x460 [xfs] [69002.046088] [<ffffffff8168b956>] wait_for_completion+0x116/0x170 [69002.052242] [<ffffffff810c4ec0>] ? wake_up_state+0x20/0x20 [69002.057892] [<ffffffffa02f3e53>] ? _xfs_buf_read+0x23/0x40 [xfs] [69002.064076] [<ffffffffa02f3d46>] xfs_buf_submit_wait+0xe6/0x1d0 [xfs] [69002.070669] [<ffffffffa0323709>] ? xfs_trans_read_buf_map+0x199/0x400 [xfs] [69002.077814] [<ffffffffa02f3e53>] _xfs_buf_read+0x23/0x40 [xfs] [69002.083832] [<ffffffffa02f3f68>] xfs_buf_read_map+0xf8/0x160 [xfs] [69002.090163] [<ffffffffa0323709>] xfs_trans_read_buf_map+0x199/0x400 [xfs] [69002.097128] [<ffffffffa02e758e>] xfs_imap_to_bp+0x6e/0xf0 [xfs] [69002.103213] [<ffffffffa02e7a8e>] xfs_iread+0x7e/0x430 [xfs] [69002.108958] [<ffffffffa02fcd08>] xfs_iget+0x2a8/0x750 [xfs] [69002.114645] [<ffffffffa0305ae7>] xfs_lookup+0xf7/0x140 [xfs] [69002.120525] [<ffffffffa030278b>] xfs_vn_lookup+0x7b/0xd0 [xfs] [69002.126513] [<ffffffff812083dd>] lookup_real+0x1d/0x50 [69002.131783] [<ffffffff81208d52>] __lookup_hash+0x42/0x60 [69002.137255] [<ffffffff8168318b>] lookup_slow+0x42/0xa7 [69002.142531] [<ffffffff8120b96f>] link_path_walk+0x80f/0x8b0 [69002.148275] [<ffffffffa02b9176>] ? xfs_attr_shortform_getvalue+0x116/0x130 [xfs] [69002.155874] [<ffffffff8120bb6b>] path_lookupat+0x6b/0x7a0 [69002.161428] [<ffffffff811ddf75>] ? kmem_cache_alloc+0x35/0x1e0 [69002.167346] [<ffffffff8120ebbf>] ? getname_flags+0x4f/0x1a0 [69002.173068] [<ffffffff8120c2cb>] filename_lookup+0x2b/0xc0 [69002.178700] [<ffffffff8120fce7>] user_path_at_empty+0x67/0xc0 [69002.184592] [<ffffffff8120eb4d>] ? putname+0x3d/0x60 [69002.189698] [<ffffffff8120fcf2>] ? user_path_at_empty+0x72/0xc0 [69002.195767] [<ffffffff8120fd51>] user_path_at+0x11/0x20 [69002.201076] [<ffffffff812031c3>] vfs_fstatat+0x63/0xc0 [69002.206366] [<ffffffff81203791>] SYSC_newlstat+0x31/0x60 [69002.211811] [<ffffffff8121e8e4>] ? mntput+0x24/0x40 [69002.216836] [<ffffffff8111ede6>] ? __audit_syscall_exit+0x1e6/0x280 [69002.223232] [<ffffffff81203a1e>] SyS_newlstat+0xe/0x10 [69002.228520] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b [69002.234573] INFO: task glusterfsd:26904 blocked for more than 120 seconds. [69002.241504] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [69002.249371] glusterfsd D ffff8801260bc228 0 26904 1 0x00000080 [69002.256542] ffff880902b2bb70 0000000000000086 ffff880bed763ec0 ffff880902b2bfd8 [69002.264122] ffff880902b2bfd8 ffff880902b2bfd8 ffff880bed763ec0 ffff8801260bc220 [69002.271765] ffff8801260bc224 ffff880bed763ec0 00000000ffffffff ffff8801260bc228 [69002.279360] Call Trace: [69002.281868] [<ffffffff8168c669>] schedule_preempt_disabled+0x29/0x70 [69002.288357] [<ffffffff8168a2c5>] __mutex_lock_slowpath+0xc5/0x1c0 [69002.294599] [<ffffffff81208797>] ? unlazy_walk+0x87/0x140 [69002.300181] [<ffffffff8168972f>] mutex_lock+0x1f/0x2f [69002.305353] [<ffffffff8168317c>] lookup_slow+0x33/0xa7 [69002.310629] [<ffffffff8120b96f>] link_path_walk+0x80f/0x8b0 [69002.316354] [<ffffffffa02b9176>] ? xfs_attr_shortform_getvalue+0x116/0x130 [xfs] [69002.323881] [<ffffffff8120bb6b>] path_lookupat+0x6b/0x7a0 [69002.329424] [<ffffffff811ddf75>] ? kmem_cache_alloc+0x35/0x1e0 [69002.335395] [<ffffffff8120ebbf>] ? getname_flags+0x4f/0x1a0 [69002.341093] [<ffffffff8120c2cb>] filename_lookup+0x2b/0xc0 [69002.346721] [<ffffffff8120fce7>] user_path_at_empty+0x67/0xc0 [69002.352609] [<ffffffff8120eb4d>] ? putname+0x3d/0x60 [69002.357721] [<ffffffff8120fcf2>] ? user_path_at_empty+0x72/0xc0 [69002.363785] [<ffffffff8120fd51>] user_path_at+0x11/0x20 [69002.369102] [<ffffffff812031c3>] vfs_fstatat+0x63/0xc0 [69002.374380] [<ffffffff81203791>] SYSC_newlstat+0x31/0x60 [69002.379829] [<ffffffff8121e8e4>] ? mntput+0x24/0x40 [69002.384851] [<ffffffff8111ede6>] ? __audit_syscall_exit+0x1e6/0x280 [69002.391243] [<ffffffff81203a1e>] SyS_newlstat+0xe/0x10 [69002.396521] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b </snip> Version-Release number of selected component (if applicable): -------------------------------------------------------------- glusterfs-fuse-3.8.4-28.el7rhgs.x86_64 How reproducible: ----------------- Reporting the first occurence. Actual results: --------------- dmesg flooded with Call Traces. Expected results: ----------------- No traces in logs. Additional info: ---------------- *MAster* : Volume Name: testvol Type: Distributed-Replicate Volume ID: 29f18f45-c822-4c0e-84ef-737e128e0368 Status: Started Snapshot Count: 0 Number of Bricks: 12 x 2 = 24 Transport-type: tcp Bricks: Brick1: gqas005.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick2: gqas013.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick4: gqas008.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick5: gqas005.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick6: gqas013.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick7: gqas006.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick8: gqas008.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick9: gqas005.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick10: gqas013.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick11: gqas006.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick12: gqas008.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick13: gqas005.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick14: gqas013.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick15: gqas006.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick16: gqas008.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick17: gqas005.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick18: gqas013.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick19: gqas006.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick20: gqas008.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick21: gqas005.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick22: gqas013.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick23: gqas006.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick24: gqas008.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Options Reconfigured: nfs.disable: off transport.address-family: inet features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.stat-prefetch: on performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 50000 server.event-threads: 4 client.event-threads: 4 geo-replication.indexing: on geo-replication.ignore-pid-check: on changelog.changelog: on cluster.enable-shared-storage: enable [root@gqas013 glusterfs]# *Slave* : [root@gqas015 ~]# gluster v info Volume Name: butcher Type: Distributed-Disperse Volume ID: 6de155ee-2200-44bb-a8ed-bdae5acf348f Status: Started Snapshot Count: 0 Number of Bricks: 4 x (4 + 2) = 24 Transport-type: tcp Bricks: Brick1: gqas014.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick2: gqas015.sbu.lab.eng.bos.redhat.com:/bricks1/A1 Brick3: gqas014.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick4: gqas015.sbu.lab.eng.bos.redhat.com:/bricks2/A1 Brick5: gqas014.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick6: gqas015.sbu.lab.eng.bos.redhat.com:/bricks3/A1 Brick7: gqas014.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick8: gqas015.sbu.lab.eng.bos.redhat.com:/bricks4/A1 Brick9: gqas014.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick10: gqas015.sbu.lab.eng.bos.redhat.com:/bricks5/A1 Brick11: gqas014.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick12: gqas015.sbu.lab.eng.bos.redhat.com:/bricks6/A1 Brick13: gqas014.sbu.lab.eng.bos.redhat.com:/bricks7/A1 Brick14: gqas015.sbu.lab.eng.bos.redhat.com:/bricks7/A1 Brick15: gqas014.sbu.lab.eng.bos.redhat.com:/bricks8/A1 Brick16: gqas015.sbu.lab.eng.bos.redhat.com:/bricks8/A1 Brick17: gqas014.sbu.lab.eng.bos.redhat.com:/bricks9/A1 Brick18: gqas015.sbu.lab.eng.bos.redhat.com:/bricks9/A1 Brick19: gqas014.sbu.lab.eng.bos.redhat.com:/bricks10/A1 Brick20: gqas015.sbu.lab.eng.bos.redhat.com:/bricks10/A1 Brick21: gqas014.sbu.lab.eng.bos.redhat.com:/bricks11/A1 Brick22: gqas015.sbu.lab.eng.bos.redhat.com:/bricks11/A1 Brick23: gqas014.sbu.lab.eng.bos.redhat.com:/bricks12/A1 Brick24: gqas015.sbu.lab.eng.bos.redhat.com:/bricks12/A1 Options Reconfigured: network.inode-lru-limit: 50000 performance.md-cache-timeout: 600 performance.cache-invalidation: on performance.stat-prefetch: on features.cache-invalidation-timeout: 600 features.cache-invalidation: on transport.address-family: inet nfs.disable: off [root@gqas015 ~]# [root@gqas013 glusterfs]# gluster volume geo-replication testvol gqas014.sbu.lab.eng.bos.redhat.com::butcher status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks1/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7299 0 0 2017-06-16 05:32:22 No N/A gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks2/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7321 0 0 2017-06-16 05:32:22 No N/A gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks3/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7303 0 0 2017-06-16 05:32:22 No N/A gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks4/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 8192 7318 0 0 2017-06-16 05:32:22 No N/A gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks5/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7308 0 0 2017-06-16 05:32:22 No N/A gqas013.sbu.lab.eng.bos.redhat.com testvol /bricks6/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7293 0 0 2017-06-16 05:32:22 No N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks1/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks2/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks3/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks4/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks5/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas005.sbu.lab.eng.bos.redhat.com testvol /bricks6/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks1/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks2/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks3/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7310 0 0 2017-06-16 05:32:22 No N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks4/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks5/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas008.sbu.lab.eng.bos.redhat.com testvol /bricks6/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas014.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks1/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7305 0 0 2017-06-16 05:32:22 No N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks2/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 8192 7311 0 0 2017-06-16 05:32:22 No N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks3/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Passive N/A N/A N/A N/A N/A N/A N/A N/A N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks4/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7311 0 0 2017-06-16 05:32:22 No N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks5/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7309 0 0 2017-06-16 05:32:22 No N/A gqas006.sbu.lab.eng.bos.redhat.com testvol /bricks6/A1 root gqas014.sbu.lab.eng.bos.redhat.com::butcher gqas015.sbu.lab.eng.bos.redhat.com Active Hybrid Crawl N/A 0 7311 0 0 2017-06-16 05:32:22 No N/A [root@gqas013 glusterfs]#
This call trace looks like the issues with backend fs. We had similar issue in another stress testing machine, If this same behavior is not seen in recent times, specially in testing of 3.4.0, should we be closing the bug as WORKSFORME/CURRENTRELEASE?
Closing with reference to Comment #7, No changes in glusterfs has caused this particular stack trace in kernel. Please reopen if seen again in testing.