1476176 – [Gluster-block]: Call trace seen in dmesg while doing block create/delete and tcmu-runner service restart

Bug 1476176 - [Gluster-block]: Call trace seen in dmesg while doing block create/delete and tcmu-runner service restart

Summary: [Gluster-block]: Call trace seen in dmesg while doing block create/delete and...

Keywords:
Status:	CLOSED DUPLICATE of bug 1476730
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tcmu-runner
Sub Component:
Version:	cns-3.9
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Prasanna Kumar Kalever
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:	1560418
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-28 08:17 UTC by Sweta Anandpara
Modified:	2018-05-15 11:19 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-05-15 11:19:05 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Sweta Anandpara 2017-07-28 08:17:03 UTC

Description of problem:
======================
While doing a round of block create/delete testing in a loop and tcmu-runner restart in the interim, and a few other things (which I don't remember), I happened to see a call trace in dmesg output on 2 of my 6 node cluster. 
The trace is seen multiple times, back to back, 8 times on one node and 10 times on the other.

[   82.389711] Rounding down aligned max_sectors from 4294967295 to 4294967288
[   82.408631] tcmu daemon: command reply support 1.
[   82.419220] target_core_register_fabric() trying autoload for iscsi
[  239.882153] tcmu daemon: command reply support 1.
[  240.045338] INFO: task targetctl:4262 blocked for more than 120 seconds.
[  240.045442] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.045547] targetctl       D ffff8800c6ec1468     0  4262      1 0x00000080
[  240.045556]  ffff880117db3c50 0000000000000086 ffff8800cacf8fd0 ffff880117db3fd8
[  240.045561]  ffff880117db3fd8 ffff880117db3fd8 ffff8800cacf8fd0 ffff8800c6ec1448
[  240.045565]  7fffffffffffffff ffff8800c6ec1440 ffff8800cacf8fd0 ffff8800c6ec1468
[  240.045570] Call Trace:
[  240.045587]  [<ffffffff816a94c9>] schedule+0x29/0x70
[  240.045598]  [<ffffffff816a6fd9>] schedule_timeout+0x239/0x2c0
[  240.045608]  [<ffffffff815ba55c>] ? netlink_broadcast_filtered+0x14c/0x3e0
[  240.045613]  [<ffffffff816a987d>] wait_for_completion+0xfd/0x140
[  240.045622]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
[  240.045634]  [<ffffffffc0619f5a>] tcmu_netlink_event+0x26a/0x3a0 [target_core_user]
[  240.045642]  [<ffffffff810b1910>] ? wake_up_atomic_t+0x30/0x30
[  240.045649]  [<ffffffffc061a2c6>] tcmu_configure_device+0x236/0x350 [target_core_user]
[  240.045682]  [<ffffffffc05aa5df>] target_configure_device+0x3f/0x3b0 [target_core_mod]
[  240.045695]  [<ffffffffc05a4e7c>] target_core_store_dev_enable+0x2c/0x60 [target_core_mod]
[  240.045707]  [<ffffffffc05a3244>] target_core_dev_store+0x24/0x40 [target_core_mod]
[  240.045715]  [<ffffffff81287f44>] configfs_write_file+0xc4/0x130
[  240.045722]  [<ffffffff81200d2d>] vfs_write+0xbd/0x1e0
[  240.045726]  [<ffffffff81201b3f>] SyS_write+0x7f/0xe0
[  240.045732]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b



Version-Release number of selected component (if applicable):
============================================================
glusterfs-3.8.4-33 and gluster-block-0.2.1-6


How reproducible:
===================
I don't have the exact steps to reproduce this. Will keep a watch on further tests that I am doing, in case I hit it again.


Additional info:
===============

[root@dhcp47-115 ~]# gluster peer status
Number of Peers: 5

Hostname: dhcp47-121.lab.eng.blr.redhat.com
Uuid: 49610061-1788-4cbc-9205-0e59fe91d842
State: Peer in Cluster (Connected)
Other names:
10.70.47.121

Hostname: dhcp47-113.lab.eng.blr.redhat.com
Uuid: a0557927-4e5e-4ff7-8dce-94873f867707
State: Peer in Cluster (Connected)

Hostname: dhcp47-114.lab.eng.blr.redhat.com
Uuid: c0dac197-5a4d-4db7-b709-dbf8b8eb0896
State: Peer in Cluster (Connected)
Other names:
10.70.47.114

Hostname: dhcp47-116.lab.eng.blr.redhat.com
Uuid: a96e0244-b5ce-4518-895c-8eb453c71ded
State: Peer in Cluster (Connected)
Other names:
10.70.47.116

Hostname: dhcp47-117.lab.eng.blr.redhat.com
Uuid: 17eb3cef-17e7-4249-954b-fc19ec608304
State: Peer in Cluster (Connected)
Other names:
10.70.47.117
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# rpm -qa | grep gluster
glusterfs-3.8.4-35.el7rhgs.x86_64
glusterfs-api-3.8.4-35.el7rhgs.x86_64
glusterfs-server-3.8.4-35.el7rhgs.x86_64
glusterfs-rdma-3.8.4-35.el7rhgs.x86_64
gluster-block-0.2.1-6.el7rhgs.x86_64
samba-vfs-glusterfs-4.6.3-5.el7rhgs.x86_64
glusterfs-fuse-3.8.4-35.el7rhgs.x86_64
glusterfs-events-3.8.4-35.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.1.el7rhgs.noarch
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7.x86_64
glusterfs-libs-3.8.4-35.el7rhgs.x86_64
glusterfs-cli-3.8.4-35.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-35.el7rhgs.x86_64
gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-26.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-35.el7rhgs.x86_64
python-gluster-3.8.4-35.el7rhgs.noarch
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# gluster v list
ctdb
disp
gluster_shared_storage
testvol
vol0
vol1
vol10
vol11
vol12
vol13
vol14
vol15
vol16
vol17
vol18
vol19
vol2
vol20
vol21
vol22
vol23
vol24
vol25
vol26
vol27
vol28
vol29
vol3
vol30
vol31
vol32
vol33
vol34
vol35
vol36
vol37
vol38
vol39
vol4
vol40
vol5
vol6
vol7
vol8
vol9
[root@dhcp47-115 ~]#

Comment 3 Sweta Anandpara 2017-07-28 08:28:26 UTC

Not proposing this as a blocker for rhgs3.3 as I am unsure of the steps that I was executing.
 
I will keep a watch on this in my further testing.

Comment 6 Prasanna Kumar Kalever 2018-05-15 11:19:05 UTC


*** This bug has been marked as a duplicate of bug 1476730 ***

Note You need to log in before you can comment on or make changes to this bug.