Bug 1476176 - [Gluster-block]: Call trace seen in dmesg while doing block create/delete and tcmu-runner service restart
[Gluster-block]: Call trace seen in dmesg while doing block create/delete and...
Status: NEW
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tcmu-runner (Show other bugs)
3.3
Unspecified Unspecified
unspecified Severity high
: ---
: ---
Assigned To: Prasanna Kumar Kalever
Rahul Hinduja
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-28 04:17 EDT by Sweta Anandpara
Modified: 2017-09-28 13:03 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sweta Anandpara 2017-07-28 04:17:03 EDT
Description of problem:
======================
While doing a round of block create/delete testing in a loop and tcmu-runner restart in the interim, and a few other things (which I don't remember), I happened to see a call trace in dmesg output on 2 of my 6 node cluster. 
The trace is seen multiple times, back to back, 8 times on one node and 10 times on the other.

[   82.389711] Rounding down aligned max_sectors from 4294967295 to 4294967288
[   82.408631] tcmu daemon: command reply support 1.
[   82.419220] target_core_register_fabric() trying autoload for iscsi
[  239.882153] tcmu daemon: command reply support 1.
[  240.045338] INFO: task targetctl:4262 blocked for more than 120 seconds.
[  240.045442] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.045547] targetctl       D ffff8800c6ec1468     0  4262      1 0x00000080
[  240.045556]  ffff880117db3c50 0000000000000086 ffff8800cacf8fd0 ffff880117db3fd8
[  240.045561]  ffff880117db3fd8 ffff880117db3fd8 ffff8800cacf8fd0 ffff8800c6ec1448
[  240.045565]  7fffffffffffffff ffff8800c6ec1440 ffff8800cacf8fd0 ffff8800c6ec1468
[  240.045570] Call Trace:
[  240.045587]  [<ffffffff816a94c9>] schedule+0x29/0x70
[  240.045598]  [<ffffffff816a6fd9>] schedule_timeout+0x239/0x2c0
[  240.045608]  [<ffffffff815ba55c>] ? netlink_broadcast_filtered+0x14c/0x3e0
[  240.045613]  [<ffffffff816a987d>] wait_for_completion+0xfd/0x140
[  240.045622]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
[  240.045634]  [<ffffffffc0619f5a>] tcmu_netlink_event+0x26a/0x3a0 [target_core_user]
[  240.045642]  [<ffffffff810b1910>] ? wake_up_atomic_t+0x30/0x30
[  240.045649]  [<ffffffffc061a2c6>] tcmu_configure_device+0x236/0x350 [target_core_user]
[  240.045682]  [<ffffffffc05aa5df>] target_configure_device+0x3f/0x3b0 [target_core_mod]
[  240.045695]  [<ffffffffc05a4e7c>] target_core_store_dev_enable+0x2c/0x60 [target_core_mod]
[  240.045707]  [<ffffffffc05a3244>] target_core_dev_store+0x24/0x40 [target_core_mod]
[  240.045715]  [<ffffffff81287f44>] configfs_write_file+0xc4/0x130
[  240.045722]  [<ffffffff81200d2d>] vfs_write+0xbd/0x1e0
[  240.045726]  [<ffffffff81201b3f>] SyS_write+0x7f/0xe0
[  240.045732]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b



Version-Release number of selected component (if applicable):
============================================================
glusterfs-3.8.4-33 and gluster-block-0.2.1-6


How reproducible:
===================
I don't have the exact steps to reproduce this. Will keep a watch on further tests that I am doing, in case I hit it again.


Additional info:
===============

[root@dhcp47-115 ~]# gluster peer status
Number of Peers: 5

Hostname: dhcp47-121.lab.eng.blr.redhat.com
Uuid: 49610061-1788-4cbc-9205-0e59fe91d842
State: Peer in Cluster (Connected)
Other names:
10.70.47.121

Hostname: dhcp47-113.lab.eng.blr.redhat.com
Uuid: a0557927-4e5e-4ff7-8dce-94873f867707
State: Peer in Cluster (Connected)

Hostname: dhcp47-114.lab.eng.blr.redhat.com
Uuid: c0dac197-5a4d-4db7-b709-dbf8b8eb0896
State: Peer in Cluster (Connected)
Other names:
10.70.47.114

Hostname: dhcp47-116.lab.eng.blr.redhat.com
Uuid: a96e0244-b5ce-4518-895c-8eb453c71ded
State: Peer in Cluster (Connected)
Other names:
10.70.47.116

Hostname: dhcp47-117.lab.eng.blr.redhat.com
Uuid: 17eb3cef-17e7-4249-954b-fc19ec608304
State: Peer in Cluster (Connected)
Other names:
10.70.47.117
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# rpm -qa | grep gluster
glusterfs-3.8.4-35.el7rhgs.x86_64
glusterfs-api-3.8.4-35.el7rhgs.x86_64
glusterfs-server-3.8.4-35.el7rhgs.x86_64
glusterfs-rdma-3.8.4-35.el7rhgs.x86_64
gluster-block-0.2.1-6.el7rhgs.x86_64
samba-vfs-glusterfs-4.6.3-5.el7rhgs.x86_64
glusterfs-fuse-3.8.4-35.el7rhgs.x86_64
glusterfs-events-3.8.4-35.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.1.el7rhgs.noarch
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
libvirt-daemon-driver-storage-gluster-3.2.0-14.el7.x86_64
glusterfs-libs-3.8.4-35.el7rhgs.x86_64
glusterfs-cli-3.8.4-35.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-35.el7rhgs.x86_64
gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64
glusterfs-debuginfo-3.8.4-26.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-35.el7rhgs.x86_64
python-gluster-3.8.4-35.el7rhgs.noarch
[root@dhcp47-115 ~]# 
[root@dhcp47-115 ~]# gluster v list
ctdb
disp
gluster_shared_storage
testvol
vol0
vol1
vol10
vol11
vol12
vol13
vol14
vol15
vol16
vol17
vol18
vol19
vol2
vol20
vol21
vol22
vol23
vol24
vol25
vol26
vol27
vol28
vol29
vol3
vol30
vol31
vol32
vol33
vol34
vol35
vol36
vol37
vol38
vol39
vol4
vol40
vol5
vol6
vol7
vol8
vol9
[root@dhcp47-115 ~]#
Comment 3 Sweta Anandpara 2017-07-28 04:28:26 EDT
Not proposing this as a blocker for rhgs3.3 as I am unsure of the steps that I was executing.
 
I will keep a watch on this in my further testing.

Note You need to log in before you can comment on or make changes to this bug.