Bug 1476730
Summary: | [Tracker-RHEL-BZ#1608677] tcmu-runner: allow reset netlink support | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Sweta Anandpara <sanandpa> |
Component: | tcmu-runner | Assignee: | Xiubo Li <xiubli> |
Status: | CLOSED ERRATA | QA Contact: | Rahul Hinduja <rhinduja> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | cns-3.9 | CC: | akrishna, amukherj, ansverma, bgoyal, bkunal, bmohanra, ccalhoun, dvercill, dwojslaw, hchiramm, kramdoss, madam, nberry, nchilaka, olim, pkarampu, prasanna.kalever, rgeorge, rhs-bugs, sabose, sanandpa, sankarshan, storage-qa-internal, vbellur, xiubli, ykaul |
Target Milestone: | --- | ||
Target Release: | OCS 3.11 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | tcmu-runner-1.2.0-25.el7rhgs | Doc Type: | Bug Fix |
Doc Text: |
Gluster-block operations (create/delete/modify) or gluster-block-target service restart, performed when tcmu-runner is in offline state, can trigger netlink hung issue, with targetcli process entering uninterruptible sleep (D state) state forever. To recover from this state, restart the tcmu-runner daemon.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-10-24 04:52:58 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1608677 | ||
Bug Blocks: | 1559239, 1568868, 1622458, 1629577 |
Description
Sweta Anandpara
2017-07-31 10:00:44 UTC
Unable to get gluster-block daemon UP affects the entire functionality. It is concerning as this is seen in positive scenario as well, the second time. Hence proposing it as a blocker. Either we should not hit this at all, or a way to reset gluster-blockd if any of its dependent services take a hit. I understand tcmu-runner prints out the error messages as it is unable to open those devices. Looks like the files were made unavailable without tcmu-runner knowing about it. Gluster-block logs can be found at this location: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/ The tcmu-runner-glfs.log are filled with "Client-quorum is not met" and "no subvolumes up". This clearly indicates there's an issue with the gluster volume. Unfortunately, the sosreport provided in Comment 3 does not have the glusterd or brick logs. Will need these to understand Logs are now attached at: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1476730/dhcp47_117_glusterfs_logs/ Pasting the snippet below from link. It has logs all the way from April 16th. Isn't this sufficient?! [qe@rhsqe-repo dhcp47_117_glusterfs_logs]$ hostname rhsqe-repo.lab.eng.blr.redhat.com [qe@rhsqe-repo dhcp47_117_glusterfs_logs]$ pwd /home/repo/sosreports/1476730/dhcp47_117_glusterfs_logs [qe@rhsqe-repo dhcp47_117_glusterfs_logs]$ [qe@rhsqe-repo dhcp47_117_glusterfs_logs]$ cd glusterfs [qe@rhsqe-repo glusterfs]$ [qe@rhsqe-repo glusterfs]$ [qe@rhsqe-repo glusterfs]$ ls -lrt glusterd* -rwxr-xr-x. 1 qe qe 3499487 Aug 9 11:13 glusterd.log-20170430.gz -rwxr-xr-x. 1 qe qe 159812 Aug 9 11:13 glusterd.log-20170703.gz -rwxr-xr-x. 1 qe qe 2619 Aug 9 11:13 glusterd.log-20170416.gz -rwxr-xr-x. 1 qe qe 76632 Aug 9 11:13 glusterd.log-20170424.gz -rwxr-xr-x. 1 qe qe 5045262 Aug 9 11:13 glusterd.log-20170507.gz -rwxr-xr-x. 1 qe qe 5675694 Aug 9 11:13 glusterd.log-20170515.gz -rwxr-xr-x. 1 qe qe 261303 Aug 9 11:13 glusterd.log -rwxr-xr-x. 1 qe qe 21472 Aug 9 11:13 glusterd.log-20170716.gz -rwxr-xr-x. 1 qe qe 745438 Aug 9 11:13 glusterd.log-20170521.gz -rwxr-xr-x. 1 qe qe 105599 Aug 9 11:13 glusterd.log-20170713.gz -rwxr-xr-x. 1 qe qe 455 Aug 9 11:13 glusterd.log-20170529.gz -rwxr-xr-x. 1 qe qe 389 Aug 9 11:13 glusterd.log-20170604.gz -rwxr-xr-x. 1 qe qe 132407 Aug 9 11:13 glusterd.log-20170724.gz -rwxr-xr-x. 1 qe qe 78701 Aug 9 11:13 glusterd.log-20170611.gz -rwxr-xr-x. 1 qe qe 127661 Aug 9 11:13 glusterd.log-20170619.gz -rwxr-xr-x. 1 qe qe 131188 Aug 9 11:13 glusterd.log-20170625.gz -rwxr-xr-x. 1 qe qe 109866 Aug 9 11:13 glusterd.log-20170730.gz -rwxr-xr-x. 1 qe qe 89385768 Aug 9 11:14 glusterd.log-20170806 [qe@rhsqe-repo glusterfs]$ *** Bug 1476176 has been marked as a duplicate of this bug. *** Is this bug related to bug 1560418 ? (In reply to Yaniv Kaul from comment #28) > Is this bug related to bug 1560418 ? Yes *** Bug 1599158 has been marked as a duplicate of this bug. *** The upstream PR: https://github.com/open-iscsi/tcmu-runner/pull/402 *** Bug 1610150 has been marked as a duplicate of this bug. *** *** Bug 1577791 has been marked as a duplicate of this bug. *** using the correct tracker Updated the Doc text kindly verify. Updated the Doc text. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2987 |