Description of problem: ========================= Hit this while verifying bug 1456227. Had about ~35 blocks present on a volume, did a volume stop and delete and then executed the command 'targetcli clearconfig confirm=True' - which resulted in multiple errors in the tcmu-runner logs as well as in the status of tcmu-runner. 'systemctl status tcmu-runner' even though it shows as 'active(running)' displayed the same error messages, indicating that tcmu-runner in that node was not healthy. Restart of gluster-blockd did not help. Restart of tcmu-runner was hung. When tried to get the node (and its services) back to normal post weekend, did a node reboot. Status of tcmu-runner shows the service as dead, and any attempt to restart the same results in the same behaviour - 'active (running)' but lots of errors (pasted below) in the logs. Restart of gluster-blockd remains hung ( I am assuming as it tries to get tcmu-runner UP because of spec dependency). Version-Release number of selected component (if applicable): ============================================================ glusterfs-3.8.4-33 and gluster-block-0.2.1-6 How reproducible: ================= 1:1 Steps to Reproduce: ====================== // Same as mentioned in the bz 1456227 1. Create some gluster-blocks 2. Gluster volume stop the volume and delete it 3. targetcli clearconfig confirm=True Actual results: =============== tcmu-runner goes down, taking gluster-blockd along with it. Expected results: ================ The services should not be affected. Or if we are expecting something to go wrong, 'volume delete' should error out at the outset itself , mentioning that there are block devices present in it. Additional info: ================= I did have a few blocks (may be 3, but I am not sure) which were in a failed state, due to some negative testing that I was doing. But I have no idea what are the devices uio1, 2 and 3 that are mentioned in the logs. root@dhcp47-115 ~]# systemctl status tcmu-runner ● tcmu-runner.service - LIO Userspace-passthrough daemon Loaded: loaded (/usr/lib/systemd/system/tcmu-runner.service; static; vendor preset: disabled) Active: active (running) since Wed 2017-07-19 03:21:54 EDT; 1 day 20h ago Main PID: 2737 (tcmu-runner) CGroup: /system.slice/tcmu-runner.service └─2737 /usr/bin/tcmu-runner --tcmu-log-dir=/var/log/gluster-block/ Jul 19 03:21:54 dhcp47-115.lab.eng.blr.redhat.com systemd[1]: Starting LIO Userspace-passthrough daemon... Jul 19 03:21:54 dhcp47-115.lab.eng.blr.redhat.com systemd[1]: Started LIO Userspace-passthrough daemon. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: tcmu-runner : remove_device:522 : Could not remove device uio3: not found. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: 2017-07-20 23:35:10.254 2737 [ERROR] remove_device:522 : Could not remove device uio3: not found. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: tcmu-runner : remove_device:522 : Could not remove device uio2: not found. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: 2017-07-20 23:35:10.258 2737 [ERROR] remove_device:522 : Could not remove device uio2: not found. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: tcmu-runner : remove_device:522 : Could not remove device uio1: not found. Jul 20 23:35:10 dhcp47-115.lab.eng.blr.redhat.com tcmu-runner[2737]: 2017-07-20 23:35:10.261 2737 [ERROR] remove_device:522 : Could not remove device uio1: not found. [root@dhcp47-115 ~]# /var/log/messages also show the same errors wrt tcmu-runner. ------------------ Jul 20 23:35:10 localhost tcmu-runner: 2017-07-20 23:35:10.254 2737 [ERROR] remove_device:522 : Could not remove device uio3: not found. Jul 20 23:35:10 localhost journal: tcmu-runner#012: remove_device:522 : Could not remove device uio2: not found. Jul 20 23:35:10 localhost tcmu-runner: 2017-07-20 23:35:10.258 2737 [ERROR] remove_device:522 : Could not remove device uio2: not found. Jul 20 23:35:10 localhost journal: tcmu-runner#012: remove_device:522 : Could not remove device uio1: not found. Jul 20 23:35:10 localhost tcmu-runner: 2017-07-20 23:35:10.261 2737 [ERROR] remove_device:522 : Could not remove device uio1: not found.
Gluster-block logs present in this location: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/ [qe@rhsqe-repo 1474273]$ hostname rhsqe-repo.lab.eng.blr.redhat.com [qe@rhsqe-repo 1474273]$ [qe@rhsqe-repo 1474273]$ pwd /home/repo/sosreports/1474273 [qe@rhsqe-repo 1474273]$ [qe@rhsqe-repo 1474273]$ ll total 12 drwxr-xr-x. 2 qe qe 4096 Jul 24 15:03 gluster-block_dhcp47-115 drwxr-xr-x. 2 qe qe 4096 Jul 24 15:02 gluster-block_dhcp47-116 drwxr-xr-x. 2 qe qe 4096 Jul 24 15:01 gluster-block_dhcp47-117 [qe@rhsqe-repo 1474273]$ [qe@rhsqe-repo 1474273]$
Removing the need-info on this bug as Humble has replied in comment 4.
*** Bug 1467851 has been marked as a duplicate of this bug. ***
We shouldn't use targetcli to clean the volumes. Only use gluster-block.