Bug 1629889

Summary: Device removal leaves garbage on the device returning "ok", breaking further attempts to "add" device back
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Valerii Ponomarov <vponomar>
Component: heketiAssignee: John Mulligan <jmulligan>
Status: CLOSED ERRATA QA Contact: Valerii Ponomarov <vponomar>
Severity: high Docs Contact:
Priority: unspecified    
Version: cns-3.10CC: akrishna, hchiramm, kramdoss, madam, ndevos, pprakash, rgeorge, rhs-bugs, rtalur, sankarshan, storage-qa-internal, vinug, vponomar
Target Milestone: ---   
Target Release: OCS 3.11   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: heketi-7.0.0-12.el7rhgs Doc Type: Bug Fix
Doc Text:
Previously, heketi ignored pvremove and vgremove errors when a device was removed while the device remove commands were used. Attempting to add the same disk again failed because it had not been properly removed in the first place. Heketi no longer ignores pvremove and vgremove errors, ensuring that devices are removed correctly, and can be re-added to Heketi after removal. Alternatively, you can also use the "--force-forget" flag with the device remove command to ignore any errors to ensure the same device can be added back to Heketi.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-24 04:51:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1629575    
Attachments:
Description Flags
Heketi DB dump
none
Heketi server logs none

Description Valerii Ponomarov 2018-09-17 14:56:40 UTC
Description of problem:
If we disable device, then remove it, heketi shows that bricks were moved to other existing device,
BUT, removed device continues to have "used" space equal to the size of bricks which were there.
Then, we are able to delete such device. And at this step, we are not able to add it back, even after run of "wipefs" command on the appropriate glusterfs POD.

Version-Release number of selected component (if applicable):
Heketi server: heketi-7.0.0-11.el7rhgs.x86_64
Heketi client: heketi-client-7.0.0-8.el7rhgs.x86_64
Storage release version: Red Hat Gluster Volume Manager 3.4.0 ( Container)

Image:
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhgs3/rhgs-volmanager-rhel7:v3.10

How reproducible:
In scope of one node it failed for both devices from 2.
But it failed only for one node, from 2.

Steps to Reproduce:
1. Create heketi topology using couple of devices
2. Create couple of volumes
3. Add one more device to the heketi
4. Disable one of devices which contains some bricks
5. Remove disabled device 
6. Delete removed device
7. Add deleted device back

Actual results:
Following error as answer for "re-add device" attempt:
"""
Error: Can't open /dev/sdd exclusively.  Mounted filesystem?
"""
Prior to it, "Used space" was not changed after "remove device" operation.

Expected results:
All bricks are evacuated after "remove device" operation and "Used space" is set as 0. Ok result trying to add device back.

Comment 2 Valerii Ponomarov 2018-09-17 14:57:51 UTC
Created attachment 1484065 [details]
Heketi DB dump

Adding Heketi DB dump

Comment 3 Valerii Ponomarov 2018-09-17 14:58:42 UTC
Created attachment 1484066 [details]
Heketi server logs

Adding Heketi server logs

Comment 4 Valerii Ponomarov 2018-09-17 15:01:05 UTC
Also, after seeing error, I was told to try to run "wipefs" command and retry again. It didn't help, results:

[root@vp-ansible-v310-ga-master-0 ~]# heketi-cli device add --name /dev/sdd --node 8966bb38131b92389d55e09862ca9cce
Error: Can't open /dev/sdd exclusively.  Mounted filesystem?
[root@vp-ansible-v310-ga-master-0 ~]# ssh root@vp-ansible-v310-ga-app-cns-1
Last login: Mon Sep 17 08:06:55 2018 from vp-ansible-v310-ga-master-0
[root@vp-ansible-v310-ga-app-cns-1 ~]# wipefs /dev/sdd
offset               type
----------------------------------------------------------------
0x218                LVM2_member   [raid]
                     UUID:  gIk6eH-yHuU-py77-8PCo-lbuk-98fI-K9OveQ
====================================================
[root@vp-ansible-v310-ga-app-cns-1 ~]# wipefs -a /dev/sdd
wipefs: error: /dev/sdd: probing initialization failed: Device or resource busy
====================================================
[root@vp-ansible-v310-ga-app-cns-1 ~]# pvscan
  PV /dev/sdd    VG vg_68c9ef3f5a2e31d2976565f9f187a6cf   lvm2 [99.87 GiB / <97.85 GiB free]
  PV /dev/sda2   VG rhel_dhcp46-210                       lvm2 [<39.00 GiB / 0    free]
  PV /dev/sdb1   VG docker-vol                            lvm2 [<40.00 GiB / 0    free]
  PV /dev/sdf    VG vg_1517317e9a1fc96c830c9493c49833f9   lvm2 [99.87 GiB / <97.85 GiB free]
  PV /dev/sde    VG vg_ca1fd84ddfbd19783537ea7d61e39f9b   lvm2 [199.87 GiB / <196.84 GiB free]
  Total: 5 [<478.61 GiB] / in use: 5 [<478.61 GiB] / in no VG: 0 [0   ]
====================================================
[root@vp-ansible-v310-ga-app-cns-1 ~]# wipefs --force --all /dev/sdf
/dev/sdf: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
[root@vp-ansible-v310-ga-app-cns-1 ~]# wipefs --force --all /dev/sdd
/dev/sdd: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
====================================================
[root@vp-ansible-v310-ga-app-cns-1 ~]# pvscan
  PV /dev/sda2   VG rhel_dhcp46-210                       lvm2 [<39.00 GiB / 0    free]
  PV /dev/sdb1   VG docker-vol                            lvm2 [<40.00 GiB / 0    free]
  PV /dev/sde    VG vg_ca1fd84ddfbd19783537ea7d61e39f9b   lvm2 [199.87 GiB / <196.84 GiB free]
  Total: 3 [278.86 GiB] / in use: 3 [278.86 GiB] / in no VG: 0 [0   ]
===================================================
[root@vp-ansible-v310-ga-master-0 ~]# heketi-cli device add --name /dev/sdd --node 8966bb38131b92389d55e09862ca9cce
Error: Can't open /dev/sdd exclusively.  Mounted filesystem?
[root@vp-ansible-v310-ga-master-0 ~]# heketi-cli device add --name /dev/sdf --node 8966bb38131b92389d55e09862ca9cce
Error: Can't open /dev/sdf exclusively.  Mounted filesystem?

Comment 5 John Mulligan 2018-09-17 17:33:52 UTC
Do you have the cluster in this state such that I could log on to the cluster and reproduce the error condition myself?

If not, does the problem go away after you either (a) restart the gluster pod or (b) reboot the node?

Comment 6 Valerii Ponomarov 2018-09-17 17:37:43 UTC
John,

I didn't reboot neither pod nor node. I have cluster used for it. Will send you cluster creds in email.

Comment 7 John Mulligan 2018-09-17 18:12:10 UTC
It's not so much that Heketi is failing to try to clean up the node but rather that the actual resources are still in use by the kernel (device mapper).

You an see that certain vgs are missing from the lvm output but are present in the device mapper list.

== vgs inside the container ==
sh-4.2# vgs
  VG                                  #PV #LV #SN Attr   VSize   VFree   
  docker-vol                            1   1   0 wz--n- <40.00g       0 
  rhel_dhcp46-210                       1   2   0 wz--n- <39.00g       0 
  vg_ca1fd84ddfbd19783537ea7d61e39f9b   1   4   0 wz--n- 199.87g <196.84g

== lvs inside the container ==
sh-4.2# lvs
  LV                                     VG                                  Attr       LSize   Pool                                Origin Data%  Meta%  Move Log Cpy%Sync Convert
  dockerlv                               docker-vol                          -wi-ao---- <40.00g                                                                                   
  root                                   rhel_dhcp46-210                     -wi-ao---- <35.00g                                                                                   
  swap                                   rhel_dhcp46-210                     -wi-a-----   4.00g                                                                                   
  brick_83a6d5e8b1072640197ae231f52a48af vg_ca1fd84ddfbd19783537ea7d61e39f9b Vwi-aotz--   1.00g tp_5c46b2de672ea8f555d523b679a4c2d6        1.39                                   
  brick_aea6efb6d0301b9b35b66e3b7821606e vg_ca1fd84ddfbd19783537ea7d61e39f9b Vwi-aotz--   2.00g tp_534b36a164af13c58dd8d4f14368e7f0        0.70                                   
  tp_534b36a164af13c58dd8d4f14368e7f0    vg_ca1fd84ddfbd19783537ea7d61e39f9b twi-aotz--   2.00g                                            0.70   0.33                            
  tp_5c46b2de672ea8f555d523b679a4c2d6    vg_ca1fd84ddfbd19783537ea7d61e39f9b twi-aotz--   1.00g                                            1.39   0.49                        

== lvs on the node ==
[root@vp-ansible-v310-ga-app-cns-1 ~]# lvs
  LV                                     VG                                  Attr       LSize   Pool                                Origin Data%  Meta%  Move Log Cpy%Sync Convert
  dockerlv                               docker-vol                          -wi-ao---- <40.00g                                                                                   
  root                                   rhel_dhcp46-210                     -wi-ao---- <35.00g                                                                                   
  swap                                   rhel_dhcp46-210                     -wi-a-----   4.00g                                                                                   
  brick_83a6d5e8b1072640197ae231f52a48af vg_ca1fd84ddfbd19783537ea7d61e39f9b Vwi-aotz--   1.00g tp_5c46b2de672ea8f555d523b679a4c2d6        1.39                                   
  brick_aea6efb6d0301b9b35b66e3b7821606e vg_ca1fd84ddfbd19783537ea7d61e39f9b Vwi-aotz--   2.00g tp_534b36a164af13c58dd8d4f14368e7f0        0.70                                   
  tp_534b36a164af13c58dd8d4f14368e7f0    vg_ca1fd84ddfbd19783537ea7d61e39f9b twi-aotz--   2.00g                                            0.70   0.33                            
  tp_5c46b2de672ea8f555d523b679a4c2d6    vg_ca1fd84ddfbd19783537ea7d61e39f9b twi-aotz--   1.00g                                            1.39   0.49                           


== dmsetup ls on the node ==
[root@vp-ansible-v310-ga-app-cns-1 ~]# dmsetup ls
vg_1517317e9a1fc96c830c9493c49833f9-tp_ff12ee2104892099503b087db5b8aefd-tpool   (253:10)
vg_1517317e9a1fc96c830c9493c49833f9-tp_ff12ee2104892099503b087db5b8aefd_tdata   (253:9)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_5c46b2de672ea8f555d523b679a4c2d6-tpool   (253:15)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_5c46b2de672ea8f555d523b679a4c2d6_tdata   (253:14)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-brick_83a6d5e8b1072640197ae231f52a48af      (253:17)
vg_1517317e9a1fc96c830c9493c49833f9-tp_ff12ee2104892099503b087db5b8aefd_tmeta   (253:8)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_5c46b2de672ea8f555d523b679a4c2d6_tmeta   (253:13)
vg_68c9ef3f5a2e31d2976565f9f187a6cf-tp_e04cc4729e00e4ec21aa1b028651d599-tpool   (253:5)
vg_68c9ef3f5a2e31d2976565f9f187a6cf-tp_e04cc4729e00e4ec21aa1b028651d599_tdata   (253:4)
vg_68c9ef3f5a2e31d2976565f9f187a6cf-tp_e04cc4729e00e4ec21aa1b028651d599_tmeta   (253:3)
docker--vol-dockerlv    (253:2)
vg_68c9ef3f5a2e31d2976565f9f187a6cf-brick_e04cc4729e00e4ec21aa1b028651d599      (253:7)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_534b36a164af13c58dd8d4f14368e7f0-tpool   (253:20)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_534b36a164af13c58dd8d4f14368e7f0_tdata   (253:19)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_534b36a164af13c58dd8d4f14368e7f0_tmeta   (253:18)
vg_1517317e9a1fc96c830c9493c49833f9-tp_ff12ee2104892099503b087db5b8aefd (253:11)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_5c46b2de672ea8f555d523b679a4c2d6 (253:16)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-tp_534b36a164af13c58dd8d4f14368e7f0 (253:21)
rhel_dhcp46--210-swap   (253:1)
rhel_dhcp46--210-root   (253:0)
vg_ca1fd84ddfbd19783537ea7d61e39f9b-brick_aea6efb6d0301b9b35b66e3b7821606e      (253:22)
vg_1517317e9a1fc96c830c9493c49833f9-brick_651d564bd98fb500018dadbe35ee60b0      (253:12)


I think this is not directly related to Heketi, but rather, the way we're running lvm on the gluster pods. More to come...

Comment 8 Niels de Vos 2018-09-18 13:38:27 UTC
The problem is that wipefs does not completely remove the LVM metadata from the block devices. When a device is scanned after wipefs, the VolumeGroup and LogicalVolumes may return again. I have seen this in my test environments on occasion too.

This is what I do to completely remove everything:

# lvremove -y $(cd /dev/mapper ; ls vg_* | sed s,-,/,)
# pvremove --force --force -y /dev/vdb /dev/vdc
# wipefs --force --all /dev/vdb /dev/vdc
# sync
# reboot

It might be overdoing it a bit, but it works for me :)

Comment 9 John Mulligan 2018-09-18 15:22:03 UTC
I can confirm that this is a bug in heketi, it did not successfully delete the vg but removed the device from the db anyway. It should not have allowed this.

Comment 16 Anjana KD 2018-10-11 07:26:44 UTC
Updated the Doc Text field, Kindly verify.

Comment 19 errata-xmlrpc 2018-10-24 04:51:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2986