Bug 1602776

Summary: Fail to activate FC SD with Permission denied error on metadata file
Product: Red Hat Enterprise Virtualization Manager Reporter: guy chen <guchen>
Component: vdsmAssignee: Amit Bawer <abawer>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Avihai <aefrat>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2.5CC: dagur, guchen, lsurette, nsoffer, srevivo, tnisan, ycui
Target Milestone: ovirt-4.4.1   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-19 13:48:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1163890, 1544370    
Bug Blocks:    

Description guy chen 2018-07-18 13:10:36 UTC
Description of problem:
When activating FC SD it failed with the following error 

2018-07-18 15:59:30,270+0300 INFO  (jsonrpc/5) [storage.StoragePool] Connect host #1 to the storage pool 2a742b2a-0c9c-4a5c-85fc-941a29d7185e with master domain: 28916e93-bd23-4e34-8a3b-ded331       2161bd (ver = 24) (sp:688)
2018-07-18 15:59:30,587+0300 INFO  (jsonrpc/5) [vdsm.api] FINISH connectStoragePool error=[Errno 13] Permission denied: '/dev/28916e93-bd23-4e34-8a3b-ded3312161bd/metadata' from=::ffff:10.35.6       8.1,37698, task_id=9b7366b7-7f92-44d8-91ea-63575f1a8172 (api:50)


Version-Release number of selected component (if applicable):
rhv 4.2.5

How reproducible:
Always

Steps to Reproduce:
1.removed host form rhvm
2.removed vdsm from server :
service vdsmd stop
service supervdsm stop
umount -l /rhev/data-center/mnt/*
iscsiadm -m node -u  
service iscsi stop 
service iscsid stop
yum remove -y vdsm* libvirt*
rm -rf /etc/vdsm/
rm -rf /etc/libvirt/
rm -rf /etc/qemu-kvm/ 
rm -rf /etc/pki/vdsm/
3.added host to rhv
4.host is up but SD cannot be activated

Actual results:
host is up but SD cannot be activated

Expected results:
host is up and SD activated

Additional info:
logs will be attached

Comment 2 Mor 2018-07-29 13:07:24 UTC
+1 on my environment. Encountered the same issue.

Comment 3 Mor 2018-07-29 13:09:55 UTC
^ with Fiber Channel Data Store.

Comment 4 Nir Soffer 2018-07-29 14:17:28 UTC
I had same issue in one of the scale team hosts, fixed by rebooting the host.

The error was bad permissions on /dev/dm-N used by the metadata volume. We have
udev rules ensuring correct permissions, but maybe there is some issue in the 
particular way the host was removed that caused this issue.

Comment 5 Mor 2018-07-29 14:20:50 UTC
The host was re-provisioned, the old host object was exist on engine. I did a reinstall without success, then removed and re-added the host.

Comment 6 Nir Soffer 2018-07-29 14:22:23 UTC
Can you reproduce this when vdsm is remove like this?

1.remove host form rhvm

2.uninstall vdsm:

service vdsmd stop
service supervdsm stop
yum remove -y vdsm* libvirt*

3.added host to rhv

Not clear why you are removing files manually. Packages should remove their files.

Comment 7 Nir Soffer 2018-07-29 14:28:57 UTC
There is a good chance that this issue is caused by bug 1562369.

Lets retest this when that bug is verified.

Comment 8 Nir Soffer 2018-07-29 14:31:45 UTC
Oops, wrong bug - fixed to bug 1331978

Comment 9 Mor 2018-07-29 14:33:29 UTC
Reboot helped solve this issue. 
What udev roles did we added?

# udevadm info --query=all --name=/dev/dm-2
P: /devices/virtual/block/dm-2
N: dm-2
L: 10
S: disk/by-id/dm-name-3600a098038304437415d4b6a59676d43
S: disk/by-id/dm-uuid-mpath-3600a098038304437415d4b6a59676d43
S: disk/by-id/lvm-pv-uuid-9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo
S: mapper/3600a098038304437415d4b6a59676d43
E: DEVLINKS=/dev/disk/by-id/dm-name-3600a098038304437415d4b6a59676d43 /dev/disk/by-id/dm-uuid-mpath-3600a098038304437415d4b6a59676d43 /dev/disk/by-id/lvm-pv-uuid-9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo /dev/mapper/3600a098038304437415d4b6a59676d43
E: DEVNAME=/dev/dm-2
E: DEVPATH=/devices/virtual/block/dm-2
E: DEVTYPE=disk
E: DM_ACTIVATION=0
E: DM_MULTIPATH_TIMESTAMP=1532874120
E: DM_NAME=3600a098038304437415d4b6a59676d43
E: DM_SUBSYSTEM_UDEV_FLAG0=1
E: DM_SUSPENDED=0
E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1
E: DM_UDEV_PRIMARY_SOURCE_FLAG=1
E: DM_UDEV_RULES_VSN=2
E: DM_UUID=mpath-3600a098038304437415d4b6a59676d43
E: ID_FS_TYPE=LVM2_member
E: ID_FS_USAGE=raid
E: ID_FS_UUID=9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo
E: ID_FS_UUID_ENC=9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo
E: ID_FS_VERSION=LVM2 001
E: MAJOR=253
E: MINOR=2
E: MPATH_SBIN_PATH=/sbin
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=48071

Comment 10 Nir Soffer 2018-07-29 14:52:09 UTC
Lowering priority, since there is an easy workaround, and the use case is not clear.

Regarding udev rules, we install this:
/usr/lib/udev/rules.d/12-vdsm-lvm.rules

These rules ensure that devices get the correct owner and group (vdsm:kvm) when
you their are added. It is possible that existing device lost the owner:group
while vdsm were removed manually, and when adding the host back, the permissions
were not fixed since the device did  not change.

So generally this sounds like an edge case that may be possible to fix by
triggering udev rules during installation, but I'm not sure it worth the time.

I suggest to move this to 4.3 for now.

Comment 11 Nir Soffer 2018-07-31 22:11:17 UTC
Fixing the depends bug again.

Comment 12 Daniel Gur 2018-12-30 09:38:06 UTC
Mo pending requests from QE

Comment 13 Sandro Bonazzola 2019-01-28 09:39:49 UTC
This bug has not been marked as blocker for oVirt 4.3.0.
Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.

Comment 15 Nir Soffer 2019-12-19 13:32:35 UTC
Comment 6 is still unanswered since 2018-07-29. I think we should close
this bug since we don't have enough data to tell if this is a real issue
that may affect real use of the system.

Comment 16 Amit Bawer 2019-12-19 13:48:03 UTC
Closed per comment #15