Description of problem: When activating FC SD it failed with the following error 2018-07-18 15:59:30,270+0300 INFO (jsonrpc/5) [storage.StoragePool] Connect host #1 to the storage pool 2a742b2a-0c9c-4a5c-85fc-941a29d7185e with master domain: 28916e93-bd23-4e34-8a3b-ded331 2161bd (ver = 24) (sp:688) 2018-07-18 15:59:30,587+0300 INFO (jsonrpc/5) [vdsm.api] FINISH connectStoragePool error=[Errno 13] Permission denied: '/dev/28916e93-bd23-4e34-8a3b-ded3312161bd/metadata' from=::ffff:10.35.6 8.1,37698, task_id=9b7366b7-7f92-44d8-91ea-63575f1a8172 (api:50) Version-Release number of selected component (if applicable): rhv 4.2.5 How reproducible: Always Steps to Reproduce: 1.removed host form rhvm 2.removed vdsm from server : service vdsmd stop service supervdsm stop umount -l /rhev/data-center/mnt/* iscsiadm -m node -u service iscsi stop service iscsid stop yum remove -y vdsm* libvirt* rm -rf /etc/vdsm/ rm -rf /etc/libvirt/ rm -rf /etc/qemu-kvm/ rm -rf /etc/pki/vdsm/ 3.added host to rhv 4.host is up but SD cannot be activated Actual results: host is up but SD cannot be activated Expected results: host is up and SD activated Additional info: logs will be attached
+1 on my environment. Encountered the same issue.
^ with Fiber Channel Data Store.
I had same issue in one of the scale team hosts, fixed by rebooting the host. The error was bad permissions on /dev/dm-N used by the metadata volume. We have udev rules ensuring correct permissions, but maybe there is some issue in the particular way the host was removed that caused this issue.
The host was re-provisioned, the old host object was exist on engine. I did a reinstall without success, then removed and re-added the host.
Can you reproduce this when vdsm is remove like this? 1.remove host form rhvm 2.uninstall vdsm: service vdsmd stop service supervdsm stop yum remove -y vdsm* libvirt* 3.added host to rhv Not clear why you are removing files manually. Packages should remove their files.
There is a good chance that this issue is caused by bug 1562369. Lets retest this when that bug is verified.
Oops, wrong bug - fixed to bug 1331978
Reboot helped solve this issue. What udev roles did we added? # udevadm info --query=all --name=/dev/dm-2 P: /devices/virtual/block/dm-2 N: dm-2 L: 10 S: disk/by-id/dm-name-3600a098038304437415d4b6a59676d43 S: disk/by-id/dm-uuid-mpath-3600a098038304437415d4b6a59676d43 S: disk/by-id/lvm-pv-uuid-9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo S: mapper/3600a098038304437415d4b6a59676d43 E: DEVLINKS=/dev/disk/by-id/dm-name-3600a098038304437415d4b6a59676d43 /dev/disk/by-id/dm-uuid-mpath-3600a098038304437415d4b6a59676d43 /dev/disk/by-id/lvm-pv-uuid-9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo /dev/mapper/3600a098038304437415d4b6a59676d43 E: DEVNAME=/dev/dm-2 E: DEVPATH=/devices/virtual/block/dm-2 E: DEVTYPE=disk E: DM_ACTIVATION=0 E: DM_MULTIPATH_TIMESTAMP=1532874120 E: DM_NAME=3600a098038304437415d4b6a59676d43 E: DM_SUBSYSTEM_UDEV_FLAG0=1 E: DM_SUSPENDED=0 E: DM_UDEV_DISABLE_LIBRARY_FALLBACK_FLAG=1 E: DM_UDEV_PRIMARY_SOURCE_FLAG=1 E: DM_UDEV_RULES_VSN=2 E: DM_UUID=mpath-3600a098038304437415d4b6a59676d43 E: ID_FS_TYPE=LVM2_member E: ID_FS_USAGE=raid E: ID_FS_UUID=9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo E: ID_FS_UUID_ENC=9dbJgB-a11p-l9OE-cD5d-MJHL-D3Dx-WwdImo E: ID_FS_VERSION=LVM2 001 E: MAJOR=253 E: MINOR=2 E: MPATH_SBIN_PATH=/sbin E: SUBSYSTEM=block E: TAGS=:systemd: E: USEC_INITIALIZED=48071
Lowering priority, since there is an easy workaround, and the use case is not clear. Regarding udev rules, we install this: /usr/lib/udev/rules.d/12-vdsm-lvm.rules These rules ensure that devices get the correct owner and group (vdsm:kvm) when you their are added. It is possible that existing device lost the owner:group while vdsm were removed manually, and when adding the host back, the permissions were not fixed since the device did not change. So generally this sounds like an edge case that may be possible to fix by triggering udev rules during installation, but I'm not sure it worth the time. I suggest to move this to 4.3 for now.
Fixing the depends bug again.
Mo pending requests from QE
This bug has not been marked as blocker for oVirt 4.3.0. Since we are releasing it tomorrow, January 29th, this bug has been re-targeted to 4.3.1.
Comment 6 is still unanswered since 2018-07-29. I think we should close this bug since we don't have enough data to tell if this is a real issue that may affect real use of the system.
Closed per comment #15