Description of problem: When trying to upgrade a 4.4.1 oVirt Node to oVirt 4.4.2 the upgrade fails due to dangling symlinks from iSCSI Storage Domain. How reproducible: Upgrade a oVirt Node 4.4.1 with iSCSI Storage domains to 4.4.2 Steps to Reproduce: 1. Run yum upgrade on the node Actual results: Running scriptlet: ovirt-node-ng-image-update-4.4.2-1.el8.noarch 1/3 Local storage domains were found on the same filesystem as / ! Please migrate the data to a new LV before upgrading, or you will lose the VMs See: https://bugzilla.redhat.com/show_bug.cgi?id=1550205#c3 Storage domains were found in: /rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/dom_md /rhev/data-center/mnt/blockSD/37a74cff-19be-44a2-98f9-0720745fa4b5/dom_md /rhev/data-center/mnt/blockSD/0040a08b-36ea-4bdb-ba93-f4d55321bb97/dom_md /rhev/data-center/mnt/blockSD/2c5eef3e-b40e-4ea5-8c97-07e5114381ac/dom_md error: %prein(ovirt-node-ng-image-update-4.4.2-1.el8.noarch) scriptlet failed, exit status 1 Expected results: The symlinks should have been cleaned/ignored, so the upgrade can complete. Additional info: This was introduced in https://bugzilla.redhat.com/show_bug.cgi?id=1850378
Nir Soffer, any chance this is a vdsm bug not removing symlinks on block storage domain deactivation?
When checking the dangling symlinks it seems like they are all removed snapshots of VM's. The logs neither say anything about removing the LV after merging the snapshot. There are also symlinks still pointing to an 'active' /dev/xxx/xxx, but that LV doesn't exist anymore: # fdisk -l /dev/6e99da85-8414-4ec5-92c3-b6cf741fc125/06a5da70-f29c-41f2-a063-aa22677e7bdc fdisk: cannot open /dev/6e99da85-8414-4ec5-92c3-b6cf741fc125/06a5da70-f29c-41f2-a063-aa22677e7bdc: Input/output error
The issue described in this bug seems to be the correct response to upgrade. Because after the Bug 1850378 is fixed, if local storage is defined(or found) on the file system / (root), the host upgrade will be blocked.
can you please specify how have you migrate the snapshot? and what is the status of folders /rhev/data-center/mnt/blockSD/ The root cause to what you are encountering is rhvh would not upgrade on a disk that have any relevant info of the previous installation to avoid data loss such as snapshot and vms that has not been migrated, in case you have you done it manually you should handle all content of those folders (the folders themselves can be left as empty folders) If you have used the admin pages we need to make sure we are cleaning all links after migration, but for us to know that we need to know how were you getting to this situation.
Hi Nir, Well the snapshots were not migrated. They are old symlinks from already removed/merged snapshots. Its quite easy to reproduce. 1. Create a VM on iSCSI Storage Domain 2. Create snapshot on it 3. Delete snapshot 4. You'll see dangling symlinks Seems like RemoveSnapshotSingleDiskLive does not properly remove the symlinks.
(In reply to Jean-Louis Dupond from comment #0) > Description of problem: > When trying to upgrade a 4.4.1 oVirt Node to oVirt 4.4.2 the upgrade fails > due to dangling symlinks from iSCSI Storage Domain. > > > How reproducible: > Upgrade a oVirt Node 4.4.1 with iSCSI Storage domains to 4.4.2 > > > Steps to Reproduce: > 1. Run yum upgrade on the node > > Actual results: > Running scriptlet: ovirt-node-ng-image-update-4.4.2-1.el8.noarch > 1/3 > Local storage domains were found on the same filesystem as / ! Please > migrate the data to a new LV before upgrading, or you will lose the VMs > See: https://bugzilla.redhat.com/show_bug.cgi?id=1550205#c3 > Storage domains were found in: > /rhev/data-center/mnt/blockSD/6e99da85-8414-4ec5-92c3-b6cf741fc125/dom_md > /rhev/data-center/mnt/blockSD/37a74cff-19be-44a2-98f9-0720745fa4b5/dom_md > /rhev/data-center/mnt/blockSD/0040a08b-36ea-4bdb-ba93-f4d55321bb97/dom_md > /rhev/data-center/mnt/blockSD/2c5eef3e-b40e-4ea5-8c97-07e5114381ac/dom_md This check in the scriptlet is wrong. These are not local storage domains but block storage domains. The check for local storage domains should exclude /rhev/. Local storage domains are never created in this location. Local stoage domains can be created anywhere outside /rhev. For every local fs storage domain we will have a symlink: /rhev/data-center/mnt/_path_to_local_dir -> /path/to/local/dir > error: %prein(ovirt-node-ng-image-update-4.4.2-1.el8.noarch) scriptlet > failed, exit status 1 > > > Expected results: > The symlinks should have been cleaned/ignored, so the upgrade can complete. Removing the symlinks would be nice but it is not required for normal operation of the system. > Additional info: > This was introduced in https://bugzilla.redhat.com/show_bug.cgi?id=1850378 Correct, this fix for this bug is incorrect.
(In reply to Sandro Bonazzola from comment #1) > Nir Soffer, any chance this is a vdsm bug not removing symlinks on block > storage domain deactivation? Vdsm never removes symlinks in /rhev/data-center/mnt/blockSD/*/. It would be nice to remove them but we can never guarantee that the links are removed, for example if vdsm is killed. The issue in this bug is wrong search for local fs storage domain.
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.
$ git tag --contains a6ed080d886db7db73d912a68858e2e52558fc04 ovirt-node-ng-image-4.4.3
The bug has been resolved on "ovirt-node-ng-image-update-4.4.3-1.el8" Test Version: host: ovirt-node-ng-image-update-4.4.3-1.el8 oVirt: 4.4.1.4-1.el8 Test Steps: 1. Install ovirt-node-ng-installer-4.4.1-2020072310.el8.iso on an iSCSI machine 2. Setup local repos and point to "ovirt-node-ng-image-update-4.4.3-1.el8.noarch.rpm" 3. Add host to oVirt 4. Add a iSCSI storage domain and wait for its status to become "Active" 5. Create a VM on iSCSI Storage Domain 6. Create a snapshot on it 7. Delete the snapshot 8. Manage the host to maintenance mode 9. Upgrade the host # yum update # reboot 10. Activate the host via oVirt 11. Start the VM Actual results: Upgrade is successful. The status of the iSISI storage domain is "Active" and the VM can start up successful after the upgrade. Move the bug status to "VERIFIED".