Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
Trying to move resource:
-> $ pcs resource move c8kubermaster2 swir
Location constraint to move resource 'c8kubermaster2' has been created
Waiting for the cluster to apply configuration changes...
Location constraint created to move resource 'c8kubermaster2' has been removed
Waiting for the cluster to apply configuration changes...
Error: resource 'c8kubermaster2' is running on node 'whale'
Error: Errors have occurred, therefore pcs is unable to continue
VM store is on mounted a GlusterFS volume via fuse (now when libgfapi is removed/deprecated)
'virtsh' migrates a VM with '--unsafe' just fine, but adding this to the resource:
-> $ pcs resource update c8kubermaster2 attr migrate_options="--unsafe"
makes _no_ difference.
Should be very easy to reproduce.
Seem that moving a VirtualDomain resource between nodes is completely broken.
many thanks, L.
Version-Release number of selected component (if applicable):
resource-agents-4.10.0-4.el9.x86_64
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
I'm looking at something else very strange, I see:
-> $ pcs constraint config | lesi
...
Resource: c8kubermaster2
Enabled on:
Node: whale (score:INFINITY)
and even though I do: 'clear' & 'cleanup' that constrain remains there, until I deleted the resource & re-create, now I can 'move' the resource again, albeit not! as 'live' migration.
Also 'setenforce' seems to make no difference.(unless some silent denials do)
In new C9 there is a number of vir* services which replace libvirtd.service - looking at virtqemud.service I see:
..
2022-01-05 16:58:16.399+0000: 644190: warning : virSecurityValidateTimestamp:206 : Invalid XATTR timestamp detected on /VMs3/c8kubermaster2.qcow2 secdriver=dac
internal error: unable to execute QEMU command 'cont': Failed to get "write" lock
'locking' problem affect other bits outside of PCS, backups, snapshots of VMs, now with only-via-fuse method. (unless there is some way to fuse-mount glusterFS with does the trick)
thanks, L.
Thank you for reporting this issue. After looking into it in more detail, I'm pretty sure that I know what is causing this. There is a bug in the new implementation of `pcs resource move` introduced in pcs-0.11 (see pcs man page, section changes in pcs-0.11, for more details) which in some cases will not move the resource. However, the old implementation of the move command is still available as `pcs resource move-with-constraint` which can be used a a workaround for now. Another option is to run `pcs resource clear <resource> <node>` just before `pcs resource move`.
Yes, but really the issue I care about reporting this BZ is LIVE migration/move of VirtualDomain - even if it's not really a BUG on PCS part - and possible ways for PCS to fix/improve that.
With Qemu/Libvirt versions still with 'libgfapi' support LIVE migration works smoothly but with new version where 'libgfapi' is removed only way is to fuse-mount GlusterFS volumes, it's broken, LIVE move fails-over to shutdown/start - which is, well, what it is.
from log:
...
internal error: unable to execute QEMU command 'cont': Failed to get "write" lock
...
thanks, L.