Bug 2152186

Summary: pacemaker VirtualDomain live migration fails
Product: Red Hat Enterprise Linux 8 Reporter: michal novacek <mnovacek>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: ASSIGNED --- QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.7CC: agk, cluster-maint, fdinitto, jdenemar, oalbrigt, peterx
Target Milestone: rcFlags: dgilbert: needinfo? (jdenemar)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2132247 Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2132247    
Bug Blocks:    

Comment 2 Dr. David Alan Gilbert 2023-01-12 11:45:08 UTC
Please state which RHEL8 version this worked on - did it work on 8.2?
Please provide libvirt XML for the domain that fails.

Comment 5 Dr. David Alan Gilbert 2023-01-19 19:55:32 UTC
Jiri:
  Should the attached XML work for this? I see it's got hte svirt labels in the XML, but it's marked dynamic which isn't what you pointed
to in bz 2132247?

Comment 7 Dr. David Alan Gilbert 2023-01-25 19:01:01 UTC
OK, having downgraded my qemu to 8.2, I can confirm the live migration works:

kernel: 4.18.0-193.el8.x86_64
qemu-kvm-2.12.0-99.module+el8.2.0+7988+c1d02dbb.4.x86_64
libvirt-8.0.0-10.1.module+el8.7.0+17192+cbc2449b.x86_64

so it looks like the qemu version does make a difference here.

Comment 8 Dr. David Alan Gilbert 2023-01-26 18:05:56 UTC
8.3 fails:
kernel: 4.18.0-193.el8.x86_64
qemu-kvm-4.2.0-34.module+el8.3.0+10437+1ca0c2ba.5.x86_64
libvirt-8.0.0-10.1.module+el8.7.0+17192+cbc2449b.x86_64


Migration: [100 %]error: internal error: unable to execute QEMU command 'cont': Failed to get "write" lock

so not quite the current failure, but still a failure
dest:
Jan 26 11:46:42 dgbz2132247rh82n2 libvirtd[1234610]: hostname: dgbz2132247rh82n2
Jan 26 11:46:42 dgbz2132247rh82n2 libvirtd[1234610]: internal error: unable to execute QEMU command 'cont': Failed to get "write" lock
source:    
Jan 26 11:46:43 dgbz2132247rh82n1 libvirtd[1236335]: internal error: unable to execute QEMU command 'cont': Could not reopen qcow2 layer: Could not read qc>Jan 26 11:46:43 dgbz2132247rh82n1 libvirtd[1236335]: Failed to resume guest L2rhel9.1 after failure
 
Jan 26 11:46:47 dgbz2132247rh82n1 setroubleshoot[1237076]: SELinux is preventing live_migration from read access on the file rhel-guest-image-forl2.qcow2. >Jan 26 11:46:47 dgbz2132247rh82n1 platform-python[1237076]: SELinux is preventing live_migration from read access on the file rhel-guest-image-forl2.qcow2.

Comment 9 Dr. David Alan Gilbert 2023-01-26 18:34:30 UTC
8.4 fails in a closer way to this report - even though it's onl-y a minor qemu bump:

qemu-kvm-4.2.0-48.module+el8.4.0+11909+3300d70f.3.x86_64


source:
2023-01-26T18:25:16.100012Z qemu-kvm: Unable to read from socket: Bad file descriptor
2023-01-26T18:25:16.100141Z qemu-kvm: Unable to read from socket: Bad file descriptor
2023-01-26T18:25:16.100146Z qemu-kvm: Unable to read from socket: Bad file descriptor

Comment 10 Dr. David Alan Gilbert 2023-01-26 19:01:49 UTC
I'm not seeing any obvious source changes between -34 and -48 to do with block or migration beahviour.