Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2132247

Summary:	pacemaker VirtualDomain live migration fails
Product:	Red Hat Enterprise Linux 9	Reporter:	michal novacek <mnovacek>
Component:	resource-agents	Assignee:	Oyvind Albrigtsen <oalbrigt>
Status:	CLOSED MIGRATED	QA Contact:	cluster-qe <cluster-qe>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	9.1	CC:	agk, cluster-maint, fdinitto, jdenemar, peterx, sbradley
Target Milestone:	rc	Keywords:	MigratedToJIRA
Target Release:	---	Flags:	pm-rhel: mirror+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2152186 (view as bug list)		Environment:
Last Closed:	2023-09-22 19:50:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2152186

Description michal novacek 2022-10-05 08:46:48 UTC

Description of problem:
VirtualDomain live migration does not work and always ends up with vm being rebooted.

It seems to me that the qemu command for live migration fails and causes this:

Oct 05 10:19:14  VirtualDomain(pool-10-37-167-241)[1104724]:    INFO: pool-10-37-167-241: Starting live migration to picard-02 (using: virsh --connect=qemu:///system --quiet migrate --live  pool-10-37-167-241 qemu+ssh://picard-02/system ).
Oct 05 10:19:21  VirtualDomain(pool-10-37-167-241)[1104724]:    ERROR: pool-10-37-167-241: live migration to picard-02 failed: 1
Oct 05 10:19:21  VirtualDomain(pool-10-37-167-241)[1104977]:    INFO: Issuing graceful shutdown request for domain pool-10-37-167-241.


Version-Release number of selected component (if applicable): 
pacemaker-2.1.4-5.el9.x86_64
resource-agents-4.10.0-23.el9.x86_64
libvirt-8.5.0-7.el9_1.x86_64

How reproducible: always


Steps to Reproduce:
1. pcs resource move pool-10-37-167-241
2. 
3.

Actual results: offline migration

Expected results: live migration

Additional info:

running "virsh --connect=qemu:///system --quiet migrate --live  pool-10-37-167-241 qemu+ssh://picard-02/system" manually will just end up with error:

[root@riker-01 ~]# virsh --connect=qemu:///system migrate --live  pool-10-37-167-242 qemu+ssh://10.37.165.56/system --verbose 
Migration: [ 94 %]error: operation failed: job 'migration out' unexpectedly failed

Comment 2 Dr. David Alan Gilbert 2022-11-10 11:26:12 UTC

I tried to reproduce this without libvirt, but it works:

on riker:
/usr/libexec/qemu-kvm -m 2048 -cpu host -smp 2 -machine pc-q35-rhel9.0.0 -accel kvm -nographic -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.1","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-3-format","read-only":true,"driver":"qcow2","file":"libvirt-3-storage","backing":null}' -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/pool-10-37-167-238-delta","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-3-format"}' -device '{"driver":"virtio-blk-pci","drive":"libvirt-2-format","id":"virtio-disk0","bootindex":1}'

on picard:
/usr/libexec/qemu-kvm -m 2048 -cpu host -smp 2 -machine pc-q35-rhel9.0.0 -accel kvm -nographic -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhel9.1","node-name":"libvirt-3-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-3-format","read-only":true,"driver":"qcow2","file":"libvirt-3-storage","backing":null}' -blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/pool-10-37-167-238-delta","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}' -blockdev '{"node-name":"libvirt-2-format","read-only":false,"driver":"qcow2","file":"libvirt-2-storage","backing":"libvirt-3-format"}' -device '{"driver":"virtio-blk-pci","drive":"libvirt-2-format","id":"virtio-disk0","bootindex":1}' -incoming tcp::4444

let the VM boot on the source, then ctrl-a c
(qemu)   migrate_set_parameter max-bandwidth 10G
(qemu)   migrate -d tcp:picard-02:4444

and wait a few seconds and the VM is happy on the destination; I bounced it back in the other direction as well.
So the basics seem OK for tht stripped down set.

Comment 3 Dr. David Alan Gilbert 2022-11-10 14:18:31 UTC

Seen on the source:

2022-11-10 14:00:44.189+0000: 127919: debug : qemuMonitorJSONIOProcessLine:195 : Line [{"timestamp": {"seconds": 1668088844, "microseconds": 189656}, "event": "MIGRATION_PASS", "data": {"pass": 3}}]
2022-11-10 14:00:44.189+0000: 127919: info : qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_EVENT: mon=0x7f0988083460 event={"timestamp": {"seconds": 1668088844, "microseconds": 189656}, "event": "MIGRATION_PASS", "data": {"pass": 3}}

2022-11-10 14:00:44.234+0000: 127919: info : qemuMonitorIOWrite:382 : QEMU_MONITOR_IO_WRITE: mon=0x7f0988083460 buf={"execute":"query-migrate","id":"libvirt-427"}^M
2022-11-10 14:00:44.303+0000: 127919: info : qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_EVENT: mon=0x7f0988083460 event={"timestamp": {"seconds": 1668088844, "microseconds": 303047}, "event": "MIGRATION", "data": {"status": "failed"}}

so the source thinks something failed in those few ms

On the destination we see:
2022-11-10 14:00:39.232+0000: 120615: info : qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_EVENT: mon=0x7f08fc083460 event={"timestamp": {"seconds": 1668088839, "microseconds": 232939}, "event": "MIGRATION", "data": {"status": "active"}}
2022-11-10 14:00:44.338+0000: 120543: debug : virThreadJobSet:93 : Thread 120543 (rpc-virtqemud) is now running job remoteDispatchDomainMigrateFinish3Params
2022-11-10 14:00:44.338+0000: 120543: debug : qemuDomainObjStartJobPhase:765 : Starting phase 'finish3' of 'migration in' job
2022-11-10 14:00:44.338+0000: 120543: debug : qemuDomainObjSetJobPhase:734 : Setting 'migration in' phase to 'finish3'

2022-11-10 14:00:44.339+0000: 120615: debug : qemuMonitorJSONIOProcessLine:195 : Line [{"return": {"status": "inmigrate", "singlestep": false, "running": false}, "id": "libvirt-408"}]
2022-11-10 14:00:44.339+0000: 120615: info : qemuMonitorJSONIOProcessLine:214 : QEMU_MONITOR_RECV_REPLY: mon=0x7f08fc083460 reply={"return": {"status": "inmigrate", "singlestep": false, "running": false}, "id": "libvirt-408"}

so the source seems to have seen a fail, but the destination was apparently happy - I *think* the destination thinks it finished

Comment 6 Jiri Denemark 2022-11-10 15:48:23 UTC

Apparently the migration is almost finished as the reported failure happens
very shortly after

    2022-11-10 14:00:44.189+0000: 127919: info : qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_EVENT: mon=0x7f0988083460 event={"timestamp": {"seconds": 1668088844, "microseconds": 189440}, "event": "MIGRATION", "data": {"status": "device"}}

(this event is accompanied with the MIGRATION_PASS event mentined in comment
#3) that is QEMU is migrating the device state after transferring all memory.
And interestingly only 4 seconds after guest cpus were resumed (as a recovery
from failed migration) the domain crashes:

    2022-11-10 14:00:48.756+0000: 127919: debug : qemuMonitorIO:556 : Error on monitor <null> mon=0x7f0988083460 vm=0x55d5f04ca6a0 name=pool-10-37-167-238
    2022-11-10 14:00:48.958+0000: 127981: debug : processMonitorEOFEvent:4113 : Monitor connection to 'pool-10-37-167-238' closed without SHUTDOWN event; assuming the domain crashed

That finish3 phase from comment #3 is started by

    2022-11-10 14:00:44.338+0000: 120543: debug : virDomainMigrateFinish3Params:5425 : dconn=0x7f08fc015190, params=0x7f09380030f0, nparams=3, cookiein=(nil), cookieinlen=0, cookieout=0x7f0942ee78d0, cookieoutlen=0x7f0942ee78c4, flags=0x1, cancelled=1

where cancelled=1 means this API is called to tell the destination migration
failed on the source and the domain on the destination host needs to be
killed.

So it looks something bad happened when device state was supposed to be
transferred. But there's no info about what happened as even "query-migrate"
called just after the failure only says "status": "failed" without any error
message. Perhaps QEMU log for the domain on the source host could help?

Comment 7 Dr. David Alan Gilbert 2022-11-10 15:58:09 UTC

(In reply to Jiri Denemark from comment #6)
> Apparently the migration is almost finished as the reported failure happens
> very shortly after
> 
>     2022-11-10 14:00:44.189+0000: 127919: info :
> qemuMonitorJSONIOProcessLine:209 : QEMU_MONITOR_RECV_EVENT:
> mon=0x7f0988083460 event={"timestamp": {"seconds": 1668088844,
> "microseconds": 189440}, "event": "MIGRATION", "data": {"status": "device"}}
> 
> (this event is accompanied with the MIGRATION_PASS event mentined in comment
> #3) that is QEMU is migrating the device state after transferring all memory.
> And interestingly only 4 seconds after guest cpus were resumed (as a recovery
> from failed migration) the domain crashes:
> 
>     2022-11-10 14:00:48.756+0000: 127919: debug : qemuMonitorIO:556 : Error
> on monitor <null> mon=0x7f0988083460 vm=0x55d5f04ca6a0
> name=pool-10-37-167-238
>     2022-11-10 14:00:48.958+0000: 127981: debug :
> processMonitorEOFEvent:4113 : Monitor connection to 'pool-10-37-167-238'
> closed without SHUTDOWN event; assuming the domain crashed
> 
> That finish3 phase from comment #3 is started by
> 
>     2022-11-10 14:00:44.338+0000: 120543: debug :
> virDomainMigrateFinish3Params:5425 : dconn=0x7f08fc015190,
> params=0x7f09380030f0, nparams=3, cookiein=(nil), cookieinlen=0,
> cookieout=0x7f0942ee78d0, cookieoutlen=0x7f0942ee78c4, flags=0x1, cancelled=1
> 
> where cancelled=1 means this API is called to tell the destination migration
> failed on the source and the domain on the destination host needs to be
> killed.
> 
> So it looks something bad happened when device state was supposed to be
> transferred. But there's no info about what happened as even "query-migrate"
> called just after the failure only says "status": "failed" without any error
> message. Perhaps QEMU log for the domain on the source host could help?

The source host has:

2022-11-10 14:00:39.232+0000: initiating migration
2022-11-10T14:00:44.302946Z qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
2022-11-10T14:00:44.384781Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-11-10T14:00:44.384815Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-11-10T14:00:44.384825Z qemu-kvm: Unable to read from socket: Bad file descriptor
qemu-kvm: ../block/io.c:2022: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
2022-11-10 14:00:48.958+0000: shutting down, reason=crashed

but I'm assuming that's the recovery failing rather than the migration failing

Comment 8 Dr. David Alan Gilbert 2022-11-10 20:09:32 UTC

Hmm still not sure what's going on here; the:

2022-11-10T14:00:44.302946Z qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)

is on the source near the end of the migration - so that's bad and suggests it's a block problem; but it doesn't
always happen which makes me suspicious.  Still, if that is happening, I wonder if either libvirt or qemu
isn't understanding this is shared storage?

Comment 9 Jiri Denemark 2022-11-11 09:13:05 UTC

BTW, the migration was started with flags 0x101, that is VIR_MIGRATE_LIVE and
VIR_MIGRATE_CHANGE_PROTECTION. There should be no storage migration of any
kind involved.

Comment 12 Dr. David Alan Gilbert 2022-12-13 19:53:26 UTC

Good news, I've been able to replicate this in my environment (a nested host):
  L0 RHEL8.x ish
  L1 RHEL9.1
  L2 RHEL9.1

L0 provides iSCSI and virt fencing to the L1's

virsh migrate --live L2rhel9.1 qemu+ssh://dgilbertbz2132247n2/system --verbose

on the source we see:

char device redirected to /dev/pts/0 (label charserial0)
2022-12-13 19:47:07.526+0000: initiating migration
2022-12-13T19:47:07.912528Z qemu-kvm: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)
2022-12-13T19:47:07.998369Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-12-13T19:47:07.998385Z qemu-kvm: Unable to read from socket: Bad file descriptor
2022-12-13T19:47:07.998390Z qemu-kvm: Unable to read from socket: Bad file descriptor
qemu-kvm: ../block/io.c:2022: int bdrv_co_write_req_prepare(BdrvChild *, int64_t, int64_t, BdrvTrackedRequest *, int): Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
2022-12-13 19:47:21.241+0000: shutting down, reason=crashed

Comment 13 Dr. David Alan Gilbert 2022-12-14 10:08:49 UTC

Downgraded L1 qemu's to 6.2.0-11.el9_0.6.x86_64 and it still fails.

Comment 14 Dr. David Alan Gilbert 2022-12-14 11:30:33 UTC

Back on the 7.0.0 qemu
Rolled kernel back to 5.14.0-70.39.1.el9_0.x86_64 - still fails

Comment 15 Dr. David Alan Gilbert 2022-12-14 12:04:28 UTC

Still fails with libvirt-8.4.0-3

Hmm, so having rolled pretty much everything interesting back, (individually)
that's not helping.
I'll go back to taking gdb and tracing to it.

Comment 16 Dr. David Alan Gilbert 2022-12-14 16:42:37 UTC

Still fails on upstream 7.2 qemu

Comment 17 Dr. David Alan Gilbert 2022-12-14 18:12:56 UTC

There's some selinuxism going on here, I'm not yet understanding what, but on the source during migrate:

Dec 14 13:10:06 dgilbertbz2132247n1 setroubleshoot[295298]: SELinux is preventing /opt/bin/qemu-system-x86_64 from read access on the file rhel-guest-image-forl2.qcow2. For complete SELinux messages run: sealert -l 0c080ecf-c3c5-4d71-a8b3-875bd2770b49
Dec 14 13:10:06 dgilbertbz2132247n1 setroubleshoot[295298]: SELinux is preventing /opt/bin/qemu-system-x86_64 from read access on the file rhel-guest-image-forl2.qcow2.
[root@dgilbertbz2132247n1 ~]# sealert -l 0c080ecf-c3c5-4d71-a8b3-875bd2770b49
SELinux is preventing /opt/bin/qemu-system-x86_64 from read access on the file rhel-guest-image-forl2.qcow2.

*****  Plugin restorecon_source (62.6 confidence) suggests   *****************

If you want to fix the label. 
/opt/bin/qemu-system-x86_64 default label should be bin_t.
Then you can run restorecon.
Do
# /sbin/restorecon -v /opt/bin/qemu-system-x86_64

*****  Plugin qemu_file_image (37.7 confidence) suggests   *******************

If rhel-guest-image-forl2.qcow2 is a virtualization target
Then you need to change the label on rhel-guest-image-forl2.qcow2'
Do
# semanage fcontext -a -t virt_image_t 'rhel-guest-image-forl2.qcow2'
# restorecon -v 'rhel-guest-image-forl2.qcow2'

*****  Plugin catchall (1.12 confidence) suggests   **************************

If you believe that qemu-system-x86_64 should be allowed read access on the rhel-guest-image-forl2.qcow2 file by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'live_migration' --raw | audit2allow -M my-livemigration
# semodule -X 300 -i my-livemigration.pp


Additional Information:
Source Context                system_u:system_r:svirt_t:s0:c227,c943
Target Context                system_u:object_r:svirt_image_t:s0:c179,c732
Target Objects                rhel-guest-image-forl2.qcow2 [ file ]
Source                        live_migration
Source Path                   /opt/bin/qemu-system-x86_64
Port                          <Unknown>
Host                          dgilbertbz2132247n1
Source RPM Packages           
Target RPM Packages           
SELinux Policy RPM            selinux-policy-targeted-34.1.43-1.el9.noarch
Local Policy RPM              selinux-policy-targeted-34.1.43-1.el9.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     dgilbertbz2132247n1
Platform                      Linux dgilbertbz2132247n1
                              5.14.0-162.11.1.el9_1.x86_64 #1 SMP
                              PREEMPT_DYNAMIC Thu Dec 8 05:28:54 EST 2022 x86_64
                              x86_64
Alert Count                   1
First Seen                    2022-12-14 13:10:05 EST
Last Seen                     2022-12-14 13:10:05 EST
Local ID                      0c080ecf-c3c5-4d71-a8b3-875bd2770b49

Raw Audit Messages
type=AVC msg=audit(1671041405.277:1638): avc:  denied  { read } for  pid=293735 comm="live_migration" name="rhel-guest-image-forl2.qcow2" dev="dm-0" ino=97994 scontext=system_u:system_r:svirt_t:s0:c227,c943 tcontext=system_u:object_r:svirt_image_t:s0:c179,c732 tclass=file permissive=0


type=SYSCALL msg=audit(1671041405.277:1638): arch=x86_64 syscall=openat success=no exit=EACCES a0=ffffff9c a1=56367c2e06d1 a2=80000 a3=0 items=0 ppid=1 pid=293735 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm=live_migration exe=/opt/bin/qemu-system-x86_64 subj=system_u:system_r:svirt_t:s0:c227,c943 key=(null)

Hash: live_migration,svirt_t,svirt_image_t,file,read

Comment 18 Dr. David Alan Gilbert 2022-12-14 18:22:33 UTC

With some debug added, I'm seeing:

2022-12-14 18:10:04.902+0000: initiating migration
bdrv_inactivate_all: driver: qcow2 filename: /mnt/shared/rhel-guest-image-forl2.qcow2
bdrv_inactivate_recurse: ENTRY driver: qcow2 filename: /mnt/shared/rhel-guest-image-forl2.qcow2
bdrv_inactivate_recurse: bdrv_inactivate: 0
bdrv_inactivate_recurse: klass->inactivate: 0
bdrv_inactivate_recurse: ENTRY driver: file filename: /mnt/shared/rhel-guest-image-forl2.qcow2
bdrv_inactivate_recurse: klass->inactivate: 0
bdrv_inactivate_recurse: perms cumulative_perms=b
bdrv_inactivate_recurse: bdrv_inactivate_recurse ret=-1
bdrv_inactivate_all: driver: qcow2 filename: /mnt/shared/rhel-guest-image-forl2.qcow2 recurse=-1
2022-12-14T18:10:05.361844Z qemu-system-x86_64: qemu_savevm_state_complete_precopy_non_iterable: bdrv_inactivate_all() failed (-1)

so the failure there is:
bdrv_inactivate_recurse: perms cumulative_perms=b which is the cumulative_perms check added in qemu a13de40a054
'a13de40 block: bdrv_inactivate_recurse(): check for permissions and fix crash
   Vladimir Sementsov-Ogievskiy <vsementsov> Sat Sep 11 15:00:27 2021 +0300
'
=b' corresponds to 0xb=11 = 8+2+1 = BLK_PERM_RESIZE + BLK_PERM_WRITE + BLK_PERM_CONSISTENT_READ

and I think that check is saying we don't expect BLK_PERM_WRITE at that stage - something should have dropped it
by then.

Comment 19 Dr. David Alan Gilbert 2022-12-14 18:59:38 UTC

Seen on the destination side only:

Dec 14 13:45:40 dgilbertbz2132247n2 virtqemud[296716]: 2022-12-14 18:45:40.987+0000: 296716: warning : virSecurityValidateTimestamp:205 : Invalid XATTR timestamp detected on /mnt/shared/rhel-guest-image-forl2.qcow2 secdriver=selinux
Dec 14 13:45:41 dgilbertbz2132247n2 virtqemud[296717]: 2022-12-14 18:45:41.014+0000: 296717: warning : virSecurityValidateTimestamp:205 : Invalid XATTR timestamp detected on /mnt/shared/rhel-guest-image-forl2.qcow2 secdriver=dac

Comment 20 Dr. David Alan Gilbert 2022-12-14 19:58:21 UTC

If I hack the check from a13de40 out, I get further but the destination fails with:

Dec 14 14:43:42 dgilbertbz2132247n2 virtqemud[336909]: internal error: unable to execute QEMU command 'cont': Failed to get "write" lock

I noticed in the reporter's logs that they have:

2022-11-10 14:00:44.541+0000: 120543: debug : virFileIsSharedFSType:3442 : Check if path /var/lib/libvirt/images/pool-10-37-167-238-delta with FS magic 18225520 is shared

I'm not seeing that in my setup

Jiri: Is there some magic that should mean libvirt notices it's shared?
      Do those Xattr timestamp things in comment 19 mean anything to you?

Comment 21 Dr. David Alan Gilbert 2022-12-15 13:41:01 UTC

For me, disabling selinux is making this work.

Comment 22 Jiri Denemark 2022-12-15 15:22:36 UTC

Hmm, it might be a labeling issue of some kind. It would be nice to see the
domain XML, libvirt logs and the mount table. I see
/var/lib/libvirt/images/pool-10-37-167-238-delta is on GFS2, but no idea what
filesystem /mnt/shared/rhel-guest-image-forl2.qcow2 is on.

Comment 23 Dr. David Alan Gilbert 2023-01-05 11:02:55 UTC

The mount entry in my nested reproducer setup is:

/dev/mapper/shared_vg-shared_lv /mnt/shared gfs2 rw,seclabel,noatime,rgrplvb 0 0

Comment 25 Dr. David Alan Gilbert 2023-01-05 14:41:45 UTC

jdenemar pointed me at:
https://listman.redhat.com/archives/libvir-list/2022-December/236418.html

which was asked int he context of ceph, but is the same:

''For shared filesytems, which DO have SELinux labelling, then label
assignment needs to be tracked across all hosts which can use that
filesystem for VM storage.

Libvirt is NOT capable of doing this tracking itself. So mgmt apps
need to tell libvirt to use a static label which they've figured
out uniqueness for globally. Libvirt can still do relabelling of
files. IOW, we'd recommend this:

  <seclabel type='static' model='selinux' relabel='yes'>
''

Comment 29 RHEL Program Management 2023-09-22 19:48:43 UTC

Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 30 RHEL Program Management 2023-09-22 19:50:36 UTC

This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.