Bug 2223411
Summary: | Unable to create snapshot for VM with mounted second disk (PVC) | ||||||
---|---|---|---|---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Denys Shchedrivyi <dshchedr> | ||||
Component: | Storage | Assignee: | Adam Litke <alitke> | ||||
Status: | CLOSED ERRATA | QA Contact: | Jenia Peimer <jpeimer> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.14.0 | CC: | akalenyu, alitke, jpeimer, jsaucier, mhenriks, ycui | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.14.1 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-12-07 15:00:42 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Denys Shchedrivyi
2023-07-17 17:41:41 UTC
for info - see same behavior with Fedora38 and RHEL9.2 Summarizing offline chats: The underlying issue is a failure in guest agent freeze: {"component":"virt-launcher","kind":"","level":"error","msg":"Failed to freeze vmi","name":"vm-fedora-with-pvc","namespace":"test-clone","pos":"server.go:269","reason":"virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU agent command 'guest-fsfreeze-freeze': failed to open /mnt/test: Permission denied')","timestamp":"2023-07-17T18:23:58.258646Z","uid":"04ce94f3-5f77-472a-9c61-21eb0f0fb41f"} The corresponding bug for this scenario, and its conclusion is here: https://bugzilla.redhat.com/show_bug.cgi?id=1747960#c35 Some comments on the bug suggest qemu-ga cannot do anything more than expose this (off by default) boolean: https://bugzilla.redhat.com/show_bug.cgi?id=1747960#c20 https://bugzilla.redhat.com/show_bug.cgi?id=1747960#c22 So I am not sure if there's anything we can do on the CNV side, But I am curious about how this has not bugged other users before Thanks for the explanation Alex. I think single disk VMs are overwhelmingly the norm in the field. Also, I wonder if this would reproduce if the second disk is block and initialized with LVM. In any case, I think we should have a KCS article for this topic. Adding Jean-Francois: What do you think? (In reply to Adam Litke from comment #3) > I think single disk VMs are overwhelmingly the norm in the field. From a quick look at hotplug tests, this looks like a common pattern (minus taking a snapshot at the end), but yeah I agree about single-disk VMs being the norm Whoops messed up the needinfo. Michael, I was about to ask if you there is anything we can do from our side like: - Integrate this selinux bool in our golden images - Change the boolean before calling freeze Both seem risky to me, as this should be something that is consciously done by the VM owner I don't think we should change any VM settings. Hi Jean-Francois, Will you be able to create a KCS for this prior to 4.14.0 GA? The KCS is created and published : https://access.redhat.com/solutions/7041127 @jpeimer Please see the KCS article for QA. Thanks. Verified on CNV 4.14.0 and 4.15.0 Could reproduce the issue, tried the proposed solution, and it worked: [fedora@vm-fedora-with-pvc ~]$ sudo setsebool -P virt_qemu_ga_read_nonsecurity_files on [ 337.004824] SELinux: Class mctp_socket not defined in policy.d_nonsecurity_files on [ 337.007003] SELinux: the above unknown classes and permissions will be allowed [ 337.013869] SELinux: Converting 309 SID table entries... [ 337.060824] SELinux: policy capability network_peer_controls=1 [ 337.063419] SELinux: policy capability open_perms=1 [ 337.065175] SELinux: policy capability extended_socket_class=1 [ 337.067260] SELinux: policy capability always_check_network=0 [ 337.069448] SELinux: policy capability cgroup_seclabel=1 [ 337.071488] SELinux: policy capability nnp_nosuid_transition=1 [ 337.073628] SELinux: policy capability genfs_seclabel_symlinks=0 $ oc get vmsnapshot -A NAMESPACE NAME SOURCEKIND SOURCENAME PHASE READYTOUSE CREATIONTIME ERROR default my-vmsnapshot VirtualMachine vm-fedora-with-pvc Succeeded true 12s $ oc get vmsnapshot my-vmsnapshot -o json | jq .status.snapshotVolumes { "excludedVolumes": [ "containerdisk", "cloudinitdisk" ], "includedVolumes": [ "disk-0" ] } Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Virtualization 4.14.1 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:7704 |