Bug 1749257

Summary: Failed to start second guest which use a shared nvdimm device
Product: Red Hat Enterprise Linux 9 Reporter: Luyao Huang <lhuang>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
libvirt sub component: General QA Contact: Luyao Huang <lhuang>
Status: CLOSED MIGRATED Docs Contact:
Severity: medium    
Priority: high CC: dyuan, hhan, jinqi, jsuchane, lcong, lmen, virt-maint, xuzhang, yafu, yalzhang
Version: 9.0Keywords: MigratedToJIRA, Reopened, Triaged
Target Milestone: rcFlags: knoel: mirror+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-22 12:19:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Luyao Huang 2019-09-05 08:50:29 UTC
Description of problem:
Failed to start second guest which use a shared nvdimm device

Version-Release number of selected component (if applicable):
libvirt-5.6.0-4.module+el8.1.0+4160+b50057dc.x86_64

How reproducible:
100%

Steps to Reproduce:
1. prepare 2 guest with 1 shared nvdimm device:

# virsh dumpxml vm1

    <memory model='nvdimm' access='shared'>
      <source>
        <path>/tmp/nvdimm</path>
        <alignsize unit='KiB'>2048</alignsize>
      </source>
      <target>
        <size unit='KiB'>524288</size>
        <node>1</node>
        <label>
          <size unit='KiB'>128</size>
        </label>
      </target>
      <alias name='nvdimm0'/>
      <address type='dimm' slot='0'/>
    </memory>

# virsh dumpxml vm2

    <memory model='nvdimm' access='shared'>
      <source>
        <path>/tmp/nvdimm</path>
        <alignsize unit='KiB'>2048</alignsize>
      </source>
      <target>
        <size unit='KiB'>524288</size>
        <node>1</node>
        <label>
          <size unit='KiB'>128</size>
        </label>
        <readonly/>
      </target>
      <address type='dimm' slot='0'/>
    </memory>

2. start 2 guest

# virsh start vm1
Domain vm1 started

# virsh start vm2
error: Failed to start domain vm2
error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /tmp/nvdimm which is already in use

3. check nvdimm label:

# ll -Z /tmp/nvdimm 
-rw-r--r--. 1 qemu qemu system_u:object_r:svirt_image_t:s0:c486,c699 536870912 Sep  5 04:37 /tmp/nvdimm


# getfattr -m trusted.libvirt.security -d /tmp/nvdimm
getfattr: Removing leading '/' from absolute path names
# file: tmp/nvdimm
trusted.libvirt.security.dac="+0:+0"
trusted.libvirt.security.ref_dac="3"
trusted.libvirt.security.ref_selinux="1"
trusted.libvirt.security.selinux="unconfined_u:object_r:user_tmp_t:s0"
trusted.libvirt.security.timestamp_dac="1567565944"
trusted.libvirt.security.timestamp_selinux="1567565944"


Actual results:

Fail to start 2nd guest

Expected results:

Start success

Additional info:

from the qemu doc, when set share=on, the same nvdimm backend file could be shared with other guest:

   "share=on/off" controls the visibility of guest writes. If
   "share=on", then guest writes will be applied to the backend
   file. If another guest uses the same backend file with option
   "share=on", then above writes will be visible to it as well. If
   "share=off", then guest writes won't be applied to the backend
   file and thus will be invisible to other guests.

Comment 1 jiyan 2019-09-11 07:55:28 UTC
Also hit this issue when using pipe serial device.
    <serial type='pipe'>
      <source path='/mnt/pipe'/>
      <target type='pci-serial' port='0'>
        <model name='pci-serial'/>
      </target>
      <alias name='ua-600dc99f-3bb9-43c2-bf4c-1558d1e00512'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x05' function='0x0'/>
    </serial>

# virsh start test 
error: Failed to start domain test
error: internal error: child reported (status=125): Requested operation is not valid: Setting different SELinux label on /mnt/pipe which is already in use

Comment 2 Michal Privoznik 2019-09-18 13:58:17 UTC
I think my seclabel remembering only uncovered the issue that was dormant. Previously, we relabeled the file without asking and thus effectively cut off the access for qemu that's running. It didn't caused much trouble because qemu did not repoen the NVMDIMM. What we should to is to introduce <seclabel/> element to <memory/> so that a different seclabel can be set (the same way we allow overriding seclabel for other devices).
And as for the <serial/> - you need to specify a seclabel so that libvirt doesn't invent a new one (which fails obviously).

Comment 3 Xuesong Zhang 2019-09-20 11:11:53 UTC
This issue also occurred for disk device with the same scenario, Han, would you please add a comment for the failure scenarios? Thx.

Comment 4 Xuesong Zhang 2019-09-23 09:20:25 UTC
(In reply to Xuesong Zhang from comment #3)
> This issue also occurred for disk device with the same scenario, Han, would
> you please add a comment for the failure scenarios? Thx.

After a analysis, the scenarios Han met is similar with the following BZ, so add scenarios there. Please ignore the above comment.
https://bugzilla.redhat.com/show_bug.cgi?id=1740024#c9

Comment 9 RHEL Program Management 2022-03-19 07:27:12 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 10 liang cong 2022-04-13 01:27:25 UTC
Hi Michal,
I reopen this bug since this issue still exists on libvirt-8.0.0-8.el9_0.x86_64 of rhel9.0.
Could you help to confirm whether this bug should be fixed, if so, do you have plan to fix it?


Thanks
Liang Cong

Comment 12 Michal Privoznik 2022-04-13 12:31:12 UTC
Yes, this should be fixed. However, I have no timeframe yet.

Comment 14 RHEL Program Management 2023-09-22 12:18:29 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.

Comment 15 RHEL Program Management 2023-09-22 12:19:51 UTC
This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there.

Due to differences in account names between systems, some fields were not replicated.  Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information.

To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer.  You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like:

"Bugzilla Bug" = 1234567

In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information.