Bug 1839992

Summary: qemu-pr-helper does not pass scsi reservations due to qemu mount namespace [rhel-7.9.z]
Product: Red Hat Enterprise Linux 7 Reporter: Roman Hodain <rhodain>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: gaojianan <jgao>
Severity: high Docs Contact:
Priority: high    
Version: 7.8CC: agk, coli, hhan, jdenemar, jgao, jinzhao, jreznik, jsuchane, juzhang, michal.skrivanek, mkalinin, mprivozn, mtessun, pvlasin, qinwang, snagar, virt-maint, yalzhang, yisun
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-4.5.0-36.el7_9.1 Doc Type: Bug Fix
Doc Text:
Cause: As a part of fix of 1823976 a patch that I wrote was merged. However, the patch did not do what we think it would. The idea was to use a libdevmapper API to determine whether given path is managed by devmapper or not. If it isn't then devmapper is not consulted any further. If it is, then devmapper is asked to provide list of dependencies for the device. The aforementioned API to determine whether a path is managed or not uses the major() number of a device. So libvirt did stat() and then used major() to get the major number. But it was doing so over wrong member of the stat structure. Long story short, libvirt was passing garbage to devmapper hoping it will figure out what is it we want. Well, it didn't. Consequence: Dependent devices were not created in namespace, nor CGroups and thus SCSI persistent reservations did not work from inside the guest. Fix: The fix is hilarious - I've discovered that libvirt already has the code which asks devmapper the same thing. And it's written correctly. So I've thrown my code out and replaced it with the code that works. Result: SCSI persistent reservations work again.
Story Points: ---
Clone Of:
: 1849095 (view as bug list) Environment:
Last Closed: 2020-09-29 21:18:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1849095    

Description Roman Hodain 2020-05-26 07:49:52 UTC
Description of problem:
The scsi reservation is not passed to the multipath device when the qemu mount namespace is enabled.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-44.el7_8.2.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Enable privileged scsi passtrough
2. Try to reservat a key
    ./sg_persist --out --register --param-sark=123abc e:

Tested on Win2016, but the same results should come from regular RHEL as well.

Actual results:

    ./sg_persist --out --register --param-sark=123abc e:
    DellEMC   ME4               G280
    Peripheral device type: disk
    PR out (Register): Aborted command
    sg_persist failed: Aborted command

Expected results:
The key is successfully registered

Additional info:
Disabling the mont namspace in /etc/libvirt/qemu.conf resolves the issue.

Comment 1 qing.wang 2020-05-29 06:22:22 UTC
Test on 


3.10.0-1127.el7.x86_64  (RHRL7.8)
qemu-kvm-common-rhev-2.12.0-44.el7_8.2.x86_64

Guest Win2019 with virtio-win-prewhql-0.1-185.iso
Guest RHEL7.9

Test steps:

1.create multipath connection on host 

iscsiadm --mode node --targetname iqn.2016-06.qing.server:a  --portal 10.66.8.105:3260 --login;iscsiadm --mode node --targetname iqn.2016-06.qing.server:b  --portal 10.66.8.105:3260 --login

mpatha (3600140520f6acee074149f3a8516f11e) dm-3 LIO-ORG ,mpath-disk0     
size=25G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=active
| `- 27:0:0:0 sdb 8:16 active undef running
`-+- policy='service-time 0' prio=0 status=enabled
  `- 28:0:0:0 sdc 8:32 active undef running

2.test sg_persist on host with test_sg_persist.sh
 test_sg_persist.sh /dev/sdb
 test_sg_persist.sh /dev/sdb 
 test_sg_persist.sh /dev/mapper/mpatha


cat test_sg_persist.sh

echo "1:register-key"
sg_persist --no-inquiry -v --out --register-ignore --param-sark 123aaa "$@"
echo "2:read-key"
sg_persist --no-inquiry --in -k "$@"
echo "3:reserve"
sg_persist --no-inquiry -v --out --reserve --param-rk 123aaa --prout-type 5 "$@"
echo "4:read-reservation"
sg_persist --no-inquiry --in -r "$@"
echo "5:release"
sg_persist --no-inquiry -v --out --release --param-rk 123aaa --prout-type 5 "$@"
echo "6:read-reservation"
sg_persist --no-inquiry --in -r "$@"
echo "7:cancel-register"
sg_persist --no-inquiry -v --out --register --param-rk 123aaa --prout-type 5 "$@"
echo "8:read-key"
sg_persist --no-inquiry --in -k "$@"

3.passthrough /dev/sdb /dev/sdc /dev/mapper/mpatha to vm 

os=win2019-64-virtio-scsi.qcow2
/usr/libexec/qemu-kvm \
  -name wqvm1 \
  -machine q35 \
  -nodefaults \
  -vga qxl \
  -device pcie-root-port,id=pcie.0-root-port-2,slot=2,bus=pcie.0,multifunction=on \
  -device pcie-root-port,id=pcie.0-root-port-2-1,chassis=3,bus=pcie.0,addr=0x2.0x1 \
  -device pcie-root-port,id=pcie.0-root-port-2-2,chassis=4,bus=pcie.0,addr=0x2.0x2 \
  -device pcie-root-port,id=pcie.0-root-port-3,slot=3,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-4,slot=4,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-5,slot=5,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-6,slot=6,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-8,slot=8,bus=pcie.0 \
  -device pcie-root-port,id=pcie.0-root-port-9,slot=9,bus=pcie.0 \
  -drive id=drive_cd1,if=none,snapshot=off,aio=threads,cache=unsafe,media=cdrom,file=/home/kvm_autotest_root/iso/windows/virtio-win-latest-prewhql.iso  \
  -device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \
  -drive id=drive_cd2,if=none,snapshot=off,aio=threads,cache=none,media=cdrom,file=/home/kvm_autotest_root/iso/windows/winutils.iso \
 -device ide-cd,id=cd2,drive=drive_cd2,bus=ide.1,unit=0 \
  -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2-1,addr=0x0 \
  -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
  -object iothread,id=iothread0 \
  -device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-2-2,addr=0x0,iothread=iothread0 \
  \
  -blockdev driver=qcow2,file.driver=file,cache.direct=off,cache.no-flush=on,file.filename=/home/kvm_autotest_root/images/${os},node-name=drive_image1 \
  -device scsi-hd,id=os1,bus=scsi0.0,drive=drive_image1,bootindex=0 \
  \
  -object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock \
  -device virtio-scsi-pci,id=scsi1,bus=pcie.0-root-port-8,addr=0x0 \
  -blockdev driver=raw,file.driver=host_device,cache.direct=off,cache.no-flush=on,file.filename=/dev/sdb,node-name=drive2,file.pr-manager=helper0 \
  -device scsi-block,bus=scsi1.0,channel=0,scsi-id=0,lun=0,drive=drive2,id=scsi0-0-0-0,bootindex=2 \
  \
  -vnc :5 \
  -qmp tcp:0:5955,server,nowait \
  -monitor stdio \
  -m 4096 \
  -smp 8 \
  \
  -netdev tap,script=/etc/qemu-ifup,downscript=no,id=hostnet1 \
  -device e1000,netdev=hostnet1,id=net1,mac=00:1a:4a:12:13:55,bus=pcie.0-root-port-6,addr=0x0 \

4 login guest then run
./sg_persist --out --register --param-sark=123abc f:


No issue found on qemu layer

Comment 2 Roman Hodain 2020-05-29 08:34:04 UTC
I believe that this outcome is expected. Have you started the qemu process via libvirt? I think the namespace is created by libvirt.

Comment 3 qing.wang 2020-05-29 09:16:27 UTC
Hi hhan,Could you please help to check on libvirt side to reproduce this bug ?

Comment 4 Han Han 2020-05-29 09:35:31 UTC
(In reply to qing.wang from comment #3)
> Hi hhan,Could you please help to check on libvirt side to reproduce this bug
> ?

Forward to Yi Sun

Comment 5 yisun 2020-06-01 02:20:09 UTC
scsi part is covered by jgao now, and since libvirt namespace is a suspicious area, could you pls provide libvirt version? Guess it's the latest rhel7.8.z version?

Comment 6 Roman Hodain 2020-06-02 08:34:05 UTC
Yap, it is libvirt-4.5.0-33.el7_8.1.x86_64

Comment 7 Ademar Reis 2020-06-02 16:15:48 UTC
Given the namespaces are set and handled by libvirt, I'm changing the component.

Comment 8 gaojianan 2020-06-03 04:05:26 UTC
Reproduced in libvirt version and get the same results:
libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
and
libvirt-4.5.0-33.el7_8.1.x86_64

Step:
Scenario1:
Edit the qemu.conf namespace as 
This namespace is turned on
 by default.
#namespaces = [ "mount" ]

1.Prepare scsi device(/dev/sdc) and start qemu-pr-helper:
# lsscsi
[0:0:0:0]    disk    ATA      MB0500GCEHF      HPGC  /dev/sda 
[1:0:0:0]    disk    ATA      MB0500GCEHF      HPGD  /dev/sdb 
[7:0:0:0]    disk    LIO-ORG  device.logical-  4.0   /dev/sdc 

#systemctl start qemu-pr-helper

2.Prepare a guest and attach xml:
# cat attach.xml
<disk type='block' device='lun'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdc'/>
      <reservations managed='yes'>
    </reservations>
<target dev='sdb' bus='scsi'/>
    </disk>

3.Hot-plug the device with 'reservations'
# virsh attach-device avocado-vt-vm1 attach.xml 
Device attached successfully

Check qemu-pr-helper process:
# ps -ef | grep qemu-pr-helper | grep -v grep
root       14342       1  0 23:32 ?        00:00:00 /usr/bin/qemu-pr-helper

4.Login the guest and run script as:
# cat test.sh
#! /bin/sh
sg_persist --no-inquiry -v --out --register-ignore --param-sark 123aaa "$@"
sg_persist --no-inquiry --in -k "$@"
sg_persist --no-inquiry -v --out --reserve --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -r "$@"
sg_persist --no-inquiry -v --out --release --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -r "$@"
sg_persist --no-inquiry -v --out --register --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -k "$@"

# sh test.sh /dev/sda
    Persistent reservation out cdb: 5f 06 00 00 00 00 00 00 18 00 
Persistent reservation out:
Fixed format, current; Sense key: Aborted Command
Additional sense: I/O process terminated
PR out (Register and ignore existing key): Aborted command, type: sense key, other than protection related (asc=0x10)
PR in (Read keys): Aborted command
sg_persist failed: Aborted command
    Persistent reservation out cdb: 5f 01 05 00 00 00 00 00 18 00 
Persistent reservation out:
Fixed format, current; Sense key: Aborted Command
Additional sense: I/O process terminated
PR out (Reserve): Aborted command, type: sense key, other than protection related (asc=0x10)
PR in (Read reservation): Aborted command
sg_persist failed: Aborted command
    Persistent reservation out cdb: 5f 02 05 00 00 00 00 00 18 00 
Persistent reservation out:
Fixed format, current; Sense key: Aborted Command
Additional sense: I/O process terminated
PR out (Release): Aborted command, type: sense key, other than protection related (asc=0x10)
PR in (Read reservation): Aborted command
sg_persist failed: Aborted command
    Persistent reservation out cdb: 5f 00 05 00 00 00 00 00 18 00 
Persistent reservation out:
Fixed format, current; Sense key: Aborted Command
Additional sense: I/O process terminated
PR out (Register): Aborted command, type: sense key, other than protection related (asc=0x10)
PR in (Read keys): Aborted command
sg_persist failed: Aborted command


Failed

Scenario 2:
Edit the qemu.conf namespace as 
This namespace is turned on
 by default.
namespaces = [ "mount" ]

Other steps are all like above,failed again

Comment 9 yisun 2020-06-03 04:12:46 UTC
(In reply to gaojianan from comment #8)
> Reproduced in libvirt version and get the same results:
> libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> and
> libvirt-4.5.0-33.el7_8.1.x86_64
> 
> Step:
> Scenario 2:
> Edit the qemu.conf namespace as 
> This namespace is turned on
>  by default.
> namespaces = [ "mount" ]
> 
> Other steps are all like above,failed again
to turn off the namespaces mount, need to edit the qemu.conf as follow:
namespaces = [ ]

Comment 10 gaojianan 2020-06-03 06:20:29 UTC
(In reply to yisun from comment #9)
> (In reply to gaojianan from comment #8)
> > Reproduced in libvirt version and get the same results:
> > libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> > and
> > libvirt-4.5.0-33.el7_8.1.x86_64
> > 
> > Step:
> > Scenario 2:
> > Edit the qemu.conf namespace as 
> > This namespace is turned on
> >  by default.
> > namespaces = [ "mount" ]
> > 
> > Other steps are all like above,failed again
> to turn off the namespaces mount, need to edit the qemu.conf as follow:
> namespaces = [ ]

Thanks for your remind,try again with namespaces = [ ],but still fail as https://bugzilla.redhat.com/show_bug.cgi?id=1839992#c8

Comment 11 Roman Hodain 2020-06-03 09:41:15 UTC
(In reply to gaojianan from comment #10)
> (In reply to yisun from comment #9)
> > (In reply to gaojianan from comment #8)
> > > Reproduced in libvirt version and get the same results:
> > > libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> > > and
> > > libvirt-4.5.0-33.el7_8.1.x86_64
> > > 
> > > Step:
> > > Scenario 2:
> > > Edit the qemu.conf namespace as 
> > > This namespace is turned on
> > >  by default.
> > > namespaces = [ "mount" ]
> > > 
> > > Other steps are all like above,failed again
> > to turn off the namespaces mount, need to edit the qemu.conf as follow:
> > namespaces = [ ]
> 
> Thanks for your remind,try again with namespaces = [ ],but still fail as
> https://bugzilla.redhat.com/show_bug.cgi?id=1839992#c8

Can you also set 

    cgroup_controllers = [ "cpu", "memory", "blkio", "cpuset", "cpuacct" ]

Paolo Bonzini suggested that there can also be an issue with cgroup controller for devices.

Comment 12 Jaroslav Suchanek 2020-06-03 11:52:15 UTC
Michal, can you please have a look? Thanks. Might be related to bug 1814157 ?

Comment 13 Michal Privoznik 2020-06-03 13:07:15 UTC
(In reply to gaojianan from comment #8)
> Reproduced in libvirt version and get the same results:
> libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> and
> libvirt-4.5.0-33.el7_8.1.x86_64
> 
> Step:
> Scenario1:
> Edit the qemu.conf namespace as 
> This namespace is turned on
>  by default.
> #namespaces = [ "mount" ]
> 
> 1.Prepare scsi device(/dev/sdc) and start qemu-pr-helper:
> # lsscsi
> [0:0:0:0]    disk    ATA      MB0500GCEHF      HPGC  /dev/sda 
> [1:0:0:0]    disk    ATA      MB0500GCEHF      HPGD  /dev/sdb 
> [7:0:0:0]    disk    LIO-ORG  device.logical-  4.0   /dev/sdc 
> 
> #systemctl start qemu-pr-helper

This is not needed, because you set managed='yes'.
And this /dev/sdc - is it a multitarget device? It looks like a regular iSCSI device.

> 
> 2.Prepare a guest and attach xml:
> # cat attach.xml
> <disk type='block' device='lun'>
>       <driver name='qemu' type='raw'/>
>       <source dev='/dev/sdc'/>
>       <reservations managed='yes'>
>     </reservations>
> <target dev='sdb' bus='scsi'/>
>     </disk>

This is incorrect XML. The <reservations/> element must be a child of <source/>. This is the correct one:

    <disk type='block' device='lun'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sdc'>
        <reservations managed='yes'/>
      </source>
      <target dev='sdb' bus='scsi'/>
    </disk>

This explains why your test is still failing - you haven't enabled reservations really.

> 
> 3.Hot-plug the device with 'reservations'
> # virsh attach-device avocado-vt-vm1 attach.xml 
> Device attached successfully
> 
> Check qemu-pr-helper process:
> # ps -ef | grep qemu-pr-helper | grep -v grep
> root       14342       1  0 23:32 ?        00:00:00 /usr/bin/qemu-pr-helper

This is the pr-helper started by systemd earlier. There would be another one, if the passed XML was correct.

Comment 15 gaojianan 2020-06-04 09:57:17 UTC
(In reply to Michal Privoznik from comment #13)
> (In reply to gaojianan from comment #8)
> > Reproduced in libvirt version and get the same results:
> > libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> > and
> > libvirt-4.5.0-33.el7_8.1.x86_64
> > 
> > Step:
> > Scenario1:
> > Edit the qemu.conf namespace as 
> > This namespace is turned on
> >  by default.
> > #namespaces = [ "mount" ]
> > 
> > 1.Prepare scsi device(/dev/sdc) and start qemu-pr-helper:
> > # lsscsi
> > [0:0:0:0]    disk    ATA      MB0500GCEHF      HPGC  /dev/sda 
> > [1:0:0:0]    disk    ATA      MB0500GCEHF      HPGD  /dev/sdb 
> > [7:0:0:0]    disk    LIO-ORG  device.logical-  4.0   /dev/sdc 
> > 
> > #systemctl start qemu-pr-helper
> 
> This is not needed, because you set managed='yes'.
> And this /dev/sdc - is it a multitarget device? It looks like a regular
> iSCSI device.
> 
> > 
> > 2.Prepare a guest and attach xml:
> > # cat attach.xml
> > <disk type='block' device='lun'>
> >       <driver name='qemu' type='raw'/>
> >       <source dev='/dev/sdc'/>
> >       <reservations managed='yes'>
> >     </reservations>
> > <target dev='sdb' bus='scsi'/>
> >     </disk>
> 
> This is incorrect XML. The <reservations/> element must be a child of
> <source/>. This is the correct one:
> 
>     <disk type='block' device='lun'>
>       <driver name='qemu' type='raw'/>
>       <source dev='/dev/sdc'>
>         <reservations managed='yes'/>
>       </source>
>       <target dev='sdb' bus='scsi'/>
>     </disk>
> 
> This explains why your test is still failing - you haven't enabled
> reservations really.
> 
> > 
> > 3.Hot-plug the device with 'reservations'
> > # virsh attach-device avocado-vt-vm1 attach.xml 
> > Device attached successfully
> > 
> > Check qemu-pr-helper process:
> > # ps -ef | grep qemu-pr-helper | grep -v grep
> > root       14342       1  0 23:32 ?        00:00:00 /usr/bin/qemu-pr-helper
> 
> This is the pr-helper started by systemd earlier. There would be another
> one, if the passed XML was correct.

Thanks for your suggestion, try again :
1.Login a multipath target iscsi lun:
 # iscsiadm --mode node --login --targetname iqn.2016-03.com.virttest:logical-pool.target1 --portal $ip
Logging in to [iface: default, target: iqn.2016-03.com.virttest:logical-pool.target1, portal: $ip,3260] (multiple)

2.Prepare a guest and attach device
# virsh attach-device avocado-vt-vm1 attach.xml
Device attached successfully

# cat attach.xml 
<disk type='block' device='lun'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/sde'>
      <reservations managed='yes'>
    </reservations>
</source>
<target dev='sdc' bus='scsi'/>
    </disk>

3.Login in the guest and check:
# cat sh 
#!/bin/sh
sg_persist --no-inquiry -v --out --register-ignore --param-sark 123aaa "$@"
sg_persist --no-inquiry --in -k "$@"
sg_persist --no-inquiry -v --out --reserve --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -r "$@"
sg_persist --no-inquiry -v --out --release --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -r "$@"
sg_persist --no-inquiry -v --out --register --param-rk 123aaa --prout-type 5 "$@"
sg_persist --no-inquiry --in -k "$@"

# sh sh /dev/sda 
    Persistent reservation out cdb: 5f 06 00 00 00 00 00 00 18 00 
PR out: command (Register and ignore existing key) successful
  PR generation=0x1, 1 registered reservation key follows:
    0x123aaa
    Persistent reservation out cdb: 5f 01 05 00 00 00 00 00 18 00 
PR out: command (Reserve) successful
  PR generation=0x1, Reservation follows:
    Key=0x123aaa
    scope: LU_SCOPE,  type: Write Exclusive, registrants only
    Persistent reservation out cdb: 5f 02 05 00 00 00 00 00 18 00 
PR out: command (Release) successful
  PR generation=0x1, there is NO reservation held
    Persistent reservation out cdb: 5f 00 05 00 00 00 00 00 18 00 
PR out: command (Register) successful
  PR generation=0x1, there are NO registered reservation keys

No error found both namespace = ["mount"] or namespace = []

Comment 16 gaojianan 2020-06-04 10:14:57 UTC
(In reply to Roman Hodain from comment #11)
> (In reply to gaojianan from comment #10)
> > (In reply to yisun from comment #9)
> > > (In reply to gaojianan from comment #8)
> > > > Reproduced in libvirt version and get the same results:
> > > > libvirt-6.3.0-1.module+el8.3.0+6478+69f490bb.x86_64
> > > > and
> > > > libvirt-4.5.0-33.el7_8.1.x86_64
> > > > 
> > > > Step:
> > > > Scenario 2:
> > > > Edit the qemu.conf namespace as 
> > > > This namespace is turned on
> > > >  by default.
> > > > namespaces = [ "mount" ]
> > > > 
> > > > Other steps are all like above,failed again
> > > to turn off the namespaces mount, need to edit the qemu.conf as follow:
> > > namespaces = [ ]
> > 
> > Thanks for your remind,try again with namespaces = [ ],but still fail as
> > https://bugzilla.redhat.com/show_bug.cgi?id=1839992#c8
> 
> Can you also set 
> 
>     cgroup_controllers = [ "cpu", "memory", "blkio", "cpuset", "cpuacct" ]
> 
> Paolo Bonzini suggested that there can also be an issue with cgroup
> controller for devices.


I have enabled cgroup controllers and try again as comment above,but it will successful.

Comment 21 Michal Privoznik 2020-06-05 13:46:55 UTC
I can see the following AVC:

type=AVC msg=audit(1591351931.866:4399): avc:  denied  { read write } for  pid=30220 comm="qemu-kvm" path="/dev/mapper/control" dev="devtmpfs" ino=2574 scontext=system_u:system_r:svirt_t:s0:c138,c185 tcontext=system_u:object_r:lvm_control_t:s0 tclass=chr_file permissive=1

which makes me think of bug 1822522. BUT, as I was debugging, I realized that the dependent device was not allowed in CGroups. In this specific case: /dev/mapper/mpathd was passed as a disk to the VM. The device consists of /dev/sdd:

# multipath -ll
mpathe (360014057931a93b90f04cdf98ab20cd6) dm-6 LIO-ORG ,device.logical- 
size=100M features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 9:0:0:1 sdf 8:80 active ready running
mpathd (360014054f261f4509ef473d822956916) dm-4 LIO-ORG ,device.logical- 
size=1.0G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 8:0:0:0 sdd 8:48 active ready running
mpathc (360014058826db48b5334ecf993c1312a) dm-3 LIO-ORG ,device.logical- 
size=1.0G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 7:0:0:0 sdc 8:32 active ready running
dm1 (36001405ed38b021021d42aa888ce8cd1) dm-5 LIO-ORG ,device.logical- 
size=100M features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 9:0:0:0 sde 8:64 active ready running
mpathf (36001405fc7afc9b4a9c479eb6dd5c96a) dm-7 LIO-ORG ,device.logical- 
size=100M features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 9:0:0:2 sdg 8:96 active ready running



But the VM's CGroup did not contain /dev/sdd (8:48):

# grep 48 /sys/fs/cgroup/devices/machine.slice/machine-qemu\\x2d5\\x2davocado\\x2dvt\\x2dvm1.scope/devices.list | wc -l
0


Debugging continues.

Comment 22 Michal Privoznik 2020-06-05 14:51:14 UTC
Alright. I think I know what the problem is. The AVC mentioned in comment 21 is unrelated (the bug reproduces regardless of SELinux mode).

As a result of bug 1823976 libvirt does stat() + dm_is_dm_major(major(sb.st_dev)) over each path before asking devmapper for targets. Well, in this specific case the stat() returns the following:

91          if (stat(path, &sb) < 0) {
(gdb) p path
$2 = 0x7fc424012b70 "/dev/mapper/mpathd"
(gdb) n
97          if (!dm_is_dm_major(major(sb.st_dev)))
(gdb) p sb
$3 = {st_dev = 5, st_ino = 157089, st_nlink = 1, st_mode = 25008, st_uid = 107, st_gid = 107, __pad0 = 0, st_rdev = 64772, st_size = 0, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1591366190, tv_nsec = 517861746}, st_mtim = {tv_sec = 1591366190, 
    tv_nsec = 517861746}, st_ctim = {tv_sec = 1591366191, tv_nsec = 795843727}, __unused = {0, 0, 0}}
(gdb) call gnu_dev_major(sb.st_dev)
$4 = 0
(gdb) call dm_is_dm_major(gnu_dev_major(sb.st_dev))
$5 = 0
(gdb) n
173     }
(gdb) p $eax


In other words: stat() returns 0:5 for the /dev/mapper/mpathd and subsequent dm_is_dm_major() returins false (of course it does). Therefore, libvirt doesn't try to ask for dependencies (dm_task_create(DM_DEVICE_DEPS)) and as a result of all of this doesn't allow dependent devices in CGgroups nor create them in the namespace. Incidentally, qemu-pr-helper (or a library it links with) expects to see ALL deps, but it doesn't hence the failure.

Indeed, running ls returns weird major:minor:

# stat -L /dev/mapper/mpathd
  File: ‘/dev/mapper/mpathd’
  Size: 0               Blocks: 0          IO Block: 4096   block special file
Device: 5h/5d   Inode: 157089      Links: 1     Device type: fd,4
Access: (0660/brw-rw----)  Uid: (  107/    qemu)   Gid: (  107/    qemu)
Context: system_u:object_r:svirt_image_t:s0:c187,c500
Access: 2020-06-05 10:09:50.517861746 -0400
Modify: 2020-06-05 10:09:50.517861746 -0400
Change: 2020-06-05 10:09:51.795843727 -0400
 Birth: -


Truth is, that sb.s_rdev contains the correct major (253, or 0xfd). Alasdair, what stat memeber should be passed to dm_is_dm_major() is it major(sb.st_dev), major(sb.st_dev), or both (and it's sufficient if one succeeds)?

Comment 23 Alasdair Kergon 2020-06-10 13:30:00 UTC
man 2 stat:
               dev_t     st_rdev;    /* device ID (if special file) */

Comment 24 Michal Privoznik 2020-06-11 11:45:42 UTC
Thanks for conforming that. Patch proposed here:

https://www.redhat.com/archives/libvir-list/2020-June/msg00482.html

Comment 25 Michal Privoznik 2020-06-15 12:53:06 UTC
Merged upstream as:

d53ab9f54e virDevMapperGetTargetsImpl: Check for dm major properly
dfa0e118f7 util: Move virIsDevMapperDevice() to virdevmapper.c

v6.4.0-95-gd53ab9f54e

Comment 44 gaojianan 2020-07-17 07:15:12 UTC
Verified at libvirt version: 
libvirt-4.5.0-36.el7_7.9.1.x86_64

step as https://bugzilla.redhat.com/show_bug.cgi?id=1849095#c6
work as expected,so verified.

Comment 47 errata-xmlrpc 2020-09-29 21:18:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4070

Comment 48 Red Hat Bugzilla 2023-09-14 06:01:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days