Bug 1643707
| Summary: | LSM is failing with L2 permissions failure. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Frank DeLorey <fdelorey> |
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Han Han <hhan> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 7.5 | CC: | abpatil, aperotti, aromito, bcholler, branpise, bzlotnik, dyuan, fdelorey, gveitmic, gwatson, hhan, jdenemar, jsuchane, lichen, lsurette, michal.skrivanek, mjankula, mkalinin, mtessun, nashok, nsoffer, pkrempa, smaudet, srevivo, tnisan, xuzhang, yafu, yalzhang, ycui |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-09-10 10:59:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1417161, 1597019, 1660265 | ||
Please add logs. Bug without logs is pretty much useless When filing RHV bugs, please include relevant software versions. I have extracted these from sosreport-ucaprdrvh05.hq.tnb.com.my.02239297-20181027030125 but ovirt-engine is interesting, too. vdsm-4.19.51-1.el7ev.x86_64 libvirt-daemon-kvm-3.9.0-14.el7_5.2.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64 running kernel-3.10.0-862.el7.x86_64 what is the backing storage for source and destination? NFS? iSCSI? (I'm only trying to collect more data, it is unlikely that I continue the research here) The migration is from the storage domain "f219a555-9187-4158-b7ed-60ba2cc51098" to "2cde64b7-aa85-45fe-846f-65dcd3e5ced7".
It failed when it does "blockJobAbort" with the error below.
===
2018-10-27 02:34:42,578+0800 INFO (jsonrpc/1) [vdsm.api] START diskReplicateFinish(srcDisk={u'device': u'disk', u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': u'f219a555-9187-4158-b7ed-60ba2cc51098', u'imageID': u'733646b4-de58-4d09-96eb-8b698b759dd1'}, dstDisk={u'device': u'disk', u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': u'326f1ee8-cdaf-4ae5-abae-2d37b6fc9cdf', u'imageID': u'733646b4-de58-4d09-96eb-8b698b759dd1'}) from=::ffff: (api:46)
libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-virtio-disk2' not found
===
As Yuval Turgeman mentioned in the mailing list, there are AVC denied error in the audit log at the time of issue which seems to be the root cause.
===
type=AVC msg=audit(1540578884.718:9218518): avc: denied { write } for pid=6841 comm="qemu-kvm" path="/dev/dm-233" dev="tmpfs" ino=470420506 scontext=system_u:system_r:svirt_t:s0:c107,c132 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
type=SYSCALL msg=audit(1540578884.718:9218518): arch=c000003e syscall=209 success=no exit=-13 a0=7f89550eb000 a1=1 a2=7f89410f9820 a3=1 items=0 ppid=1 pid=6841 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c107,c132 key=(null)
===
It's showing error on "/dev/dm-233" and the context was having is fixed_disk_device_t instead of "svirt_image_t" as per the log above. The dm-233 is the destination storage LV.
lrwxrwxrwx. 1 root root 9 Oct 27 01:25 326f1ee8--cdaf--4ae5--abae--2d37b6fc9cdf-2cde64b7--aa85--45fe--846f--65dcd3e5ced7 -> ../dm-233
The server is having libvirt 3.9.0 where libvirt will use mount namespace. So it's unlikely that any other process(like udev, vdsm) changes the SELinux context. The strace and qemu logs clearly shows "permission denied" error message.
2018-10-26T18:34:44.718660Z qemu-kvm: Failed to flush the L2 table cache: Permission denied
2018-10-26T18:34:44.719401Z qemu-kvm: Failed to flush the refcount block cache: Permission denied
Since the SELinux context is set by libvirtd, I don't think this is an RHV bug. However, I was not able to reproduce this issue yet.
Do we have any known issue with libvirt not setting correct context during blockCopy/blockJobAbort?
Also, the customer was able to migrate the disk without any changes in the next attempt.
(In reply to nijin ashok from comment #10) > The migration is from the storage domain > "f219a555-9187-4158-b7ed-60ba2cc51098" to > "2cde64b7-aa85-45fe-846f-65dcd3e5ced7". > > It failed when it does "blockJobAbort" with the error below. > > === > 2018-10-27 02:34:42,578+0800 INFO (jsonrpc/1) [vdsm.api] START > diskReplicateFinish(srcDisk={u'device': u'disk', u'poolID': > u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': > u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': > u'f219a555-9187-4158-b7ed-60ba2cc51098', u'imageID': > u'733646b4-de58-4d09-96eb-8b698b759dd1'}, dstDisk={u'device': u'disk', > u'poolID': u'33f59a82-e273-433f-a59f-4ae661527d04', u'volumeID': > u'2cde64b7-aa85-45fe-846f-65dcd3e5ced7', u'domainID': > u'326f1ee8-cdaf-4ae5-abae-2d37b6fc9cdf', u'imageID': > u'733646b4-de58-4d09-96eb-8b698b759dd1'}) from=::ffff: (api:46) > > libvirtError: internal error: unable to execute QEMU command > 'block-job-complete': Block job 'drive-virtio-disk2' not found > === > There is one libvirt bug BZ1506072 similar to this bug. > > As Yuval Turgeman mentioned in the mailing list, there are AVC denied error > in the audit log at the time of issue which seems to be the root cause. > The audit log of BZ1506072: type=AVC msg=audit(1508873458.895:3099876): avc: denied { write } for pid=4211 comm="qemu-kvm" name="dm-143" dev="tmpfs" ino=176589443 scontext=system_u:system_r:svirt_t:s0:c339,c582 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file type=SYSCALL msg=audit(1508873458.895:3099876): arch=x86_64 syscall=open success=no exit=EACCES a0=56074c4f8031 a1=84002 a2=0 a3=0 items=0 ppid=1 pid=4211 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm=qemu-kvm exe=/usr/libexec/qemu-kvm subj=system_u:system_r:svirt_t:s0:c339,c582 key=(null) > === > type=AVC msg=audit(1540578884.718:9218518): avc: denied { write } for > pid=6841 comm="qemu-kvm" path="/dev/dm-233" dev="tmpfs" ino=470420506 > scontext=system_u:system_r:svirt_t:s0:c107,c132 > tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file > type=SYSCALL msg=audit(1540578884.718:9218518): arch=c000003e syscall=209 > success=no exit=-13 a0=7f89550eb000 a1=1 a2=7f89410f9820 a3=1 items=0 ppid=1 > pid=6841 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 > egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" > exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c107,c132 > key=(null) > === > > It's showing error on "/dev/dm-233" and the context was having is > fixed_disk_device_t instead of "svirt_image_t" as per the log above. The > dm-233 is the destination storage LV. > > > lrwxrwxrwx. 1 root root 9 Oct 27 01:25 > 326f1ee8--cdaf--4ae5--abae--2d37b6fc9cdf-2cde64b7--aa85--45fe--846f-- > 65dcd3e5ced7 -> ../dm-233 > > > The server is having libvirt 3.9.0 where libvirt will use mount namespace. > So it's unlikely that any other process(like udev, vdsm) changes the SELinux > context. The strace and qemu logs clearly shows "permission denied" error > message. > > 2018-10-26T18:34:44.718660Z qemu-kvm: Failed to flush the L2 table cache: > Permission denied > 2018-10-26T18:34:44.719401Z qemu-kvm: Failed to flush the refcount block > cache: Permission denied > > Since the SELinux context is set by libvirtd, I don't think this is an RHV > bug. However, I was not able to reproduce this issue yet. > > Do we have any known issue with libvirt not setting correct context during > blockCopy/blockJobAbort? > However, BZ1506072 is fixed in libvirt-3.9.0-3.el7. I will try to reproduce this bug on the libvirt version of reporter. Could you tell me what is the exact libvirt version of reporter? I don't know how to find it in the sos reports.... > Also, the customer was able to migrate the disk without any changes in the > next attempt. I find the libvirt-libs version of sos report is: libvirt-libs-3.2.0-14.el7_4.3.x86_64. However, the similar bug is fixed since libvirt-3.2.0-14.el7_4.5(BZ1516717). I will keep tracking and try to reproduce it. Sorry for the wrong libvirt version. I mistook history libvirt version as current libvirt version...
I try to reproduce it on libvirt-3.9.0-14.el7_5.2.x86_64 qemu-kvm-rhev-2.10.0-21.el7_5.1.x86_64 via virsh:
Since LSM in rhv actually invokes libvirt blockCopy with --shallow --reuse-external flag to copy storage, then invokes blockJobAbort with --pivot to switch to new storage. So I tried with following
1. Prepare a running VM, lvm backend, virtio type
# virsh dumpxml rhel|grep '/<disk/,/<\disk/'
<disk type='block' device='disk'>
<driver name='qemu' type='raw'/>
<source dev='/dev/test/base'/>
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</disk>
2. Create backing file
# qemu-img create -b /dev/test/base -f qcow2 /dev/test/lv1 -o backing_fmt=raw
# qemu-img create -b /dev/test/base -f qcow2 /dev/test/lv2 -o backing_fmt=raw
3. Start vm and create disk snapshot with reused external
# virsh start rhel
Domain rhel started
# virsh snapshot-create-as rhel s1 --no-metadata --reuse-external --disk-only --diskspec vda,file=/dev/test/lv1
Domain snapshot s1 created
4. Do shallow blockcopy with reused external disk. Use --pivot to switch to new storage after storage copied.
# virsh blockcopy rhel vda /dev/test/lv2 --shallow --wait --verbose --transient-job --reuse-external --pivot
Block Copy: [100 %]
Successfully pivoted
# virsh blockcopy rhel vda /dev/test/lv1 --shallow --wait --verbose --transient-job --reuse-external --pivot
Block Copy: [100 %]
Successfully pivoted
This bug is not reproduced.
(In reply to Han Han from comment #13) > This bug is not reproduced. I was not able to reproduce in my test environment as well. I tested it multiple times using RHV with same vdsm/libvirt version. From the customer's vdsm log, I can see that the issue has happened on multiple VMs. And the customer is having libvirt 3.9.0-14.el7_5.2 which already have the fix for BZ 1516717 and BZ 1506072. Also, the issue is fixed automatically if he tried the LSM multiple times. I was trying to get libvirt debug logs and output of "ls -lRZ /dev/*" within the mount namespace in a loop during the migration to have more understanding on this if he is able to reproduce the issue again. Do you need any other specific information if the customer is able to reproduce the issue again so that we can understand the root cause? (In reply to nijin ashok from comment #14) > (In reply to Han Han from comment #13) > > This bug is not reproduced. > > I was not able to reproduce in my test environment as well. I tested it > multiple times using RHV with same vdsm/libvirt version. > > From the customer's vdsm log, I can see that the issue has happened on > multiple VMs. And the customer is having libvirt 3.9.0-14.el7_5.2 which > already have the fix for BZ 1516717 and BZ 1506072. Also, the issue is fixed > automatically if he tried the LSM multiple times. > > I was trying to get libvirt debug logs and output of "ls -lRZ /dev/*" within > the mount namespace in a loop during the migration to have more > understanding on this if he is able to reproduce the issue again. > > Do you need any other specific information if the customer is able to > reproduce the issue again so that we can understand the root cause? No more extra info needed. It is better to get libvirtd.log, but libvirtd log was not enable in the env of reporter:) I will check the libvirt part of vdsm.log with problem tomorrow. And try to reproduce it by following the vdsm log... Also, the bug 1597019 looks similar to this one which was opened by RHV QE. I can see same "permission denied" error messages in the qemu logs for the VM in that BZ as well. Looks like they were not able to reproduce it after that. Following the operations in vdsm, I wrote a script to reproduce it:
#!/bin/bash -x
DOM=rhel
DISK=vda
IMG_SRC=/var/lib/libvirt/images/rhel.qcow2
BASE=/datacenter/storage/base
MIR1=/datacenter/storage/lv1
MIR2=/datacenter/storage/lv2
function blockcopy_check() {
virsh blockcopy $DOM $DISK $1 --shallow --wait --verbose --pivot --transient-job --reuse-external
if [ $? -ne 0 ];then
echo fail
exit 2
fi
}
#qemu-img convert $IMG_SRC -O raw $BASE
qemu-img create -b $BASE -f qcow2 $MIR1 -o backing_fmt=raw
qemu-img create -b $BASE -f qcow2 $MIR2 -o backing_fmt=raw
# preparation
# Please prepare symlink of block backend: $BASE $MIR1 $MIR2
useradd vdsm -u 36 -G kvm
DIR=$(dirname $BASE)
chmod 777 $DIR -R
chcon -u system_u -r object_r -t virt_var_run_t -l s0 $DIR -R
while true;do
virsh start $DOM && sleep 30
virsh snapshot-create-as $DOM s1 --no-metadata --disk-only --reuse-external --diskspec $DISK,file=$MIR1
blockcopy_check $MIR2
blockcopy_check $MIR1
virsh destroy $DOM
virsh detach-disk $DOM $DISK --config
virsh attach-disk $DOM $BASE $DISK --config
done
Hope it can hit this bug sometime :)
I tried to produce it on:
libvirt-3.9.0-14.el7_5.8.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.7.x86_64
selinux-policy-3.13.1-192.el7_5.7.noarch
1. Create logical volumes on fc storage
2. Create symbol links of the LVs
# ls /rhev/ -alZ
drwxr-xr-x. root root unconfined_u:object_r:mnt_t:s0 .
dr-xr-xr-x. root root system_u:object_r:root_t:s0 ..
lrwxrwxrwx. root root unconfined_u:object_r:mnt_t:s0 lv1 -> /dev/fc/lv
lrwxrwxrwx. root root unconfined_u:object_r:mnt_t:s0 lv2 -> /dev/fc/lv1
# ls /dev/fc -alZ
drwxr-xr-x. root root system_u:object_r:device_t:s0 .
drwxr-xr-x. root root system_u:object_r:device_t:s0 ..
lrwxrwxrwx. root root system_u:object_r:device_t:s0 lv -> ../dm-3
lrwxrwxrwx. root root system_u:object_r:device_t:s0 lv1 -> ../dm-4
3. Create a VM based /rhev/lv1
Disk xml:
<disk type='block' device='disk' snapshot='no'>
<driver name='qemu' type='raw' cache='none' error_policy='stop' io='native'/>
<source dev='/rhev/lv1'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
</disk>
Start vm and blockcopy to another logical volume:
# virsh blockcopy fc vda /rhev/lv2 --wait --verbose --pivot
Block Copy: [100 %]
Successfully pivoted
Copy back:
# virsh blockcopy fc vda /rhev/lv1 --wait --verbose --pivot
Block Copy: [100 %]
Successfully pivoted
Bug not reproduced.
Hi Raz, are they that vdsm does when preparing fc storage? What vdsm exactly does when preparing fc?
Hi Han Han, +Elad who discovered this bug (https://bugzilla.redhat.com/show_bug.cgi?id=1597019) on RHV QE side and looks exactly the same. We probably can reproduce it on our side? Hi Han Han, the following are the commands used for volume and VM creation. All taken from the logs uploaded to bug 1597019 in https://bugzilla.redhat.com/attachment.cgi?id=1455763 hypervisor-lynx01_vdsm.log: LV creation: 2018-06-29 14:58:20,773+0300 DEBUG (tasks/4) [storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-7 /usr/bin/sudo -n /sbin/lvm lvcreate --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/3600a09803830447a4f244c4657595057|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --autobackup n --contiguous n --size 1024m --addtag OVIRT_VOL_INITIALIZING --name 45468d0d-77ab-4ff1-a60c-6e68e1c7e0bb f54afc10-74ad-4ae5-99b5-32f1ac874c44 (cwd None) (commands:65) hypervisor-lynx03_vdsm.log: VM XML for domain create (start VM): <domain type="kvm" xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"><name>vm_type_2914583186_from_both</name><uuid>a09ab6e7-57bf-4010-b616-003df33ff5b5</uuid><memory>1048576</memory><currentMemory>1048576</currentMemory><maxMemory slots="16">4194304</maxMemory><vcpu current="1">16</vcpu><sysinfo type="smbios"><system><entry name="manufacturer">oVirt</entry><entry name="product">OS-NAME:</entry><entry name="version">OS-VERSION:</entry><entry name="serial">HOST-SERIAL:</entry><entry name="uuid">a09ab6e7-57bf-4010-b616-003df33ff5b5</entry></system></sysinfo><clock offset="variable" adjustment="0"><timer name="rtc" tickpolicy="catchup"></timer><timer name="pit" tickpolicy="delay"></timer><timer name="hpet" present="no"></timer></clock><features><acpi></acpi><vmcoreinfo></vmcoreinfo></features><cpu match="exact"><model>Westmere</model><topology cores="1" threads="1" sockets="16"></topology><numa><cell id="0" cpus="0" memory="1048576"></cell></numa></cpu><cputune></cputune><devices><input type="mouse" bus="ps2"></input><channel type="unix"><target type="virtio" name="ovirt-guest-agent.0"></target><source mode="bind" path="/var/lib/libvirt/qemu/channels/a09ab6e7-57bf-4010-b616-003df33ff5b5.ovirt-guest-agent.0"></source></channel><channel type="unix"><target type="virtio" name="org.qemu.guest_agent.0"></target><source mode="bind" path="/var/lib/libvirt/qemu/channels/a09ab6e7-57bf-4010-b616-003df33ff5b5.org.qemu.guest_agent.0"></source></channel><controller type="virtio-serial" index="0" ports="16"><alias name="ua-1e1a4b3e-822e-4474-bcef-9a7a062f7fbe"></alias></controller><video><model type="qxl" vram="8192" heads="1" ram="65536" vgamem="16384"></model><alias name="ua-3caa0ebc-f90a-4450-a72c-3006e313c9ed"></alias></video><memballoon model="virtio"><stats period="5"></stats><alias name="ua-4f67f598-d26c-4700-b2b1-71443e4fb2f4"></alias></memballoon><sound model="ich6"><alias name="ua-58891468-8889-43ac-8661-f1b02fcb610e"></alias></sound><controller type="usb" model="piix3-uhci" index="0"></controller><graphics type="spice" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" tlsPort="-1"><channel name="main" mode="secure"></channel><channel name="inputs" mode="secure"></channel><channel name="cursor" mode="secure"></channel><channel name="playback" mode="secure"></channel><channel name="record" mode="secure"></channel><channel name="display" mode="secure"></channel><channel name="smartcard" mode="secure"></channel><channel name="usbredir" mode="secure"></channel><listen type="network" network="vdsm-ovirtmgmt"></listen></graphics><rng model="virtio"><backend model="random">/dev/urandom</backend><alias name="ua-ab6e48fd-84ee-4681-95e9-5bf3ac10be0e"></alias></rng><controller type="scsi" model="virtio-scsi" index="0"><alias name="ua-b54c7cc0-0810-46ef-a893-977030002d54"></alias></controller><channel type="spicevmc"><target type="virtio" name="com.redhat.spice.0"></target></channel><interface type="bridge"><model type="virtio"></model><link state="up"></link><source bridge="ovirtmgmt"></source><alias name="ua-aceaac08-f924-4f3a-ab96-a7a1ac0cc711"></alias><mac address="00:1a:4a:16:88:0f"></mac><filterref filter="vdsm-no-mac-spoofing"></filterref><bandwidth></bandwidth></interface><disk type="file" device="cdrom" snapshot="no"><driver name="qemu" type="raw" error_policy="report"></driver><source file="" startupPolicy="optional"></source><target dev="hdc" bus="ide"></target><readonly></readonly><alias name="ua-84ff7fe0-9caf-4c6b-8d02-9e76ec9fe314"></alias></disk><disk snapshot="no" type="block" device="disk"><target dev="vda" bus="virtio"></target><source dev="/rhev/data-center/mnt/blockSD/f54afc10-74ad-4ae5-99b5-32f1ac874c44/images/fee6faec-ebfc-440a-8e88-84fe4ecf68d1/cad8fb18-bfdc-4585-aee6-1a8218d18c09"></source><driver name="qemu" io="native" type="qcow2" error_policy="stop" cache="none"></driver><alias name="ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1"></alias><boot order="1"></boot><serial>fee6faec-ebfc-440a-8e88-84fe4ecf68d1</serial></disk></devices><pm><suspend-to-disk enabled="no"></suspend-to-disk><suspend-to-mem enabled="no"></suspend-to-mem></pm><os><type arch="x86_64" machine="pc-i440fx-rhel7.5.0">hvm</type><smbios mode="sysinfo"></smbios></os><metadata><ovirt-tune:qos></ovirt-tune:qos><ovirt-vm:vm><minGuaranteedMemoryMb type="int">1024</minGuaranteedMemoryMb><clusterVersion>4.2</clusterVersion><ovirt-vm:custom></ovirt-vm:custom><ovirt-vm:device mac_address="00:1a:4a:16:88:0f"><ovirt-vm:custom></ovirt-vm:custom></ovirt-vm:device><ovirt-vm:device devtype="disk" name="vda"><ovirt-vm:poolID>5eb0ed42-7acf-11e8-ae76-00163e7be006</ovirt-vm:poolID><ovirt-vm:volumeID>cad8fb18-bfdc-4585-aee6-1a8218d18c09</ovirt-vm:volumeID><ovirt-vm:imageID>fee6faec-ebfc-440a-8e88-84fe4ecf68d1</ovirt-vm:imageID><ovirt-vm:domainID>f54afc10-74ad-4ae5-99b5-32f1ac874c44</ovirt-vm:domainID></ovirt-vm:device><launchPaused>false</launchPaused><resumeBehavior>auto_resume</resumeBehavior></ovirt-vm:vm></metadata></domain>'} The failure: 2018-06-29 15:04:31,212+0300 INFO (jsonrpc/3) [virt.vm] (vmId='a09ab6e7-57bf-4010-b616-003df33ff5b5') Disabling drive monitoring (drivemonitor:58) 2018-06-29 15:04:31,276+0300 ERROR (jsonrpc/3) [virt.vm] (vmId='a09ab6e7-57bf-4010-b616-003df33ff5b5') Unable to stop the replication for the drive: vda (vm:4750) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4747, in diskReplicateFinish self._dom.blockJobAbort(drive.name, blockJobFlags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 773, in blockJobAbort if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self) libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1' not found Hi Han Han, unfortunately, we still don't have the exact steps that trigger the issue. You can track the correspondings in bug 1660265 which has several customer tickets attached. I looked through some of the logs attached but none of those I looked at contained libvirtd debug logs. Without that I can't help much. It is fishy though that 'block-job-complete' was called in the situations where the job is missing as libvirt makes sure that the job is in ready state before sending the command or that it can't be completed. Both should not have happened. Unfortunately without debug logs or an reproducer it's very hard to figure out what's happening. (In reply to Peter Krempa from comment #35) > It is fishy though that 'block-job-complete' was called in the situations > where the job is missing as libvirt makes sure that the job is in ready > state before sending the command or that it can't be completed. Both should > not have happened. Peter, we had a similar issue in blockCommit in the past, when we failed to check if the block job is ready in the xml before trying to pivot. Maybe we have similar race? What vdsm i doing is (simplified): # Looking for the replication blockJob info (checking its presence) blkJobInfo = self._dom.blockJobInfo(drive.name, 0) # Checking if we reached the replication mode ("mirroring" in libvirt # and qemu terms) if blkJobInfo['cur'] != blkJobInfo['end']: fail... # Stopping the replication self._dom.blockJobAbort(drive.name, blockJobFlags) In active layer commit, we do: def _activeLayerCommitReady(self, jobInfo, drive): if (jobInfo['cur'] == jobInfo['end'] and jobInfo['type'] == pivot): # Check the job state in the xml to make sure the job is # ready. We know about two interesting corner cases: # # - cur == 0 and end == 0 when a job starts. Trying to pivot # succeeds, but the xml never updates after that. # See https://bugzilla.redhat.com/1442266. # # - cur == end and cur != 0, but the job is not ready yet, and # blockJobAbort raises an error. # See https://bugzilla.redhat.com/1376580 self.log.debug("Checking xml for drive %r", drive.name) root = ET.fromstring(self._dom.XMLDesc(0)) disk_xpath = "./devices/disk/target[@dev='%s'].." % drive.name disk = root.find(disk_xpath) if disk is None: self.log.warning("Unable to find %r in vm xml", drive) return False return disk.find("./mirror[@ready='yes']") is not None return False Should we do something similar for blockCopy? Or maybe we should register VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 and VIR_DOMAIN_BLOCK_JOB_READY to detect that that a blockCopy job is ready? Currently we register libvirt.VIR_DOMAIN_EVENT_ID_BLOCK_JOB, but we only log the event, and take no action. (In reply to Nir Soffer from comment #37) > (In reply to Peter Krempa from comment #35) > > It is fishy though that 'block-job-complete' was called in the situations > > where the job is missing as libvirt makes sure that the job is in ready > > state before sending the command or that it can't be completed. Both should > > not have happened. > > Peter, we had a similar issue in blockCommit in the past, when we failed to > check > if the block job is ready in the xml before trying to pivot. Maybe we have > similar > race? You are right in the regard that active block-commit and block-copy have exactly the same internals in libvirt ... > > What vdsm i doing is (simplified): > > # Looking for the replication blockJob info (checking its presence) > blkJobInfo = self._dom.blockJobInfo(drive.name, 0) > > # Checking if we reached the replication mode ("mirroring" in libvirt > # and qemu terms) > if blkJobInfo['cur'] != blkJobInfo['end']: > > fail... > > # Stopping the replication > > self._dom.blockJobAbort(drive.name, blockJobFlags) I presume this includes _PIVOT here. You are right that the scenario described below can happen. The thing is though that it will have different behaviour regarding the error messages reported: If we assume that the _PIVOT flag is used we perform the following check: 17223 if (disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_READY) { 17224 virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, 17225 _("disk '%s' not ready for pivot yet"), 17226 disk->dst); 17227 goto cleanup; 17228 } Libvirt updates the mirror state only according to events. (Historically (e.g. in libvirt-3.9.0 as described ) there was also code which updated the state if it was not READY, but that actually checked the 'ready' flag in qemu (if qemu supported it) and thus should not have caused problems)). I'm not sure which version introduced the 'ready' state. The error that you can see e.g. in comment 31 happens later as a response to using block-job-complete command: libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Block job 'drive-ua-fee6faec-ebfc-440a-8e88-84fe4ecf68d1' not found > In active layer commit, we do: > > def _activeLayerCommitReady(self, jobInfo, drive): > > if (jobInfo['cur'] == jobInfo['end'] and jobInfo['type'] == pivot): > > # Check the job state in the xml to make sure the job is > # ready. We know about two interesting corner cases: > # > # - cur == 0 and end == 0 when a job starts. Trying to pivot > # succeeds, but the xml never updates after that. > # See https://bugzilla.redhat.com/1442266. > # > # - cur == end and cur != 0, but the job is not ready yet, and > # blockJobAbort raises an error. > # See https://bugzilla.redhat.com/1376580 There is also a case when cur == end and its non-zero, but the job is not ready for pivot yet. I've recently pushed a fix to this which decrements "cur" by one if the job is not ready: commit 73ce3911aa0404f070468ae00c78369a5b1ac071 Author: Peter Krempa <pkrempa> Date: Tue Jan 29 17:17:29 2019 +0100 qemu: blockjob: Don't report block job progress at 100% if job isn't ready Some clients poll virDomainGetBlockJobInfo rather than wait for the VIR_DOMAIN_BLOCK_JOB_READY event. In some cases qemu can get to 100% and still not reach the synchronised phase. Initiating a pivot in that case will fail. Given that computers are interacting here, the error that the job can't be finalized yet is not handled very well by those specific implementations. Our docs now correctly state to use the event. We already do a similar output adjustment in case when the progress is not available from qemu as in that case we'd report 0 out of 0, which some apps also incorrectly considered as 100% complete. In this case we subtract 1 from the progress if the ready state is not signalled by qemu if the progress was at 100% otherwise. Signed-off-by: Peter Krempa <pkrempa> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 96cb108a0e..2485dd93eb 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -17440,6 +17440,14 @@ qemuBlockJobInfoTranslate(qemuMonitorBlockJobInfoPtr rawInfo, } } + /* If qemu reports that it's not ready yet don't make the job go to + * cur == end as some apps wrote code polling this instead of waiting for + * the ready event */ + if (rawInfo->ready == 0 && + info->cur == info->end && + info->cur > 0) + info->cur -= 1; + info->type = rawInfo->type; if (info->type == VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT && disk->mirrorJob == VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT) v5.0.0-180-g73ce3911aa > > self.log.debug("Checking xml for drive %r", drive.name) > root = ET.fromstring(self._dom.XMLDesc(0)) > disk_xpath = "./devices/disk/target[@dev='%s'].." % drive.name > disk = root.find(disk_xpath) > if disk is None: > self.log.warning("Unable to find %r in vm xml", drive) > return False > return disk.find("./mirror[@ready='yes']") is not None > return False > > Should we do something similar for blockCopy? > > Or maybe we should register VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 and > VIR_DOMAIN_BLOCK_JOB_READY to detect that that a blockCopy job > is ready? That is the best solution. Also new libvirt munges the output if the job is not ready for pivot now. > > Currently we register libvirt.VIR_DOMAIN_EVENT_ID_BLOCK_JOB, but we only > log the event, and take no action. VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 is better as it reports the disk 'target' and thus works also for network disks where we don't report the first event. Since we cannot reproduce after multiple attempts and we do not have the logs available, let's close the bug for now. The issue is documented in KCS: https://access.redhat.com/solutions/3771541. If anyone hits the issue again and has the libvirt logs from that same time, please upload to the BZ and reopen. FWIW, it is possible to edit libvirt log level settings without restarting libvirt, using virt-admin command, as documented here: https://access.redhat.com/solutions/53568 Hi Peter and Jaroslav, Would you be kind enough to specify the log_filters needed to further investigate this problem? We are currently proposing adding a small subset of libvirt logs in RHV by default via BZ1749634 ("1:qemu" at the moment) and are interested in what filters could help solving this bug. This is because this bug is very hard to reproduce and enabling debug logs after the problem happened is not very useful, so the idea is if we could run by default with some logging that would help solve this bug too. Or we don't know the specifics and something more broad is needed, like below? ~~~ log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util" ~~~ Thanks, (In reply to Germano Veit Michel from comment #52) > Hi Peter and Jaroslav, > > Would you be kind enough to specify the log_filters needed to further > investigate this problem? > > We are currently proposing adding a small subset of libvirt logs in RHV by > default via BZ1749634 ("1:qemu" at the moment) and are interested in what > filters could help solving this bug. > This is because this bug is very hard to reproduce and enabling debug logs > after the problem happened is not very useful, so the idea is if we could > run by default with some logging that would help solve this bug too. > > Or we don't know the specifics and something more broad is needed, like > below? > ~~~ > log_filters="1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file > 3:object 1:util" > ~~~ Yes, this setting should be fine. Thanks. I propose to add insights rule to check whether libvirt logs are enabled. Should the rule check if libvirt logs are size 0 or don't exist? Or check the `/etc/libvirt/libvirtd.conf` if it exists and what is the config in it? Petr, would you have any suggestions maybe on what to check? Logging can be done in various ways. The best way is to check whether the proper logging is configured in /etc/libvirt/libvirtd.conf Depending on version the 'log_level', 'log_filters' and 'log_outputs' variables configure logging. Please see https://wiki.libvirt.org/page/DebugLogs for more details. |
Description of problem: Customer is unable to do Live Storage Migration of the disk "diskname", it fails. Version-Release number of selected component (if applicable): 4.0.7 How reproducible: Currently stuck and cannot migrate this drive. Steps to Reproduce: 1. 2. 3. Actual results: LSM of a drive fails and reports unable to flush L2 cache: permission denied Expected results: LSM should move the drive Additional info: From strace: 2:34:44 read(10, "\x01\x00\x00\x00\x00\x00\x00\x00", 512) = 8 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}] ) = -1 EACCES (Permission denied) 02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}] ) = -1 EACCES (Permission denied) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}] ) = -1 EACCES (Permission denied) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}] ) = -1 EACCES (Permission denied) 02:34:44 ppoll([{fd=24, events=POLLIN|POLLERR|POLLHUP}, {fd=10, events=POLLIN|POLLERR|POLLHUP}], 2, {0, 0}, NULL, 8) = 0 (Timeout) 02:34:44 write(2, "2018-10-26T18:34:44.718660Z ", 28) = 28 02:34:44 write(2, "qemu-kvm:", 9) = 9 02:34:44 write(2, " ", 1) = 1 02:34:44 write(2, "Failed to flush the L2 table cac"..., 53) = 53 02:34:44 write(2, "\n", 1) = 1 02:34:44 io_submit(140227814273024, 1, [{pwritev, fildes=23, iovec=[{"\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01"..., 65536}], offset=131072, resfd=24}] ) = -1 EACCES (Permission denied) The strace is too large to attach to this case.