Bug 1413773

Summary: new regression on GIT: Error:An error occurred, but the cause is unknown
Product: [Community] Virtualization Tools Reporter: sL1pKn07 <sl1pkn07>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: eskultet, libvirt-maint, me, mprivozn, pkrempa, rbalakri, sl1pkn07, zman0900
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-13 07:30:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
libvirt_client.log
none
windoze.log
none
libvirtd.log none

Description sL1pKn07 2017-01-16 22:54:35 UTC
Description of problem:


can't launch the VM after commit 095f042ed68b0162c784ad5d0eb30c289d2fcd90

the configuration is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1406837

How reproducible:


Steps to Reproduce:
1. build last code in libvirt GIT
2. rearm the libvirtd and virtlogd services
2. run the VM with 'virsh -c qemu:///system start VM' as normal user

Actual results:

fail launch:

Error:An error occurred, but the cause is unknown

when launch the vm with:

virsh -c qemu:///system start VM


Expected results:

launch without problem


Additional info:


"bisecting" the commits, start fail in the commit 095f042ed68b0162c784ad5d0eb30c289d2fcd90

https://libvirt.org/git/?p=libvirt.git;a=commit;h=095f042ed68b0162c784ad5d0eb30c289d2fcd90

Comment 1 sL1pKn07 2017-01-16 23:37:24 UTC
*** Bug 1413771 has been marked as a duplicate of this bug. ***

Comment 2 sL1pKn07 2017-01-16 23:39:37 UTC
*** Bug 1413772 has been marked as a duplicate of this bug. ***

Comment 3 Peter Krempa 2017-01-17 07:50:16 UTC
commit 095f042ed68b0162c784ad5d0eb30c289d2fcd90
Author: Michal Privoznik <mprivozn>
Date:   Thu Dec 15 16:47:15 2016 +0100

    qemu: Use transactions from security driver
    
    So far if qemu is spawned under separate mount namespace in order
    to relabel everything it needs an access to the security driver
    to run in that namespace too. This has a very nasty down side -
    it is being run in a separate process, so any internal state
    transition is NOT reflected in the daemon. This can lead to many
    sleepless nights. Therefore, use the transaction APIs so that
    libvirt developers can sleep tight again.

Comment 4 Erik Skultety 2017-01-17 15:35:52 UTC
Fixed upstream by:

commit 7e8b2da74f1322050a993ca988bfbea997a84355
Author:     Erik Skultety <eskultet>
AuthorDate: Tue Jan 17 12:22:14 2017 +0100
Commit:     Erik Skultety <eskultet>
CommitDate: Tue Jan 17 15:49:57 2017 +0100

    security: SELinux: fix the transaction model's list append
    
    The problem is in the way how the list item is created prior to
    appending it to the transaction list - the @path argument is just a
    shallow copy instead of deep copy of the hostdev device's path.
    Unfortunately, the hostdev devices from which the @path is extracted, in
    order to add them into the transaction list, are only temporary and
    freed before the buildup of the qemu namespace, thus making the @path
    attribute in the transaction list NULL, causing 'permission denied' or
    'double free' or 'unknown cause' errors.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1413773

commit df7f42d5bea7b98483aab510748eded5f6e8f437
Author:     Erik Skultety <eskultet>
AuthorDate: Tue Jan 17 12:21:27 2017 +0100
Commit:     Erik Skultety <eskultet>
CommitDate: Tue Jan 17 15:49:57 2017 +0100

    security: DAC: fix the transaction model's list append
    
    The problem is in the way how the list item is created prior to
    appending it to the transaction list - the @path attribute is just a
    shallow copy instead of deep copy of the hostdev device's path.
    Unfortunately, the hostdev devices from which the @path is extracted, in
    order to add them into the transaction list, are only temporary and
    freed before the buildup of the qemu namespace, thus making the @path
    attribute in the transaction list NULL, causing 'permission denied' or
    'double free' or 'unknown cause' errors.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1413773

Comment 5 sL1pKn07 2017-01-17 15:46:39 UTC
nope. same issue

NOTE: my system is selinux free

Comment 6 Michal Privoznik 2017-01-17 15:50:36 UTC
Can you please share the debug logs from both daemon and the domain then?

http://wiki.libvirt.org/page/DebugLogs

Comment 7 sL1pKn07 2017-01-17 15:53:49 UTC
┌─┤[$]|[sl1pkn07]|[sL1pKn07]|[~/aplicaciones/murmur-git]|
└───╼  LC_ALL=C sudo libvirtd
2017-01-17 15:52:45.120+0000: 2285: info : libvirt version: 3.0.0
2017-01-17 15:52:45.120+0000: 2285: info : hostname: sL1pKn07
2017-01-17 15:52:45.120+0000: 2285: error : virCgroupMakeGroup:1092 : Failed to create controller cpu for group: No such file or directory
2017-01-17 15:52:57.844+0000: 2204: warning : qemuDomainObjTaint:4011 : Domain id=2 name='windoze' uuid=167cfa49-c88f-46df-a6bf-3127d5bf4d38 is tainted: custom-argv
2017-01-17 15:52:57.844+0000: 2204: warning : qemuDomainObjTaint:4011 : Domain id=2 name='windoze' uuid=167cfa49-c88f-46df-a6bf-3127d5bf4d38 is tainted: host-cpu

┌─┤[$]|[sl1pkn07]|[sL1pKn07]|[~]|
└───╼  LC_ALL=C virsh -c qemu:///system start windoze
error: Failed to start domain windoze
error: An error occurred, but the cause is unknown

Comment 8 sL1pKn07 2017-01-17 15:55:17 UTC
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/usr/ps2/bin:/usr/ps2/sdk/bin HOME=/root USER=root LOGNAME=root QEMU_AUDIO_DRV=none /usr/bin/qemu-system-x86_64 -name guest=windoze,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-2-windoze/master-key.aes -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu host,hv_time,hv_vendor_id=SomeString,kvm=off -bios /usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd -m 12288 -realtime mlock=off -smp 6,sockets=1,cores=6,threads=1 -uuid 167cfa49-c88f-46df-a6bf-3127d5bf4d38 -display none -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-2-windoze/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-shutdown -boot menu=off,strict=on -device ahci,id=sata0,bus=pci.0,addr=0x4 -drive file=/dev/disk/by-id/ata-KINGSTON_SV300S37A240G_50026B775B0399A7,format=raw,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=21,id=hostnet0,vhost=on,vhostfd=23 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:e4:b0:b4,bus=pci.0,addr=0x9 -device vfio-pci,host=07:00.0,id=hostdev0,bus=pci.0,addr=0x7 -device vfio-pci,host=07:00.1,id=hostdev1,bus=pci.0,addr=0x3 -device vfio-pci,host=14:00.0,id=hostdev2,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -machine max-ram-below-4g=1G -msg timestamp=on
Domain id=2 is tainted: custom-argv
Domain id=2 is tainted: host-cpu
libvirt:  error : libvirtd quit during handshake: Input/output error
2017-01-17 15:52:57.912+0000: shutting down, reason=failed

Comment 9 Erik Skultety 2017-01-17 16:12:50 UTC
As per comment 6, we really need proper debug logs, what you posted in comments 7 and 8 doesn't really help much, as we need the @log_level to 1, see http://wiki.libvirt.org/page/DebugLogs again and as Michal asked you, please do not also forget to attach the machine log from /var/log/libvirt/qemu/<vm>.log (the absolutely best would be to create attachments for the logs instead of copying it to the text box).

Comment 10 sL1pKn07 2017-01-17 16:25:00 UTC
Created attachment 1241879 [details]
libvirt_client.log

Comment 11 sL1pKn07 2017-01-17 16:26:12 UTC
Created attachment 1241881 [details]
windoze.log

Comment 12 sL1pKn07 2017-01-17 16:27:43 UTC
Created attachment 1241883 [details]
libvirtd.log

Comment 13 Michal Privoznik 2017-01-18 11:18:41 UTC
I think this is a result of relabelling bug. Can you please try this branch and see if it works for you?

https://github.com/zippy2/libvirt/tree/qemu_ns_fixes

Comment 14 sL1pKn07 2017-01-18 15:01:00 UTC
with that branch/repo

└───╼  LC_ALL=C virsh -c qemu:///system start windoze
error: Failed to start domain windoze
error: internal error: child reported: unable to set user and group to '1000:78' on '/dev/disk/by-id/ata-KINGSTON_SV300S37A240G_50026B775B0399A7': No such file or directory
┌─┤[$]|[sl1pkn07]|[sL1pKn07]|[~]|
└───╼  ls /dev/disk/by-id/ata-KINGSTON_SV300S37A240G_50026B775B0399A7
lrwxrwxrwx 1 root root 9 ene 13 19:42 /dev/disk/by-id/ata-KINGSTON_SV300S37A240G_50026B775B0399A7 -> ../../sdi

Comment 15 Michal Privoznik 2017-01-19 17:33:35 UTC
Okay, I've updated my branch again:

https://github.com/zippy2/libvirt/tree/qemu_ns_fixes

Can you please give it a try?

Comment 16 sL1pKn07 2017-01-20 00:18:58 UTC
works without problem

the logs is clear, except:

2017-01-20 00:05:20.728+0000: 12738: error : x86FeatureInData:780 : internal error: unknown CPU feature __kvm_hv_vendor_id

2017-01-20 00:05:42.166+0000: 12740: error : virProcessRunInMountNamespace:1145 : internal error: child reported: Kernel does not provide mount namespace: No such file or directory

but sound normal things, right? (the first i have notice of it in older versions, the second is the first time i see it)

greetings

Comment 17 Michal Privoznik 2017-01-20 13:19:56 UTC
Patches posted to the list:

https://www.redhat.com/archives/libvir-list/2017-January/msg00894.html

Comment 18 sL1pKn07 2017-02-11 14:33:10 UTC
Hi

i'm completly lost

the patches has been landing into upstream?

the qemu_ns_fixes branch of https://github.com/zippy2/libvirt is gone. the new branch is qemu_ns_fixes_next?

greetings

Comment 19 Erik Skultety 2017-02-13 07:30:32 UTC
Yes, the series mentioned in comment 17 has been merged into the master branch and the issue should be fixed properly now.