Bug 2175582

Summary: virtqemud coredump sometimes when try to start kvm guest with interface type=null/vds
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Peter Krempa <pkrempa>
libvirt sub component: General QA Contact: yalzhang <yalzhang>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: jdenemar, lmen, mkletzan, pkrempa, virt-maint
Version: 9.2Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-9.2.0-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:30:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 9.2.0
Embargoed:
Attachments:
Description Flags
the coredump info none

Description yalzhang@redhat.com 2023-03-06 00:33:17 UTC
Created attachment 1948239 [details]
the coredump info

Description of problem:
virtqemud coredump sometimes when try to start vm with interface type='null'

Version-Release number of selected component (if applicable):
libvirt-9.0.0-7.el9.x86_64
qemu-kvm-7.2.0-10.el9.x86_64

How reproducible:
60%

Steps to Reproduce:
1. try to start a kvm vm with interface type='null':
# pidof virtqemud
15926
# virsh dumpxml rhel --xpath //interface
<domain type='kvm'>
  <name>rhel</name>
...
 <devices>
<interface type="null">
  <mac address="52:54:00:22:c9:42"/>
  <model type="rtl8139"/>
  <address type="pci" domain="0x0000" bus="0x10" slot="0x01" function="0x0"/>
</interface>
</devices>
</domain>

2. VM start failed and virtqemud coredump:
# virsh start rhel
error: Disconnected from qemu:///system due to end of file
error: Failed to start domain 'rhel'
error: End of file while reading data: Input/output error

# pidof virtqemud
16019

# virsh start rhel
error: Failed to start domain 'rhel'
error: Requested operation is not valid: Setting different SELinux label on /var/lib/libvirt/swtpm/cef2bbc7-cf89-4ef0-9827-e7ed121762a7/tpm2/tpm2-00.permall which is already in use

# pidof virtqemud
16019

# coredumpctl list
TIME                          PID UID GID SIG     COREFILE EXE                    SIZE
Sun 2023-03-05 04:22:17 EST 15926   0   0 SIGSEGV present  /usr/sbin/virtqemud    1.0M

Actual results:
virtqemud crash when start kvm vm with interface type='null'

Expected results:
virtqemud should not crash and vm start successfully or fail to start with reasonable error msg

Additional info:

Comment 1 yalzhang@redhat.com 2023-03-06 07:49:36 UTC
virtqemud will also coredump when try to start a vm with vds interface type.
It is a negative scenario for kvm VMs, the vm should not start successfully with reasonable error msg, but virtqemud should not crash.

# pidof virtqemud
11828

# virsh dumpxml test
<domain type='kvm'>
  <name>test</name>
...
 <devices>
...
<interface type='vds'>
      <mac address='52:54:00:22:c9:42'/>
      <source switchid='12345678-1234-1234-1234-123456789abc' portid='6' portgroupid='pg-4321' connectionid='12345'/>
      <model type='rtl8139'/>
      <address type='pci' domain='0x0000' bus='0x10' slot='0x01' function='0x0'/>
    </interface>
 </devices>
</domain>

# virsh start test 
error: Disconnected from qemu:///system due to end of file
error: Failed to start domain 'test'
error: End of file while reading data: Input/output error

# pidof virtqemud
11918

Comment 2 Peter Krempa 2023-03-06 08:00:13 UTC
#0  0x00007f38c54f75fc in virJSONValueObjectInsert (object=object@entry=0x0, 
    key=key@entry=0x7f38c079089d "id", value=value@entry=0x7f38c1ffaf80, 
    prepend=prepend@entry=false) at ../src/util/virjson.c:552
        pair = {key = 0x9 <error: Cannot access memory at address 0x9>, 
          value = 0x18}
        ret = <optimized out>
        __FUNCTION__ = "virJSONValueObjectInsert"
#1  0x00007f38c54f7de5 in virJSONValueObjectInsertString (object=0x0, 
    key=0x7f38c079089d "id", value=<optimized out>, prepend=<optimized out>)
    at ../src/util/virjson.c:599
        jvalue = 0x7f38a402d810
#2  0x00007f38c54f7ef4 in virJSONValueObjectAppendStringPrintf (object=0x0, 
    key=key@entry=0x7f38c079089d "id", fmt=fmt@entry=0x7f38c077cdca "host%s")
    at ../src/util/virjson.c:625
        ap = {{gp_offset = 32, fp_offset = 48, 
            overflow_arg_area = 0x7f38c1ffb0a0, 
            reg_save_area = 0x7f38c1ffafd0}}
        str = 0x7f38a402c700 "hostnet0"
#3  0x00007f38c06987d8 in qemuBuildHostNetProps (vm=<optimized out>, 
    net=0x7f38ac049650) at ../src/qemu/qemu_command.c:3996
        netType = <optimized out>
        i = <optimized out>
        netpriv = <optimized out>
        netprops = 0x0
        __FUNCTION__ = "qemuBuildHostNetProps"


We apparently try to call virJSONValueObjectAppendStringPrintf on a object which is NULL.

Comment 3 Peter Krempa 2023-03-06 08:06:34 UTC
This was caused by commit:

commit b6738ffc9f8be5a2a61236cd9bef7fd317982f01
Author: Peter Krempa <pkrempa>
Date:   Thu May 14 22:50:59 2020 +0200

    qemu: command: Generate -netdev command line via JSON->cmdline conversion
    
    The 'netdev_add' command was recently formally described in qemu via the
    QMP schema. This means that it also requires the arguments to be
    properly formatted. Our current approach is to generate the command line
    and then use qemuMonitorJSONKeywordStringToJSON to get the JSON
    properties for the monitor. This will not work if we need to pass some
    fields as numbers or booleans.
    
    In this step we re-do internals of qemuBuildHostNetStr to format a JSON
    object which is converted back via virQEMUBuildNetdevCommandlineFromJSON
    to the equivalent command line. This will later allow fixing of the
    monitor code to use the JSON object directly rather than rely on the
    conversion.

$ git desc b6738ffc9f8
v6.3.0-139-gb6738ffc9f

Comment 4 Peter Krempa 2023-03-06 08:10:50 UTC
Ah, so the commit I've mentioned above introduced code that made it easy to do a mistake when adding a new interface type, which is what actually caused to code to crash. Previously the code adding the alias could not be reached without the object being allocated.

Comment 5 Peter Krempa 2023-03-06 13:52:07 UTC
Fixed upstream:

commit f3a73384092f798261d99bf1ce099c8ea6cc91a9
Author: Peter Krempa <pkrempa>
Date:   Mon Mar 6 09:18:51 2023 +0100

    qemuBuildHostNetProps: Report proper errors for unhandled interface types
    
    VIR_DOMAIN_NET_TYPE_NULL and VIR_DOMAIN_NET_TYPE_VDS are not implemented
    for the qemu driver but the formatter code in 'qemuBuildHostNetProps'
    didn't report an error for them and didn't even return from the function
    when they were encountered.
    
    This caused a crash in 'virJSONValueObjectAppendStringPrintf' which
    does not tolerate NULL JSON object to append to when the unsupported
    devices were used.
    
    Properly report error when unhandled devices are encountered. This also
    includes the case for VIR_DOMAIN_NET_TYPE_HOSTDEV, but that code path
    should never be reached.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2175582
    Fixes: bac6b266fb6a / 6457619d186
    Fixes: 0225483adce
    Signed-off-by: Peter Krempa <pkrempa>
    Reviewed-by: Ján Tomko <jtomko>

v9.1.0-84-gf3a7338409

Comment 6 yalzhang@redhat.com 2023-04-04 16:00:40 UTC
No such issue exists in automation function test job with libvirt-9.2.0-1.el9.x86_64.

Comment 7 yalzhang@redhat.com 2023-04-04 16:22:19 UTC
Sorry, Please ignore comment 6.
Test with libvirt-9.2.0-1.el9.x86_64, the result is as expected, virtqemud will not coredump, and the error msg makes sense.

1. Try to start vm with interface type as "null":
# virsh dumpxml avocado-vt-vm1 --xpath //interface 
<interface type="null">
  <mac address="52:54:00:22:c9:42"/>
  <model type="virtio"/>
  <address type="pci" domain="0x0000" bus="0x01" slot="0x00" function="0x0"/>
</interface>

# virsh start avocado-vt-vm1 
error: Failed to start domain 'avocado-vt-vm1'
error: unsupported configuration: network device type 'null' is not supported by this hypervisor

2. Try to start vm with interface type as "vds":
# virsh dumpxml avocado-vt-vm1 --xpath //interface 
<interface type="vds">
  <mac address="52:54:00:22:c9:42"/>
  <source switchid="12345678-1234-1234-1234-123456789abc" portid="6" portgroupid="pg-4321" connectionid="12345"/>
  <model type="rtl8139"/>
  <address type="pci" domain="0x0000" bus="0x10" slot="0x01" function="0x0"/>
</interface>

# virsh start avocado-vt-vm1 
error: Failed to start domain 'avocado-vt-vm1'
error: unsupported configuration: network device type 'vds' is not supported by this hypervisor

Comment 10 yalzhang@redhat.com 2023-05-19 02:07:13 UTC
Test on libvirt-9.3.0-2.el9.x86_64 with the scenarios in comment 7, the result is the same, all expected.

Comment 12 errata-xmlrpc 2023-11-07 08:30:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: libvirt security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:6409