RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1071181 - Libvirtd crashed on a light loop of starting domain with CTRL_IP_LEARNING=dhcp
Summary: Libvirtd crashed on a light loop of starting domain with CTRL_IP_LEARNING=dhcp
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Laine Stump
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 731059 744225
TreeView+ depends on / blocked
 
Reported: 2014-02-28 08:32 UTC by Hu Jianwei
Modified: 2019-04-08 16:20 UTC (History)
12 users (show)

Fixed In Version: libvirt-1.1.1-29.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-13 10:44:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirtd_debug.log (64.45 KB, text/plain)
2014-02-28 08:32 UTC, Hu Jianwei
no flags Details


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 105593 0 None None None 2019-04-08 17:03:38 UTC

Description Hu Jianwei 2014-02-28 08:32:49 UTC
Created attachment 868890 [details]
libvirtd_debug.log

Description of problem:
libvirtd crashed on a light loop of starting domain with CTRL_IP_LEARNING=dhcp

Version-Release number of selected component (if applicable):
libvirt-1.1.1-25.el7.x86_64
qemu-kvm-rhev-1.5.3-50.el7.x86_64
kernel-3.10.0-97.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Set "dhcp" value to CTRL_IP_LEARNING
[root@intel-e5530-8-2 ~]# virsh dumpxml r7 | grep interface -A8
    <interface type='network'>
      <mac address='52:54:00:7f:44:cb'/>
      <source network='default'/>
      <model type='rtl8139'/>
      <filterref filter='clean-traffic'>
        <parameter name='CTRL_IP_LEARNING' value='dhcp'/>
      </filterref>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
...

2. Do a light loop starting the domain.
[root@intel-e5530-8-2 ~]# i=1;while true; do echo -----------------$i---------------; virsh start r7;i=$((i + 1)); done

-----------------1---------------
error: Failed to start domain r7
error: An error occurred, but the cause is unknown

...
-----------------27---------------
error: Failed to start domain r7
error: An error occurred, but the cause is unknown

-----------------28---------------
error: Failed to start domain r7
error: End of file while reading data: Input/output error
error: One or more references were leaked after disconnect from the hypervisor
error: Failed to reconnect to the hypervisor

-----------------29---------------
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': Connection refused

Actual results:
As shown above, domain failed to start and caused libvirtd crash.

[root@intel-e5530-8-2 ccpp-2014-02-28-14:06:44-4926]# service libvirtd status
Redirecting to /bin/systemctl status  libvirtd.service
libvirtd.service - Virtualization daemon
   Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
   Active: failed (Result: core-dump) since Fri 2014-02-28 14:06:45 CST; 3min 34s ago
  Process: 4926 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=dumped, signal=ABRT)
 Main PID: 4926 (code=dumped, signal=ABRT)

Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7f67c9707000-7f67c9708000 rw-p 00000000 00:00 0
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7f67c9708000-7f67c975f000 r-xp 00000000 08:01 68017482                   /usr/sbin/libvirtd
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7f67c995f000-7f67c9961000 r--p 00057000 08:01 68017482                   /usr/sbin/libvirtd
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7f67c9961000-7f67c9965000 rw-p 00059000 08:01 68017482                   /usr/sbin/libvirtd
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7f67cadca000-7f67cae9e000 rw-p 00000000 00:00 0                          [heap]
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7fff082cc000-7fff082ed000 rw-p 00000000 00:00 0                          [stack]
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: 7fff0830b000-7fff0830d000 r-xp 00000000 00:00 0                          [vdso]
Feb 28 14:06:44 intel-e5530-8-2.englab.nay.redhat.com libvirtd[4926]: ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Feb 28 14:06:45 intel-e5530-8-2.englab.nay.redhat.com systemd[1]: libvirtd.service: main process exited, code=dumped, status=6/ABRT
Feb 28 14:06:45 intel-e5530-8-2.englab.nay.redhat.com systemd[1]: Unit libvirtd.service entered failed state.


Expected results:
Below two detailed problems, should be handled.
1. For "error: An error occurred, but the cause is unknown", domain with CTRL_IP_LEARNING=dhcp should be started successfully.
2. For libvirtd crash, I think libvitd should keep living after meeting above negative behavior.

Additional info:
(gdb) bt
#0  0x00007f67c5dea989 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f67c5dec098 in __GI_abort () at abort.c:90
#2  0x00007f67c5e2b177 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f67c5f33b48 "*** Error in `%s': %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  0x00007f67c5e30f87 in malloc_printerr (action=<optimized out>, str=0x7f67c5f31238 "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4972
#4  0x00007f67c5e325ba in _int_free (av=0x7f6790000020, p=0x7f6790002010, have_lock=0) at malloc.c:3960
#5  0x00007f67c8c62f8a in virFree (ptrptr=0x7f6790009720) at util/viralloc.c:566
#6  0x00007f67c8c8176b in virHashFree (table=0x7f6790009720) at util/virhash.c:264
#7  0x00007f67c8cf1a77 in virNWFilterHashTableFree (table=table@entry=0x7f6790009700) at conf/nwfilter_params.c:684
#8  0x00007f67b39feb5b in virNWFilterInstantiate (forceWithPendingReq=false, driver=0x633a34343a66373a, macaddr=0x7f67ac222fe4, teardownOld=true, 
    foundNewFilter=<optimized out>, useNewFilter=INSTANTIATE_ALWAYS, vars=0x7f67900104a0, linkdev=0x0, ifindex=747, ifname=0x7f6790007a90 "vnet0", 
    filter=0x7f67ac016cd0, nettype=VIR_DOMAIN_NET_TYPE_NETWORK, techdriver=<optimized out>, vmuuid=<optimized out>) at nwfilter/nwfilter_gentech_driver.c:780
#9  __virNWFilterInstantiateFilter (driver=driver@entry=0x7f67ac01e9f0, 
    vmuuid=vmuuid@entry=0x7f67ac27e1c8 "!\236'\261\366\071L\277\200\241\236\324[6W\362p\350'\254g\177", teardownOld=teardownOld@entry=true, 
    ifname=0x7f6790007a90 "vnet0", ifindex=747, linkdev=linkdev@entry=0x0, nettype=VIR_DOMAIN_NET_TYPE_NETWORK, macaddr=macaddr@entry=0x7f67ac222fe4, 
    filtername=0x7f67ac27e720 "clean-traffic", filterparams=0x7f67ac27e740, useNewFilter=useNewFilter@entry=INSTANTIATE_ALWAYS, 
    forceWithPendingReq=forceWithPendingReq@entry=false, foundNewFilter=foundNewFilter@entry=0x7f67b83bfac7) at nwfilter/nwfilter_gentech_driver.c:894
#10 0x00007f67b39ff131 in _virNWFilterInstantiateFilter (driver=0x7f67ac01e9f0, 
    vmuuid=0x7f67ac27e1c8 "!\236'\261\366\071L\277\200\241\236\324[6W\362p\350'\254g\177", net=0x7f67ac222fe0, teardownOld=teardownOld@entry=true, 
    useNewFilter=useNewFilter@entry=INSTANTIATE_ALWAYS, foundNewFilter=foundNewFilter@entry=0x7f67b83bfac7) at nwfilter/nwfilter_gentech_driver.c:949
#11 0x00007f67b39ff29b in virNWFilterInstantiateFilter (driver=<optimized out>, vmuuid=<optimized out>, net=<optimized out>)
    at nwfilter/nwfilter_gentech_driver.c:1020
#12 0x00007f67b2389c04 in qemuNetworkIfaceConnect (def=def@entry=0x7f67ac27e1c0, conn=conn@entry=0x7f678800a930, driver=driver@entry=0x7f67ac1b20f0, 
    net=net@entry=0x7f67ac222fe0, qemuCaps=qemuCaps@entry=0x7f67900034b0, tapfd=0x7f6790008a70, tapfdSize=tapfdSize@entry=0x7f67b83bfcb0)
    at qemu/qemu_command.c:400
#13 0x00007f67b23970de in qemuBuildInterfaceCommandLine (vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, bootindex=0, vlan=-1, qemuCaps=0x7f67900034b0, 
    net=0x7f67ac222fe0, def=0x7f67ac27e1c0, conn=0x7f678800a930, driver=0x7f67ac1b20f0, cmd=0x7f6790010390) at qemu/qemu_command.c:7386
#14 qemuBuildCommandLine (conn=conn@entry=0x7f678800a930, driver=driver@entry=0x7f67ac1b20f0, def=<optimized out>, monitor_chr=<optimized out>, 
    monitor_json=<optimized out>, qemuCaps=0x7f67900034b0, migrateFrom=migrateFrom@entry=0x0, migrateFd=migrateFd@entry=-1, snapshot=snapshot@entry=0x0, 
    vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, callbacks=0x7f67b2655b50 <buildCommandLineCallbacks>) at qemu/qemu_command.c:8473
#15 0x00007f67b23c0464 in qemuProcessStart (conn=conn@entry=0x7f678800a930, driver=driver@entry=0x7f67ac1b20f0, vm=vm@entry=0x7f67ac224830, 
    migrateFrom=migrateFrom@entry=0x0, stdin_fd=stdin_fd@entry=-1, stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0, 
    vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=flags@entry=1) at qemu/qemu_process.c:3828
#16 0x00007f67b240c1b3 in qemuDomainObjStart (conn=0x7f678800a930, driver=driver@entry=0x7f67ac1b20f0, vm=vm@entry=0x7f67ac224830, flags=flags@entry=0)
    at qemu/qemu_driver.c:6136
#17 0x00007f67b240c762 in qemuDomainCreateWithFlags (dom=0x7f6790004260, flags=0) at qemu/qemu_driver.c:6190
#18 0x00007f67c8d332a7 in virDomainCreate (domain=domain@entry=0x7f6790004260) at libvirt.c:9511
#19 0x00007f67c973d7b7 in remoteDispatchDomainCreate (server=<optimized out>, msg=<optimized out>, args=<optimized out>, rerr=0x7f67b83c0c80, 
    client=0x7f67cae6de50) at remote_dispatch.h:2888
#20 remoteDispatchDomainCreateHelper (server=<optimized out>, client=0x7f67cae6de50, msg=<optimized out>, rerr=0x7f67b83c0c80, args=<optimized out>, 
    ret=<optimized out>) at remote_dispatch.h:2866
#21 0x00007f67c8d8b09a in virNetServerProgramDispatchCall (msg=0x7f67cae6dd80, client=0x7f67cae6de50, server=0x7f67cae5ed00, prog=0x7f67cae6ab40)
    at rpc/virnetserverprogram.c:435
---Type <return> to continue, or q <return> to quit---
#22 virNetServerProgramDispatch (prog=0x7f67cae6ab40, server=server@entry=0x7f67cae5ed00, client=0x7f67cae6de50, msg=0x7f67cae6dd80)
    at rpc/virnetserverprogram.c:305
#23 0x00007f67c8d85c08 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7f67cae5ed00)
    at rpc/virnetserver.c:166
#24 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f67cae5ed00) at rpc/virnetserver.c:187
#25 0x00007f67c8caa4d5 in virThreadPoolWorker (opaque=opaque@entry=0x7f67cadf1860) at util/virthreadpool.c:144
#26 0x00007f67c8ca9e6e in virThreadHelper (data=<optimized out>) at util/virthreadpthread.c:194
#27 0x00007f67c6584df3 in start_thread (arg=0x7f67b83c1700) at pthread_create.c:308
#28 0x00007f67c5eab39d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
(gdb)

Comment 1 Stefan Berger 2014-03-03 02:49:53 UTC
Do you see an error like the following in the libvirt log?

virNWFilterSnoopDHCPOpen:1116 : internal error: setup of pcap handle failed

I think this is the root cause for the VM not even starting the first time. This error does not occur with libvirt-1.1.1 on F18 (libpcap-1.3.0-2). RHEL7 has libpcap-1.5.3-3.

Comment 2 Stefan Berger 2014-03-03 04:00:02 UTC
With libpcap-1.5.3 on FC18 I also need to supply at least a 128kb buffer for libpcap's pcap_set_buffer_szie() otherwise there will be an error related to mmaping the rx buffer ring when pcap_activate() is called.

I will submit a patch for this tomorrow (3/3).

Comment 3 IBM Bug Proxy 2014-03-12 11:50:42 UTC
fyi .. update in LTC bug 73493 - RH731059- [7.0 FEAT] libvirt: Support DHCP Snooping and Dynamic ARP Inspection (DCN)
"
Patches to fix this issue have been pushed to upstream repo.

commit 49b59a151f60b0a178b023b727bac30f80bd6000
Author: Stefan Berger <stefanb.ibm.com>
Date:   Mon Mar 3 15:13:50 2014 -0500

nwfilter: Increase buffer size for libpcap

Libpcap 1.5 requires a larger buffer than previous pcap versions.
Adjust the size of the buffer to 128kb.

This patch should address symptoms in BZ 1071181 and BZ 731059

Signed-off-by: Stefan Berger <stefanb.ibm.com>

commit 64df4c75189b42799f82a8d8816c7c55598d2b6e
Author: Stefan Berger <stefanb.ibm.com>
Date:   Mon Mar 3 15:13:47 2014 -0500

nwfilter: Display the pcap errror message

Display the pcap error message in the log.

Signed-off-by: Stefan Berger <stefanb.ibm.com>
"
...

Comment 4 Laine Stump 2014-03-18 03:05:00 UTC
Do the patches above specifically fix the crash reported here? Or do they just improve functionality of DHCP Snooping (Bug 731059)?

Do we need to move this BZ to 7.0 and set blocker?

Comment 5 IBM Bug Proxy 2014-03-18 14:41:10 UTC
(In reply to comment #7)
> Do the patches above specifically fix the crash reported here? Or do they
> just improve functionality of DHCP Snooping (Bug 731059)?
>
> Do we need to move this BZ to 7.0 and set blocker?
.
Comment from Stefan Berger 2014-03-18 10:24:44 EDT

With these patches applied the crash did not occur anymore.

Comment 6 Stefan Berger 2014-03-19 18:25:35 UTC
This commit fixes the actual reason for the crash of libvirtd found in the error path. The above mentioned patches (comment 3) masked the crash by fixing the problem that caused the error.

commit 963dcf905c5ee0358d6b0b74b124ff340cbbbd2b
Author: Stefan Berger <stefanb.ibm.com>
Date:   Wed Mar 19 13:38:44 2014 -0400

    nwfilter: Fix double free of pointer

    https://bugzilla.redhat.com/show_bug.cgi?id=1071181

    Commit 49b59a15 fixed one problem but masks another one related to pointer
    freeing.

    Avoid putting of the virNWFilterSnoopReq once the thread has been started.
    It belongs to the thread and the thread will call virNWFilterSnoopReqPut() o

    Signed-off-by: Stefan Berger <stefanb.ibm.com>

Comment 11 Jincheng Miao 2014-03-26 07:37:06 UTC
In latest libvirt-1.1.1-29.el7, the crash problem is gone:
# rpm -q libvirt
libvirt-1.1.1-29.el7.x86_64

# virsh dumpxml r6 | grep -A10 '<interface'
    <interface type='network'>
      <mac address='52:54:00:21:0d:9b'/>
      <source network='default'/>
      <target dev='vnet0'/>
      <model type='virtio'/>
      <filterref filter='clean-traffic'>
        <parameter name='CTRL_IP_LEARNING' value='dhcp'/>
      </filterref>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>

# i=1;while true; do echo -----------------$i---------------; virsh start r6;i=$((i + 1)); done
-----------------1---------------
Domain r6 started

-----------------2---------------
error: Domain is already active

-----------------3---------------
error: Domain is already active

-----------------4---------------
error: Domain is already active

.
.
.

-----------------725---------------
error: Domain is already active

-----------------726---------------
error: Domain is already active

-----------------727---------------
error: Domain is already active

No crash happened. So I change the status to VERIFIED.

Comment 12 Jincheng Miao 2014-04-01 06:31:03 UTC
Change the status to VERIFIED

Comment 13 Stefan Berger 2014-04-04 23:54:31 UTC
VERIFIED also here now that libvirt-1.1.1-29.el7.x86_64 is available.

Comment 14 Ludek Smid 2014-06-13 10:44:30 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Comment 16 IBM Bug Proxy 2019-04-08 16:20:40 UTC
------- Comment From hannsj_uhl.com 2019-04-08 10:20 EDT-------


Note You need to log in before you can comment on or make changes to this bug.