Bug 874549

Summary: libvirt_lxc segfaults when staring lxc through openstack
Product: Red Hat Enterprise Linux 6 Reporter: unicell <unicell>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: acathrow, ajia, dyasny, dyuan, eblake, jdenemar, mjenner, mzhan, rwu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.2-7.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 07:26:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description unicell 2012-11-08 12:26:08 UTC
Description of problem:

Launch LXC through OpenStack on CentOS cause libvirt_lxc segfault.

Version-Release number of selected component (if applicable):

libvirt-python-0.9.10-21.el6_3.5.x86_64

How reproducible:

/usr/libexec/libvirt_lxc --name instance-0000006f --console 23 --handshake 26 --background --veth veth1

<domain type='lxc'>
  <name>instance-00000069</name>
  <uuid>5abb4ca2-9e9b-4b33-b489-b09d301b1e8f</uuid>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>524288</currentMemory>
  <vcpu placement='static'>2</vcpu>
  <os>
    <type arch='x86_64'>exe</type>
    <init>/sbin/init</init>
    <cmdline>console=ttyS0</cmdline>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/libvirt_lxc</emulator>
    <filesystem type='mount' accessmode='passthrough'>
      <source dir='/home/stack/nova_state/instances/instance-00000069/rootfs'/>
      <target dir='/'/>
    </filesystem>
    <interface type='bridge'>
      <mac address='fa:16:3e:24:b3:65'/>
      <source bridge='br100'/>
      <filterref filter='nova-instance-instance-00000069-fa163e24b365'>
        <parameter name='DHCPSERVER' value='10.48.253.1'/>
        <parameter name='IP' value='10.48.253.2'/>
        <parameter name='PROJMASK' value='255.255.255.0'/>
        <parameter name='PROJNET' value='10.48.253.0'/>
      </filterref>
    </interface>
    <console type='pty'>
      <target type='lxc' port='0'/>
    </console>
  </devices>
</domain>

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:


Already verified it could be fixed by upstream patch 
http://libvirt.org/git/?p=libvirt.git;a=commit;h=57349ffc10290eed2cb25ca7cfb4b34ab5003156

Relavant info on Novell bugzilla
https://bugzilla.novell.com/show_bug.cgi?id=767448

Comment 4 Alex Jia 2012-11-15 06:02:19 UTC
I still can reproduce this issue on libvirt-0.10.2-8.el6.x86_64:

# /usr/libexec/libvirt_lxc --name instance-0000006f --console 23 --handshake 26 --background --veth veth1
Segmentation fault (core dumped)

==4406== Invalid read of size 8
==4406==    at 0x411755: main (lxc_controller.c:1596)
==4406==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==4406==
==4406==
==4406== Process terminating with default action of signal 11 (SIGSEGV)
==4406==  Access not within mapped region at address 0x0
==4406==    at 0x411755: main (lxc_controller.c:1596)

Comment 8 Eric Blake 2012-11-15 12:27:35 UTC
(In reply to comment #4)
> I still can reproduce this issue on libvirt-0.10.2-8.el6.x86_64:
> 
> # /usr/libexec/libvirt_lxc --name instance-0000006f --console 23 --handshake
> 26 --background --veth veth1
> Segmentation fault (core dumped)
> 
> ==4406== Invalid read of size 8
> ==4406==    at 0x411755: main (lxc_controller.c:1596)
> ==4406==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==4406==
> ==4406==
> ==4406== Process terminating with default action of signal 11 (SIGSEGV)
> ==4406==  Access not within mapped region at address 0x0
> ==4406==    at 0x411755: main (lxc_controller.c:1596)

That sounds like a completely different problem.  It may be the same symptoms of a crash on an invalid read, but line 1596 of lxc_controller in 0.10.2-8.el6 corresponds to:

    VIR_DEBUG("Security model %s type %s label %s imagelabel %s",
              NULLSTR(ctrl->def->seclabels[0]->model),
              virDomainSeclabelTypeToString(ctrl->def->seclabels[0]->type),
              NULLSTR(ctrl->def->seclabels[0]->label),
              NULLSTR(ctrl->def->seclabels[0]->imagelabel));

while the original crash report was inside random_r().

Comment 9 Alex Jia 2012-11-16 02:50:43 UTC
(In reply to comment #8)
> That sounds like a completely different problem.  It may be the same
> symptoms of a crash on an invalid read, but line 1596 of lxc_controller in
> 0.10.2-8.el6 corresponds to:
> 
>     VIR_DEBUG("Security model %s type %s label %s imagelabel %s",
>               NULLSTR(ctrl->def->seclabels[0]->model),
>               virDomainSeclabelTypeToString(ctrl->def->seclabels[0]->type),
>               NULLSTR(ctrl->def->seclabels[0]->label),
>               NULLSTR(ctrl->def->seclabels[0]->imagelabel));
> 
> while the original crash report was inside random_r().

Eric, yes, as you said, I just double confirm it.

1596     VIR_DEBUG("Security model %s type %s label %s imagelabel %s",
1597               NULLSTR(ctrl->def->seclabels[0]->model),
1598               virDomainSeclabelTypeToString(ctrl->def->seclabels[0]->type),
1599               NULLSTR(ctrl->def->seclabels[0]->label),
1600               NULLSTR(ctrl->def->seclabels[0]->imagelabel));

Need I file a new bug to track this? thanks.

Alex

Comment 10 Alex Jia 2012-11-26 06:52:26 UTC
Eric, do we plan to fix new issue on this bug? because the new question probably will block we contiune to verify the bug,  

I can reproduce this issue on libvirt-python-0.9.10-21.el6_3.5.x86_64:

==16397== Invalid read of size 4
==16397==    at 0x33F68369C8: random_r (random_r.c:375)
==16397==    by 0x43CD70: virRandomBits (virrandom.c:81)
==16397==    by 0x433C53: virHashCreateFull (virhash.c:134)
==16397==    by 0x469C9C: virNWFilterHashTableCreate (nwfilter_params.c:669)
==16397==    by 0x46A467: virNWFilterParseParamAttributes (nwfilter_params.c:776)
==16397==    by 0x44E7B4: virDomainNetDefParseXML (domain_conf.c:4473)
==16397==    by 0x45EB93: virDomainDefParseXML (domain_conf.c:8419)
==16397==    by 0x4611E1: virDomainDefParseNode (domain_conf.c:9120)
==16397==    by 0x461ACA: virDomainDefParse (domain_conf.c:9070)
==16397==    by 0x40D9E4: main (lxc_controller.c:1782)
==16397==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==16397== 
==16397== 
==16397== Process terminating with default action of signal 11 (SIGSEGV)
==16397==  Access not within mapped region at address 0x0
==16397==    at 0x33F68369C8: random_r (random_r.c:375)
==16397==    by 0x43CD70: virRandomBits (virrandom.c:81)
==16397==    by 0x433C53: virHashCreateFull (virhash.c:134)
==16397==    by 0x469C9C: virNWFilterHashTableCreate (nwfilter_params.c:669)
==16397==    by 0x46A467: virNWFilterParseParamAttributes (nwfilter_params.c:776)
==16397==    by 0x44E7B4: virDomainNetDefParseXML (domain_conf.c:4473)
==16397==    by 0x45EB93: virDomainDefParseXML (domain_conf.c:8419)
==16397==    by 0x4611E1: virDomainDefParseNode (domain_conf.c:9120)
==16397==    by 0x461ACA: virDomainDefParse (domain_conf.c:9070)
==16397==    by 0x40D9E4: main (lxc_controller.c:1782)
==16397==  If you believe this happened as a result of a stack
==16397==  overflow in your program's main thread (unlikely but
==16397==  possible), you can try to increase the size of the
==16397==  main thread stack using the --main-stacksize= flag.
==16397==  The main thread stack size used in this run was 10485760.

And it's fine on libvirt-0.10.2-7.el6, so move the bug to verified.

For new question(see Comment 8 or Comment 9), I will file a new bug to track it.

Comment 11 Alex Jia 2012-11-26 07:14:39 UTC
(In reply to comment #10)
> 
> For new question(see Comment 8 or Comment 9), I will file a new bug to track
> it.

Just a record:

https://bugzilla.redhat.com/show_bug.cgi?id=880064

Comment 12 errata-xmlrpc 2013-02-21 07:26:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html