Bug 836326 - virsh net-create with SRV records kills libvirtd
Summary: virsh net-create with SRV records kills libvirtd
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Virtualization Tools
Classification: Community
Component: libvirt
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Laine Stump
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-06-28 17:29 UTC by Stephen Gordon
Modified: 2016-04-27 01:50 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-03-23 22:20:54 UTC
Embargoed:


Attachments (Terms of Use)
backtrace (3.21 KB, text/plain)
2012-06-28 17:42 UTC, Dave Allan
no flags Details

Description Stephen Gordon 2012-06-28 17:29:48 UTC
Description of problem:

Ran virsh net-create on a given XML file defining a network, libvirt crashed. There may or may not be an issue with my network in the XML file (probably around the SRV entries) but I would not have expected that to result in the daemon dying...

Output from virsh net-create:

# virsh net-create /tmp/demo.redhat.com-network.xml 
error: Failed to create network from /tmp/demo.redhat.com-network.xml
error: End of file while reading data: Input/output error

Additional Info:

    <network>
    <name>demo-redhat</name>
    <bridge name='virbr9' />
    <forward mode='nat' />
    <domain name='demo.redhat.com' />
    <dns>
    <srv service='_kerberos' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/>
    <srv service='_kerberos' protocol='_udp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/>
    <srv service='_ldap' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='389'/>
    <host ip='192.168.200.1'><hostname>pony.demo.redhat.com</hostname></host>
    </dns>
    <ip address='192.168.200.254' netmask='255.255.255.0'>
    <dhcp>
    <range start='192.168.200.1' end='192.168.200.1' />
    </dhcp>
    </ip>
    </network>

Comment 2 Stephen Gordon 2012-06-28 17:35:46 UTC
As an aside it's unclear to me how the SRV records should really be written, the examples on libvirt.org set protocol="tcp" rather than "_tcp" but that results in a non-sensical srv-host entry being sent to dnsmasq with:

--srv-host=ldap.tcp.rhevm31.demo.redhat.com,rhevm31.demo.redhat.com,389,...

Instead of:

--srv-host=_ldap._tcp.pony.demo.redhat.com,rhevm31.demo.redhat.com,389,...

Resulting, I think, in this failure:

error: Failed to create network from /tmp/demo.redhat.com-network.xml
error: internal error Child process (/sbin/dnsmasq --strict-order --bind-interfaces --domain demo.redhat.com --pid-file=/var/run/libvirt/network/demo.redhat.pid --conf-file= --except-interface lo --srv-host=kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544 --srv-host=kerberos.udp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544 --srv-host=ldap.tcp.pony.demo.redhat.com,pony.demo.redhat.com,389,1095468768,32544 --listen-address 192.168.200.254 --dhcp-range 192.168.200.1,192.168.200.1 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/demo.redhat.leases --dhcp-lease-max=1 --dhcp-no-override --expand-hosts --addn-hosts=/var/lib/libvirt/dnsmasq/demo.redhat.addnhosts) status unexpected: exit status 1

So there are a few issues here:

1) Why does a failed net-create kill libvirt? This shouldn't happen.

2) What is the correct syntax expected when defining SRV entries in the network definition, and does it actually result in valid SRV entries?

Comment 3 Stephen Gordon 2012-06-28 17:39:15 UTC
libvirt-0.9.11.3-1.fc17.x86_64

Comment 4 Dave Allan 2012-06-28 17:41:19 UTC
I've reproduced this with the current git HEAD.  It's 100% reproducible for me.

Comment 5 Dave Allan 2012-06-28 17:42:08 UTC
Created attachment 595096 [details]
backtrace

Comment 7 Peter Krempa 2012-06-28 22:07:39 UTC
Fix is now commited upstream:

commit 96ebb4fe586512487f83b4696d20923315889796
Author: Peter Krempa <pkrempa>
Date:   Thu Jun 28 23:42:50 2012 +0200

    network_conf: Don't free uninitialized pointers while parsing DNS SRV

Stephen,

this patch solves problem 1) of your report. As of problem 2) the code checks for values "tcp" and "udp" (without leading underscores) for the protocol property. With the mentioned patch, you now get "error: Invalid protocol attribute value '_tcp'" (instead of daemon crash). When I remove the underscores I get another error:

error: internal error Child process (/usr/sbin/dnsmasq --strict-order --bind-interfaces --domain demo.redhat.com --pid-file=/var/run/libvirt/network/demo-redhat.pid --conf-file= --except-interface lo --srv-host=_kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,-1686014515,32560 --srv-host=_kerberos.udp.pony.demo.redhat.com,pony.demo.redhat.com,88,-1686014515,32560 --srv-host=_ldap.tcp.pony.demo.redhat.com,pony.demo.redhat.com,389,-1686014515,32560 --listen-address 192.168.200.254 --dhcp-range 192.168.200.1,192.168.200.1 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/demo-redhat.leases --dhcp-lease-max=1 --dhcp-no-override --expand-hosts --addn-hosts=/var/lib/libvirt/dnsmasq/demo-redhat.addnhosts) status unexpected: exit status 1

I'm not familiar with SRV DNS records in libvirt, so I'll leave this bug open for someone other to follow up with this problem.

Comment 8 Stephen Gordon 2012-06-28 22:21:19 UTC
I am pretty sure the issue is that the tcp in the srv-host line(s) should be prefixed with an underscore (and the same for udp entries) but as you can see, not much in the way of specific information from dnsmasq about the failure.

Comment 9 Dave Allan 2012-06-29 01:28:36 UTC
I've verified that the crash is fixed in the upstream git HEAD.

Comment 10 Gunannan Ren 2012-07-04 13:55:43 UTC
According to RFC 2782, the service and protocol fields should have leading underscores. In practice it is fine not to follow this rule, I will try to post a patch for it after digging into code.

The format of SRV RR should be <name>,<target>,<port>,<priority>,<weight>
so use the following xml for <srv> element
<srv service='kerberos' protocol='tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88' priority='10' weight='10'/>

instead of
<srv service='_kerberos' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/>

your case will lead to random digit value for <priority> as well as <weight>
error like:
--srv-host=kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544

Comment 11 Stephen Gordon 2012-07-04 15:10:45 UTC
Ok, but the docs say that only service name and protocol are mandatory arguments and seem to indicate the reason they are optional is because they are defined that way in that same RFC:

http://libvirt.org/formatnetwork.html#elementsAddress

To me where these optional fields are not provided in the XML then no value for them should be sent to dnsmasq (random or otherwise). Certainly you don't have to set them when interacting with dnsmasq directly.

Comment 12 Stephen Gordon 2012-07-04 15:14:36 UTC
s/seem to indicate the reason they/seem to indicate the reason the others/

Comment 13 Gunannan Ren 2012-07-05 05:32:14 UTC
Yes, that's right. The dnsmasq adds zero value for these missing optional values, libvirt should align itself with dnsmasq. I will try to fix it.

Comment 14 Gunannan Ren 2012-07-08 10:54:17 UTC
patches sent to upstream:
https://www.redhat.com/archives/libvir-list/2012-July/msg00234.html


Note You need to log in before you can comment on or make changes to this bug.