Description of problem: Ran virsh net-create on a given XML file defining a network, libvirt crashed. There may or may not be an issue with my network in the XML file (probably around the SRV entries) but I would not have expected that to result in the daemon dying... Output from virsh net-create: # virsh net-create /tmp/demo.redhat.com-network.xml error: Failed to create network from /tmp/demo.redhat.com-network.xml error: End of file while reading data: Input/output error Additional Info: <network> <name>demo-redhat</name> <bridge name='virbr9' /> <forward mode='nat' /> <domain name='demo.redhat.com' /> <dns> <srv service='_kerberos' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/> <srv service='_kerberos' protocol='_udp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/> <srv service='_ldap' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='389'/> <host ip='192.168.200.1'><hostname>pony.demo.redhat.com</hostname></host> </dns> <ip address='192.168.200.254' netmask='255.255.255.0'> <dhcp> <range start='192.168.200.1' end='192.168.200.1' /> </dhcp> </ip> </network>
As an aside it's unclear to me how the SRV records should really be written, the examples on libvirt.org set protocol="tcp" rather than "_tcp" but that results in a non-sensical srv-host entry being sent to dnsmasq with: --srv-host=ldap.tcp.rhevm31.demo.redhat.com,rhevm31.demo.redhat.com,389,... Instead of: --srv-host=_ldap._tcp.pony.demo.redhat.com,rhevm31.demo.redhat.com,389,... Resulting, I think, in this failure: error: Failed to create network from /tmp/demo.redhat.com-network.xml error: internal error Child process (/sbin/dnsmasq --strict-order --bind-interfaces --domain demo.redhat.com --pid-file=/var/run/libvirt/network/demo.redhat.pid --conf-file= --except-interface lo --srv-host=kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544 --srv-host=kerberos.udp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544 --srv-host=ldap.tcp.pony.demo.redhat.com,pony.demo.redhat.com,389,1095468768,32544 --listen-address 192.168.200.254 --dhcp-range 192.168.200.1,192.168.200.1 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/demo.redhat.leases --dhcp-lease-max=1 --dhcp-no-override --expand-hosts --addn-hosts=/var/lib/libvirt/dnsmasq/demo.redhat.addnhosts) status unexpected: exit status 1 So there are a few issues here: 1) Why does a failed net-create kill libvirt? This shouldn't happen. 2) What is the correct syntax expected when defining SRV entries in the network definition, and does it actually result in valid SRV entries?
libvirt-0.9.11.3-1.fc17.x86_64
I've reproduced this with the current git HEAD. It's 100% reproducible for me.
Created attachment 595096 [details] backtrace
Fix: http://www.redhat.com/archives/libvir-list/2012-June/msg01295.html
Fix is now commited upstream: commit 96ebb4fe586512487f83b4696d20923315889796 Author: Peter Krempa <pkrempa> Date: Thu Jun 28 23:42:50 2012 +0200 network_conf: Don't free uninitialized pointers while parsing DNS SRV Stephen, this patch solves problem 1) of your report. As of problem 2) the code checks for values "tcp" and "udp" (without leading underscores) for the protocol property. With the mentioned patch, you now get "error: Invalid protocol attribute value '_tcp'" (instead of daemon crash). When I remove the underscores I get another error: error: internal error Child process (/usr/sbin/dnsmasq --strict-order --bind-interfaces --domain demo.redhat.com --pid-file=/var/run/libvirt/network/demo-redhat.pid --conf-file= --except-interface lo --srv-host=_kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,-1686014515,32560 --srv-host=_kerberos.udp.pony.demo.redhat.com,pony.demo.redhat.com,88,-1686014515,32560 --srv-host=_ldap.tcp.pony.demo.redhat.com,pony.demo.redhat.com,389,-1686014515,32560 --listen-address 192.168.200.254 --dhcp-range 192.168.200.1,192.168.200.1 --dhcp-leasefile=/var/lib/libvirt/dnsmasq/demo-redhat.leases --dhcp-lease-max=1 --dhcp-no-override --expand-hosts --addn-hosts=/var/lib/libvirt/dnsmasq/demo-redhat.addnhosts) status unexpected: exit status 1 I'm not familiar with SRV DNS records in libvirt, so I'll leave this bug open for someone other to follow up with this problem.
I am pretty sure the issue is that the tcp in the srv-host line(s) should be prefixed with an underscore (and the same for udp entries) but as you can see, not much in the way of specific information from dnsmasq about the failure.
I've verified that the crash is fixed in the upstream git HEAD.
According to RFC 2782, the service and protocol fields should have leading underscores. In practice it is fine not to follow this rule, I will try to post a patch for it after digging into code. The format of SRV RR should be <name>,<target>,<port>,<priority>,<weight> so use the following xml for <srv> element <srv service='kerberos' protocol='tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88' priority='10' weight='10'/> instead of <srv service='_kerberos' protocol='_tcp' domain='pony.demo.redhat.com' target='pony.demo.redhat.com' port='88'/> your case will lead to random digit value for <priority> as well as <weight> error like: --srv-host=kerberos.tcp.pony.demo.redhat.com,pony.demo.redhat.com,88,1095468768,32544
Ok, but the docs say that only service name and protocol are mandatory arguments and seem to indicate the reason they are optional is because they are defined that way in that same RFC: http://libvirt.org/formatnetwork.html#elementsAddress To me where these optional fields are not provided in the XML then no value for them should be sent to dnsmasq (random or otherwise). Certainly you don't have to set them when interacting with dnsmasq directly.
s/seem to indicate the reason they/seem to indicate the reason the others/
Yes, that's right. The dnsmasq adds zero value for these missing optional values, libvirt should align itself with dnsmasq. I will try to fix it.
patches sent to upstream: https://www.redhat.com/archives/libvir-list/2012-July/msg00234.html