+++ This bug was initially created as a clone of Bug #512350 +++ Description of problem: I'm using libvirt's python wrapper and get into lots of trouble when having concurrent TLS connections (to xen://localhost/). I assume that is the bug in gnutls which is described on the Internet: http://groups.google.com/group/google-gadgets-for-linux-user/browse_thread/thread/2c718e4e56be7a49/d16868d743ac18b4?lnk=raot http://bugzilla.gnome.org/show_bug.cgi?id=172813 Version-Release number of selected component (if applicable): libvirt-0.6.4 (from Debian lenny stable repo) How reproducible: Always, makes libvirt unusable for me Steps to Reproduce: Create two concurrent connections to xen://localhost/, e.g. using multithreading (and yes, global locks did *not* help, it's really a gnutls problem) and use the connection somehow. In my test case thread A was getting domain information and thread B was to shutdown a domain. Actual results: The following errors occur and the python process is aborted immediately: python: ath.c:193: _gcry_ath_mutex_lock: Assertion `*lock == ((ath_mutex_t) 0)' failed Aborted Expected results: Simultaneous TLS connections should not crash the application. Additional info: As to the links given above (especially http://lists.gnupg.org/pipermail/gcrypt-devel/2006-January/000911.html), it might help to call the correct gcrypt initialization functions (for threading) when starting libvirtd. In libvirt-0.6.4, gcrypt functions do not appear at all. *Quickly* porting to OpenSSL as a workaround is not an option, I guess ;-) I will now try to use a singleton connection object as a workaround but please could somebody solve this problem?
Actually changing my mind about this. It is not a Regression. The server side only ever uses GNUTLS from 1 single thread, so it can't be impacted. THe client side was always broken
A proof of concept posted upstream http://www.redhat.com/archives/libvir-list/2009-December/msg00486.html
Created attachment 379825 [details] Backport of the upstream patch to the 0.6.3 RHEL-5 tree This is nearly the upstream patch with just some context changed and the extra configure bits. Daniel
libvirt-0.6.3-26.el5 has been built into dist-5E-qu-candidate with the fix Daniel
Using TLS connection is failed,a error message was raised for kvm and xen hypervisor: [root@dhcp-66-70-173 ~]# virsh -c qemu+tls://10.66.70.62/system error: unable to connect to '10.66.70.62': Invalid argument error: failed to connect to the hypervisor but SSH connection is ok: [root@dhcp-66-70-173 ~]# virsh -c qemu+ssh://10.66.70.62/system list --all root.70.62's password: Id Name State ---------------------------------- - rhel5u5 shut off I am not sure whether missing some stuff between client and server.it seems that certificate is ok. Steps to Reproduce: server --> 10.66.70.62 client --> 10.66.70.173 1.Setting up a Certificate Authority(on server) and -- scp cacert.pem 10.66.70.173:/etc/pki/CA/ -- scp cakey.pem 10.66.70.173:/etc/pki/CA/ 2.Issuing server certificatesscp cakey.pem and -- mkdir -p /etc/pki/libvirt/private/ -- cp serverkey.pem /etc/pki/libvirt/private/ -- cp servercert.pem /etc/pki/libvirt/ 3.Issuing client certificates and -- mkdir -p /etc/pki/libvirt/private/ -- cp clientkey.pem /etc/pki/libvirt/private/ -- cp clientcert.pem /etc/pki/libvirt/ 4.Turn on libvird monitor listening on server -- uncomment LIBVIRTD_ARGS="--listen" -- enbale listen_tls = 1 in libvirtd.conf(/etc/libvirt/libvirtd.conf) -- service libvirtd restart -- service iptables stop 5.Remote connection from client to server libvirtd -- virsh -c qemu+tls://10.66.70.62/system Version-Release number of selected component (if applicable): [root@dhcp-66-70-173 clientkey]# uname -a Linux dhcp-66-70-173.nay.redhat.com 2.6.18-183.el5 #1 SMP Mon Dec 21 18:37:42 EST 2009 x86_64 x86_64 x86_64 GNU/Linux [root@dhcp-66-70-173 clientkey]# lsmod|grep kvm kvm_intel 86664 0 kvm 223648 2 ksm,kvm_intel [root@dhcp-66-70-173 clientkey]# rpm -qa|grep libvirt libvirt-debuginfo-0.6.3-29.el5 libvirt-python-0.6.3-29.el5 libvirt-0.6.3-29.el5 [root@dhcp-66-70-173 clientkey]# rpm -qa|grep kvm kvm-tools-83-140.el5 kvm-qemu-img-83-140.el5 kmod-kvm-83-140.el5 etherboot-zroms-kvm-5.4.4-13.el5 kvm-83-140.el5
Created attachment 382191 [details] Setting up a Certificate Authority
Created attachment 382192 [details] Issuing server certificates
Created attachment 382193 [details] Issuing client certificates
Please check that port 16514 is open on the firewall for the server running libvirtd, eg telnet 10.66.70.62 16514 and see if that works
I redo the test, "telnet 10.66.70.62 16541" is working, it shows connection set up already. in the client , errors still report like this: error: unable to connect to '10.66.70.62': Invalid argument error: failed to connect to the hypervisor on the end of libvirtd server, add option "--verbose" on the /usr/sbin/libvirtd command line while the client is connecting, it shows: 07:34:46.331: error : gnutls_record_recv : A TLS packet with unexpected length was received
It looks like your x509 server certificate is not correct. In the attachment in comment 9, I see a subject line of: Subject: O=Red Hat Emerging Technologies,CN=oirase The 'CN=oirase' bit is supposed to be using the hostname of your server. 'oirase' is the example hostname from the libvirt documentation - you need to replace this with your own hostname when creating certificates. This must match the hostname used in the libvirt URI *exactly*, so since your URI is qemu+ssh://10.66.70.62/system The server certificate should end up showing CN=10.66.70.62
yup I know what you mean, I did it as you said. it reports the errors like comment 12
sorry, I will do it again before adding the above comments
I tried again , it still reports errors on end of libvirtd server while client is connecting: 19:53:52.569: error : gnutls_record_recv: A TLS packet with unexpected length was received. Here is the relevant information: on libvirtd server(10.66.70.62): #ps -ef|grep libvirt nobody 3257 1 0 16:35 ? 00:00:00 /usr/sbin/dnsmasq --strict-order --bind-interfaces --pid-file=/var/run/libvirt/network/default.pid --conf-file= --listen-address 192.168.122.1 --except-interface lo --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-lease-max=253 root 5869 4797 14 19:53 pts/5 00:00:00 /usr/sbin/libvirtd --listen on the client(10.66.70.64): # virsh -c qemu+tls://10.66.70.62/system error: unable to connect to '10.66.70.62': Invalid argument error: failed to connect to the hypervisor
Created attachment 384904 [details] cacert.pem
Created attachment 384905 [details] servercert.pem
Created attachment 384906 [details] clientcert.pem
You still have mis-matched hostnames. You are connecting based on IP address # virsh -c qemu+tls://10.66.70.62/system But the servercert.pem contains Subject: O=RedHat test,CN=dhcp-66-70-62.nay.redhat.com You need to use *EXACTLY* the same hostname for both, you can't mix & match hostnames with IP addreses, or vica-verca. So you need to try connecting with #virsh -c qemu+tls://dhcp-66-70-62.nay.redhat.com/system
yep, using hostname is right, it is successful to have TLS connection, now next, I try to create concurrent connections using threading to reproduce the bug.
The bug has been fixed on libvirt-0.6.3-30.el5 I reproduce the bug on libvirt-0.6.3-24.el5 # python multhread.py rhel5u5_x86_64_kvm Thread:No 1 Thread:No 2 python: ath.c:193: _gcry_ath_mutex_lock: Assertion `*lock == ((ath_mutex_t) 0)' failed. Aborted on libvirt-0.6.3-30.el5, there is no problem.
Created attachment 385667 [details] The scripts verifying the bug
Verified this bug PASS with libvirt-0.6.3-31.el5 on RHEL-5.5-Server-x86_64-xen
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2010-0205.html