This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 512367 - Application using libvirt crashes when having concurrent TLS connections (gnutls problem)
Application using libvirt crashes when having concurrent TLS connections (gnu...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt (Show other bugs)
5.4
i386 Linux
low Severity urgent
: rc
: ---
Assigned To: Daniel Berrange
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-17 10:42 EDT by Daniel Berrange
Modified: 2014-06-23 23:33 EDT (History)
9 users (show)

See Also:
Fixed In Version: libvirt-0.6.3-26.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 512350
Environment:
Last Closed: 2010-03-30 04:10:16 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Backport of the upstream patch to the 0.6.3 RHEL-5 tree (2.47 KB, patch)
2009-12-22 08:56 EST, Daniel Veillard
no flags Details | Diff
Setting up a Certificate Authority (3.54 KB, text/plain)
2010-01-07 05:23 EST, Alex Jia
no flags Details
Issuing server certificates (3.86 KB, text/plain)
2010-01-07 05:24 EST, Alex Jia
no flags Details
Issuing client certificates (4.28 KB, text/plain)
2010-01-07 05:25 EST, Alex Jia
no flags Details
cacert.pem (709 bytes, text/plain)
2010-01-17 06:59 EST, Gunannan Ren
no flags Details
servercert.pem (830 bytes, text/plain)
2010-01-17 07:00 EST, Gunannan Ren
no flags Details
clientcert.pem (895 bytes, text/plain)
2010-01-17 07:01 EST, Gunannan Ren
no flags Details
The scripts verifying the bug (1.12 KB, text/x-python)
2010-01-20 07:36 EST, Gunannan Ren
no flags Details

  None (edit)
Description Daniel Berrange 2009-07-17 10:42:09 EDT
+++ This bug was initially created as a clone of Bug #512350 +++

Description of problem:
I'm using libvirt's python wrapper and get into lots of trouble when having concurrent TLS connections (to xen://localhost/). I assume that is the bug in gnutls which is described on the Internet:

http://groups.google.com/group/google-gadgets-for-linux-user/browse_thread/thread/2c718e4e56be7a49/d16868d743ac18b4?lnk=raot
http://bugzilla.gnome.org/show_bug.cgi?id=172813

Version-Release number of selected component (if applicable):
libvirt-0.6.4 (from Debian lenny stable repo)

How reproducible:
Always, makes libvirt unusable for me

Steps to Reproduce:
Create two concurrent connections to xen://localhost/, e.g. using multithreading (and yes, global locks did *not* help, it's really a gnutls problem) and use the connection somehow. In my test case thread A was getting domain information and thread B was to shutdown a domain.
  
Actual results:
The following errors occur and the python process is aborted immediately:
  python: ath.c:193: _gcry_ath_mutex_lock: Assertion `*lock == ((ath_mutex_t) 0)' failed
  Aborted

Expected results:
Simultaneous TLS connections should not crash the application.

Additional info:
As to the links given above (especially http://lists.gnupg.org/pipermail/gcrypt-devel/2006-January/000911.html), it might help to call the correct gcrypt initialization functions (for threading) when starting libvirtd. In libvirt-0.6.4, gcrypt functions do not appear at all.
*Quickly* porting to OpenSSL as a workaround is not an option, I guess ;-)

I will now try to use a singleton connection object as a workaround but please could somebody solve this problem?
Comment 1 Daniel Berrange 2009-07-17 10:46:04 EDT
Actually changing my mind about this. It is not a Regression. The server side only ever uses GNUTLS from 1 single thread, so it can't be impacted. THe client side was always broken
Comment 3 Daniel Berrange 2009-12-16 13:50:14 EST
A proof of concept posted upstream

http://www.redhat.com/archives/libvir-list/2009-December/msg00486.html
Comment 4 Daniel Veillard 2009-12-22 08:56:35 EST
Created attachment 379825 [details]
Backport of the upstream patch to the 0.6.3 RHEL-5 tree

This is nearly the upstream patch with just some context changed and
the extra configure bits.

Daniel
Comment 5 Daniel Veillard 2009-12-22 10:25:58 EST
libvirt-0.6.3-26.el5 has been built into dist-5E-qu-candidate with the fix

Daniel
Comment 7 Alex Jia 2010-01-07 05:20:04 EST
Using TLS connection is failed,a error message was raised for kvm and xen hypervisor:
[root@dhcp-66-70-173 ~]# virsh -c qemu+tls://10.66.70.62/system
error: unable to connect to '10.66.70.62': Invalid argument
error: failed to connect to the hypervisor

but SSH connection is ok:
[root@dhcp-66-70-173 ~]# virsh -c qemu+ssh://10.66.70.62/system list --all
root@10.66.70.62's password:
 Id Name                 State
----------------------------------
  - rhel5u5              shut off

I am not sure whether missing some stuff between client and server.it seems that certificate is ok.


Steps to Reproduce:
server --> 10.66.70.62
client --> 10.66.70.173
1.Setting up a Certificate Authority(on server) and
  -- scp cacert.pem 10.66.70.173:/etc/pki/CA/
  -- scp cakey.pem 10.66.70.173:/etc/pki/CA/
2.Issuing server certificatesscp cakey.pem and
  -- mkdir -p /etc/pki/libvirt/private/
  -- cp serverkey.pem /etc/pki/libvirt/private/
  -- cp servercert.pem /etc/pki/libvirt/
3.Issuing client certificates and
  -- mkdir -p /etc/pki/libvirt/private/
  -- cp clientkey.pem /etc/pki/libvirt/private/
  -- cp clientcert.pem /etc/pki/libvirt/
4.Turn on libvird monitor listening on server
  -- uncomment LIBVIRTD_ARGS="--listen"
  -- enbale listen_tls = 1 in libvirtd.conf(/etc/libvirt/libvirtd.conf)
  -- service libvirtd restart
  -- service iptables stop
5.Remote connection from client to server libvirtd
  -- virsh -c qemu+tls://10.66.70.62/system

Version-Release number of selected component (if applicable):
[root@dhcp-66-70-173 clientkey]# uname -a
Linux dhcp-66-70-173.nay.redhat.com 2.6.18-183.el5 #1 SMP Mon Dec 21 18:37:42 EST 2009 x86_64 x86_64 x86_64 GNU/Linux
[root@dhcp-66-70-173 clientkey]# lsmod|grep kvm
kvm_intel              86664  0
kvm                   223648  2 ksm,kvm_intel
[root@dhcp-66-70-173 clientkey]# rpm -qa|grep libvirt
libvirt-debuginfo-0.6.3-29.el5
libvirt-python-0.6.3-29.el5
libvirt-0.6.3-29.el5
[root@dhcp-66-70-173 clientkey]# rpm -qa|grep kvm
kvm-tools-83-140.el5
kvm-qemu-img-83-140.el5
kmod-kvm-83-140.el5
etherboot-zroms-kvm-5.4.4-13.el5
kvm-83-140.el5
Comment 8 Alex Jia 2010-01-07 05:23:16 EST
Created attachment 382191 [details]
Setting up a Certificate Authority
Comment 9 Alex Jia 2010-01-07 05:24:14 EST
Created attachment 382192 [details]
Issuing server certificates
Comment 10 Alex Jia 2010-01-07 05:25:24 EST
Created attachment 382193 [details]
Issuing client certificates
Comment 11 Daniel Berrange 2010-01-12 12:27:27 EST
Please check that port 16514 is open on the firewall for the server running libvirtd,

eg 

  telnet 10.66.70.62  16514

and see if that works
Comment 12 Gunannan Ren 2010-01-15 07:56:20 EST
I redo the test, "telnet 10.66.70.62 16541" is working, it shows connection set up already.
in the client , errors still report like this:
error: unable to connect to '10.66.70.62': Invalid argument
error: failed to connect to the hypervisor

on the end of libvirtd server, add option "--verbose" on the /usr/sbin/libvirtd command line while the client is connecting, it shows:

07:34:46.331: error : gnutls_record_recv : A TLS packet with unexpected length was received
Comment 13 Daniel Berrange 2010-01-15 09:10:43 EST
It looks like your x509 server certificate is not correct. In the attachment in comment 9, I see a subject line of:

Subject: O=Red Hat Emerging Technologies,CN=oirase


The 'CN=oirase' bit is supposed to be using the hostname of your server. 'oirase' is the example hostname from the libvirt documentation - you need to replace this with your own hostname when creating certificates. This must match the hostname used in the libvirt URI *exactly*, so since your URI is

 qemu+ssh://10.66.70.62/system 

The server certificate should end up showing

  CN=10.66.70.62
Comment 14 Gunannan Ren 2010-01-15 09:19:33 EST
yup I know what you mean, I did it as you said.
it reports the errors like comment 12
Comment 15 Gunannan Ren 2010-01-15 09:24:14 EST
sorry, I will do it again before adding the above comments
Comment 16 Gunannan Ren 2010-01-17 06:57:53 EST
I tried again , it still reports errors on end of libvirtd server while client is connecting:

19:53:52.569: error : gnutls_record_recv: A TLS packet with unexpected length was received.

Here is the relevant information:

on libvirtd server(10.66.70.62):

#ps -ef|grep libvirt
nobody    3257     1  0 16:35 ?        00:00:00 /usr/sbin/dnsmasq --strict-order --bind-interfaces --pid-file=/var/run/libvirt/network/default.pid --conf-file=  --listen-address 192.168.122.1 --except-interface lo --dhcp-range 192.168.122.2,192.168.122.254 --dhcp-lease-max=253
root      5869  4797 14 19:53 pts/5    00:00:00 /usr/sbin/libvirtd --listen

on the client(10.66.70.64):

# virsh -c qemu+tls://10.66.70.62/system
error: unable to connect to '10.66.70.62': Invalid argument
error: failed to connect to the hypervisor
Comment 17 Gunannan Ren 2010-01-17 06:59:39 EST
Created attachment 384904 [details]
cacert.pem
Comment 18 Gunannan Ren 2010-01-17 07:00:38 EST
Created attachment 384905 [details]
servercert.pem
Comment 19 Gunannan Ren 2010-01-17 07:01:14 EST
Created attachment 384906 [details]
clientcert.pem
Comment 20 Daniel Berrange 2010-01-18 05:46:56 EST
You still have mis-matched hostnames. You are connecting based on IP address

# virsh -c qemu+tls://10.66.70.62/system

But the servercert.pem contains

        Subject: O=RedHat test,CN=dhcp-66-70-62.nay.redhat.com


You need to use *EXACTLY* the same hostname for both, you can't mix & match hostnames with IP addreses, or vica-verca.

So you need to try connecting with

  #virsh -c qemu+tls://dhcp-66-70-62.nay.redhat.com/system
Comment 21 Gunannan Ren 2010-01-19 06:26:13 EST
yep, using hostname is right, it is successful to have TLS connection, now
next, I try to create concurrent connections using threading to reproduce the bug.
Comment 22 Gunannan Ren 2010-01-20 07:33:16 EST
The bug has been fixed on libvirt-0.6.3-30.el5

I reproduce the bug on libvirt-0.6.3-24.el5
# python multhread.py rhel5u5_x86_64_kvm
Thread:No 1
Thread:No 2
python: ath.c:193: _gcry_ath_mutex_lock: Assertion `*lock == ((ath_mutex_t) 0)' failed.
Aborted

on libvirt-0.6.3-30.el5, there is no problem.
Comment 23 Gunannan Ren 2010-01-20 07:36:29 EST
Created attachment 385667 [details]
The scripts verifying the bug
Comment 25 Johnny Liu 2010-02-02 05:43:20 EST
Verified this bug PASS with libvirt-0.6.3-31.el5 on RHEL-5.5-Server-x86_64-xen
Comment 27 errata-xmlrpc 2010-03-30 04:10:16 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0205.html

Note You need to log in before you can comment on or make changes to this bug.