Description of problem: This error is printed all the time in vdsm.log. I'm not sure what the actual impact is, but I didn't see this error until is9. BindingXMLRPC::ERROR::2013-08-08 03:59:12,639::BindingXMLRPC::72::vds::(threaded_start) xml-rpc handler exception Traceback (most recent call last): File "/usr/share/vdsm/BindingXMLRPC.py", line 68, in threaded_start self.server.handle_request() File "/usr/lib64/python2.6/SocketServer.py", line 268, in handle_request self._handle_request_noblock() File "/usr/lib64/python2.6/SocketServer.py", line 278, in _handle_request_noblock request, client_address = self.get_request() File "/usr/lib64/python2.6/SocketServer.py", line 446, in get_request return self.socket.accept() File "/usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py", line 117, in accept client, address = self.connection.accept() File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 167, in accept ssl.accept_ssl() File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 156, in accept_ssl return m2.ssl_accept(self.ssl, self._timeout) SSLError: sslv3 alert certificate unknown Version-Release number of selected component (if applicable): vdsm-4.12.0-37.gitfd6a1b7.el6ev.x86_64 How reproducible: 100% Logs can be found here: http://jenkins.qa.lab.tlv.redhat.com:8080/view/Storage/view/3.3/job/3.3-storage_export_import_desktop-iscsi-sdk/28/artifact/logs/jenkins-3.3-storage_export_import_desktop-iscsi-sdk-28__08082013_03-59-33.tar.bz2
This message means that the digital certificate used by the host isn't valid from the point of view of the engine. This may happen if you are using a host that has digital certificates from an old installation of the engine, with a different CA. This is the couterpart message in the engine log: 2013-08-08 03:59:30,799 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-93) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
meital - a configuration issue on your side or a bug?
I got the exact same error on a fresh rhev-h install. Additional problem: I can't get this host to full work in the rhev-m web management, the status always stays as: "non responsive" this is an evaluation environment. The rhev-h instance registeres fine from the TUI with the management. I'm also able to reboot the server via ipmilan but it always stays "non responsive". What I didn't try is reinstall rhev-h. The Hypervisor-version is: RHEV Hypervisor 6.4-20130709.0.el6_4 (downloaded and checksum proved from rhn). Please let me know if you need any additional info, thanks!
Sven, can you perform the following verifications? 1. Copy the VDSM certificate of the RHEV-H host to the RHEV-M machine. This certificate should be in the host, inside the file /etc/pki/vdsm/certs/vdsmcert.pem. 2. Once you have the VDSM certificate in the engine machine verify that it has been signed by the certificate authority of the engine: # openssl verify -CAfile /etc/pki/ovirt-engine/ca.pem vdsmcert.pem vdsmcert.pem: OK As in the example above the result should be "OK", if you get any other thing then there is a problem. 3. Check that the CA certificate used by both RHEV-H and RHEV-M is the same. In RHEV-H it is inside /etc/pki/vdsm/certs/cacert.pem, in RHEV-M it is inside /etc/pki/ovirt-engine/ca.pem. 4. From the RHEV-M machine verify that you can establish a SSL connection to the VDSM running in the RHEV-H machine: # openssl s_client \ -connect the_ip_of_the_rhev_h:54321 \ -cert /etc/pki/ovirt-engine/certs/engine.cer \ -key /etc/pki/ovirt-engine/keys/engine_id_rsa \ -CAfile /etc/pki/ovirt-engine/ca.pem This is the same that RHEV-M does when connection to RHEV-H. The output of this command can be very useful to determine what is failing. Please include also the version of RHEV-M that you are using (rpm -q rhevm).
Hi, thanks for your help I get the following output: 1.& 2.: openssl verify -CAfile /etc/pki/ovirt-engine/ca.pem vdsmcert.pem vdsmcert.pem: O = VDSM Certificate, CN = localhost.localdomain error 20 at 0 depth lookup:unable to get local issuer certificate 3. They are not completely the same, the ca.pem on the rhev-m host has additional info at the beginning of it, which looks like this: Certificate: Data: Version: 3 (0x2) Serial Number: 1 (0x1) Signature Algorithm: sha1WithRSAEncryption Issuer: C=US, O=localdomain, CN=CA-rhevtest.localdomain.16822 Validity Not Before: Aug 21 13:19:02 2013 Not After : Aug 20 13:19:02 2023 GMT Subject: C=US, O=localdomain, CN=CA-rhevtest.localdomain.16822 Subject Public Key Info: Public Key Algorithm: rsaEncryption ... ... -----BEGIN CERTIFICATE----- after "-----BEGIN CERTIFICATE-----" they are exactly the same, but the cacert.pem from the hypervisor is missing this meta-information, somehow. 4. [root@rhevtest ~]# openssl s_client -connect $rhev-h-ip:54321 -cert /etc/pki/ovirt-engine/certs/engine.cer -key /etc/pki/ovirt-engine/keys/engine_id_rsa -CAfile /etc/pki/ovirt-engine/ca.pem CONNECTED(00000003) depth=1 CN = VDSM Certificate Authority verify error:num=19:self signed certificate in certificate chain verify return:0 140044209932104:error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca:s3_pkt.c:1197:SSL alert number 48 140044209932104:error:140790E5:SSL routines:SSL23_WRITE:ssl handshake failure:s23_lib.c:184: --- Certificate chain 0 s:/O=VDSM Certificate/CN=localhost.localdomain i:/CN=VDSM Certificate Authority 1 s:/CN=VDSM Certificate Authority i:/CN=VDSM Certificate Authority --- Server certificate -----BEGIN CERTIFICATE----- $I_OMMITED_THIS_PART -----END CERTIFICATE----- subject=/O=VDSM Certificate/CN=localhost.localdomain issuer=/CN=VDSM Certificate Authority --- No client certificate CA names sent --- SSL handshake has read 1737 bytes and written 2722 bytes --- New, TLSv1/SSLv3, Cipher is AES256-SHA Server public key is 2048 bit Secure Renegotiation IS supported Compression: NONE Expansion: NONE SSL-Session: Protocol : TLSv1 Cipher : AES256-SHA Session-ID: Session-ID-ctx: Master-Key: $MASTERKEY Key-Arg : None Krb5 Principal: None PSK identity: None PSK identity hint: None Start Time: 1377611360 Timeout : 300 (sec) Verify return code: 19 (self signed certificate in certificate chain) --- Can you point me to the documentation which states how to setup the certificates correct? Output of: rpm -q rhevm is: rhevm-3.2.2-0.41.el6ev.noarch Thanks for your help so far, seems something is wrong with my certificate setup, but I don't know why.
Sorry, there was some wrong info regarding Point 3: The Certificates are _not_ the same as I just found out, even not after: "-----BEGIN CERTIFICATE-----"
This means that the installation of your RHEV-H didn't finish correctly. We don't have a procedure to perform the setup manually. I would suggest to remove and reinstall completely this RHEV-H and try the installation again, then if it fails again, we can study the logs to see why it happens.
Reinstallation of RHEV-H did solve this problem for me. Thanks again for your help in debugging this problem!
Sven, you are welcome. Meital, can you run these same verifications in your environment?
It stopped happening on the same host that it used to happen on. I think that we should close this bug.
This thing usually happens when the same host is added to 2 different rhevm servers. Sow it just too many times. So in case this is not happening any more I'll simply CLOSE NOTABUG.
Go for it.