Bug 1368556

Summary: Agent autoupgrade fails when 2-way authentication is used
Product: [JBoss] JBoss Operations Network Reporter: Filip Brychta <fbrychta>
Component: Security, AgentAssignee: Josejulio Martínez <jmartine>
Status: CLOSED ERRATA QA Contact: Filip Brychta <fbrychta>
Severity: urgent Docs Contact:
Priority: high    
Version: JON 3.3.5CC: jmartine, loleary, spinder
Target Milestone: ER01Keywords: Triaged
Target Release: JON 3.3.8   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-16 18:45:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
configuration files and stores
none
agent log containing ssl debug
none
server log with ssl debug none

Description Filip Brychta 2016-08-19 18:09:33 UTC
Created attachment 1192258 [details]
configuration files and stores

Description of problem:
Agent autoupgrade fails with  Received fatal alert: bad_certificate when 2-way authentication is used.

Version-Release number of selected component (if applicable):
Tried it on 3.3.5 but it will be probably visible on all versions

How reproducible:
2/2

Steps to Reproduce:
1. install JON 3.3.5 + 1 agent, use hostnames instead of IPs everywhere
2. enable 2-way authentication (use your own hostnames):
a) generate key on server - keytool -genkey -dname "CN=jon-335-sec.bc.jonqe.lab.eng.bos.redhat.com"  -keystore server1-keystore.dat -validity 3650 -alias server1 -keyalg RSA -storetype JKS -keypass secret -storepass secret
b) generate key on agent - keytool -genkey -dname "CN=fbr-ag1-sec.bc.jonqe.lab.eng.bos.redhat.com"  -keystore server2-keystore.dat -validity 3650 -alias server2 -keyalg RSA -storetype JKS -keypass secret -storepass secret
c) export certificates for both server and agent - e.g. keytool -export -keystore server2-keystore.dat -alias server2 -storetype JKS -storepass secret -file server2-cert for agent
d) import BOTH certificate to trustore - keytool -import -keystore truststore.dat -alias server2 -storetype JKS -file server2-cert -noprompt -keypass secret -storepass secret and similarly for server1-cert
e) copy matching created keystores and truststores to default locations defined in rhq-server.properties and agent-configuration.xml or change the configuration to point to correct locations
f) set rhq-server.properties and agent configuration according to attached examples (make sure to use correct hostnames)
3. start server
4. start agent with --cleanconfig to get new configuration
5. there are no exceptions in logs and and resources are up
6. stop server rhqctl stop
7. apply CP7 jon-server-3.3.0.GA-update-07/apply-updates.sh jon-server-3.3.0.GA
8. start server rhqctl start




Actual results:
Server and co-located agent are started correctly.
Remote agent fails to upgrade itself:
2016-08-19 07:40:44,031 DEBUG [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateVersion)- {AgentUpdateVersion.update-version-retrieval}Getting the agent update version via URL [https://jon-335-sec.bc.jonqe.lab.eng.bos.redhat.com:7443/agentupdate/version]
2016-08-19 07:40:44,083 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {PromptCommand.update.download-failed}Failed to download the agent update binary. Cause: Received fatal alert: bad_certificate
2016-08-19 07:40:44,084 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {AgentUpdateThread.exception}The agent update thread encountered an exception: javax.net.ssl.SSLHandshakeException:Received fatal alert: bad_certificate -> javax.net.ssl.SSLHandshakeException:Received fatal alert: bad_certificate
2016-08-19 07:40:44,084 FATAL [RHQ Agent Update Thread] (org.rhq.enterprise.agent.AgentUpdateThread)- {AgentUpdateThread.cannot-restart-retry}The agent cannot restart after the aborted update, will try to update again in [60,000]ms

Expected results:
Remote agent is autoupgraded correclty

Additional info:
In attachment:
agent-configuration-remote.xml -- agent configuration used by remote agent
rhq-server.properties 
rhq.truststore -- truststore used by JON server and co-located agent
rhq.keystore -- keystore used by JON server and co-located agent
truststore.dat -- truststore for remote agent
keystore.dat -- keystore for remote agent

It seems that everything is working fine except autoupgrade of agent.
When client authentication is disabled on server rhq.server.tomcat.security.client-auth-mode=false the autoupgrade is working fine. After upgrade it can be again enabled and everything is working.

There is ssl debug log attached in server.log and rhq-agent-wrapper.log
Interesting part is in server.log:06:35:15,160 INFO  [stdout] (http-10.16.23.205:7443-2) http-10.16.23.205:7443-2, IOException in getSession():  javax.net.ssl.SSLHandshakeException: null cert chain
Could that mean that agent did not send it's certificate?

Comment 1 Filip Brychta 2016-08-19 18:10:09 UTC
Created attachment 1192259 [details]
agent log containing ssl debug

Comment 2 Filip Brychta 2016-08-19 18:10:45 UTC
Created attachment 1192260 [details]
server log with ssl debug

Comment 5 Josejulio Martínez 2016-08-23 19:05:58 UTC
A closer inspection reveals that the agent does not load the keystore: https://github.com/rhq-project/rhq/pull/271

Before:
> update --version
Failed to get the agent update version from [https://avalanche.local:7443/agentupdate/version]. Cause: javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate

After:
> update --version
Agent update version obtained from [https://avalanche.local:7443/agentupdate/version]:
Agent Update Binary Version: 4.14.0-SNAPSHOT (dc20141)
         This Agent Version: 4.14.0-SNAPSHOT (dc20141)
Agent is up to date.
> update --download
Downloaded the agent update binary to [/home/josejulio/rhq/rhq-enterprise-agent-4.14.0-SNAPSHOT.jar]

Comment 6 Josejulio Martínez 2016-08-24 17:51:21 UTC
commit cd4339d4e56e39075ce45f89b98007fe15074e9b
Author: Josejulio Martínez <jmartine>
Date:   Tue Aug 23 12:14:31 2016 -0500

    Bug 1368556 - Loads keystore when creating a secure connection.

Comment 10 Filip Brychta 2017-01-25 14:36:54 UTC
Verified on update from 3.3.7 to 3.3.8.ER01

Comment 11 errata-xmlrpc 2017-02-16 18:45:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2017-0285.html