Bug 1071020

Summary: [RHEV] Upgrading rhevm-reports from 3.2 to 3.3 without Apache running fails and prevents any future rhevm-reports-setup from succeeding
Product: Red Hat Enterprise Virtualization Manager Reporter: James W. Mills <jamills>
Component: ovirt-engine-reportsAssignee: Yedidyah Bar David <didi>
Status: CLOSED ERRATA QA Contact: Barak Dagan <bdagan>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: acathrow, adahms, asegundo, bazulay, didi, iheim, jraju, meverett, pablo.iranzo, pstehlik, Rhev-m-bugs, yeylon
Target Milestone: ---Keywords: TestOnly
Target Release: 3.4.0   
Hardware: All   
OS: Linux   
Whiteboard: integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, attempting to upgrade the Reports feature when the httpd service was not running would cause the upgrade operation to fail after creating the new engine_reports user but before the randomly generated password for that user could be written to the configuration for Red Hat Enterprise Virtualization. Now, the logic used to write the randomly generated password has been revised so that the password is written immediately after it is generated, making it possible to access these credentials even in the event that the upgrade operation fails.
Story Points: ---
Clone Of:
: 1072466 (view as bug list) Environment:
Last Closed: 2014-06-09 15:27:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1072068, 1077775, 1086003, 1111749, 1121792    
Bug Blocks: 1072466, 1078909, 1142926    

Description James W. Mills 2014-02-27 22:56:58 UTC
Description of problem:

After upgrading rhevm-reports from 3.2 to 3.3, running rhevm-reports-setup while Apache is shut down will cause the ssl2jkstrust.py script to fail.  This failure causes the rhevm-reports-setup to exit with an error *before* the 10-setup-database.conf file is written.  So, the new engine_reports user has been created with a random password, and that password has been put into the jasperreports configuration, but the actual configuration file containing the password does not exist.

Further attemps to re-run rhevm-reports-setup will fail because the engine_reports user has a password that the script cannot know.


Version-Release number of selected component (if applicable):

rhevm-reports-3.3.0-28.el6ev.noarch

How reproducible:

100%

Steps to Reproduce:
1. Upgrade a functional 3,2 RHEV/rhevm-reports configuration to 3.3
2. Run engine-setup
3. Run rhevm-dwh-setup
4. Stop ovirt-engine, ovirt-engine-dwhd, and http
5. Run rhevm-reports-setup

Actual results:

A failure that prevents any future successes without manual intervention.

Expected results:

Success.

Additional info:

Here is the original error, when ssl2jkstrust.py fails:

2014-02-27 22:18:12::DEBUG::common_utils::1018::root:: Executing command --> '/usr/share/ovirt-engine-reports/ssl2jkstrust.py --host=rhevm32.awayfar.org --port=443 --keystore=/etc/ovirt-engine/ovirt-engine-reports/trust.jks --storepass=mypass' in working directory '/root'
2014-02-27 22:18:12::DEBUG::common_utils::1073::root:: output =
2014-02-27 22:18:12::DEBUG::common_utils::1074::root:: stderr = Traceback (most recent call last):
  File "/usr/share/ovirt-engine-reports/ssl2jkstrust.py", line 116, in <module>
    main()
  File "/usr/share/ovirt-engine-reports/ssl2jkstrust.py", line 95, in main
    for c in getChainFromSSL((options.host, int(options.port)))[1:]:
  File "/usr/share/ovirt-engine-reports/ssl2jkstrust.py", line 45, in getChainFromSSL
    sock.connect(host)
  File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 181, in connect
    self.socket.connect(addr)
  File "<string>", line 1, in connect
socket.error: [Errno 111] Connection refused

2014-02-27 22:18:12::DEBUG::common_utils::1075::root:: retcode = 1
2014-02-27 22:18:12::ERROR::rhevm-reports-setup::1280::root:: Failed to complete the setup of the reports package!
2014-02-27 22:18:12::DEBUG::rhevm-reports-setup::1281::root:: Traceback (most recent call last):
  File "/usr/bin/rhevm-reports-setup", line 1255, in main
    updateApplicationSecurity()
  File "/usr/bin/rhevm-reports-setup", line 930, in updateApplicationSecurity
    failOnError=True,
  File "/usr/share/ovirt-engine-reports/common_utils.py", line 1078, in execCmd
    raise Exception(msg)
Exception: Return Code is not zero


I believe this error prompted BZ#1058016 and BZ#1064827.  However, looking at the upstream commit, I do not *think* that this is a fix for Apache not running at all.

I believe there should be some early-on checking of whether or not httpd is running, and what the status of its trust store is.

The fix for this is to extract the "CREATE ROLE" lines from the first rhevm-reports-setup log, and manually create /etc/ovirt-engine-reports/ovirt-engine-reports.conf.d/10-setup-database.conf.  Unfortunately, it is unlikely the customer will know what variable names to use.

Comment 1 James W. Mills 2014-02-27 23:08:01 UTC
Further attempts to run rhevm-reports-setup after the intial failure look like this:

2014-02-27 22:42:28::DEBUG::common_utils::1018::root:: Executing command --> '/usr/bin/psql -U engine_reports -d rhevmreports -h localhost -p 5432 -f /usr/share/jasperreports-server-pro/buildomatic/install_resources/sql/postgresql/upgrade-postgresql-4.7.0-5.0.0-pro.sql' in working directory '/root'
2014-02-27 22:42:28::DEBUG::common_utils::1073::root:: output = 
2014-02-27 22:42:28::DEBUG::common_utils::1074::root:: stderr = psql: FATAL:  password authentication failed for user "engine_reports"

2014-02-27 22:42:28::DEBUG::common_utils::1075::root:: retcode = 2
2014-02-27 22:42:28::ERROR::decorators::26::root:: Traceback (most recent call last):
  File "/usr/share/ovirt-engine-reports/decorators.py", line 19, in wrapped_f
    output = f(*args)
  File "/usr/bin/rhevm-reports-setup", line 180, in updateDbSchema
    envDict={'ENGINE_PGPASS': TEMP_PGPASS},
  File "/usr/share/ovirt-engine-reports/common_utils.py", line 1078, in execCmd
    raise Exception(msg)
Exception: Return Code is not zero

Comment 4 Yedidyah Bar David 2014-03-04 15:47:23 UTC
25339 makes setup save the credentials (including the randomly-generated password) right after creating them and changing ownership in the db.

Verified that it helps the described flow by doing:

install and setup 3.2 engine/dwh/reports
upgrade engine to 3.3
upgrade dwh to 3.3
yum update rhevm-reports to 3.3
stop services ovirt-engine ovirt-engine-dwhd httpd
rhevm-reports-setup
see that it fails as described, but 10-setup-database.conf is created
start these services
rhevm-reports-setup - this time it finished successfully.

Did not address yet checking access to httpd etc.

Comment 6 Yedidyah Bar David 2014-03-04 16:15:29 UTC
Moving to ON_QA as the relevant code was rewritten in 3.4.

Comment 11 Barak Dagan 2014-04-24 16:07:56 UTC
Verified on av6.1.

Installation successed when httpd service was down.

Comment 13 errata-xmlrpc 2014-06-09 15:27:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-0602.html