Bug 1101299 - Hosted engine upgrade from 3.3 to 3.4, ovirt-ha-agent die after three errors
Summary: Hosted engine upgrade from 3.3 to 3.4, ovirt-ha-agent die after three errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 3.4.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 3.4.0
Assignee: Jiri Moskovcak
QA Contact: Artyom
URL:
Whiteboard: sla
: 1099395 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-05-26 16:45 UTC by Artyom
Modified: 2016-02-10 20:13 UTC (History)
11 users (show)

Fixed In Version: ovirt-hosted-engine-ha-1.1.2-5.el6ev
Doc Type: Bug Fix
Doc Text:
Previously, upgrading an environment in a hosted engine configuration would fail under certain conditions. This was caused by an error in the code used to store the state of the engine during the upgrade process, whereby the state could be correctly parsed in a Red Hat Enterprise Virtualization 3.4 environment, but not in a Red Hat Enterprise Virtualization 3.3 environment. Now, this code has been updated so that the state of the engine can be correctly parsed by both version.
Clone Of: 1092075
Environment:
Last Closed: 2014-06-09 14:26:52 UTC
oVirt Team: SLA


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0671 normal SHIPPED_LIVE ovirt-hosted-engine-ha bug fix and enhancement update 2014-06-09 18:25:01 UTC
oVirt gerrit 28239 None None None Never
oVirt gerrit 28240 None None None Never
Red Hat Bugzilla 1092075 None None None Never

Description Artyom 2014-05-26 16:45:30 UTC
Description of problem:
Exception appeared in time of upgrade all environment from 3.3 to 3.4
Have hosted engine environment environment:
In step when environment have one host_3.3 with hosted-engine 3.3(on it run vm now)
and another host_3.4 with 3.4, agent on host_3.3 shutdown because 3 errors:
MainThread::WARNING::2014-05-26 19:12:00,466::hosted_engine::336::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Unexpected error
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 329, in start_monitoring
    self._collect_all_host_stats()
  File "/usr/lib/python2.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 885, in _collect_all_host_stats
    in json.loads(md['engine-status']).iteritems()])
  File "/usr/lib64/python2.6/json/__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.6/json/decoder.py", line 319, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python2.6/json/decoder.py", line 336, in raw_decode
    obj, end = self._scanner.iterscan(s, **kw).next()
  File "/usr/lib64/python2.6/json/scanner.py", line 55, in iterscan
    rval, next_pos = action(m, context)
  File "/usr/lib64/python2.6/json/decoder.py", line 171, in JSONObject
    raise ValueError(errmsg("Expecting property name", s, end))
ValueError: Expecting property name: line 1 column 1 (char 1)

Version-Release number of selected component (if applicable):
Before upgrade:
2 Host - ovirt-hosted-engine-ha.noarch 0:1.0.0-3.el6ev
engine vm - is36.4
After upgrade:
2 Host - ovirt-hosted-engine-ha.noarch 0:1.0.0-3.el6ev
engine vm - av9.2

How reproducible:
Always

Steps to Reproduce:
Have hosted engine environment environment(hosts and engine vm 3.3)
1. From one of hosts  hosted-engine --set-maintenance --mode=global
2. Put to maintenance host without engine vm
3. Upgrade engine vm to 3.4
4. Upgrade host to 3.4(now it host_3.4), service vdsmd restart && service ovirt-ha-broker restart && service ovirt-ha-agent restart
5. Upgrade engine vm to 3.4
6. hosted-engine --set-maintenance --mode=none
7. Wait few minute until error appear in agent.log of host_3.3 

Actual results:
Appear error in agent.log and HA agent failed to start

Expected results:
No error agent.log success to start

Additional info:
If after all steps I run hosted-engine --set-maintenance --mode=global
and upgrade also second host(without maintenance and with vm) all back to normal and HA agent succes to run on host_3.4

Comment 1 Jiri Moskovcak 2014-05-29 14:07:01 UTC
*** Bug 1099395 has been marked as a duplicate of this bug. ***

Comment 2 Jiri Moskovcak 2014-05-30 09:05:41 UTC
I got this during the update, but not sure if it's connected: 

warning: /etc/vdsm/vdsm.conf created as /etc/vdsm/vdsm.conf.rpmnew

Checking configuration status...

Traceback (most recent call last):
  File "/usr/bin/vdsm-tool", line 145, in <module>
    sys.exit(main())
  File "/usr/bin/vdsm-tool", line 142, in main
    return tool_command[cmd]["command"](*args[1:])
  File "/usr/lib64/python2.6/site-packages/vdsm/tool/configurator.py", line 230, in configure
    service.service_stop(s)
  File "/usr/lib64/python2.6/site-packages/vdsm/tool/service.py", line 370, in service_stop
    return _runAlts(_srvStopAlts, srvName)
  File "/usr/lib64/python2.6/site-packages/vdsm/tool/service.py", line 351, in _runAlts
    "%s failed" % alt.func_name, out, err)
vdsm.tool.service.ServiceOperationError: ServiceOperationError: _serviceStop failed
Sending stop signal sanlock (7145): [  OK  ]
Waiting for sanlock (7145) to stop:[FAILED]

Comment 3 Jiri Moskovcak 2014-05-30 13:23:03 UTC
As i turned out it is a bug in 3.4 which makes 3.4 generate wrong json which 3.3 code is not able to parse -> moving to 3.4

Comment 5 Artyom 2014-06-01 10:40:54 UTC
Verified on ovirt-hosted-engine-ha-1.1.2-5.el6ev.noarch

Comment 6 errata-xmlrpc 2014-06-09 14:26:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0671.html


Note You need to log in before you can comment on or make changes to this bug.