Bug 1535796 - Undeployment of HE is not graceful
Summary: Undeployment of HE is not graceful
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.8
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.0
: ---
Assignee: Evgeny Slutsky
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-18 06:45 UTC by Germano Veit Michel
Modified: 2021-03-11 19:48 UTC (History)
8 users (show)

Fixed In Version: rhv-4.4.0-28
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 13:16:05 UTC
oVirt Team: Integration
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1535268 0 medium CLOSED HE Undeploy leaves host metadata on shared storage. 2021-05-01 16:35:15 UTC
Red Hat Product Errata RHSA-2020:3247 0 None None None 2020-08-04 13:16:42 UTC

Internal Links: 1535268

Description Germano Veit Michel 2018-01-18 06:45:52 UTC
Description of problem:

During Hosted-Engine Undeploy on host re-install, apparently first the hosted-engine.conf file is removed and than the ha daemon is stopped.

The shutdown (undeploy) is not clean, the agent goes into failed state (systemd) and fills logs with unnecessary errors.

Deploy logs:
2018-01-11 21:11:08 DEBUG otopi.filetransaction filetransaction.prepare:219 backup '/etc/ovirt-hosted-engine/hosted-engine.conf'->'/etc/ovirt-hosted-engine/hosted-engine.conf.20180111211108'                                                                 
2018-01-11 21:11:24 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:813 execute: ('/bin/systemctl', 'stop', 'ovirt-ha-agent.service'), executable='None', cwd='None', env=None                                                                    

From agent side:

MainThread::ERROR::2018-01-11 21:11:26,620::config::163::ovirt_hosted_engine_ha.lib.storage_server.StorageServer.config::(_load_single_conf_file) Configuration file '/etc/ovirt-hosted-engine/hosted-engine.conf' not available [[Errno 2] No such file or directory: '/etc/ovirt-hosted-engine/hosted-engine.conf']                          
MainThread::WARNING::2018-01-11 21:11:26,620::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Error while monitoring engine: 'Configuration value not found: file=/etc/ovirt-hosted-engine/hosted-engine.conf, key=domainType'                                                                 
MainThread::WARNING::2018-01-11 21:11:26,620::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Unexpected error                                                                                                 
Traceback (most recent call last):                                                                                                                                                                                                                             
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 437, in start_monitoring                                                                                                                                         
    self._initialize_storage_images()                                                                                                                                                                                                                          
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 631, in _initialize_storage_images                                                                                                                               
    sserver = storage_server.StorageServer()                                                                                                                                                                                                                   
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/storage_server.py", line 39, in __init__                                                                                                                                                   
    self._domain_type = self._config.get(config.ENGINE, config.DOMAIN_TYPE)                                                                                                                                                                                    
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/env/config.py", line 226, in get                                                                                                                                                               
    key                                                                                                                                                                                                                                                        
KeyError: 'Configuration value not found: file=/etc/ovirt-hosted-engine/hosted-engine.conf, key=domainType'  

MainThread::INFO::2018-01-11 21:12:31,841::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down                                                                                                                                     

With several errors in between.

Version-Release number of selected component (if applicable):
rhevm-4.1.8.2-0.1.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Undeploy Hosted-Engine
2. Check ha-agent status and logs

Actual results:
ha-agent is in failed state. logs shows errors.

Expected results:
ha-agent stopped, clean logs with graceful shutdown.

Comment 2 Daniel Gur 2019-08-28 13:13:49 UTC
sync2jira

Comment 3 Daniel Gur 2019-08-28 13:18:03 UTC
sync2jira

Comment 4 Sandro Bonazzola 2019-11-13 08:45:24 UTC
Is this still reproducible?

Comment 5 Sandro Bonazzola 2020-03-02 13:53:13 UTC
Moving to QE and marking as test only. If this is reproducible we'll dig into fresh data.

Comment 6 Sandro Bonazzola 2020-03-17 13:57:39 UTC
Moving to the engine since ovirt-host-deploy is not going to be shipped in 4.4 and this should work with the ansible deployment as well.

Comment 8 Nikolai Sednev 2020-04-06 18:21:19 UTC
ovirt-ha-broker.service was still running after undeployment of secondary ha-host.


ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2020-04-06 20:56:43 IDT; 19min ago
 Main PID: 38446 (ovirt-ha-broker)
    Tasks: 13 (limit: 178310)
   Memory: 43.2M
   CGroup: /system.slice/ovirt-ha-broker.service
           └─38446 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker
● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2020-04-06 21:14:39 IDT; 2min 0s ago
  Process: 38618 ExecStart=/usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent (code=exited, status=0/SUCCESS)
 Main PID: 38618 (code=exited, status=0/SUCCESS)

MainThread::INFO::2020-04-06 21:14:39,282::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
::(_stop_domain_monitor) Stopped VDSM domain monitor
MainThread::INFO::2020-04-06 21:14:39,283::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down

Agent was shut down gracefully without any errors.

Comment 9 Nikolai Sednev 2020-04-06 18:22:42 UTC
Tested on:
ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch
ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch
rhvm-appliance.x86_64 2:4.4-20200403.0.el8ev
Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 8.2 (Ootpa)

Comment 17 errata-xmlrpc 2020-08-04 13:16:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3247


Note You need to log in before you can comment on or make changes to this bug.