Bug 1287195 - ovirt-ha-agent should explicitly fail if the configuration volume is not valid
Summary: ovirt-ha-agent should explicitly fail if the configuration volume is not valid
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-ha
Classification: oVirt
Component: Agent
Version: 1.3.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ovirt-3.6.1
: 1.3.3.4
Assignee: Simone Tiraboschi
QA Contact: Nikolai Sednev
URL:
Whiteboard:
Depends On:
Blocks: ovirt-hosted-engine-ha-1.3.4.3
TreeView+ depends on / blocked
 
Reported: 2015-12-01 17:50 UTC by Simone Tiraboschi
Modified: 2016-02-18 10:53 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
If for any reasons (eg. data corruption on disk) the configuration volume on the shared domain wasn't valid, ovirt-ha-agent was logging it just at debug level and silently failing. Increasing the log level to make the issue more clear and evident.
Clone Of:
Environment:
Last Closed: 2016-02-18 10:53:26 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-3.6.z+
bmcclain: planning_ack+
sbonazzo: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 49574 0 master MERGED log: increasing the log level on shared conf errors 2021-01-25 09:27:53 UTC
oVirt gerrit 50237 0 ovirt-hosted-engine-ha-1.3 MERGED log: increasing the log level on shared conf errors 2021-01-25 09:27:53 UTC

Description Simone Tiraboschi 2015-12-01 17:50:08 UTC
Description of problem:
ovirt-ha-agent reports an invalid configuration volume just with a debug message and silently fails back to the local copy of the configurations files.
This can made issues on the configuration volumes less evident.

Version-Release number of selected component (if applicable):


How reproducible:
Only as the result of a failed setup

Steps to Reproduce:
1. Manually destroy the configuration volume with dd and restart ha-agent
2.
3.

Actual results:
It silently failback to the local copy of the configuration files with a few debug messages

Expected results:
It should explicitly report the issue 

Additional info:

Comment 1 Nikolai Sednev 2015-12-20 12:03:16 UTC
Hi Simone,
Is it possible to describe a bit more in details the reproduction steps please?
May this https://bugzilla.redhat.com/show_bug.cgi?id=1116469 reproduction flow, match our case?

Comment 2 Simone Tiraboschi 2015-12-28 20:03:52 UTC
Hi Nikolay.

Deploy hosted-engine as usual, than manually wipe the configuration volume (you can find its uuid in hosted-engine.conf) on the shared storage and restart the agent.
Previously was failing with just an error line at debug level (and normally we log info and upper), now the error should be reported at error level.

Comment 3 Nikolai Sednev 2016-02-01 14:05:01 UTC
Executed dd to the conf_volume_UUID=98d26505-2bcf-43e3-9425-a958efefad68 and received error message in/var/log/ovirt-hosted-engine-ha/broker.log, as described bellow:

Thread-24289::ERROR::2016-02-01 16:48:24,175::heconflib::111::ovirt_hosted_engine_ha.broker.notifications.Notifications.config::(validateConfImage
) 'version' is not stored in the HE configuration image
Thread-24289::ERROR::2016-02-01 16:48:24,177::notifications::35::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email) [Errno 11
1] Connection refused
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", line 24, in send_email
    server = smtplib.SMTP(cfg["smtp-server"], port=cfg["smtp-port"])
  File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
    (code, msg) = self.connect(host, port)
  File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
    self.sock = self._get_socket(host, port, self.timeout)
  File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
    return socket.create_connection((host, port), timeout)
  File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
    raise err
error: [Errno 111] Connection refused


Components on host:
sanlock-3.2.4-1.el7.x86_64
ovirt-host-deploy-1.4.1-1.el7ev.noarch
ovirt-vmconsole-1.0.0-1.el7ev.noarch
libvirt-client-1.2.17-13.el7_2.2.x86_64
mom-0.5.1-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.4.x86_64
ovirt-hosted-engine-ha-1.3.3.7-1.el7ev.noarch
ovirt-vmconsole-host-1.0.0-1.el7ev.noarch
vdsm-4.17.18-0.el7ev.noarch
Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20160126.0.el7ev)


Engine:
rhevm-3.6.3-0.1.el6.noarch
Linux version 2.6.32-573.12.1.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Mon Nov 23 12:55:32 EST 2015


Note You need to log in before you can comment on or make changes to this bug.