Bug 1364286 - The agent got stuck if the broker takes more that 30 seconds to reach the smtp server
Summary: The agent got stuck if the broker takes more that 30 seconds to reach the smt...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 3.6.0
Hardware: x86_64
OS: Linux
high
urgent
Target Milestone: ovirt-3.6.9
: 3.6.9
Assignee: Andrej Krejcir
QA Contact: Nikolai Sednev
URL:
Whiteboard: sla
Depends On: 1359059
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-05 00:31 UTC by Germano Veit Michel
Modified: 2021-08-30 12:25 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the HA broker waited for a non-responsive SMTP server without timing out. This meant the the HA agent waited indefinitely for the HA broker. Now, a timeout has been added to the connection between the HA broker and the SMTP server. This means that the HA broker and the HA agent no longer wait indefinitely for a SMTP response.
Clone Of: 1359059
Environment:
Last Closed: 2016-09-21 17:54:47 UTC
oVirt Team: SLA
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-43233 0 None None None 2021-08-30 12:25:53 UTC
Red Hat Knowledge Base (Solution) 2486301 0 None None None 2016-08-05 00:56:21 UTC
Red Hat Product Errata RHBA-2016:1924 0 normal SHIPPED_LIVE ovirt-hosted-engine-ha bug fix update for 3.6.9 2016-09-21 21:47:05 UTC
oVirt gerrit 61948 0 master MERGED Add a timeout to the notification sender 2016-08-23 09:53:30 UTC
oVirt gerrit 62718 0 ovirt-hosted-engine-ha-1.3 MERGED Add a timeout to the notification sender 2016-08-23 11:43:34 UTC

Comment 3 Martin Sivák 2016-08-05 08:10:40 UTC
This is an easy fix and I believe we should backport it to 3.6. I am setting all the right flags to ask for that.

Comment 8 Nikolai Sednev 2016-08-29 16:47:49 UTC
I'm still running with the agent that not being restarted because of SMTP server is not reachable as was blocked.
I don't see original error messages and agent not being restarted by the broker, hence I'm closing this bug as verified.

In broker's log I see these:
Thread-116::ERROR::2016-08-29 19:41:30,220::notifications::39::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email) timed out
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/notifications.py", line 26, in send_email
    timeout=float(cfg["smtp-timeout"]))
  File "/usr/lib64/python2.7/smtplib.py", line 255, in __init__
    (code, msg) = self.connect(host, port)
  File "/usr/lib64/python2.7/smtplib.py", line 315, in connect
    self.sock = self._get_socket(host, port, self.timeout)
  File "/usr/lib64/python2.7/smtplib.py", line 290, in _get_socket
    return socket.create_connection((host, port), timeout)
  File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
    raise err
timeout: timed out


Works for me on these components on host:
Host:
libvirt-client-1.2.17-13.el7_2.5.x86_64
vdsm-4.17.34-1.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64
ovirt-hosted-engine-setup-1.3.7.3-1.el7ev.noarch
sanlock-3.2.4-3.el7_2.x86_64
ovirt-setup-lib-1.0.1-1.el7ev.noarch
rhevm-appliance-20160620.0-1.el7ev.noarch
mom-0.5.5-1.el7ev.noarch
rhevm-sdk-python-3.6.8.0-1.el7ev.noarch
rhev-release-3.6.9-1-001.noarch
ovirt-hosted-engine-ha-1.3.5.8-1.el7ev.noarch
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
ovirt-host-deploy-1.4.1-1.el7ev.noarch
Linux version 3.10.0-327.36.1.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Wed Aug 17 03:02:37 EDT 2016
Linux 3.10.0-327.36.1.el7.x86_64 #1 SMP Wed Aug 17 03:02:37 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 7.2 (Maipo)
rhevm-appliance-20160620.0-1.el7ev.noarch

Engine:
ovirt-engine-extension-aaa-jdbc-1.0.7-2.el6ev.noarch
ovirt-setup-lib-1.0.1-1.el6ev.noarch
ovirt-vmconsole-1.0.4-1.el6ev.noarch
rhevm-setup-plugin-ovirt-engine-3.6.9-0.1.el6.noarch
ovirt-vmconsole-proxy-1.0.4-1.el6ev.noarch
rhevm-setup-plugin-ovirt-engine-common-3.6.9-0.1.el6.noarch
ovirt-host-deploy-1.4.1-1.el6ev.noarch
ovirt-host-deploy-java-1.4.1-1.el6ev.noarch
rhevm-image-uploader-3.6.1-2.el6ev.noarch
rhevm-webadmin-portal-3.6.9-0.1.el6.noarch
rhevm-spice-client-x64-cab-3.6-7.el6.noarch
rhevm-setup-plugins-3.6.5-1.el6ev.noarch
rhevm-setup-base-3.6.9-0.1.el6.noarch
rhevm-setup-3.6.9-0.1.el6.noarch
rhevm-tools-backup-3.6.9-0.1.el6.noarch
rhevm-branding-rhev-3.6.0-10.el6ev.noarch
rhevm-setup-plugin-ovirt-engine-3.6.9-0.1.el6.noarch
rhevm-tools-3.6.9-0.1.el6.noarch
rhevm-restapi-3.6.9-0.1.el6.noarch
rhevm-spice-client-x86-cab-3.6-7.el6.noarch
rhevm-guest-agent-common-1.0.11-6.el6ev.noarch
rhevm-sdk-python-3.6.9.0-2.el6ev.noarch
rhevm-setup-plugin-vmconsole-proxy-helper-3.6.9-0.1.el6.noarch
rhevm-vmconsole-proxy-helper-3.6.9-0.1.el6.noarch
rhevm-backend-3.6.9-0.1.el6.noarch
rhevm-3.6.9-0.1.el6.noarch
rhevm-log-collector-3.6.1-1.el6ev.noarch
rhevm-spice-client-x86-msi-3.6-7.el6.noarch
rhev-release-3.6.9-1-001.noarch
rhevm-lib-3.6.9-0.1.el6.noarch
rhevm-setup-plugin-ovirt-engine-common-3.6.9-0.1.el6.noarch
rhevm-cli-3.6.9.0-1.el6ev.noarch
rhevm-extensions-api-impl-3.6.9-0.1.el6.noarch
rhevm-websocket-proxy-3.6.9-0.1.el6.noarch
rhevm-doc-3.6.8-1.el6eng.noarch
rhevm-userportal-3.6.9-0.1.el6.noarch
rhevm-setup-plugin-websocket-proxy-3.6.9-0.1.el6.noarch
rhevm-dependencies-3.6.1-1.el6ev.noarch
rhev-guest-tools-iso-3.6-6.el6ev.noarch
rhevm-dbscripts-3.6.9-0.1.el6.noarch
rhevm-spice-client-x64-msi-3.6-7.el6.noarch
rhevm-iso-uploader-3.6.0-1.el6ev.noarch
Linux version 2.6.32-642.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Wed Apr 13 00:51:26 EDT 2016
Linux 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux Server release 6.8 (Santiago)


I've deployed clean hosted engine over iSCSI storage domain from appliance, then upgraded the appliance's components to latest bits.

Moving to verified.

Comment 10 errata-xmlrpc 2016-09-21 17:54:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1924.html


Note You need to log in before you can comment on or make changes to this bug.