Bug 1591150 - ovirt-ha-broker restarts in loops and fails to get started on RHVH after upgrade.
Summary: ovirt-ha-broker restarts in loops and fails to get started on RHVH after upgr...
Keywords:
Status: CLOSED DUPLICATE of bug 1585028
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-ha
Version: 4.2.4
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Martin Sivák
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-14 07:39 UTC by Nikolai Sednev
Modified: 2019-05-16 13:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-14 08:38:02 UTC
oVirt Team: Integration
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Screenshot from 2018-06-14 10-39-15.png (109.93 KB, image/png)
2018-06-14 07:39 UTC, Nikolai Sednev
no flags Details
Screenshot from 2018-06-14 10-40-07.png (37.05 KB, image/png)
2018-06-14 07:40 UTC, Nikolai Sednev
no flags Details
sosreport from the engine (9.98 MB, application/x-xz)
2018-06-14 07:42 UTC, Nikolai Sednev
no flags Details

Description Nikolai Sednev 2018-06-14 07:39:09 UTC
Description of problem:
Opening this bug from https://bugzilla.redhat.com/show_bug.cgi?id=1585028#c7.

alma04 ~]# systemctl status ovirt-ha-broker -l
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2018-06-14 10:09:30 IDT; 7ms ago
 Main PID: 56511 (ovirt-ha-broker)
    Tasks: 1
   Memory: 228.0K
   CGroup: /system.slice/ovirt-ha-broker.service
           └─56511 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker

Jun 14 10:09:30 alma04.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker.
Jun 14 10:09:30 alma04.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker...
[root@alma04 ~]# systemctl status ovirt-ha-broker -l
● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Thu 2018-06-14 10:09:39 IDT; 2s ago
  Process: 56553 ExecStart=/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker (code=exited, status=1/FAILURE)
 Main PID: 56553 (code=exited, status=1/FAILURE)

Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: Unit ovirt-ha-broker.service entered failed state.
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: ovirt-ha-broker.service failed.
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: ovirt-ha-broker.service holdoff time over, scheduling restart.
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: start request repeated too quickly for ovirt-ha-broker.service
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: Failed to start oVirt Hosted Engine High Availability Communications Broker.
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: Unit ovirt-ha-broker.service entered failed state.
Jun 14 10:09:39 alma04.qa.lab.tlv.redhat.com systemd[1]: ovirt-ha-broker.service failed.


Version-Release number of selected component (if applicable):
redhat-virtualization-host-image-update-4.2-20180605.0.el7_5.noarch.rpm
ovirt-hosted-engine-ha-2.2.13-1.el7ev.noarch
ovirt-hosted-engine-setup-2.2.22-1.el7ev.noarch
vdsm-4.20.29-1.el7ev.x86_64
qemu-kvm-rhev-2.10.0-21.el7_5.3.x86_64
libvirt-3.9.0-14.el7_5.5.x86_64
sanlock-3.6.0-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.Deploy 4.1 SHE over NFS on pair of 4.2.1.11 RHVHs.
2.Upgrade the engine to latest 4.2 components.
3.Start upgrading 4.1.11 RHVHs to 4.2 RHVHs.

Actual results:
Upgraded RHVHs can't start ovirt-ha-broker, broker being continuously restarted.

Expected results:
ha-broker and ha-agent should run uninterruptedly.

Additional info:
Sosreport from host.
Sosreport from the engine.

Comment 1 Nikolai Sednev 2018-06-14 07:39:49 UTC
Created attachment 1451185 [details]
Screenshot from 2018-06-14 10-39-15.png

Comment 2 Nikolai Sednev 2018-06-14 07:40:30 UTC
Created attachment 1451186 [details]
Screenshot from 2018-06-14 10-40-07.png

Comment 3 Nikolai Sednev 2018-06-14 07:42:32 UTC
Created attachment 1451188 [details]
sosreport from the engine

Comment 4 Nikolai Sednev 2018-06-14 07:59:31 UTC
Sosreport from host alma04.
https://drive.google.com/file/d/1f5CmhnfMRSVi_CgZMJGQ1VBLOD4dhWdf/view?usp=sharing

Comment 5 Martin Sivák 2018-06-14 08:09:27 UTC
Where is the new issue? I still see this to be related to the log permissions. What exactly is the bug here?

Comment 6 Nikolai Sednev 2018-06-14 08:32:27 UTC
You're right , its related to the initial issue, the only difference now is that in initial case there were root:root UG/UID, now thy're fine and kvm:vdsm, but the permissions themselves are different.

Comment 7 Simone Tiraboschi 2018-06-14 08:38:02 UTC
See https://bugzilla.redhat.com/show_bug.cgi?id=1585028#c9

*** This bug has been marked as a duplicate of bug 1585028 ***

Comment 8 Nikolai Sednev 2018-06-14 08:40:11 UTC
In https://bugzilla.redhat.com/show_bug.cgi?id=1585028, uid/gid were fixed from being root:root back to normal vdsm:kvm, but permission parameters are wrong "----------.".
Please see https://bugzilla.redhat.com/show_bug.cgi?id=1585028#c10 for more details.

Comment 9 Franta Kust 2019-05-16 13:06:54 UTC
BZ<2>Jira Resync


Note You need to log in before you can comment on or make changes to this bug.