Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1584237

Summary: Cluster become unresponsive adding second host to hosted engine deployment
Product: [oVirt] ovirt-engine Reporter: Sven Vogel <sven.vogel>
Component: GeneralAssignee: Ido Rosenzwig <irosenzw>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.3.5CC: bugs, irosenzw, sven.vogel, tnisan, ylavi
Target Milestone: ---Keywords: Reopened
Target Release: ---Flags: rule-engine: ovirt-4.2+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-21 07:26:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log
none
vdsm.log
none
engine.log
none
Overview
none
kvm01-vdsm.log
none
kvm02-vdsm.log
none
ovirt01-engine.log
none
/var/log/messages
none
/var/log/ovirt-engine/host-deploy/* none

Description Sven Vogel 2018-05-30 14:21:18 UTC
Description of problem:
after add a second host i dont get the cluster running.

ovirt message: 
"Invalid status on Data Center Default. Setting status to Non Responsive."

engine.log
2018-05-30 15:48:24,740+02 INFO  [org.ovirt.engine.core.bll.storage.pool.SetStoragePoolStatusCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [4b23f270] Running command: SetStoragePoolStatusCommand internal: true. Entities affected :  ID: 1ef4518e-6383-11e8-83ff-00163e1ef844 Type: StoragePool
2018-05-30 15:48:24,744+02 INFO  [org.ovirt.engine.core.vdsbroker.storage.StoragePoolDomainHelper] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [4b23f270] Storage Pool '1ef4518e-6383-11e8-83ff-00163e1ef844' - Updating Storage Domain 'b8c8bd16-ccea-4c58-96f2-0f981f933ba8' status from 'Active' to 'Unknown', reason: null
2018-05-30 15:48:24,786+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [4b23f270] EVENT_ID: SYSTEM_CHANGE_STORAGE_POOL_STATUS_PROBLEMATIC(980), Invalid status on Data Center Default. Setting status to Non Responsive.

as Storage i use ceph nfs.

there are a tip how i can fix this?

thanks

Sven

Comment 1 Sven Vogel 2018-05-30 14:25:39 UTC
Created attachment 1445912 [details]
engine.log

Comment 2 Idan Shaby 2018-06-05 12:56:49 UTC
Hi Sven,

It seems that you added a host ~ half an hour before you got the warnings in the description of this bug.
Can you please clear the logs (both engine and vdsm), and attach the new ones along with clear steps to reproduce?

Thanks!

Comment 3 Sven Vogel 2018-06-05 20:09:13 UTC
Hi Idan,

now problem i added the new files.

thanks

Sven

Comment 4 Sven Vogel 2018-06-05 20:11:13 UTC
Created attachment 1447979 [details]
vdsm.log

Comment 5 Sven Vogel 2018-06-05 20:12:21 UTC
Created attachment 1447980 [details]
engine.log

Comment 6 Idan Shaby 2018-06-06 06:49:42 UTC
Thanks Sven!
Question - what did you do when these logs were written?
Did you manage to reproduce the bug again?
Can you add clear steps to reproduce?

Comment 7 Sven Vogel 2018-06-08 19:24:59 UTC
Hi Idan,

1. i provisioned a hosted engine. that worked without any problem. i use nfs ganesha from ceph. hosted engine was moved perfectly. 
2. i added a second host to the cluster. thats all :(

Comment 8 Idan Shaby 2018-06-10 06:09:23 UTC
I can't see any error or an interesting warning in these logs.
I think that they don't contain the failure and the time when your DC went into a non-responsive state.
Can you clear them again and try to reproduce the bug, so I can see the relevant errors?

Comment 9 Sven Vogel 2018-06-11 07:55:43 UTC
Hi Idan,

where i can see the error. i added a picture. you will see a cluster is not available but hosts are added?

Comment 10 Sven Vogel 2018-06-11 07:56:12 UTC
Created attachment 1449916 [details]
Overview

Comment 11 Sven Vogel 2018-06-11 08:01:22 UTC
Which log files do you need? if i try to move the hosted engine he says to me there is no other host available?

greets

Comment 12 Sven Vogel 2018-06-11 08:05:54 UTC
[root@kvm01 ~]# hosted-engine --set-maintenance --mode=local
Unable to enter local maintenance mode: there are no available hosts capable of running the engine VM.

Comment 13 Idan Shaby 2018-06-11 08:23:24 UTC
I am not sure that I understand what you're asking.
What I need you to do is clear the engine and vdsm logs, run the scenario that got your dc into a non-responsive state and then attach those new logs again to this BZ so I can see the reported errors in them.
Can you do that, please?

Comment 14 Sven Vogel 2018-06-15 14:48:06 UTC
Created attachment 1451950 [details]
kvm01-vdsm.log

Comment 15 Sven Vogel 2018-06-15 14:49:18 UTC
Created attachment 1451951 [details]
kvm02-vdsm.log

Comment 16 Sven Vogel 2018-06-15 14:50:14 UTC
Created attachment 1451953 [details]
ovirt01-engine.log

Comment 17 Sven Vogel 2018-06-15 14:54:16 UTC
Hi Idan,

okay maybe i dont unterstand you correctly.

1. i have one host kvm01 with hosted engine. if i add a second host kvm02 i get the picture above (overview.png). does that look normal or not? thats a question to you. i add second host in the gui like add host, put the hostname, name and password into the field. nothing more i have done!
2. i cleared all logs and added you the following kvm01-vdsm.log, kvm02-vdsm.log and ovirt01-engine.log.
3. when you dont see a error in the files my second questions is why i there is no second host when i try to set "hosted-engine --set-maintenance --mode=local". i think there must be a problem with the cluster but i dont know which one.
4. i thought there is a problem with cluster why the command with maintenaince dont work.

is it now clear for you?

thanks

Sven

Comment 18 Idan Shaby 2018-06-17 08:25:59 UTC
Hi Sven,

Thanks a lot for the clear logs!
I have no idea why your cluster is not available.
The logs seem fine to me, as I still not see errors or warnings, not before and not after the new host's installation.

Moving this bug to Integration as they may answer at least your 3rd question, and I see no storage issue here.

Comment 19 Sandro Bonazzola 2018-06-18 09:56:02 UTC
Ido can you please have a look?

Comment 20 Ido Rosenzwig 2018-06-21 10:16:25 UTC
Hello Sven,

For better understanding the problem, please attach the following logs from the engine vm:
1. /var/log/ovirt-engine/host-deploy/*
2. /var/log/messages

In addition provide the output of the following command for the engine vm and the kvm01 machine:
# rpm -qa | grep ovirt

Comment 21 Sven Vogel 2018-06-24 14:41:40 UTC
Created attachment 1454140 [details]
/var/log/messages

Comment 22 Sven Vogel 2018-06-24 14:42:45 UTC
Created attachment 1454141 [details]
/var/log/ovirt-engine/host-deploy/*

Comment 23 Sven Vogel 2018-06-24 14:44:07 UTC
Hi Ido,

first: i added the logs above. When you need more logs please send a message to me.

second: output from your command
[root@kvm01 ~]# rpm -qa | grep ovirt
ovirt-host-4.2.2-2.el7.centos.x86_64
ovirt-engine-appliance-4.2-20180504.1.el7.centos.noarch
ovirt-hosted-engine-ha-2.2.11-1.el7.centos.noarch
ovirt-host-dependencies-4.2.2-2.el7.centos.x86_64
ovirt-vmconsole-1.0.5-4.el7.centos.noarch
ovirt-imageio-daemon-1.3.1.2-0.el7.centos.noarch
ovirt-provider-ovn-driver-1.2.10-1.el7.centos.noarch
ovirt-host-deploy-1.7.3-1.el7.centos.noarch
ovirt-hosted-engine-setup-2.2.20-1.el7.centos.noarch
ovirt-setup-lib-1.1.4-1.el7.centos.noarch
ovirt-release42-4.2.3.1-1.el7.noarch
ovirt-imageio-common-1.3.1.2-0.el7.centos.noarch
python-ovirt-engine-sdk4-4.2.6-2.el7.centos.x86_64
ovirt-vmconsole-host-1.0.5-4.el7.centos.noarch
cockpit-ovirt-dashboard-0.11.24-1.el7.centos.noarch
ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch

Comment 24 Ido Rosenzwig 2018-06-25 13:35:46 UTC
Hello Sven,

I tried to reproduce the issue myself couple of times without success.
I would suggest to update the packages using 'yum update' and try again.

Please comment if it solved the issue (or not). Good luck!

Comment 25 Ido Rosenzwig 2018-07-02 08:04:23 UTC
Closing the bug as NOTABUG since the reporter has not responded in a week.
If the problem continues please reopen.

Comment 26 Sven Vogel 2018-07-02 09:54:18 UTC
i would update if i could!

yum update gives me the following error.

--> Abhängigkeit liburcu-cds.so.1()(64bit) wird für Paket lttng-ust-2.4.1-4.el7.x86_64 verarbeitet
--> Abhängigkeitsauflösung beendet
Fehler: Paket: lttng-ust-2.4.1-4.el7.x86_64 (@epel)
            Benötigt: liburcu-bp.so.1()(64bit)
            Entfernen: userspace-rcu-0.7.16-1.el7.x86_64 (@epel)
                liburcu-bp.so.1()(64bit)
            Aktualisiert durch: userspace-rcu-0.10.0-3.el7.x86_64 (ovirt-4.2-centos-gluster312)
               ~liburcu-bp.so.6()(64bit)
            Verfügbar: userspace-rcu-0.7.7-1.el7.x86_64 (ovirt-4.2-centos-ovirt42)
                liburcu-bp.so.1()(64bit)
Fehler: Paket: lttng-ust-2.4.1-4.el7.x86_64 (@epel)
            Benötigt: liburcu-cds.so.1()(64bit)
            Entfernen: userspace-rcu-0.7.16-1.el7.x86_64 (@epel)
                liburcu-cds.so.1()(64bit)
            Aktualisiert durch: userspace-rcu-0.10.0-3.el7.x86_64 (ovirt-4.2-centos-gluster312)
               ~liburcu-cds.so.6()(64bit)
            Verfügbar: userspace-rcu-0.7.7-1.el7.x86_64 (ovirt-4.2-centos-ovirt42)
                liburcu-cds.so.1()(64bit)
 Sie können versuchen, mit --skip-broken das Problem zu umgehen.
 Sie könnten Folgendes versuchen: rpm -Va --nofiles --nodigest
Uploading Enabled Repositories Report
Geladene Plugins: fastestmirror, priorities, product-id, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Cannot upload enabled repos report, is this client registered?

Comment 27 Ido Rosenzwig 2018-07-02 10:38:41 UTC
Hello Sven,

Please don't be mad. Usually when a reporter stop responding it's because the issue was resolved, that's why I asked to reopen the bug if the problem continues.

Sven, can you please attach the output in English. 
use: # LANG=en_US.UTF-8 yum update

Comment 28 Ido Rosenzwig 2018-07-02 14:03:28 UTC
Sven, see above request.

Comment 29 Sven Vogel 2018-07-03 15:07:12 UTC
Hi Ido,

here it is in english.

---> Package kernel.x86_64 0:3.10.0-693.el7 will be erased
---> Package userspace-rcu.x86_64 0:0.7.16-1.el7 will be updated
--> Processing Dependency: liburcu-bp.so.1()(64bit) for package: lttng-ust-2.4.1-4.el7.x86_64
--> Processing Dependency: liburcu-cds.so.1()(64bit) for package: lttng-ust-2.4.1-4.el7.x86_64
--> Finished Dependency Resolution
Error: Package: lttng-ust-2.4.1-4.el7.x86_64 (@epel)
           Requires: liburcu-cds.so.1()(64bit)
           Removing: userspace-rcu-0.7.16-1.el7.x86_64 (@epel)
               liburcu-cds.so.1()(64bit)
           Updated By: userspace-rcu-0.10.0-3.el7.x86_64 (ovirt-4.2-centos-gluster312)
              ~liburcu-cds.so.6()(64bit)
           Available: userspace-rcu-0.7.7-1.el7.x86_64 (ovirt-4.2-centos-ovirt42)
               liburcu-cds.so.1()(64bit)
Error: Package: lttng-ust-2.4.1-4.el7.x86_64 (@epel)
           Requires: liburcu-bp.so.1()(64bit)
           Removing: userspace-rcu-0.7.16-1.el7.x86_64 (@epel)
               liburcu-bp.so.1()(64bit)
           Updated By: userspace-rcu-0.10.0-3.el7.x86_64 (ovirt-4.2-centos-gluster312)
              ~liburcu-bp.so.6()(64bit)
           Available: userspace-rcu-0.7.7-1.el7.x86_64 (ovirt-4.2-centos-ovirt42)
               liburcu-bp.so.1()(64bit)
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest
Uploading Enabled Repositories Report
Loaded plugins: fastestmirror, priorities, product-id, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Cannot upload enabled repos report, is this client registered?

i cant update because there is a conflict.

Comment 30 Ido Rosenzwig 2018-07-04 08:49:40 UTC
Again, I was not able to reproduce.

Did you configured other repos except what the documetation stated ?
(http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm)

Please attach the output of:
# # LANG=en_US.UTF-8 yum repolist

Comment 31 Sven Vogel 2018-07-05 09:12:29 UTC
Hi Ido,

yes looks good maybe?

[root@kvm01 ~]# LANG=en_US.UTF-8 yum repolist
Loaded plugins: enabled_repos_upload, fastestmirror, package_upload, priorities, product-id, search-disabled-repos, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Loading mirror speeds from cached hostfile
 * base: ftp.hosteurope.de
 * epel: ftp.uni-stuttgart.de
 * extras: centosmirror.netcup.net
 * ovirt-4.2: ftp.snt.utwente.nl
 * ovirt-4.2-epel: ftp.uni-stuttgart.de
 * updates: centos.mirror.net-d-sign.de
repo id                                                                                                           repo name                                                                                                                                               status
base/7/x86_64                                                                                                     CentOS-7 - Base                                                                                                                                          9,911
centos-qemu-ev/7/x86_64                                                                                           CentOS-7 - QEMU EV                                                                                                                                          59
centos-sclo-rh-release/x86_64                                                                                     CentOS-7 - SCLo rh                                                                                                                                       7,554
ceph                                                                                                              Ceph packages                                                                                                                                              596
epel/x86_64                                                                                                       Extra Packages for Enterprise Linux 7 - x86_64                                                                                                          12,604
extras/7/x86_64                                                                                                   CentOS-7 - Extras                                                                                                                                          314
ovirt-4.2/7                                                                                                       Latest oVirt 4.2 Release                                                                                                                                 1,787
ovirt-4.2-centos-gluster312/x86_64                                                                                CentOS-7 - Gluster 3.12                                                                                                                                    189
ovirt-4.2-centos-opstools/x86_64                                                                                  CentOS-7 - OpsTools - release                                                                                                                              564
ovirt-4.2-centos-ovirt42/x86_64                                                                                   CentOS-7 - oVirt 4.2                                                                                                                                       444
ovirt-4.2-centos-qemu-ev/x86_64                                                                                   CentOS-7 - QEMU EV                                                                                                                                          59
ovirt-4.2-epel/x86_64                                                                                             Extra Packages for Enterprise Linux 7 - x86_64                                                                                                          12,604
ovirt-4.2-virtio-win-latest                                                                                       virtio-win builds roughly matching what will be shipped in upcoming RHEL                                                                                    37
updates/7/x86_64                                                                                                  CentOS-7 - Updates                                                                                                                                         946
repolist: 47,668
Uploading Enabled Repositories Report
Loaded plugins: fastestmirror, priorities, product-id, subscription-manager
This system is not registered with an entitlement server. You can use subscription-manager to register.
Cannot upload enabled repos report, is this client registered?

Comment 32 Sandro Bonazzola 2018-09-21 07:26:18 UTC
Closing this with insufficient data for reproducing the issue.
Please reopen if you can provide enough details to reproduce the issue.