This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1247098 - [hosted-engine] [GlusterFS support] Creation of a Gluster storage domain in the hosted-engine setup causes the VM to become unreachable
[hosted-engine] [GlusterFS support] Creation of a Gluster storage domain in t...
Status: CLOSED WORKSFORME
Product: oVirt
Classification: Community
Component: ovirt-engine-core (Show other bugs)
3.6
x86_64 Unspecified
unspecified Severity high
: m1
: 3.6.0
Assigned To: Ala Hino
Elad
storage
:
Depends On:
Blocks: Hosted_Engine_HC
  Show dependency treegraph
 
Reported: 2015-07-27 06:35 EDT by Elad
Modified: 2016-03-10 01:18 EST (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-29 04:37:51 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Elad 2015-07-27 06:35:02 EDT
Description of problem:
I deployed successfully hosted-engine over GlusterFS. 
I tried to create a Gluster storage domain (with a replica 3 volume) in the setup and the vm immediately became unreachable. The VM status is reported as up:


[root@green-vdsc 7]# hosted-engine --vm-status


--== Host 1 status ==--

Status up-to-date                  : True
Hostname                           : green-vdsc.qa.lab.tlv.redhat.com
Host ID                            : 1
Engine status                      : {"reason": "failed liveliness check", "health": "bad", "vm": "up", "detail": "up"}
Score                              : 2400
stopped                            : False
Local maintenance                  : False
crc32                              : f9f6e4f7
Host timestamp                     : 1284496


Trying to power off the VM and it gets stuck in Powering down state, also killing the qemu process doesn't help. I'll file a bug.


Version-Release number of selected component (if applicable):
Hypervisor:

ovirt-hosted-engine-ha-1.3.0-0.0.master.20150615153650.20150615153645.git5f8c290.el7.noarch
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150723145342.gitc6bc631.el7.noarch
vdsm-xmlrpc-4.17.0-1198.git6ede99a.el7.noarch
vdsm-python-4.17.0-1198.git6ede99a.el7.noarch
vdsm-4.17.0-1198.git6ede99a.el7.noarch
vdsm-infra-4.17.0-1198.git6ede99a.el7.noarch
vdsm-jsonrpc-4.17.0-1198.git6ede99a.el7.noarch
vdsm-yajsonrpc-4.17.0-1198.git6ede99a.el7.noarch
vdsm-cli-4.17.0-1198.git6ede99a.el7.noarch
libvirt-client-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64
libvirt-lock-sanlock-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64
libvirt-python-1.2.8-7.el7_1.1.x86_64
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64
libvirt-daemon-1.2.8-16.el7_1.3.x86_64
qemu-kvm-tools-ev-2.1.2-23.el7_1.4.1.x86_64
qemu-img-ev-2.1.2-23.el7_1.4.1.x86_64
qemu-kvm-common-ev-2.1.2-23.el7_1.4.1.x86_64
ipxe-roms-qemu-20130517-6.gitc4bce43.el7.noarch
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64
qemu-kvm-ev-2.1.2-23.el7_1.4.1.x86_64
sanlock-3.2.2-2.el7.x86_64
selinux-policy-3.13.1-23.el7_1.7.noarch


Engine:
ovirt-engine-3.6.0-0.0.master.20150627185750.git6f063c1.el6.noarch


How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine over GlusterFS storage using repilica 3 volume. 
2. Once deployment is done, create some storage domains in the setup
3. Create a Gluster domain in the setup

Actual results:
Once clicking OK in the webadmin for the storage domain creation, the hosted-engine VM gets unreachable.

It might be that during storage domain creation, the host got disconnected from the Gluster server.

The VM is unreachable although it is reported as Up by vdsm and libvirt. Tried to do 'hosted-engine --vm-poweroff' and the it got stuck in 'Powering down', also killing the qemu process didn't help.
Therefore, for now, I can't examine the engine.log. 

Expected results:
Gluster storage domain should be created successfully.

Additional info:

sosreport: http://file.tlv.redhat.com/ebenahar/sosreport-green-vdsc.qa.lab.tlv.redhat.com-20150727100546.tar.xz

Gluster volume configuration (the same for both the volumes - the one used for the hosted engine vm image and the one for the Gluster storage domain):

[root@gluster-storage-03 ~]# gluster volume info elad1
 
Volume Name: elad1
Type: Replicate
Volume ID: 34a9bdeb-30b3-4868-921c-2c6c2cfd83b4
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.35.160.6:/gluster_volumes/elad1
Brick2: 10.35.160.202:/gluster_volumes/elad1
Brick3: 10.35.160.203:/gluster_volumes/elad1
Options Reconfigured:
server.allow-insecure: on
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
auth.allow: *
network.ping-timeout: 10
cluster.quorum-type: auto
storage.owner-uid: 36
storage.owner-gid: 36
performance.readdir-ahead: on
Comment 1 Elad 2015-07-29 04:37:51 EDT
Cannot reproduce, closing.

Will re-open in case I'll encounter it again.

Note You need to log in before you can comment on or make changes to this bug.