Bug 1242448 - [hosted-engine-setup] Deployment fails due to a sanlock exception creating temporary Posix storage domain on a loopback device
Summary: [hosted-engine-setup] Deployment fails due to a sanlock exception creating te...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: ---
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-3.6.0-rc
: 1.3.0
Assignee: Simone Tiraboschi
QA Contact: Elad
URL:
Whiteboard: integration
: 1225366 1238313 1247165 1247181 (view as bug list)
Depends On:
Blocks: 1036731 1153278 1205663
TreeView+ depends on / blocked
 
Reported: 2015-07-13 11:22 UTC by Elad
Modified: 2015-11-04 13:36 UTC (History)
16 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-11-04 13:36:26 UTC
oVirt Team: ---
Embargoed:
rule-engine: ovirt-3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 43540 0 master MERGED packaging: setup: setting SELinux context on the loopback device Never

Description Elad 2015-07-13 11:22:12 UTC
Description of problem:
Tried to deploy hosted-engine over FC and it failed due to a Sanlock exception during createStoragePool. 

Version-Release number of selected component (if applicable):
ovirt-3.6.0-3
ovirt-hosted-engine-setup-1.3.0-0.0.master.20150623153111.git68138d4.el7.noarch
vdsm-4.17.0-1054.git562e711.el7.noarch
sanlock-3.2.2-2.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine using FC
2.
3.

Actual results:
The deployment fails in createStoragePool due to a sanlock exception:

Setup.log

2015-07-13 14:00:30 DEBUG otopi.context context._executeMethod:155 method exception
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 145, in _executeMethod
    method['method']()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 1212, in _misc
    self._createStoragePool()
  File "/usr/share/ovirt-hosted-engine-setup/scripts/../plugins/ovirt-hosted-engine-setup/storage/storage.py", line 783, in _createStoragePool
    raise RuntimeError(status['status']['message'])
RuntimeError: Cannot acquire host id: ('e45c292e-3167-4efc-ac3a-ce2e216f6590', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))
2015-07-13 14:00:30 ERROR otopi.context context._executeMethod:164 Failed to execute stage 'Misc configuration': Cannot acquire host id: ('e45c292e-3167-4efc-ac3a-ce2e216f6590', SanlockException(19, 'Sanlock locks
pace add failure', 'No such device'))


vdsm.log:

Thread-40::ERROR::2015-07-13 14:00:30,338::task::866::Storage.TaskManager.Task::(_setError) Task=`297d0697-6c4e-4967-a65e-c17ac2891810`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1003, in createStoragePool
    leaseParams)
  File "/usr/share/vdsm/storage/sp.py", line 580, in create
    self._acquireTemporaryClusterLock(msdUUID, leaseParams)
  File "/usr/share/vdsm/storage/sp.py", line 512, in _acquireTemporaryClusterLock
    msd.acquireHostId(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 477, in acquireHostId
    self._clusterLock.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/clusterlock.py", line 237, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: ('e45c292e-3167-4efc-ac3a-ce2e216f6590', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))

Sanlock.log:

2015-07-13 14:00:29+0300 899 [672]: s1 lockspace e45c292e-3167-4efc-ac3a-ce2e216f6590:250:/rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmp9HMYnA/e45c292e-3167-4efc-ac3a-ce2e216f6590/dom_md/ids:0
2015-07-13 14:00:29+0300 899 [3842]: open error -13 /rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmp9HMYnA/e45c292e-3167-4efc-ac3a-ce2e216f6590/dom_md/ids
2015-07-13 14:00:29+0300 899 [3842]: s1 open_disk /rhev/data-center/mnt/_var_lib_ovirt-hosted-engine-setup_tmp9HMYnA/e45c292e-3167-4efc-ac3a-ce2e216f6590/dom_md/ids error -13
2015-07-13 14:00:30+0300 900 [672]: s1 add_lockspace fail result -19


Expected results:
Deployment should succeed over FC

Additional info:
sosreport

Comment 2 Simone Tiraboschi 2015-07-13 11:44:14 UTC
Elad, ban you please check SELinux logs?

Comment 3 Simone Tiraboschi 2015-07-13 11:44:50 UTC
(In reply to Simone Tiraboschi from comment #2)
> Elad, ban you please check SELinux logs?

can, sorry :-)

Comment 4 Elad 2015-07-13 12:25:02 UTC
Auditd was disabled.
Reproduced again with auditd enabled.
Here is the sosreport:
http://file.tlv.redhat.com/ebenahar/sosreport-green-vdsb.qa.lab.tlv.redhat.com-20150713151711.tar.xz

Comment 5 Elad 2015-07-13 12:43:01 UTC
From audit.log:

type=CRED_DISP msg=audit(1436789811.558:357): pid=3672 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_env,pam_unix acct="root" exe="/usr/bin/sud
o" hostname=? addr=? terminal=? res=success'
type=AVC msg=audit(1436789811.564:358): avc:  denied  { read write } for  pid=3677 comm="sanlock" name="ids" dev="loop1" ino=16390 scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file

Comment 6 Simone Tiraboschi 2015-07-13 12:52:42 UTC
Does it work in permissive mode?

Comment 7 Elad 2015-07-13 13:03:51 UTC
(In reply to Simone Tiraboschi from comment #6)
> Does it work in permissive mode?
Tested with permissive and it works

Comment 8 Elad 2015-07-13 13:57:00 UTC
The HE VM is not started automatically once HE installation is finished. 
Checked for its status and got the following exception:


[root@green-vdsb ovirt-hosted-engine-setup]# hosted-engine --vm-status
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 117, in <module>
    if not status_checker.print_status():
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/vm_status.py", line 60, in print_status
    all_host_stats = ha_cli.get_all_host_stats()
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 157, in get_all_host_stats
    return self.get_all_stats(self.StatModes.HOST)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 102, in get_all_stats
    stats = broker.get_stats_from_storage(service)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 232, in get_stats_from_storage
    result = self._checked_communicate(request)
  File "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 260, in _checked_communicate
    .format(message or response))
ovirt_hosted_engine_ha.lib.exceptions.RequestError: Request failed: <type 'exceptions.OSError'>


So I started the VM manually:

[root@green-vdsb ovirt-hosted-engine-setup]# hosted-engine --vm-start

a7567b69-c29c-46a2-a643-799acf0b1a87
        Status = WaitForLaunch
        nicModel = rtl8139,pv
        statusTime = 4300682480
        emulatedMachine = rhel6.5.0
        pid = 0
        vmName = HostedEngine
        devices = [{'index': '2', 'iface': 'ide', 'specParams': {}, 'readonly': 'true', 'deviceId': '1a732367-113d-4e6a-8dcb-9adb45e3e1de', 'address': {'bus': '1', 'controller': '0', 'type': 'drive', 'target': '0', 'unit': '0'}, 'device': 'cdrom', 'shared': 'false', 'path': '', 'type': 'disk'}, {'index': '0', 'iface': 'virtio', 'format': 'raw', 'bootOrder': '1', 'poolID': '00000000-0000-0000-0000-000000000000', 'volumeID': '6a80ef55-6f15-492d-b962-123615bf27cf', 'imageID': 'df02f4f1-e1c7-474b-8075-b839e4bc1c95', 'specParams': {}, 'readonly': 'false', 'domainID': '4a5d3450-655b-452f-8dda-2ef7e051b1a8', 'optional': 'false', 'deviceId': 'df02f4f1-e1c7-474b-8075-b839e4bc1c95', 'address': {'slot': '0x06', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'disk', 'shared': 'exclusive', 'propagateErrors': 'off', 'type': 'disk'}, {'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller'}, {'nicModel': 'pv', 'macAddr': '00:16:3E:76:D5:D5', 'linkActive': 'true', 'network': 'ovirtmgmt', 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'deviceId': 'a4c22ecc-0e5b-4548-b10a-5ca884d22946', 'address': {'slot': '0x03', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'interface'}, {'device': 'console', 'specParams': {}, 'type': 'console', 'deviceId': '131f7a43-a609-4795-ba03-9f25f327f6f9', 'alias': 'console0'}]
        guestDiskMapping = {}
        vmType = kvm
        clientIp = 
        displaySecurePort = -1
        memSize = 4096
        displayPort = -1
        cpuType = Conroe
        spiceSecureChannels = smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
        smp = 2
        displayIp = 0
        display = vnc


Checked again for the status and got the same exception.

Simone, is this failure to start the VM automatically and the mentioned exception related to the fact I'm working in permissive?

Comment 9 Simone Tiraboschi 2015-07-13 14:19:26 UTC
(In reply to Elad from comment #8)
> Simone, is this failure to start the VM automatically and the mentioned
> exception related to the fact I'm working in permissive?

No, it isn't. It looks like a different issue.

Comment 10 Simone Tiraboschi 2015-07-15 08:33:32 UTC
*** Bug 1238313 has been marked as a duplicate of this bug. ***

Comment 11 Simone Tiraboschi 2015-07-28 07:52:29 UTC
*** Bug 1247165 has been marked as a duplicate of this bug. ***

Comment 12 Simone Tiraboschi 2015-07-28 07:52:45 UTC
*** Bug 1225366 has been marked as a duplicate of this bug. ***

Comment 13 Simone Tiraboschi 2015-07-28 08:06:12 UTC
*** Bug 1247181 has been marked as a duplicate of this bug. ***

Comment 14 Elad 2015-11-03 09:04:59 UTC
Hosted-engine deployment over FC is completed successfully.
Note that the import of the HE storage domain is blocked due to https://bugzilla.redhat.com/show_bug.cgi?id=1273378

Verified using:
ovirt-hosted-engine-setup-1.3.0-1.el7ev.noarch
vdsm-4.17.10-5.el7ev.noarch
selinux-policy-3.13.1-60.el7.noarch

Comment 15 Sandro Bonazzola 2015-11-04 13:36:26 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.


Note You need to log in before you can comment on or make changes to this bug.