Bug 1255026

Summary: Hosted engine does not set up, stuck on AAA stage
Product: Red Hat Enterprise Virtualization Manager Reporter: Sagi Shnaidman <sshnaidm>
Component: ovirt-engineAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED CURRENTRELEASE QA Contact: Nikolai Sednev <nsednev>
Severity: high Docs Contact:
Priority: medium    
Version: 3.6.0CC: amureini, bazulay, eedri, gklein, lsurette, mperina, oourfali, rbalakri, Rhev-m-bugs, sbonazzo, sshnaidm, stirabos, tlitovsk, yeylon, ykaul, ylavi
Target Milestone: ovirt-3.6.0-rcKeywords: Automation, Triaged
Target Release: 3.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-3.6.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-11 07:33:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
package versions
none
ovirt-hosted-engine-setup log
none
ovirt-engine-setup log none

Description Sagi Shnaidman 2015-08-19 13:03:00 UTC
Created attachment 1064839 [details]
package versions

Description of problem:

Run hosted engine setup

Version-Release number of selected component (if applicable):

see in attached package_versions file

How reproducible:

always

Steps to Reproduce:
1. Run hosted-engine --deploy


Actual results:

engine setup is stuck on AAA stage

Expected results:
          |- [ INFO  ] Creating/refreshing Engine database schema
          |- [ INFO  ] Creating/refreshing Engine AAA database schema
[ ERROR ] Engine setup got stuck on the appliance
[ ERROR ] Failed to execute stage 'Closing up': Engine setup is stalled on the appliance since 300 seconds ago. Please check its log on the appliance. 
[ INFO  ] Stage: Clean up
[ INFO  ] Generating answer file '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150819061244.conf'
[ INFO  ] Stage: Pre-termination
[ INFO  ] Stage: Termination

Comment 1 Sagi Shnaidman 2015-08-19 13:03:41 UTC
Created attachment 1064840 [details]
ovirt-hosted-engine-setup log

Comment 2 Sagi Shnaidman 2015-08-19 13:04:12 UTC
Created attachment 1064842 [details]
ovirt-engine-setup log

Comment 3 Oved Ourfali 2015-08-24 06:31:40 UTC
Martin - this is fixed already, right?
Is it a duplicate?

Comment 4 Martin Perina 2015-08-25 08:32:38 UTC
According to attached ovirt-engine-setup.log all AAA related actions finished successfully and last engine-setup output is "Upgrading CA", so if the log is complete, this issue has nothing to do with aaa-jdbc.

Comment 5 Simone Tiraboschi 2015-08-25 09:55:59 UTC
engine-setup, running on the appliance VM, got stuck here:

2015-08-19 03:13:06 DEBUG otopi.context context._executeMethod:141 Stage misc METHOD otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca.Plugin._miscUpgrade
2015-08-19 03:13:06 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.pki.ca ca._miscUpgrade:484 Upgrading CA

running the CA scripts [1].

It's no reproducible on my environment so it could be something related to a lack of entropy on our CI environment.
Alon?

[1] - http://jenkins-ci.eng.lab.tlv.redhat.com/job/hosted_engine_3.6_el7_staticip_nfs3_install_on_7.1/53/artifact/logs/ovirt-engine-setup-20150825031708-st2evd.log

Comment 6 Martin Perina 2015-08-25 15:10:05 UTC
I tried to debug the issue and when I connected to stucked engine-setup process using command 'gdb -batch -ex "t a a py-bt" python PID' following  frames appeared:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f83c95179e0 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:81
81	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)

Thread 1 (Thread 0x7f83ca6ee740 (LWP 2531)):
#6 <built-in method write of file object at remote 0x2a3b780>
#8 Frame 0x2b3a710, for file /usr/lib64/python2.7/logging/__init__.py, line 872, in emit (self=<StreamHandler(stream=<file at remote 0x2a3b780>, level=20, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=140204013709120, _RLock__block=<thread.lock at remote 0x29ad1f0>, _RLock__count=1) at remote 0x2ab46d0>, _name=None, filters=[], formatter=<_MyFormatter(_parent=<Plugin(_context=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(st...(truncated)
    stream.write(fs % msg)
#11 Frame 0x2b33980, for file /usr/lib64/python2.7/logging/__init__.py, line 749, in handle (self=<StreamHandler(stream=<file at remote 0x2a3b780>, level=20, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=140204013709120, _RLock__block=<thread.lock at remote 0x29ad1f0>, _RLock__count=1) at remote 0x2ab46d0>, _name=None, filters=[], formatter=<_MyFormatter(_parent=<Plugin(_context=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(...(truncated)
    self.emit(record)
#14 Frame 0x2b33780, for file /usr/lib64/python2.7/logging/__init__.py, line 1318, in callHandlers (self=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s', _environment={'CORE/modifiedFiles': ['/var/lib/pgsql/data/pg_hba.co...(truncated)
    hdlr.handle(record)
#17 Frame 0x2b333d0, for file /usr/lib64/python2.7/logging/__init__.py, line 1278, in handle (self=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s', _environment={'CORE/modifiedFiles': ['/var/lib/pgsql/data/pg_hba.conf', '...(truncated)
    self.callHandlers(record)
#20 Frame 0x2b2dce0, for file /usr/lib64/python2.7/logging/__init__.py, line 1268, in _log (self=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s', _environment={'CORE/modifiedFiles': ['/var/lib/pgsql/data/pg_hba.conf', '/v...(truncated)
    self.handle(record)
#25 Frame 0x2b2d760, for file /usr/lib64/python2.7/logging/__init__.py, line 1149, in info (self=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s', _environment={'CORE/modifiedFiles': ['/var/lib/pgsql/data/pg_hba.conf', '/v...(truncated)
    self._log(INFO, msg, args, **kwargs)
#29 Frame 0x33b5850, for file /usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/config/aaajdbc.py, line 94, in _setupSchema (self=<Plugin(_logger=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s',...(truncated)
    _('Creating/refreshing Engine AAA database schema')
#32 Frame 0x33b3d10, for file /usr/share/ovirt-engine/setup/bin/../plugins/ovirt-engine-setup/ovirt-engine/config/aaajdbc.py, line 331, in _misc (self=<Plugin(_logger=<Logger(name='otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(asctime)s %(levelname)s %(name)s %(module)s.%(funcName)s:%(lineno)d %(message)s', _envi...(truncated)
    self._setupSchema()
#35 Frame 0x2b34b20, for file /usr/lib/python2.7/site-packages/otopi/context.py, line 146, in _executeMethod (self=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt=...(truncated)
    method['method']()
#38 Frame 0x2b306b0, for file /usr/lib/python2.7/site-packages/otopi/context.py, line 442, in runSequence (self=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y-%m-%d %H:%M:%S', _fmt='%(...(truncated)
    self._executeMethod(self._currentStage, methodinfo)
#41 Frame 0x2177180, for file /usr/lib/python2.7/site-packages/otopi/main.py, line 167, in execute (self=<Otopi(_insource=False, _context=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, _name=None, filters=[], formatter=<_MyFormatter(datefmt='%Y...(truncated)
    self.context.runSequence()
#44 Frame 0x20ff3f0, for file /usr/lib/python2.7/site-packages/otopi/__main__.py, line 103, in main (self=<Installer(_debug=0) at remote 0x7f83c1c9b510>, installer=<Otopi(_insource=False, _context=<Context(_notifications=[<instancemethod at remote 0x2ac39b0>, <instancemethod at remote 0x2ac38c0>, <instancemethod at remote 0x2ac3960>], _services=<Plugin(_logger=<Logger(name='otopi.plugins.otopi.services.systemd', parent=<Logger(name='otopi', parent=<RootLogger(name='root', parent=None, handlers=[], level=30, disabled=0, propagate=1, filters=[]) at remote 0x7f83c1c9b110>, handlers=[<NullHandler(lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f83ca66f390>, _RLock__count=0) at remote 0x215ca10>, level=0, formatter=None, filters=[], _name=None) at remote 0x213ab10>, <StreamHandler(stream=<file at remote 0x2a3b660>, level=10, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x29adc90>, _RLock__count=0) at remote 0x2ab4590>, ...(truncated)
    installer.execute()
#47 Frame 0x2074c40, for file /usr/lib/python2.7/site-packages/otopi/__main__.py, line 137, in <module> ()
    sys.exit(installer.main())
#51 Frame 0x208bbe0, for file /usr/lib64/python2.7/runpy.py, line 72, in _run_code (code=<code at remote 0x7f83ca5edc30>, run_globals={'_': <function at remote 0x2165410>, 'main': <module at remote 0x7f83ca5618a0>, 'installer': <Installer(_debug=0) at remote 0x7f83c1c9b510>, '__builtins__': <module at remote 0x7f83ca6aeb08>, '__file__': '/usr/lib/python2.7/site-packages/otopi/__main__.py', 'traceback': <module at remote 0x7f83ca602fa0>, 'gettext': <module at remote 0x7f83ca5ac2f0>, '__package__': 'otopi', 'sys': <module at remote 0x7f83ca6aebb0>, 'shlex': <module at remote 0x7f83ca551718>, 'Installer': <type at remote 0x212dee0>, '__name__': '__main__', '__loader__': <ImpLoader(code=<code at remote 0x7f83ca5edc30>, source='#\n# otopi -- plugable installer\n# Copyright (C) 2012-2013 Red Hat, Inc.\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the Licen...(truncated)
    exec code in run_globals
#55 Frame 0x2015dd0, for file /usr/lib64/python2.7/runpy.py, line 162, in _run_module_as_main (mod_name='otopi.__main__', alter_argv=1, loader=<ImpLoader(code=<code at remote 0x7f83ca5edc30>, source='#\n# otopi -- plugable installer\n# Copyright (C) 2012-2013 Red Hat, Inc.\n#\n# This library is free software; you can redistribute it and/or\n# modify it under the terms of the GNU Lesser General Public\n# License as published by the Free Software Foundation; either\n# version 2.1 of the License, or (at your option) any later version.\n#\n# This library is distributed in the hope that it will be useful,\n# but WITHOUT ANY WARRANTY; without even the implied warranty of\n# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n# Lesser General Public License for more details.\n#\n# You should have received a copy of the GNU Lesser General Public\n# License along with this library; if not, write to the Free Software\n# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA\n#\n\n\n"""o...(truncated)
    "__main__", fname, loader, pkg_name)


Last line in setup log is:

2015-08-25 14:33:11 INFO otopi.plugins.ovirt_engine_setup.ovirt_engine.config.aaajdbc aaajdbc._setupSchema:94 Creating/refreshing Engine AAA database schema

So it seems that python is stucked when trying to write into log file.

Comment 7 Sagi Shnaidman 2015-08-26 11:57:41 UTC
I see there "idle in transaction" for postgresql:

[root@hengine ~]# ps aux | grep engine
root      3823  0.6  0.8 324704 32504 ?        S    11:43   0:04 /bin/python -B -m otopi.__main__  "APPEND:BASE/pluginPath=str:/usr/share/ovirt-engine/setup/bin/../plugins" APPEND:BASE/pluginGroups=str:ovirt-engine-common:ovirt-engine-setup  OVESETUP_CORE/offlinePackager=bool:True PACKAGER/yumpackagerEnabled=bool:False "APPEND:CORE/configFileAppend=str:/root/ovirt-engine-answers" "APPEND:CORE/configFileAppend=str:/root/heanswers.conf" 
postgres 23246  0.0  0.2 239540 10284 ?        Ss   11:49   0:00 postgres: engine engine ::1(48395) idle in transaction
root     23339  0.0  0.0 112640   964 pts/0    S+   11:55   0:00 grep --color=auto engine

Comment 8 Martin Perina 2015-09-02 09:08:31 UTC
Sagi was it resolved by adding more resources to CI VMs? If so can we close the bug?

Comment 9 Simone Tiraboschi 2015-09-02 09:37:54 UTC
Also if so at least we have to enforce it as  minimum required configuration explicitly avoid trying to deploy the engine VM with less resources.

Comment 10 Oved Ourfali 2015-09-02 09:55:47 UTC
Also, does it reproduce?
If not, it is definitely neither urgent nor a blocker.

Comment 11 Sagi Shnaidman 2015-09-02 12:17:45 UTC
It's difficult to say now, because it's blocked by VDSM bug:  1259310
I can not check now if it works.

Comment 12 Oved Ourfali 2015-09-02 12:24:32 UTC
Reducing priority until this gets cleared.

Comment 13 Alon Bar-Lev 2015-09-03 14:14:50 UTC
Please close it as WORKSFORME

Comment 14 Sandro Bonazzola 2015-09-04 12:07:09 UTC
*** Bug 1252858 has been marked as a duplicate of this bug. ***

Comment 15 Yaniv Lavi 2015-09-08 11:11:26 UTC
Sagi, please reopen if you have additional issues.

Comment 16 Simone Tiraboschi 2015-09-16 09:53:05 UTC
It's still happening and it's not systematic:

Here for instance:
http://jenkins-ci.eng.lab.tlv.redhat.com/job/hosted_engine_3.6_el7_staticip_nfs4_install_on_latest/19/console

06:00:57 [cinteg31.ci.lab.tlv.redhat.com] out:           |- [ INFO  ] Creating/refreshing Engine database schema
06:05:58 [cinteg31.ci.lab.tlv.redhat.com] out: [ ERROR ] Engine setup got stuck on the appliance
06:05:58 [cinteg31.ci.lab.tlv.redhat.com] out: [ ERROR ] Failed to execute stage 'Closing up': Engine setup is stalled on the appliance since 300 seconds ago. Please check its log on the appliance. 
06:05:58 [cinteg31.ci.lab.tlv.redhat.com] out: [ INFO  ] Stage: Clean up

On another instance 
http://jenkins-ci.eng.lab.tlv.redhat.com/job/hosted_engine_3.6_el7_staticip_nfs4_install_on_7.1/33/console
we also got:
06:44:25 [cinteg07.ci.lab.tlv.redhat.com] out:           |- [ INFO  ] Configuring PostgreSQL
06:44:35 [cinteg07.ci.lab.tlv.redhat.com] out:           |- [ INFO  ] Creating/refreshing Engine database schema
06:45:05 [cinteg07.ci.lab.tlv.redhat.com] out: 
06:45:05 [cinteg07.ci.lab.tlv.redhat.com] out: gMessage from syslogd@hehost at Sep 16 06:45:05 ...
06:45:05 [cinteg07.ci.lab.tlv.redhat.com] out:  kernel:BUG: soft lockup - CPU#0 stuck for 23s! [qemu-kvm:15149]
06:50:00 [cinteg07.ci.lab.tlv.redhat.com] out: [ ERROR ] Engine setup got stuck on the appliance

With a KVM backtrace in /var/log/messages:
http://jenkins-ci.eng.lab.tlv.redhat.com/job/hosted_engine_3.6_el7_staticip_nfs4_install_on_7.1/33/artifact/logs/messages


So it could be something on kvm on the jenkins slave: we are using nested virtualization for that CI jobs.

Comment 17 Sandro Bonazzola 2015-09-18 06:17:08 UTC
Reduced severity since it happens in CI only probably triggering a kvm bug in nested virtualization. Not a common use case.

Comment 18 Nikolai Sednev 2015-11-25 08:10:31 UTC
Works for me on these components:
# rpm -qa vdsm libvirt-client sanlock qemu-kvm-rhev mom ovirt*
ovirt-vmconsole-host-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch
ovirt-release36-002-2.noarch
ovirt-engine-sdk-python-3.6.0.4-0.2.20151123.gita2f81ed.el7.centos.noarch
sanlock-3.2.4-1.el7.x86_64
ovirt-setup-lib-1.0.1-0.0.master.20151119123055.gitfa908be.el7.centos.noarch
qemu-kvm-rhev-2.3.0-31.el7_2.3.x86_64
ovirt-hosted-engine-ha-1.3.3-0.0.master.20151118145556.20151118145552.git71b535e.el7.noarch
ovirt-vmconsole-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch
ovirt-release36-snapshot-002-2.noarch
libvirt-client-1.2.17-13.el7.x86_64
ovirt-hosted-engine-setup-1.3.1-0.0.master.20151118143825.gitc013638.el7.centos.noarch
ovirt-host-deploy-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch
vdsm-4.17.10.1-0.el7ev.noarch
mom-0.5.1-2.el7.noarch