Bug 1262550
| Summary: | 3.6.0 nighty hosted-engine deploy quits silently on second host | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-hosted-engine-setup | Reporter: | acrow | ||||||
| Component: | General | Assignee: | Simone Tiraboschi <stirabos> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Nikolai Sednev <nsednev> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | urgent | ||||||||
| Version: | 1.3.0 | CC: | alonbl, bugs, didi, gklein, hawk, sbonazzo | ||||||
| Target Milestone: | ovirt-3.6.0-beta | Keywords: | TestOnly, Triaged | ||||||
| Target Release: | 1.3.0 | Flags: | rule-engine:
                ovirt-3.6.0+ rule-engine: blocker+ ylavi: planning_ack+ sbonazzo: devel_ack+ gklein: testing_ack+ | ||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | integration | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2015-11-27 07:52:42 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1255638 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: | 
 | ||||||||
| Created attachment 1072739 [details]
hosted-engine --deploy done from scratch.
Firewall off, selinux disabled. Still bails out with no obvious error.
Alex
Looks really weird. Can you please try to run this under strace and post the log? Something like: strace -tt -s 4096 -f -o /tmp/hosted-engine-trace hosted-engine --deploy Also please check/post vdsm logs. Thanks. Created attachment 1072902 [details]
strace and logs
By the way this is from a fresh setup on a "first host", so it's not restricted to the second host use case. I think it's a very recent change as a few days ago (I think on 9/9/15, possibly a bit earlier) I could deploy the hosted engine with no issues. ALex It seams that the issue is on the connect with VDSM:
3800  15:39:46.143821 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 10
3800  15:39:46.143887 fcntl(10, F_GETFL) = 0x2 (flags O_RDWR)
3800  15:39:46.143912 fcntl(10, F_SETFL, O_RDWR|O_NONBLOCK) = 0
3800  15:39:46.143946 connect(10, {sa_family=AF_INET, sin_port=htons(54321), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EINPROGRESS (Operation now in progress)
3800  15:39:46.144069 poll([{fd=10, events=POLLOUT}], 1, 900000) = 1 ([{fd=10, revents=POLLOUT}])
3800  15:39:46.144106 getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
3800  15:39:46.144159 getpeername(10, {sa_family=AF_INET, sin_port=htons(54321), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
....
3800  15:39:46.152220 close(10)         = 0
3800  15:39:46.152612 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152660 rt_sigaction(SIGHUP, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152693 rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152719 rt_sigaction(SIGABRT, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152744 rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {SIG_IGN, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152768 rt_sigaction(SIGTERM, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.152805 rt_sigaction(SIGIO, {SIG_DFL, [], SA_RESTORER, 0x7efe5af03130}, {0x7efe5b221620, [], SA_RESTORER, 0x7efe5af03130}, 8) = 0
3800  15:39:46.177030 exit_group(1)     = ?
3800  15:39:46.180417 +++ exited with 1 +++
OK, found running it with OTOPI_DEBUG=1
2015-09-14 12:12:43 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv vdsmenv._connect:62 {'status': {'message': 'Done', 'code': 0}, 'info': {'systemProductName': 'oVirt Node', 'systemUUID': '8d225eaa-6dcd-441a-80a1-1e43335dc7b0', 'systemSerialNumber': '4C4C4544-0056-4B10-8044-B9C04F513032', 'systemVersion': '20-4', 'systemManufacturer': 'oVirt'}}
('FATAL Internal error (main): <Fault 1: \'<type \\\'exceptions.Exception\\\'>:method "__unicode__" is not supported\'>',)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/otopi/__main__.py", line 103, in main
    installer.execute()
  File "/usr/lib/python2.7/site-packages/otopi/main.py", line 167, in execute
    self.context.runSequence()
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 443, in runSequence
    self.dumpEnvironment(old=oldEnvironment)
  File "/usr/lib/python2.7/site-packages/otopi/context.py", line 492, in dumpEnvironment
    value = common.toStr(self.environment[key])
  File "/usr/lib/python2.7/site-packages/otopi/common.py", line 88, in toStr
    return toStr(o.__unicode__())
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1224, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1578, in __request
    verbose=self.__verbose
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1264, in request
    return self.single_request(host, handler, request_body, verbose)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1297, in single_request
    return self.parse_response(response)
  File "/usr/lib/python2.7/site-packages/vdsm/vdscli.py", line 43, in wrapped_parse_response
    return old_parse_response(*args, **kwargs)
  File "/usr/lib64/python2.7/xmlrpclib.py", line 1473, in parse_response
    return u.close()
  File "/usr/lib64/python2.7/xmlrpclib.py", line 793, in close
    raise Fault(**self._stack[0])
Fault: <Fault 1: '<type \'exceptions.Exception\'>:method "__unicode__" is not supported'>
It's logging the fault only with OTOPI_DEBUG=1, otherwise is silently quitting:
it would be better to find a way to have it logged also non in debug mode.
I suspect that it has been introduced with this one:
https://gerrit.ovirt.org/#/c/45990
witch tries __unicode__() before __repr__()
Works for me on these components: # rpm -qa vdsm libvirt-client sanlock qemu-kvm-rhev mom ovirt* ovirt-vmconsole-host-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch ovirt-release36-002-2.noarch ovirt-engine-sdk-python-3.6.0.4-0.2.20151123.gita2f81ed.el7.centos.noarch sanlock-3.2.4-1.el7.x86_64 ovirt-setup-lib-1.0.1-0.0.master.20151119123055.gitfa908be.el7.centos.noarch qemu-kvm-rhev-2.3.0-31.el7_2.3.x86_64 ovirt-hosted-engine-ha-1.3.3-0.0.master.20151118145556.20151118145552.git71b535e.el7.noarch ovirt-vmconsole-1.0.1-0.0.master.20151105234454.git3e5d52e.el7.noarch ovirt-release36-snapshot-002-2.noarch libvirt-client-1.2.17-13.el7.x86_64 ovirt-hosted-engine-setup-1.3.1-0.0.master.20151118143825.gitc013638.el7.centos.noarch ovirt-host-deploy-1.4.2-0.0.master.20151122153544.gitfc808fc.el7.noarch vdsm-4.17.10.1-0.el7ev.noarch mom-0.5.1-2.el7.noarch Deployed 2 hosts over NFS SD. Since oVirt 3.6.0 has been released, moving from verified to closed current release. | 
Description of problem: I've manged to get a 3.6.0-pre self-hosted engine running on one Centos 7.1 host (glusterfs-backed) but on the second host it the deploy process just quits with no error before I even get to configure the storage Version-Release number of selected component (if applicable): 1.3.0-0.0.master.20150909090214.git794400d.el7.centos How reproducible: Always Steps to Reproduce: 1. Set up clean Centos 7.1 host 2. Install glusterfs 3.7 repos 3. Install ovirt-3.6 and ovirt-3.6 snapshot repos 4. Install ovirt-hosted-engine-setup 5. Run hosted-engine --deploy Actual results: Install just quits: [root@olympia ~]# hosted-engine --deploy [ INFO ] Stage: Initializing [ INFO ] Generating a temporary VNC password. [ INFO ] Stage: Environment setup Continuing will configure this host for serving as hypervisor and create a VM where you have to install oVirt Engine afterwards. Are you sure you want to continue? (Yes, No)[Yes]: yes Configuration files: [] Log file: /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20150912174714-4jr1ru.log Version: otopi-1.4.0_master (otopi-1.4.0-0.0.master.20150910103318.gitdd73099.el7) [ INFO ] Hardware supports virtualization [ INFO ] Stage: Environment packages setup [ INFO ] Stage: Programs detection [ INFO ] Stage: Environment setup [root@olympia ~]# Expected results: Install continues to ask required questions and sets up second host for engine. Additional info: Tail end of setup log: 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/reboot=str:'/sbin/reboot' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/remote-viewer=str:'/bin/remote-viewer' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/service=str:'/sbin/service' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/sshd=str:'/sbin/sshd' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/sudo=str:'/bin/sudo' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/systemctl=str:'/bin/systemctl' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/truncate=str:'/bin/truncate' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/umount=str:'/bin/umount' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:510 ENV COMMAND/vdsm-tool=str:'/bin/vdsm-tool' 2015-09-12 17:15:12 DEBUG otopi.context context.dumpEnvironment:514 ENVIRONMENT DUMP - END 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage programs METHOD otopi.plugins.otopi.services.systemd.Plugin._prog rams 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:828 execute: ('/bin/systemctl', 'show-environment'), exec utable='None', cwd='None', env=None 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'show-environment' ), rc=0 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:936 execute-output: ('/bin/systemctl', 'show-environment') stdout: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LANG=en_GB.UTF-8 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'show-environment') stderr: 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd systemd._programs:64 registering systemd provider 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage programs METHOD otopi.plugins.otopi.services.rhel.Plugin._programs 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:828 execute: ('/bin/systemctl', 'show-environment'), executable='None', cwd='None', env=None 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.rhel plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'show-environment'), rc=0 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:936 execute-output: ('/bin/systemctl', 'show-environment') stdout: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin LANG=en_GB.UTF-8 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.rhel plugin.execute:941 execute-output: ('/bin/systemctl', 'show-environment') stderr: 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage programs METHOD otopi.plugins.otopi.services.openrc.Plugin._programs 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage programs METHOD otopi.plugins.ovirt_hosted_engine_setup.ha.ha_services.Plugin._programs 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd systemd.status:105 check service ovirt-ha-agent status 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:828 execute: ('/bin/systemctl', 'status', 'ovirt-ha-agent.service'), executable='None', cwd='None', env=None 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'status', 'ovirt-ha-agent.service'), rc=3 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:936 execute-output: ('/bin/systemctl', 'status', 'ovirt-ha-agent.service') stdout: ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; disabled) Active: inactive (dead) 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'status', 'ovirt-ha-agent.service') stderr: 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd systemd.status:105 check service ovirt-ha-broker status 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:828 execute: ('/bin/systemctl', 'status', 'ovirt-ha-broker.service'), executable='None', cwd='None', env=None 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'status', 'ovirt-ha-broker.service'), rc=3 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:936 execute-output: ('/bin/systemctl', 'status', 'ovirt-ha-broker.service') stdout: ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; disabled) Active: inactive (dead) 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'status', 'ovirt-ha-broker.service') stderr: 2015-09-12 17:15:12 INFO otopi.context context.runSequence:427 Stage: Environment setup 2015-09-12 17:15:12 DEBUG otopi.context context.runSequence:431 STAGE late_setup 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage late_setup METHOD otopi.plugins.ovirt_hosted_engine_setup.vdsmd.vdsmconf.Plugin._late_setup 2015-09-12 17:15:12 DEBUG otopi.context context._executeMethod:142 Stage late_setup METHOD otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv.Plugin._late_setup 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd systemd.status:105 check service vdsmd status 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:828 execute: ('/bin/systemctl', 'status', 'vdsmd.service'), executable='None', cwd='None', env=None 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.executeRaw:878 execute-result: ('/bin/systemctl', 'status', 'vdsmd.service'), rc=0 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:936 execute-output: ('/bin/systemctl', 'status', 'vdsmd.service') stdout: vdsmd.service - Virtual Desktop Server Manager Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled) Active: active (running) since Sat 2015-09-12 17:13:53 BST; 1min 19s ago Process: 3140 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=0/SUCCESS) Main PID: 3234 (vdsm) CGroup: /system.slice/vdsmd.service └─3234 /usr/bin/python /usr/share/vdsm/vdsm Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 client step 1 Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 ask_user_info() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 client step 1 Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 ask_user_info() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 make_client_response() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 client step 2 Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 parse_server_challenge() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 ask_user_info() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 make_client_response() Sep 12 17:13:55 olympia python[3234]: DIGEST-MD5 client step 3 2015-09-12 17:15:12 DEBUG otopi.plugins.otopi.services.systemd plugin.execute:941 execute-output: ('/bin/systemctl', 'status', 'vdsmd.service') stderr: 2015-09-12 17:15:12 DEBUG otopi.plugins.ovirt_hosted_engine_setup.system.vdsmenv vdsmenv._connect:62 {'status': {'message': 'Done', 'code': 0}, 'info': {'systemProductName': 'X9DR3-F', 'systemSerialNumber': '0123456789', 'systemFamily': 'To be filled by O.E.M.', 'systemVersion': '0123456789', 'systemUUID': '00000000-0000-0000-0000-002590E5DAD8', 'systemManufacturer': 'Supermicro'}}