Bug 1128033 - [vdsm-reg el7]rhevh7.0 for rhev 3.5 build register to RHEV-M 3.5 failed
Summary: [vdsm-reg el7]rhevh7.0 for rhev 3.5 build register to RHEV-M 3.5 failed
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.node
Version: 1.2.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 1.3.0
Assignee: Douglas Schilling Landgraf
QA Contact: yeylon@redhat.com
URL:
Whiteboard: node
: 1144279 (view as bug list)
Depends On:
Blocks: 1083138 rhevh-7.0 rhev35betablocker rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-08-08 06:59 UTC by haiyang,dong
Modified: 2016-04-18 06:59 UTC (History)
25 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-12 14:03:22 UTC
oVirt Team: Node
Embargoed:
pm-rhel: blocker+
fdeutsch: devel_ack+


Attachments (Terms of Use)
attached Screenshot the output of journalctl -u vdsmd.service -b (14.23 KB, text/plain)
2014-08-08 08:09 UTC, haiyang,dong
no flags Details
attached engine.log (32.48 KB, text/plain)
2014-09-05 08:41 UTC, haiyang,dong
no flags Details
host deploy logs (259.94 KB, text/plain)
2014-10-19 10:39 UTC, Ranjith Rajaram
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 33090 0 master MERGED persist.py: use persist from utils when possible Never
oVirt gerrit 33462 0 master MERGED node: fixup broken I4b24131 code to properly import modules Never

Description haiyang,dong 2014-08-08 06:59:37 UTC
Description of problem:
After Installed rhev-h,then configure network and 
register to rhevm vt1.5 , Although rhev-h shown that register into rhevm success
But the nic still named "em1", didn't rename "rhevm",and in fact didn't register.
Couldn't display rhevh host in rhevm web admin portal.

Also no vdsm-reg.log in /var/log/vdsm-reg/

service vdsmd start failed
-------------------------
[root@dhcp-9-33 admin]# service vdsmd status
Redirecting to /bin/systemctl status  vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
   Active: failed (Result: start-limit) since Fri 2014-08-08 05:54:16 UTC; 1min 35s ago
  Process: 5911 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh --pre-start (code=exited, status=1/FAILURE)

Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: vdsmd.service holdoff time over, scheduling restart.
Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Stopping Virtual Desktop Server Manager...
Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Starting Virtual Desktop Server Manager...
Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: vdsmd.service start request repeated too quickly, refusing t...art.
Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Failed to start Virtual Desktop Server Manager.
Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Unit vdsmd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.


Version-Release number of selected component (if applicable):
rhev-hypervisor7-7.0-20140807.0.iso
ovirt-node-3.1.0-0.6.20140731git2c8e71f.el7.noarch
vdsm-reg-4.16.0-1.el7.noarch
vdsm-4.16.0-1.el7.x86_64
ovirt-node-plugin-vdsm-0.2.0-3.el7.noarch
Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.6.master.el6_5

How reproducible:
100%

Steps to Reproduce:
1. Install rhev-hypervisor7-7.0-20140807.0.iso
2. Configure network for rhev-h
3. Register to rhevm vt1.5.


Actual results:
After step3. Register to  rhevm vt1.5 failed.

Expected results:
Supporting that register rhev-h to RHEV-M vt1.5

Additional info:

Comment 1 Fabian Deutsch 2014-08-08 07:46:05 UTC
Haiyang, can you please provide the output of
journalctl -u vdsmd.service -b

And some vdsm related logs.

Comment 2 haiyang,dong 2014-08-08 08:09:29 UTC
Created attachment 925116 [details]
attached Screenshot the output of journalctl -u vdsmd.service -b

Comment 4 Fabian Deutsch 2014-08-08 10:12:25 UTC
Toni, could you take a look at the traceback in comment 2?

It look slike augeas fails to load libpython2.7.

But if I run
$ python
>>> import augeas
>>> augeas.Augeas()
<augeas-Augeas object at …>

Then everything works, so I wonder if the vdsm-tool is doing something with paths or so.

Comment 5 Douglas Schilling Landgraf 2014-08-19 02:46:43 UTC
(In reply to haiyang,dong from comment #0)
> Description of problem:
> After Installed rhev-h,then configure network and 
> register to rhevm vt1.5 , Although rhev-h shown that register into rhevm
> success
> But the nic still named "em1", didn't rename "rhevm",and in fact didn't
> register.
> Couldn't display rhevh host in rhevm web admin portal.
> 
> Also no vdsm-reg.log in /var/log/vdsm-reg/
> 
> service vdsmd start failed
> -------------------------
> [root@dhcp-9-33 admin]# service vdsmd status
> Redirecting to /bin/systemctl status  vdsmd.service
> vdsmd.service - Virtual Desktop Server Manager
>    Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
>    Active: failed (Result: start-limit) since Fri 2014-08-08 05:54:16 UTC;
> 1min 35s ago
>   Process: 5911 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=1/FAILURE)
> 
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: vdsmd.service holdoff
> time over, scheduling restart.
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Stopping Virtual
> Desktop Server Manager...
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Starting Virtual
> Desktop Server Manager...
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: vdsmd.service start
> request repeated too quickly, refusing t...art.
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Failed to start Virtual
> Desktop Server Manager.
> Aug 08 05:54:16 dhcp-9-33.nay.redhat.com systemd[1]: Unit vdsmd.service
> entered failed state.
> Hint: Some lines were ellipsized, use -l to show in full.
> 
> 
> Version-Release number of selected component (if applicable):
> rhev-hypervisor7-7.0-20140807.0.iso
> ovirt-node-3.1.0-0.6.20140731git2c8e71f.el7.noarch
> vdsm-reg-4.16.0-1.el7.noarch
> vdsm-4.16.0-1.el7.x86_64
> ovirt-node-plugin-vdsm-0.2.0-3.el7.noarch
> Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.6.master.el6_5
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Install rhev-hypervisor7-7.0-20140807.0.iso
> 2. Configure network for rhev-h
> 3. Register to rhevm vt1.5.
> 
> 
> Actual results:
> After step3. Register to  rhevm vt1.5 failed.
> 
> Expected results:
> Supporting that register rhev-h to RHEV-M vt1.5
> 
> Additional info:

Hi haiyang,dong,

Can you please attach /var/log/vdsm/supervdsm.log and /var/log/vdsm/vdsm.log?
I would like to confirm if you did receive the messsage:

"netconfpersistence::134::root::(_getConfigs) Non-existing 
config set."

rhev-hypervisor7-7.0-20140807.0.iso
ovirt-node-3.1.0-0.6.20140731git2c8e71f.el7.noarch
vdsm-reg-4.16.0-1.el7.noarch
vdsm-4.16.0-1.el7.x86_64
ovirt-node-plugin-vdsm-0.2.0-3.el7.noarch

Dan/Toni, is it a know issue?

Comment 6 haiyang,dong 2014-08-19 03:37:43 UTC
(In reply to Douglas Schilling Landgraf from comment #5)
> 
> Hi haiyang,dong,
> 
> Can you please attach /var/log/vdsm/supervdsm.log and /var/log/vdsm/vdsm.log?
> I would like to confirm if you did receive the messsage:
> 
> "netconfpersistence::134::root::(_getConfigs) Non-existing 
> config set."

I got the same error info as yours from /var/log/vdsm/supervdsm.log
[root@dhcp-10-75 admin]# cat /var/log/vdsm/supervdsm.log 
MainThread::DEBUG::2014-08-19 03:20:40,681::netconfpersistence::134::root::(_getConfigs) Non-existing config set.
MainThread::DEBUG::2014-08-19 03:20:40,681::netconfpersistence::134::root::(_getConfigs) Non-existing config set.
....

No log info from /var/log/vdsm/vdsm.log
[root@dhcp-10-75 admin]# cat /var/log/vdsm/vdsm.log 
[root@dhcp-10-75 admin]# 
> 
> rhev-hypervisor7-7.0-20140807.0.iso
> ovirt-node-3.1.0-0.6.20140731git2c8e71f.el7.noarch
> vdsm-reg-4.16.0-1.el7.noarch
> vdsm-4.16.0-1.el7.x86_64
> ovirt-node-plugin-vdsm-0.2.0-3.el7.noarch
>

Comment 7 Dan Kenigsberg 2014-08-20 13:32:12 UTC
vdsm-4.16.0-1 is built on top of upstream v4.16.0-17-ge6a84e1 which is known to be broken due to /var/log/vdsm/connectivity.log ownership and other bugs.

This could cause the failure to start Vdsm automatically.

Please retry with a vdsm build that includes v4.16.1-31-gfaf9b14 (currently no such el7 build is available).

Comment 8 Douglas Schilling Landgraf 2014-08-20 13:38:43 UTC
(In reply to Dan Kenigsberg from comment #7)
> vdsm-4.16.0-1 is built on top of upstream v4.16.0-17-ge6a84e1 which is known
> to be broken due to /var/log/vdsm/connectivity.log ownership and other bugs.
> 
> This could cause the failure to start Vdsm automatically.
> 
> Please retry with a vdsm build that includes v4.16.1-31-gfaf9b14 (currently
> no such el7 build is available).

Thanks Dan! Eyal, please let us know when we have a new build of VDSM for EL7, so we can create a new RHEV-H image.

Comment 9 haiyang,dong 2014-09-05 08:41:08 UTC
Created attachment 934703 [details]
attached engine.log

Test version:
ovirt-node-3.1.0-0.10.20140904gitb828c37.el7.noarch
rhev-hypervisor7-7.0-20140904.0.el7ev
vdsm-4.16.3-2.el7.x86_64
vdsm-reg-4.16.3-2.el7.noarch
ovirt-node-plugin-vdsm-0.2.0-3.el7.noarch
Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.10.master.el6ev

After Installed rhev-h,then configure network and 
register to rhevm vt2.2 , Although rhev-h shown that register into rhevm success
and the nic still named "rhevm",and vdsmd service start success, rhevh host display in rhevm web admin portal.

But approved rhevh host to up failed with the follow error:
	
2014-Sep-05, 16:06	
Host cshao-test installation failed. Command returned failure code 1 during SSH session 'root.9.232'.
		
2014-Sep-05, 16:06	
Installing Host cshao-test. Stage: Termination.
	
2014-Sep-05, 16:06	
Installing Host cshao-test. Retrieving installation logs to: '/var/log/ovirt-engine/host-deploy/ovirt-20140905160616-10.66.9.232-135db263.log'.	
	
2014-Sep-05, 16:06
Installing Host cshao-test. Stage: Pre-termination.
	
2014-Sep-05, 16:06	
Failed to install Host cshao-test. Failed to execute stage 'Closing up': could not import gobject (could not find _PyGObject_API object).
	
2014-Sep-05, 16:05	
Installing Host cshao-test. Starting vdsm.
..

seen engine.log to check.

so need to assigned it.

Comment 10 Douglas Schilling Landgraf 2014-09-05 20:48:34 UTC
Hello hadong,

I do believe, it's different issue and would be nice to have a different bugzilla to do not mix it. Can you please attach into the bugzilla the following data:

- /var/log/vdsm/ and /var/log/vdsm-reg logs from node
- Output of vdsClient -s 0 getVdsCaps from the node
- host-deploy logs (/var/log/ovirt-engine/host-deploy on rhev-m)

I will try to reproduce it locally.

Thanks!

Comment 11 Douglas Schilling Landgraf 2014-09-07 03:30:29 UTC
Ok, I got a reproducer:

2014-09-07 02:43:21 DEBUG otopi.context context._executeMethod:138 Stage closeup METHOD otopi.plugins.otopi.system.reboot.Plugin._closeup
2014-09-07 02:43:21 DEBUG otopi.context context._executeMethod:144 condition False
2014-09-07 02:43:21 DEBUG otopi.context context._executeMethod:138 Stage closeup METHOD otopi.plugins.ovirt_host_deploy.node.persist.Plugin._closeup
2014-09-07 02:43:21 DEBUG otopi.context context._executeMethod:152 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-f8x53hYLji/pythonlib/otopi/context.py", line 142, in _executeMethod
    method['method']()
  File "/tmp/ovirt-f8x53hYLji/otopi-plugins/ovirt-host-deploy/node/persist.py", line 51, in _closeup
    from ovirtnode import ovirtfunctions
  File "/usr/lib/python2.7/site-packages/ovirtnode/ovirtfunctions.py", line 34, in <module>
ImportError: could not import gobject (could not find _PyGObject_API object)
2014-09-07 02:43:21 ERROR otopi.context context._executeMethod:161 Failed to execute stage 'Closing up': could not import gobject (could not find _PyGObject_API object)
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:468 ENVIRONMENT DUMP - BEGIN
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/error=bool:'True'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/exceptionInfo=list:'[(<type 'exceptions.ImportError'>, ImportError('could not import gobject (could not find _PyGObject_API object)',), <traceback object at 0x7f19c9090ab8>)]'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:482 ENVIRONMENT DUMP - END
2014-09-07 02:43:21 INFO otopi.context context.runSequence:395 Stage: Pre-termination
2014-09-07 02:43:21 DEBUG otopi.context context.runSequence:399 STAGE pre-terminate
2014-09-07 02:43:21 DEBUG otopi.context context._executeMethod:138 Stage pre-terminate METHOD otopi.plugins.otopi.core.misc.Plugin._preTerminate
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:468 ENVIRONMENT DUMP - BEGIN
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/aborted=bool:'False'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/debug=int:'0'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/error=bool:'True'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/exceptionInfo=list:'[(<type 'exceptions.ImportError'>, ImportError('could not import gobject (could not find _PyGObject_API object)',), <traceback object at 0x7f19c9090ab8>)]'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/executionDirectory=str:'/root'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/log=bool:'True'
2014-09-07 02:43:21 DEBUG otopi.context context.dumpEnvironment:478 ENV BASE/pluginGroups=str:'otopi:ovirt-host-deploy'

Comment 12 Douglas Schilling Landgraf 2014-09-08 02:37:52 UTC
Hi,

Just to share that I have been investigating the case. I could use the below statements in the node:

- 'import gobject'
- 'from ovirtnode import ovirtfunctions'
- sending and executing (using ssh) the above statements in a script.
- rpm -Va and logs doesn't show anything helpful at moment

Comment 13 Ying Cui 2014-09-18 12:41:54 UTC
Hey Douglas, as this bug is testblocker, how it status now? It blocked more test cases testing. Thanks.

Comment 14 Douglas Schilling Landgraf 2014-09-20 20:50:13 UTC
(In reply to Ying Cui from comment #13)
> Hey Douglas, as this bug is testblocker, how it status now? It blocked more
> test cases testing. Thanks.

Hi Ying,

Sorry about the blocker, this bug is hard to track down. I have tried different approaches but the one which worked is moving the code from ovirtfunctions into the new ovirt node schema to persist. I have sent a patch to review.

Comment 15 Ying Cui 2014-09-23 07:08:01 UTC
"Testonly" is for ovirt-node component, now move out "testonly", because there is patch ready not test only and move to ovirt-host-deply component yet.

Comment 17 Fabian Deutsch 2014-10-08 10:36:01 UTC
Looking at comment 11 and considering what we saw over the last weeks, I believe that the import error is a side effect of SELinux issues.


Ying, can this be reproduced with enforcing=0?

Comment 18 haiyang,dong 2014-10-08 11:44:04 UTC
(In reply to Fabian Deutsch from comment #17)
> Looking at comment 11 and considering what we saw over the last weeks, I
> believe that the import error is a side effect of SELinux issues.
> 
> 
> Ying, can this be reproduced with enforcing=0?

Test version:
rhev-hypervisor7-7.0-20141006.0.el7ev.noarch
ovirt-node-3.1.0-0.20.20141006gitc421e04.el7.noarch
ovirt-node-plugin-vdsm-0.2.0-9.el7.noarch
vdsm-reg-4.16.6-1.el7.noarch
vdsm-4.16.6-1.el7.x86_64


still reproduced Comment 9 test result with or without enforcing=0.

so need to re-assigned it again.

Comment 19 haiyang,dong 2014-10-08 11:54:12 UTC
*** Bug 1144279 has been marked as a duplicate of this bug. ***

Comment 20 haiyang,dong 2014-10-09 09:18:37 UTC
(In reply to haiyang,dong from comment #18)
> (In reply to Fabian Deutsch from comment #17)
> > Looking at comment 11 and considering what we saw over the last weeks, I
> > believe that the import error is a side effect of SELinux issues.
> > 
> > 
> > Ying, can this be reproduced with enforcing=0?
> 
> Test version:
> rhev-hypervisor7-7.0-20141006.0.el7ev.noarch
> ovirt-node-3.1.0-0.20.20141006gitc421e04.el7.noarch
> ovirt-node-plugin-vdsm-0.2.0-9.el7.noarch
> vdsm-reg-4.16.6-1.el7.noarch
> vdsm-4.16.6-1.el7.x86_64
> 
> 
> still reproduced Comment 9 test result with or without enforcing=0.
> 
> so need to re-assigned it again.

After re-setup rhevm vt5 env, couldn't meet comment 9 issue again.
so this bug has been fixed in the follow version:
rhev-hypervisor7-7.0-20141006.0.el7ev.noarch
ovirt-node-3.1.0-0.20.20141006gitc421e04.el7.noarch
ovirt-node-plugin-vdsm-0.2.0-9.el7.noarch
vdsm-reg-4.16.6-1.el7.noarch
vdsm-4.16.6-1.el7.x86_64

after the status into "ON_QA", i will verify it.

Comment 21 Ying Cui 2014-10-09 10:04:31 UTC
could RHEV QE qa_ack+ this bug? or I can instead? I am not sure here.

Comment 23 haiyang,dong 2014-10-09 10:47:14 UTC
According to Comment 20, verify this bug

Comment 24 Ranjith Rajaram 2014-10-19 10:39:33 UTC
Created attachment 948220 [details]
host deploy logs

Comment 28 Fabian Deutsch 2015-02-12 14:03:22 UTC
RHEV 3.5.0 has been released. I am closing this bug, because it has been VERIFIED.

Comment 29 Fabian Deutsch 2015-04-24 06:46:27 UTC
*** Bug 1214158 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.