Bug 1229227 - Auto register rhevh6 to rhevm3.5.3 failed using management_server=$rhevm_ip:443 during the auto-install
Summary: Auto register rhevh6 to rhevm3.5.3 failed using management_server=$rhevm_ip:4...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.3
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ovirt-3.6.0-rc3
: 3.6.0
Assignee: Yaniv Bronhaim
QA Contact: wanghui
URL:
Whiteboard:
Depends On: 1203422 1233059 1249396 1249397
Blocks: 1240288
TreeView+ depends on / blocked
 
Reported: 2015-06-08 10:05 UTC by wanghui
Modified: 2016-03-09 14:28 UTC (History)
17 users (show)

Fixed In Version: ovirt-node-3.3.0-0.13.20151008git03eefb5.el7ev.noarch
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1240288 (view as bug list)
Environment:
Last Closed: 2016-03-09 14:28:47 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
/var/log files (108.43 KB, application/x-gzip)
2015-06-08 10:05 UTC, wanghui
no flags Details
vdsm logs (15.17 KB, application/x-gzip)
2015-06-08 18:46 UTC, Douglas Schilling Landgraf
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1235350 0 high CLOSED Keep the upstart libvirtd file to make the flow similar to RHEL 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1235591 0 high CLOSED HE deployment fails due to libvirtError: internal error client socket is closed 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHBA-2016:0378 0 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update for RHEV 3.6 2016-03-09 19:06:36 UTC

Internal Links: 1235350 1235591

Description wanghui 2015-06-08 10:05:50 UTC
Created attachment 1036278 [details]
/var/log files

Description of problem:
Auto register to rhevm3.5.3 failed using management_server=$rhevm_ip:443 during the auto-install. The rhevh losts all the network connection and rhevh can not be up in rhevm3.5.3. This issue can be reproduced when you wait for few minutes before you login rhevh.

Version-Release number of selected component (if applicable):
rhev-hypervisor6-6.6-20150603.0
ovirt-node-3.2.3-3.el6.noarch
ovirt-node-plugin-vdsm-0.2.0-24.el6ev.noarch
Red Hat Enterprise Virtualization Manager Version: 3.5.3.1-1.4.el6ev

How reproducible:
50%

Steps to Reproduce:
1. Auto install rhev-hypervisor6-6.6-20150603.0 with follow parameters.
   BOOTIF=eth2 storage_init=/dev/sda management_server=10.8.51.171:443 adminpw=4DHc2Jl0D05xk firstboot
2. After finished installation, wait for 5 minutes before login the rhevh.
3. Login rhevh and check the ip address.
4. Up rhevh in rhevm3.5.3.

Actual results:
1. After step3, rhevh lost the network connection. No configuration file for eth2 and rhevm.
2. Rhevh can not be up in rhevm3.5.3.

Expected results:
1. After step3, the rhevh should have the ip address for rhevm bridge.
2. After step4, the rhevh can be up in rhevm3.5.3.

Additional info:

Comment 2 Douglas Schilling Landgraf 2015-06-08 18:45:42 UTC
I can reproduce the report.

Few points:

0) Registration happens, RHEV-H is available in RHEV-M to approve. So the network, was available to communicate with Engine befo
1) The netconf link is broken:
   # ls -la /var/lib/vdsm/persistence
   netconf -> /var/lib/vdsm/persistence/netconf.1433775746165103281

2) Doesn't contain ifcfg-eth0 or ifcfg-rhevm in:
   /etc/sysconfig/network-scripts/ 

3) virsh # net-list --all

Name        State     Autostart    Persistent
;vdsmdummy; activate  no           no
default     inactive  no           yes


Some logs from supervdsm.log
================================
sourceRoute::WARNING::2015-06-08 15:02:14,364::utils::129::root::(rmFile) File: /var/run/vdsm/trackedInterfaces/eth0 already removed
sourceRoute::DEBUG::2015-06-08 15:02:14,364::sourceroutethread::39::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1433775683
sourceRoute::INFO::2015-06-08 15:02:14,365::sourceroutethread::60::root::(process_IN_CLOSE_WRITE_filePath) interface eth0 is not a libvirt interface
sourceRoute::WARNING::2015-06-08 15:02:14,365::utils::129::root::(rmFile) File: /var/run/vdsm/trackedInterfaces/eth0 already removed
sourceRoute::DEBUG::2015-06-08 15:02:14,365::sourceroutethread::39::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1433775688
sourceRoute::INFO::2015-06-08 15:02:14,365::sourceroute::166::root::(remove) Removing gateway - device: rhevm
sourceRoute::DEBUG::2015-06-08 15:02:14,365::utils::739::root::(execCmd) /sbin/ip rule (cwd None)
sourceRoute::DEBUG::2015-06-08 15:02:14,378::utils::759::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
sourceRoute::ERROR::2015-06-08 15:02:14,378::sourceroute::153::root::(_getRules) Routing rules not found for device rhevm
sourceRoute::DEBUG::2015-06-08 15:02:14,378::sourceroutethread::39::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1433775690
sourceRoute::INFO::2015-06-08 15:02:14,379::sourceroutethread::60::root::(process_IN_CLOSE_WRITE_filePath) interface rhevm is not a libvirt interface
sourceRoute::WARNING::2015-06-08 15:02:14,379::utils::129::root::(rmFile) File: /var/run/vdsm/trackedInterfaces/rhevm already removed
sourceRoute::DEBUG::2015-06-08 15:02:14,379::sourceroutethread::39::root::(process_IN_CLOSE_WRITE_filePath) Responding to DHCP response in /var/run/vdsm/sourceRoutes/1433775693
sourceRoute::INFO::2015-06-08 15:02:14,380::sourceroutethread::60::root::(process_IN_CLOSE_WRITE_filePath) interface rhevm is not a libvirt interface
sourceRoute::WARNING::2015-06-08 15:02:14,380::utils::129::root::(rmFile) File: /var/run/vdsm/trackedInterfaces/rhevm already removed
MainThread::DEBUG::2015-06-08 15:02:28,668::netconfpersistence::134::root::(_getConfigs) Non-existing config set.
MainThread::DEBUG::2015-06-08 15:02:28,668::netconfpersistence::134::root::(_getConfigs) Non-existing config set.
MainThread::DEBUG::2015-06-08 15:02:28,668::vdsm-restore-net-config::60::root::(unified_restoration) Removing all networks ({}) and bonds ({}) in running config.

Comment 3 Douglas Schilling Landgraf 2015-06-08 18:46:20 UTC
Created attachment 1036455 [details]
vdsm logs

Comment 6 Ying Cui 2015-06-25 12:18:30 UTC
Note that, this bug affect rhevh 6.6 for 3.5.3, and here if the bug still affect RHEVH 6.7 for rhev 3.5.4, let's consider it is a blocker. Thanks.

Comment 7 wanghui 2015-06-30 09:03:15 UTC
Still has the same issue on rhev-hypervisor6-6.7-20150609.0.iso.

Comment 8 Ido Barkan 2015-06-30 10:22:30 UTC
looking at the logs, this does not seem like VDSM's fault. At least not the network part of it. So I can already say that this will probably not be solved in 3.5.4.

But I do see that vdsm-reg is failing when trying to create a bridge. It fails because libvirt is down. Last time I saw this on rhev-h, libvirt refused to go up if there were no interfaces with IP to bind to (not sure if lo was enough for it).

I think vdsm-reg tried to connect to the engine although the bridge creation failed.

Douglas can you please take a look at /var/log/vdsm-reg/vdsm-reg.log ?

Comment 9 Fabian Deutsch 2015-06-30 10:42:07 UTC
Nice findings. Maybe bug 1235350 and th evdsm part helps to improve this.

But maybe Douglas also finds another reason why libvirtd does not come up.

Comment 10 Yaniv Lavi 2015-06-30 11:38:49 UTC
(In reply to Ido Barkan from comment #8)
> looking at the logs, this does not seem like VDSM's fault. At least not the
> network part of it. So I can already say that this will probably not be
> solved in 3.5.4.
> 
> But I do see that vdsm-reg is failing when trying to create a bridge. It
> fails because libvirt is down. Last time I saw this on rhev-h, libvirt
> refused to go up if there were no interfaces with IP to bind to (not sure if
> lo was enough for it).
> 
> I think vdsm-reg tried to connect to the engine although the bridge creation
> failed.
> 
> Douglas can you please take a look at /var/log/vdsm-reg/vdsm-reg.log ?

Can the ONBOOT=no due to the other bug affect this?

Comment 11 Dan Kenigsberg 2015-06-30 11:45:07 UTC
MainThread::DEBUG::2015-06-08 09:25:40,055::deployUtil::453::root::_getMGTIface IP=10.8.51.171 strIface=em1
MainThread::DEBUG::2015-06-08 09:25:40,056::deployUtil::1059::root::makeBridge found the following bridge paramaters: ['BOOTPROTO=dhcp', 'IPV6INIT=no', 'IPV6_AUTOCONF=no', 'ONBOOT=yes', 'PEERNTP=yes']
MainThread::DEBUG::2015-06-08 09:25:40,057::deployUtil::140::root::['/usr/share/vdsm/addNetwork', 'rhevm', '', '', 'em1', 'BOOTPROTO=dhcp', 'IPV6INIT=no', 'IPV6_AUTOCONF=no', 'ONBOOT=yes', 'PEERNTP=yes', 'blockingdhcp=true']
MainThread::DEBUG::2015-06-08 09:25:50,799::deployUtil::149::root::
MainThread::DEBUG::2015-06-08 09:25:50,803::deployUtil::150::root::libvirt: XML-RPC error : Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

To me it smells as another manifestation of bug 1235591


(In reply to Yaniv Dary from comment #10)
> > Douglas can you please take a look at /var/log/vdsm-reg/vdsm-reg.log ?
> 
> Can the ONBOOT=no due to the other bug affect this?

It does not seem related at all - the error above takes place before vdsm has the chance to write ifcfg files.

Comment 12 Fabian Deutsch 2015-06-30 13:59:29 UTC
The nasty thing about bug 1235591 is, that we cna not reproduce it anymore, and thus no fix was introduced for it.

Comment 13 Yaniv Bronhaim 2015-07-02 08:29:42 UTC
Now with the libvirtd upstart script if libvirt crashes over el6 it respawns quickly so it might fix it ... although crashes are never intentional and we need to figure why it happened, but we can't proceed without seeing such issue and understand why libvirtd stopped. Again I would suggest to add libvirt debug log to check that out when we will be able to reproduce it. I don't face it with the latest image I check.

Closing this bug. If this issue is raised again please re-open quickly.

Comment 14 Yaniv Lavi 2015-07-02 09:06:04 UTC
Moving to ON_QA to make sure this is tested.

Comment 15 Yaniv Lavi 2015-07-02 09:07:56 UTC
Please provide acks, clone and move to ON_QA for testing.

Comment 16 Ying Cui 2015-07-02 09:12:51 UTC
This bug affect rhevh6,

Comment 22 wanghui 2015-10-27 08:14:08 UTC
Test version:
rhev-hypervisor7-7.2-20151025.0.el7ev
ovirt-node-3.3.0-0.18.20151022git82dc52c.el7ev.noarch
Red Hat Enterprise Virtualization Manager Version: 3.6.0-0.18.el6

Test steps:
1. Auto install rhev-hypervisor6-6.6-20150603.0 with follow parameters.
   BOOTIF=em1 storage_init=/dev/sda management_server=10.8.51.171:443 adminpw=4DHc2Jl0D05xk firstboot
2. After finished installation, wait for 5 minutes before login the rhevh.
3. Login rhevh and check the ip address.
4. Up rhevh in rhevm3.6.0

Test result:
1. After step4, rhevh can up in rhevm3.6.0.

So this issue is fixed in ovirt-node-3.3.0-0.18.20151022git82dc52c.el7ev.noarch. Change the status to verified.

Comment 24 errata-xmlrpc 2016-03-09 14:28:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0378.html


Note You need to log in before you can comment on or make changes to this bug.