Description of problem: I upgraded my RHEL host from vdsm-4.14.18-4.el6ev to vdsm-4.16.13.1-1.el6ev. After the upgrade rhevm logical network disappeared. Check the config files, the network was setup as onboot=no and defrout=no. ~~~ # cat /etc/sysconfig/network-scripts/ifcfg-rhevm # Generated by VDSM version 4.16.13.1-1.el6ev DEVICE=rhevm TYPE=Bridge DELAY=0 STP=off ONBOOT=no BOOTPROTO=dhcp MTU=1500 DEFROUTE=no NM_CONTROLLED=no HOTPLUG=no ~~~ # cat /var/lib/vdsm/persistence/netconf/nets/rhevm {"nic": "eth0", "mtu": 1500, "bootproto": "dhcp", "stp": false, "bridged": true, "defaultRoute": false} After modifying persistence setting and running restore script, network came back ok. ~~~ # vim /var/lib/vdsm/persistence/netconf/nets/rhevm # rm -f /var/run/vdsm/nets_restored # /usr/share/vdsm/vdsm-restore-net-config ~~~ I will attach some logs now.
Created attachment 1026060 [details] supervdsm.log
The upgrade happened somewhere after 2015-05-14 16:32:23.
Any chance you can provide the output of `route -n` and `ip route show table all` prior of the upgrade? /var/log/vdsm/upgrade.log is a must for debugging upgrade-related issues.
Created attachment 1026764 [details] ip route show table all
0.0.0.0 10.10.183.254 0.0.0.0 UG 0 0 0 rhevm seems perfect. The lack of upgrade.log is most disturbing. Have you ever seen an upgraded 3.5 setup lacking it? /var/log/message-20150517 has disturbing logs. Could it be that vdsm was upgraded while it was running? Was the host put on maintenance beforehence? We try hard to keep this working, but it is not the recommended way. May 14 17:01:59 cisco-b200m3-01 abrt: detected unhandled Python exception in '/usr/share/vdsm/supervdsmServer' May 14 17:01:59 cisco-b200m3-01 abrtd: New client connected May 14 17:01:59 cisco-b200m3-01 abrtd: Directory 'pyhook-2015-05-14-17:01:59-28780' creation detected May 14 17:01:59 cisco-b200m3-01 abrt-server[28786]: Saved Python crash dump of pid 28780 to /var/spool/abrt/pyhook-2015-05-14-17:01:59-28780 May 14 17:01:59 cisco-b200m3-01 respawn: slave '/usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock --pidfile /var/run/vdsm/supervdsmd.pid' died too quickly, respawning slave and lots of May 15 16:22:09 cisco-b200m3-01 kernel: Neighbour table overflow.
The host was in maintenance, of course. However, I could not get to it since long time after the upgrade and I could not get to the console (ucs blades...), so I had to reboot it (using power management). Once it got back, I could ssh into it, but could not get to the storage. So that's how I discovered it didn't have default route.
Please reopen if you have a reproduction environment.