Bug 1188230 - F21 guest loses network connection after 30 min if open-vm-tools installed
Summary: F21 guest loses network connection after 30 min if open-vm-tools installed
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: open-vm-tools
Version: 21
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ravindra Kumar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-02-02 11:28 UTC by David Keegan
Modified: 2015-02-05 14:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-04 14:53:22 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Logfiles for approx 1 hour from reboot to loss of network. (98.86 KB, application/x-gzip)
2015-02-03 16:43 UTC, David Keegan
no flags Details

Description David Keegan 2015-02-02 11:28:29 UTC
Description of problem:

LXDE spin of F21 installed in ESXi VM. About 30min after it boots it loses
network connectivity. In particular because the default route is removed from the routing table. Sometimes the dhcp-assigned IPv4 address is lost also.There was no such problem with F19.

How reproducible:

Every time.

Steps to Reproduce:
1. Install LXDE spin of Fedora in VM.
2. Power on the VM.

Actual results:
VM network works for about 30 min, then the default route is removed from the routing table.

Expected results:
Network connectivity should persist indefinitely.


Additional info:
Can usually get the network working again by restarting NetworkManager.

Comment 1 Richard W.M. Jones 2015-02-02 11:32:29 UTC
I notice you commented already on bug 1184173.  What makes you
think there is a problem caused by open-vm-tools?

Comment 2 David Keegan 2015-02-02 11:40:21 UTC
Sorry I forgot to mention if I disable the vmtools daemon with "sudo systemctl disable vmtoolsd" the network connectivity stays up (for 12 hours at least).

Comment 3 Ravindra Kumar 2015-02-02 19:40:41 UTC
AFAIK, there is no 30min timer in open-vm-tools. This needs some investigation. Could you please turn on vmtoolsd logs and collect?

You will have to create /etc/vmware-tools/tools.conf file in the guest with following contents:

# Start
[logging]
vmtoolsd.level=debug
vmtoolsd.handler=file
vmtoolsd.data=/tmp/vmtoolsd.log

vmsvc.level=debug
vmsvc.handler=file
vmsvc.data=/tmp/vmsvc.log
# End

Comment 4 Ravindra Kumar 2015-02-02 21:28:04 UTC
Along with vmtoolsd logs, it will greatly help if you could collect vmware.log as well for the VM.

While supplying logs, please share the timezone differences between host and guest if there are any.

Comment 5 David Keegan 2015-02-03 08:22:59 UTC
>AFAIK, there is no 30min timer in open-vm-tools. This needs some investigation. >Could you please turn on vmtoolsd logs and collect?

The DHCP lease seems to be 1 hour, and I thought the 30m might be related to that.

I will collect logfiles today.

Comment 6 David Keegan 2015-02-03 16:43:03 UTC
Created attachment 987683 [details]
Logfiles for approx 1 hour from reboot to loss of network.

The timezone seems to be UTC on the ESXi host and is GMT on the VM (which should be the same as UTC this time of year).

Comment 7 David Keegan 2015-02-03 17:05:40 UTC
The clock setting on the ESXi host is approx 2 hours later than the correct time of day.The VM has the chrony NTP daemon enabled, so the VM gets set to the correct after a while. You can see the VM time being adjusted substantial amounts in the messages file.

The error in the host time setting is a factor because if I adjust the host to the correct time the problem goes away, or at least doesn't occur as quickly.
However my Fedora 19 VM is running on the same host and doesn't have this problem. Could the changes in the VM clock affect DHCP and cause it to mess up the routing table.

I'm not sure whether or not open-vm-tools is adjusting the time of day in the VM, or how to configure that. Can I control that via /etc/tools.conf in the VM? If so is there any documentation covering /etc/tools.conf? The vmware documentation mentions a tools configuration utility, but that doesn't seem to exist in open-vm-tools.

Edit settings options for the VM in vSphere client shows some settings for vmtools. The script settings are greyed out in the F20 VM, but not in the F19 VM.
Also the time sync option is unset in both, but I'm not sure if that is effective.

Comment 8 Ravindra Kumar 2015-02-04 00:26:30 UTC
Thanks for the logs David.

(In reply to David Keegan from comment #7)
> The error in the host time setting is a factor because if I adjust the host
> to the correct time the problem goes away, or at least doesn't occur as
> quickly.

I also don't see anything obvious in the Tools logs apart from time moving backwards. timesync is a plugin in open-vm-tools and is responsible for updating guest clock.

> However my Fedora 19 VM is running on the same host and doesn't have this
> problem. Could the changes in the VM clock affect DHCP and cause it to mess
> up the routing table.

I think it is very likely the clock. I think we can try couple of things:

1. Renew the lease after guest boots and vmtoolsd has done the required time adjustments
-OR-
2. Poweroff VM, disable timesync and poweron again

In either case, you should be able to verify that issue is gone once you do this.

> I'm not sure whether or not open-vm-tools is adjusting the time of day in
> the VM, or how to configure that.

open-vm-tools do adjust the guest clock at specific events. Please refer KB article https://kb.vmware.com/kb/1189 for more details include the procedure to disable it.

> Can I control that via /etc/tools.conf in the VM? If so is there any documentation covering /etc/tools.conf?

No.

> The vmware documentation mentions a tools configuration utility, but that
> doesn't seem to exist in open-vm-tools.

vmware-toolbox-cmd does exist in open-vm-tools.

> Edit settings options for the VM in vSphere client shows some settings for
> vmtools. The script settings are greyed out in the F20 VM, but not in the
> F19 VM.

This is unrelated, but if you poweroff the VM you should be able to modify those settings. It's not F19 vs F20, but I believe you F19 VM might be powered off.

> Also the time sync option is unset in both, but I'm not sure if that is
> effective.

Please refer https://kb.vmware.com/kb/1189 to disable time sync.

Comment 9 Ravindra Kumar 2015-02-04 00:32:33 UTC
I'm also curious to know why your host clock is not configured correctly? Do you really intend to run the host with misconfigured clocks? I think fixing the host clock might be the solution for the problem once you have tried the options I mentioned above. Or, completely disable the timesync using steps in the KB article because you have NTP configured for the VM.

Comment 10 David Keegan 2015-02-04 08:55:13 UTC
Ravindra, many thanks for your suggestions. I was completely stuck with this issue.

I agree our host should be set to the correct time of day. However I'm concerned if I just do that it will defer the loss of network and not eliminate it. I think the clock error may be exposing a bug in F20 versus F19, not necessarily a bug in open-vm-tools, but perhaps in the DHCP infrastructure. Losing network connectivity is a heavy price to pay for a time-of-day error.

I agree it looks like the time changes are the cause if the problem, and since I don't need that feature of open-vm-tools I'll disable timesync as per your instructions and see what happens.

Comment 11 David Keegan 2015-02-04 14:53:22 UTC
I followed your instructions to disable all time syncing in open-vm-tools. The time changes are no longer visible in /var/log/messages and the network stays up, so this is a viable workaround for the issue.

However when I export my VM to ovf/ova the time sync parameters are not included in the .ovf, so a VM deployed from the export will revert to time sync enabled and bad behaviour. Therefore I have reluctantly decided that the only option is to remove open-vm-tools entirely as I need to deploy this VM to multiple hosts in different environments and I can't rely on all target ESXi hosts having the clock set correctly.

Am closing as I don't think this is really an open-vm-tools issue. Thanks again for your help.

Comment 12 Ravindra Kumar 2015-02-04 21:50:09 UTC
If you really want to get rid of timesync from host, a not-so-clean workaround is to remove the /usr/libXX/open-vm-tools/plugins/vmsvc/libtimeSync.so plugin file from the guest and restart vmtoolsd service.

BTW, here is the correct URL for the KB, http://kb.vmware.com/kb/1189.

Comment 13 David Keegan 2015-02-05 14:23:17 UTC
Thanks for the workaround of removing the plugin. That works for me and allows me to keep open-vm-tools installed.


Note You need to log in before you can comment on or make changes to this bug.