Description of problem: Almost all VMs on one hypervisor in our internal RHEV cluster are generating message "The Balloon driver on VM ... on host ... is requested but unavailable." Version-Release number of selected component (if applicable): # hypervisor: RHEL 6.5.z mom-0.4.0-1.el6ev.noarch vdsm-4.14.7-3.el6ev.x86_64 # RHEV-M: Version 3.3.3-0.52.el6ev # Guest: Fedora 20 ovirt-guest-agent-common-1.0.9-1.fc20.noarch kernel-3.13.10-200.fc20.x86_64 How reproducible: It happens for two days now. Steps to Reproduce: ? It happens on our internal cluster. Actual results: Error message 'The Balloon driver on VM ... on host ... is requested but unavailable.' is shown. Expected results: Well, the error message should not be there. Additional info: Information from one of affected guests: vm-233 ~]# lsmod | grep virtio virtio_console 23843 1 virtio_net 28024 0 virtio_balloon 13530 0 virtio_blk 17972 3 virtio_pci 17677 0 virtio_ring 19975 5 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console virtio 14172 5 virtio_blk,virtio_net,virtio_pci,virtio_balloon,virtio_console vm-233 ~]# service ovirt-guest-agent status Redirecting to /bin/systemctl status ovirt-guest-agent.service ovirt-guest-agent.service - oVirt Guest Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-guest-agent.service; enabled) Active: active (running) since Út 2014-07-01 10:47:47 CEST; 2 weeks 1 days ago Main PID: 456 (python) CGroup: /system.slice/ovirt-guest-agent.service └─456 /usr/bin/python /usr/share/ovirt-guest-agent/ovirt-guest-agent.py I'm attaching logs from affected hypervisor. List of compressed files follows: log/getAllVmStats.log log/getVdsCapabilities.log log/getVdsStats.log log/mom.log log/mom.log.1 log/mom.log.2 log/vdsm.log log/vdsm.log.1 log/vdsm.log.10 log/vdsm.log.11 log/vdsm.log.12 log/vdsm.log.2 log/vdsm.log.3 log/vdsm.log.4 log/vdsm.log.5 log/vdsm.log.6 log/vdsm.log.7 log/vdsm.log.8 log/vdsm.log.9 log/webadmin.log log/webadmin-log.pdf
As far as I can see all affected VMs have "Memory Balloon Device Enabled" checkbox enabled. The interesting thing is that it didn't happened in last two weeks for some reason. Maybe VDSM/hypervisors were restarted in meantime...
From the logs it seems that the balloon works fine (the memory changes) and the problem is in the check in the engine code: if (isBalloonDeviceActiveOnVm(vmInternalData) && (Objects.equals(balloonInfo.getCurrentMemory(), balloonInfo.getBalloonMaxMemory()) || !Objects.equals(balloonInfo.getCurrentMemory(), balloonInfo.getBalloonTargetMemory()))) { vmBalloonDriverIsRequestedAndUnavailable(vmId); getCurrentMemory() and getTargetMemory() returns *almost* the same number (probably because of some memory alignment or rounding error) so the balloon works, but the condition fails because it requires the numbers to be exactly the same. We can add some allowed difference into check so it's not so strict, but still checks if the balloon works (changes).
Can you provide any verification steps please how to simulate this?
(In reply to Lukas Svaty from comment #5) > Can you provide any verification steps please how to simulate this? - just try to set odd (not divisible by 2) value as the balloon target, the balloon should get a different (but close enough) amount of memory and there should be no warning
tested multiple times on av13.4 seems to be working if this bug reappears feel free to reopen it
I can confirm having the exact same problem with RHEVH 3.4
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0158.html