Bug 1002360 - virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.5
Unspecified Unspecified
medium Severity medium
: rc
: ---
Assigned To: Luiz Capitulino
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-28 22:16 EDT by Sibiao Luo
Modified: 2014-01-06 09:15 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-01-06 09:15:41 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sibiao Luo 2013-08-28 22:16:12 EDT
Description of problem:
boot a rhel guest with 4G memory and virtio-balloon device, check used mem is about 442M, then shrink the balloon to 600M, guest kernel will flood with prompting message: virtio_balloon virtio2: Out of puff! Can't get 256 pages.  

Version-Release number of selected component (if applicable):
host info:
# uname -r && rpm -q qemu-kvm
2.6.32-413.el6.x86_64
qemu-kvm-0.12.1.2-2.398.el6.x86_64
guest info:
2.6.32-414.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1.boot a rhel guest with 4G memory and virtio-balloon device.
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1...-device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5...
2.check used mem after it boot up.
# free -m
             total       used       free     shared    buffers     cached
Mem:          3830        442       3388          0         29        141
-/+ buffers/cache:        271       3559
Swap:         4095          0       4095
3.balloon it to 600M.
(qemu) info balloon 
balloon: actual=4096
(qemu) balloon 1000
(qemu) balloon 600
(qemu) info balloon 
balloon: actual=600

{"execute":"qmp_capabilities"}
{"return": {}}

{"timestamp": {"seconds": 1377688003, "microseconds": 627581}, "event": "BALLOON_CHANGE", "data": {"actual": 4293918720}}
{"timestamp": {"seconds": 1377688004, "microseconds": 320211}, "event": "BALLOON_CHANGE", "data": {"actual": 1048576000}}
{"execute": "query-balloon"}
{"return": {"actual": 1048576000}}

{"timestamp": {"seconds": 1377688019, "microseconds": 710245}, "event": "BALLOON_CHANGE", "data": {"actual": 1047527424}}
{"timestamp": {"seconds": 1377688020, "microseconds": 643593}, "event": "BALLOON_CHANGE", "data": {"actual": 653672448}}
{"timestamp": {"seconds": 1377688021, "microseconds": 668561}, "event": "BALLOON_CHANGE", "data": {"actual": 639987712}}
{"timestamp": {"seconds": 1377688022, "microseconds": 684533}, "event": "BALLOON_CHANGE", "data": {"actual": 629145600}}

{"execute": "query-balloon"}
{"return": {"actual": 629145600}}

Actual results:
after step 3, it can balloon to 600M, but guest kernel will flood with prompting message: virtio_balloon virtio2: Out of puff! Can't get 256 pages.
# nc -U /tmp/ttyS0 
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
__ratelimit: 16 callbacks suppressed
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
virtio_balloon virtio2: Out of puff! Can't get 256 pages
^C

after that, if enlarge the balloon value to 2G, it can recover to 2G successfully.
(qemu) balloon 2048
(qemu) info balloon 
balloon: actual=2048

"timestamp": {"seconds": 1377688168, "microseconds": 590644}, "event": "BALLOON_CHANGE", "data": {"actual": 630194176}}
{"timestamp": {"seconds": 1377688168, "microseconds": 834471}, "event": "BALLOON_CHANGE", "data": {"actual": 2147483648}}

Expected results:
It should have no such message in guest kernel log.

Additional info:
Comment 1 Sibiao Luo 2013-08-28 22:19:14 EDT
My qemu-kvm command line:
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu SandyBridge -enable-kvm -m 4096 -smp 4,sockets=2,cores=2,threads=1 -no-kvm-pit-reinjection -name sluo -uuid 43425b70-86e5-4664-bf2c-3b76699b8bec -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm.1,bus=virtio-serial0.0,id=port1,nr=1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm.2,bus=virtio-serial0.0,id=port2,nr=2 -drive file=/home/rhel6.5-64-copy.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,serial="QEMU-DISK1" -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-system-disk,id=system-disk,bootindex=1 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -netdev tap,id=hostnet0,vhost=off,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=2C:41:38:B6:40:21,bus=pci.0,addr=0x6,bootindex=2 -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttyS0,server,nowait -vnc :1 -spice port=5931,disable-ticketing -monitor stdio

My host cpu info:
# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 42
Stepping:              7
CPU MHz:               3401.000
BogoMIPS:              6782.68
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
Comment 3 Luiz Capitulino 2013-09-09 11:10:36 EDT
As far as I can understand, what's actually happening is that you're pushing the balloon operation to the limit and it's failing. Even if free tells you that you have used only 442MB, you also got memory cached and memory usage can vary a little bit during the balloon operation.

When the balloon driver fails to allocate memory, it prints that "Out of puff" message and goes to sleep a little bit and then retries. It keeps trying and never return a failure to the user/host.

I wouldn't say it's a bug, it's how it was designed to work. To change this behavior we could hack it (say, print the message only once) or change it to gave up on trying and return a failure to the user somehow (would require changes on QEMU, and possibly on the virtio-spec). I'd say both solutions are unlikely to fly on upstream and I doubt are worth trying.

I'll keep this open, but IMO this is very low priority.
Comment 4 Luiz Capitulino 2014-01-02 14:32:19 EST
I'm revisiting this bug, let me confirm something: you keep getting those messages even _after_ the operation finished? I mean, you do the following:

1. balloon 600
2. info balloon and query-balloon report the operation finished at 600MB
3. you log into the guest and free also reports the operation finished
4. you keep getting the error messages in the guest

If this is what happens then this might be a bug. If you get those messages while the operation is taking place then comment 3 applies.
Comment 5 Qunfang Zhang 2014-01-06 00:10:31 EST
Hi, Sibiao

Could you please check Luiz's comment 4 and reply it?  Thanks.
Comment 6 Sibiao Luo 2014-01-06 01:50:43 EST
(In reply to Luiz Capitulino from comment #4)
> I'm revisiting this bug, let me confirm something: you keep getting those
> messages even _after_ the operation finished? I mean, you do the following:
> 
> 1. balloon 600
> 2. info balloon and query-balloon report the operation finished at 600MB
> 3. you log into the guest and free also reports the operation finished
> 4. you keep getting the error messages in the guest
> 
> If this is what happens then this might be a bug. If you get those messages
> while the operation is taking place then comment 3 applies.
The guest kernel will flood with prompting message "virtio_balloon virtio2: Out of puff! Can't get 256 pages" during the balloon operation changing process.

And if the balloon operation finished at 600MB, there is no any error message display any more.

Best Regards,
sluo
Comment 7 Luiz Capitulino 2014-01-06 09:15:41 EST
As explained in comment 3 this is how the balloon was designed to work. If you push the balloon operation to the limit (that's, you reduce the guest to the point where it may start running out of memory) then those messages will be printed and two things may happen: the guest dies out of memory or the balloon operation eventually succeeds. The latter happened in your case.

Note You need to log in before you can comment on or make changes to this bug.