Bug 1470558 - [qmp] qemu-kvm process aborted after issuing QMP 'memsave' command on Power9
[qmp] qemu-kvm process aborted after issuing QMP 'memsave' command on Power9
Status: VERIFIED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.4-Alt
ppc64le Linux
high Severity high
: rc
: 7.4-Alt
Assigned To: Laurent Vivier
yilzhang
:
Depends On:
Blocks: 1440030 1457423
  Show dependency treegraph
 
Reported: 2017-07-13 04:09 EDT by yilzhang
Modified: 2017-10-12 01:49 EDT (History)
14 users (show)

See Also:
Fixed In Version: qemu-kvm-2.9.0-18.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
IBM Linux Technology Center 156784 None None None 2017-07-18 09:36 EDT

  None (edit)
Description yilzhang 2017-07-13 04:09:11 EDT
Description of problem:
On power9, boot up a guest, then do memory save using QMP 'memsave' command, the qemu-kvm process on host will abort right after user issues QMP 'memsave' command.
x86 and Power8 don't have this issue.


Version-Release number of selected component (if applicable):
Host:  kernel: 4.11.0-10.el7a.ppc64le
       qemu-kvm-2.9.0-16.el7a.ppc64le
Guest: 4.11.0-10.el7a.ppc64le

How reproducible: 100%


Steps to Reproduce:
1. Boot with QMP enabled like: -qmp tcp:0:4444,server
2. From any box with telnet client, run #telnet $HostIP 4444
3. Run {"execute": "qmp_capabilities"}
4. After guest is up(guest is running well), start memsave: 
   {"execute":"memsave","arguments":{"val":1,"size":65535,"filename":"aaaa"}}



Actual results:
qemu-kvm process aborts abnormally
(qemu) VNC server running on ::1:5900
(qemu) qemu: fatal: Unknown or invalid MMU model
NIP c0000000000b9724   LR c000000000908b5c CTR c000000000908b00 XER 0000000000000000 CPU#0
MSR 800000000280b033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
TB 00000000 00000000 DECR 00000000
GPR00 0000000022002044 c00000000128fd40 c000000001291b00 0000000000000000
GPR04 c000000001224b60 0000000000000000 0000000000000000 0000000000000000
GPR08 0000000000000000 c000000001164400 0000000000000001 0000000000002700
GPR12 c000000000908b00 c000000007b80000 0000000002000000 0000000002d8c4e0
GPR16 0000000000000000 c000000001224b60 c0000001fee837e0 0000000000000001
GPR20 0000000000000000 c0000000011a7580 c00000000128c080 c00000000128c080
GPR24 c00000000128c080 0000000000000000 000000093fec2ac6 c000000001224b60
GPR28 c000000001224b78 0000000000000000 00000002a7809ec0 0000000000000000
CR 22002044  [ E  E  -  -  E  -  G  G  ]             RES ffffffffffffffff
FPR00 0000000082004000 c1f0000000000000 0000000000000000 41f0000000000000
FPR04 0000000000000000 0000000000000000 656d697420726f66 746e692072656d69
FPR08 20000a3020000a30 3120202020202020 0000000000000000 7ff0000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000082004000
 SRR0 c0000000000b9724  SRR1 8000000002803033    PVR 00000000004e0100 VRSAVE 0000000000000000
SPRG0 0000000000000000 SPRG1 c000000007b80000  SPRG2 c000000007b80000  SPRG3 0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
HSRR0 0000000000000000 HSRR1 0000000000000000
 CFAR 0000000000000000
 LPCR 0000000003d4f41f
  DAR 00003fffad480000  DSISR 0000000042000000
basic.sh: line 17: 18940 Aborted                 (core dumped) /usr/libexec/qemu-kvm -name yilzhang_vm -smp 6,maxcpus=8,sockets=2,cores=4,threads=1 -m 8192 -serial unix:/tmp/myserial.log,server,nowait -monitor stdio -nodefaults -enable-kvm -qmp tcp:0:9999,server,nowait -device virtio-scsi-pci,bus=pci.0,addr=0x3,id=scsi0 -drive file=/home/yilzhang/rhel7.4-alt-20170626.4.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop -device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=52:54:00:c3:e7:84,bus=pci.0,addr=0x4,ioeventfd=off -device virtio-net-pci,mac=9a:dc:dd:de:df:e0,id=idEO7ydl,vectors=4,netdev=iddGDuW8,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on -netdev tap,id=iddGDuW8,vhost=on

# gdb /usr/libexec/qemu-kvm  core.18940 
(gdb) bt
#0  0x00003fffb6c9eff0 in raise () from /lib64/libc.so.6
#1  0x00003fffb6ca136c in abort () from /lib64/libc.so.6
#2  0x0000000052c834b0 in cpu_abort (cpu=0x6d3d0000, fmt=0x530d6930 "Unknown or invalid MMU model\n") at /usr/src/debug/qemu-2.9.0/exec.c:962
#3  0x0000000052e1dd8c in get_physical_address (env=0x6d3d8460, ctx=0x3fffd436b020, eaddr=0, rw=0, access_type=32)
    at /usr/src/debug/qemu-2.9.0/target/ppc/mmu_helper.c:1409
#4  0x0000000052e1f150 in ppc_cpu_get_phys_page_debug (cs=<optimized out>, addr=0) at /usr/src/debug/qemu-2.9.0/target/ppc/mmu_helper.c:1450
#5  0x0000000052c91fc0 in cpu_get_phys_page_attrs_debug (attrs=0x3fffd436b0b4, addr=0, cpu=0x6d3d0000)
    at /usr/src/debug/qemu-2.9.0/include/qom/cpu.h:560
#6  cpu_memory_rw_debug (cpu=0x6d3d0000, addr=1, buf=0x3fffd436b1c8 "\220\262\066\324\377?", len=1024, is_write=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/exec.c:3282
#7  0x0000000052cde388 in qmp_memsave (addr=1, size=65535, filename=<optimized out>, has_cpu=<optimized out>, cpu_index=<optimized out>, 
    errp=0x3fffd436b648) at /usr/src/debug/qemu-2.9.0/cpus.c:1955
#8  0x0000000052e820d8 in qmp_marshal_memsave (args=<optimized out>, ret=<optimized out>, errp=0x3fffd436b708) at qmp-marshal.c:1834
#9  0x0000000053079f74 in do_qmp_dispatch (errp=0x3fffd436b700, request=<optimized out>, cmds=0x532eb2f0 <qmp_commands>)
    at qapi/qmp-dispatch.c:104
#10 qmp_dispatch (cmds=0x532eb2f0 <qmp_commands>, request=<optimized out>) at qapi/qmp-dispatch.c:131
#11 0x0000000052ce2020 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.9.0/monitor.c:3850
#12 0x0000000053082180 in json_message_process_token (lexer=0x6cd43688, input=0x6cdc4640, type=<optimized out>, x=<optimized out>, 
    y=<optimized out>) at qobject/json-streamer.c:105
#13 0x00000000530ab738 in json_lexer_feed_char (lexer=0x6cd43688, ch=<optimized out>, flush=false) at qobject/json-lexer.c:319
#14 0x00000000530ab874 in json_lexer_feed (lexer=0x6cd43688, buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369
#15 0x00000000530822dc in json_message_parser_feed (parser=<error reading variable: value has been optimized out>, buffer=<optimized out>, 
    size=<optimized out>) at qobject/json-streamer.c:124
#16 0x0000000052cdf8f4 in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
    at /usr/src/debug/qemu-2.9.0/monitor.c:3893
#17 0x000000005300acfc in qemu_chr_be_write_impl (len=<optimized out>, buf=<optimized out>, s=<optimized out>) at chardev/char.c:284
#18 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>, len=<optimized out>) at chardev/char.c:296
#19 0x0000000053014768 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at chardev/char-socket.c:414
#20 0x0000000053028cf4 in qio_channel_fd_source_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>)
    at io/channel-watch.c:84
#21 0x00003fffb71e3ab0 in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#22 0x000000005308a824 in glib_pollfds_poll () at util/main-loop.c:213
#23 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
#24 main_loop_wait (nonblocking=<optimized out>) at util/main-loop.c:517
#25 0x0000000052c7afd4 in main_loop () at vl.c:1909
#26 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4733



Expected results:
Due to Bug 1116315 - fail to execute the 'memsave' QMP command in qemu-kvm-rhev-2.0.0, the expected result should be: 'memsave' command fails.
But qemu-kvm process and guest both works well.


Additional info:
1. On x86 and Power8, 'memsave' command in step4 fails, the error message is:
{"execute":"memsave","arguments":{"val":1,"size":65535,"filename":"aaaa"}}
{"error": {"class": "GenericError", "desc": "Invalid addr 0x0000000000000001/size 65535 specified"}}
And according to Bug 1116315, the error is expected

2. Qemu cli:
/usr/libexec/qemu-kvm \
 -name yilzhang_vm \
-smp 6,maxcpus=8,sockets=2,cores=4,threads=1 -m 8192 \
-serial unix:/tmp/myserial.log,server,nowait \
-monitor stdio \
-nodefaults -enable-kvm \
-qmp tcp:0:9999,server,nowait \
-device virtio-balloon-pci,id=balloon0 \
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6,disable-legacy=off \
-chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 \
-device virtserialport,bus=virtio-serial0.0,chardev=qga0,id=qemu-ga0,name=org.qemu.guest_agent.0 \
\
-device virtio-scsi-pci,bus=pci.0,addr=0x3,id=scsi0 \
-drive file=/home/yilzhang/rhel7.4-alt-20170626.4.qcow2,if=none,id=drive-system-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop \
-device scsi-hd,bus=scsi0.0,drive=drive-system-disk,id=system-disk,bootindex=1 \
\
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=52:54:00:c3:e7:84,bus=pci.0,addr=0x4,ioeventfd=off \
-device virtio-net-pci,mac=9a:dc:dd:de:df:e0,id=idEO7ydl,vectors=4,netdev=iddGDuW8,bus=pci.0,addr=05,disable-legacy=off,disable-modern=on  \
-netdev tap,id=iddGDuW8,vhost=on \
Comment 2 Laurent Vivier 2017-07-13 10:07:37 EDT
Missing commits seem to be:

0922f1e target/ppc/POWER9: Add POWERPC_MMU_V3 bit
b289949 target/ppc/POWER9: Add POWER9 mmu fault handler
6a04282 target/ppc: Refactor tcg radix mmu code
95cb065 target/ppc: Add debug function for radix mmu translation
Comment 3 Laurent Vivier 2017-07-13 10:17:07 EDT
The two first ones are already in 2.9. So we need only to backport:

6a04282 target/ppc: Refactor tcg radix mmu code
95cb065 target/ppc: Add debug function for radix mmu translation
Comment 4 Laurent Vivier 2017-07-13 11:06:09 EDT
Bug can be triggered easily from HMP with "x 0".
Comment 5 Miroslav Rezanina 2017-07-18 03:57:35 EDT
Fix included in qemu-kvm-2.9.0-18.el7
Comment 6 yilzhang 2017-07-19 23:24:08 EDT
This bug is verified to be pass on Power9 against qemu-kvm-2.9.0-18.el7a

Bug verification is as follows:
Host version:
4.11.0-10.el7a.ppc64le
qemu-kvm-2.9.0-18.el7a.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch
Guest:  4.11.0-10.el7a.ppc64le


Steps:  the same with bug description
Result:  'memsave' command fails, but qemu-kvm process and guest both works well.
This is the expected result (Due to Bug 1116315 - fail to execute the 'memsave' QMP command in qemu-kvm-rhev-2.0.0).

So this bug is fixed against qemu-kvm-2.9.0-18.el7a
Comment 8 yilzhang 2017-08-08 04:36:24 EDT
This bug has been verified against the following version of components:

host:  kernel-4.11.0-22.el7a.ppc64le
qemu-kvm-2.9.0-19.el7a.ppc64le
SLOF-20170303-4.git66d250e.el7.noarch
guest kernel: 4.11.0-16.el7a.ppc64le


Actual results:  'memsave' command fails(see below). But qemu-kvm process and guest both work well. Due to Bug 1116315, this is the expected result.

{"execute":"memsave","arguments":{"val":1,"size":65535,"filename":"aaaa"}}
{"error": {"class": "GenericError", "desc": "Invalid addr 0x0000000000000001/size 65535 specified"}}


So, this bug is fixed.

Note You need to log in before you can comment on or make changes to this bug.