Bug 1820402

Summary:	Sometimes hit "error: kvm run failed Bad address" when launching a guest on Power8
Product:	Red Hat Enterprise Linux 8	Reporter:	Yihuang Yu <yihyu>
Component:	kernel	Assignee:	David Gibson <dgibson>
kernel sub component:	KVM	QA Contact:	Zhenyu Zhang <zhenyzha>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	dgibson, juzhang, lagarcia, lvivier, mdeng, mtessun, ngu, qzhang, virt-maint, xianwang, xuma, zhenyzha
Version:	8.2	Keywords:	Regression, Triaged, ZStream
Target Milestone:	rc
Target Release:	8.3
Hardware:	ppc64le
OS:	Linux
Whiteboard:
Fixed In Version:	kernel-4.18.0-200.el8	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1890883 (view as bug list)		Environment:
Last Closed:	2020-11-04 01:12:11 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1826160, 1890883

Description Yihuang Yu 2020-04-03 01:30:31 UTC

Description of problem:
In recent rounds of acceptance testing, QEMU sometimes reports "error: kvm run failed Bad address" and then guest stops.
If I degrade the host kernel to 4.18.0-187.el8.ppc64le, the tests are passed (200/200)

Version-Release number of selected component (if applicable):
Host kernel version: 4.18.0-193.el8.ppc64le
qemu version: qemu-kvm-4.2.0-17.module+el8.2.0+6131+4e715f3b.ppc64le
guest kernel: 4.18.0-193.el8.ppc64le

How reproducible:
3/100

Steps to Reproduce:
1. Launch a guest
/usr/libexec/qemu-kvm \
    -name 'avocado-vt-vm1'  \
    -S \
    -sandbox on  \
    -machine pseries  \
    -nodefaults \
    -device VGA,bus=pci.0,addr=0x2 \
    -m 27648  \
    -smp 6,maxcpus=6,cores=3,threads=1,sockets=2  \
    -cpu 'host' \
    -chardev socket,path=/var/tmp/avocado_05oqo3uk/serial-serial0-20200331-214431-IiTIsZHE,server,id=chardev_serial0,nowait \
    -device spapr-vty,id=serial0,reg=0x30000000,chardev=chardev_serial0 \
    -device qemu-xhci,id=usb1,bus=pci.0,addr=0x3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=0x4 \
    -blockdev node-name=file_image1,driver=file,aio=threads,filename=/home/kar/vt_test_images/rhel820-ppc64le-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -device virtio-net-pci,mac=9a:5f:3b:b1:c3:38,id=idVWGTmh,netdev=idGI7vS7,bus=pci.0,addr=0x5  \
    -netdev tap,id=idGI7vS7,vhost=on,vhostfd=21,fd=17 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -vnc :0  \
    -rtc base=utc,clock=host  \
    -boot menu=off,order=cdn,once=c,strict=off \
    -enable-kvm \
    -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6

2. Resume the guest
{"execute": "cont"}

Actual results:
14:08:10 INFO | [qemu output] error: kvm run failed Bad address
14:08:10 INFO | [qemu output] NIP c0000000001c2198   LR c0000000001a5f54 CTR c0000000001ac5f0 XER 0000000000000000 CPU#6
14:08:10 INFO | [qemu output] MSR 8000000040009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
14:08:10 INFO | [qemu output] TB 00000000 00000000 DECR 00000000
14:08:10 INFO | [qemu output] GPR00 c0000000001a5e70 c000003e61937740 c000000001920a00 c000003e70298c00
14:08:10 INFO | [qemu output] GPR04 c000003e618eeb00 0000000000000000 0000000000000000 c000003e618eeb00
14:08:10 INFO | [qemu output] GPR08 0000000000000001 c000003e70298c00 0000000000000000 0000000000000001
14:08:10 INFO | [qemu output] GPR12 c0000000001ac5f0 c000000007ff7800 c000003e61937f90 0000000000000000
14:08:10 INFO | [qemu output] GPR16 c0000000019520d8 0000000000000000 0000000000000800 c000000000245250
14:08:10 INFO | [qemu output] GPR20 0000004e5f4a97ea 0000000000000001 0000000000000002 0000000000000000
14:08:10 INFO | [qemu output] GPR24 0000000000000000 c000003e618eeb00 0000003e6efd0000 c00000000194dd70
14:08:10 INFO | [qemu output] GPR28 c0000000012c8c00 0000000000000030 c000003e70298c00 c0000000012c8c00
14:08:10 INFO | [qemu output] CR 88000222  [ L  L  -  -  -  E  E  E  ]             RES ffffffffffffffff
14:08:10 INFO | [qemu output] FPR00 00000000b3004100 ffffffffffffffff ffffffffffffffff ffffffffffffffff
14:08:10 INFO | [qemu output] FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
14:08:10 INFO | [qemu output] FPSCR 00000000b3004100
14:08:10 INFO | [qemu output]  SRR0 c0000000000090b0  SRR1 9000000000001033    PVR 00000000004c0100 VRSAVE 0000000000000000
14:08:10 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c000000007ff7800  SPRG2 c000000007ff7800  SPRG3 0000000000000006
14:08:10 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
14:08:10 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000
14:08:10 INFO | [qemu output]  CFAR 0000000000000000
14:08:10 INFO | [qemu output]  LPCR 000000000384f001
14:08:10 INFO | [qemu output]  SDR1 0000000000000000   DAR 00007ffeeffb0300  DSISR 0000000042000000

qmp logs:
2020-04-02 14:02:15: {"execute": "query-status", "id": "uSSk9GOJ"}
2020-04-02 14:02:15: {"return": {"status": "prelaunch", "singlestep": false, "running": false}, "id": "uSSk9GOJ"}
2020-04-02 14:02:15: {"execute": "cont", "id": "TdnqPfLP"}
2020-04-02 14:02:15: {"timestamp": {"seconds": 1585850535, "microseconds": 592268}, "event": "RESUME"}
2020-04-02 14:02:15: {"return": {}, "id": "TdnqPfLP"}
2020-04-02 14:02:15: {"execute": "query-status", "id": "7mab2AFa"}
2020-04-02 14:02:15: {"return": {"status": "running", "singlestep": false, "running": true}, "id": "7mab2AFa"}
2020-04-02 14:02:15: {"execute": "query-status", "id": "8I8m5aio"}
2020-04-02 14:02:15: {"return": {"status": "running", "singlestep": false, "running": true}, "id": "8I8m5aio"}
2020-04-02 14:37:34: {"timestamp": {"seconds": 1585850577, "microseconds": 761903}, "event": "NIC_RX_FILTER_CHANGED", "data": {"name": "idrsSzlS", "path": "/machine/peripheral/idrsSzlS/virtio-backend"}}
2020-04-02 14:37:34: {"timestamp": {"seconds": 1585850584, "microseconds": 586535}, "event": "RTC_CHANGE", "data": {"offset": 1}}
2020-04-02 14:37:34: {"timestamp": {"seconds": 1585850890, "microseconds": 68324}, "event": "STOP"}
2020-04-02 14:37:34: {"execute": "query-status", "id": "Aw0K1gee"}
2020-04-02 14:37:34: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "Aw0K1gee"}
2020-04-02 14:37:55: {"execute": "query-status", "id": "dlL6N9GO"}
2020-04-02 14:37:55: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "dlL6N9GO"}
2020-04-02 14:38:16: {"execute": "query-status", "id": "Bk9lLnEf"}
2020-04-02 14:38:16: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "Bk9lLnEf"}
2020-04-02 14:38:37: {"execute": "query-status", "id": "zfAZ1zeZ"}
2020-04-02 14:38:37: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "zfAZ1zeZ"}
2020-04-02 14:38:59: {"execute": "query-status", "id": "vSxsQ53z"}
2020-04-02 14:38:59: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "vSxsQ53z"}
2020-04-02 14:39:20: {"execute": "query-status", "id": "miCF0jm4"}
2020-04-02 14:39:20: {"return": {"status": "internal-error", "singlestep": false, "running": false}, "id": "miCF0jm4"}

Expected results:
Guest should be successfully launched and work fine.

Additional info:
This problem only hit on Power8, and it's not easy to reproduce so I can not provide a clear reproducer yet. I will try to further test these kernel versions(187 - 193) by dichotomy.

Comment 1 Min Deng 2020-04-03 03:07:18 UTC

It seems there isn't "max-cpu-compat=power8" in cmdline.

Comment 2 Yihuang Yu 2020-04-03 06:03:19 UTC

(In reply to Min Deng from comment #1)
> It seems there isn't "max-cpu-compat=power8" in cmdline.

I mean this problem only occurs on power8 host.

Comment 3 David Gibson 2020-04-09 03:21:58 UTC

I've reproduce this myself, with a script that starts a guest repeatedly.

Simply because there are very few plausibly relevant patches between -187 and -193, I suspect the problem was introduced by b940de4c6e68468c2817e072fd94a56d5399924a which is the downstream backport of cd758a9b57ee85f0733c759e60f42b969c81f27b "[kvm] KVM: PPC: Book3S HV: Use __gfn_to_pfn_memslot in HPT page fault handler".  I'm working on confirming this now.

Comment 4 David Gibson 2020-04-09 04:23:54 UTC

Confirmed that I can reproduce the problem with downstream commit b940de4c6e68468c2817e072fd94a56d5399924a.  Now trying the parent commit (942a59196ff8cbdaac59f6d9aa02399d2bf4d3a2), to see if I can reproduce there.

Comment 5 David Gibson 2020-04-09 05:52:57 UTC

Wasn't able to reproduce with 942a59196ff8cbdaac59f6d9aa02399d2bf4d3a2 in a little over 100 attempts, so that seems to confirm that b940de4c6e68468c2817e072fd94a56d5399924a introduced the problem downstream.  Next trying upstream.

Comment 6 David Gibson 2020-04-09 07:03:41 UTC

Reproduced upstream with cd758a9b57ee85f0733c759e60f42b969c81f27b, though it took quite a while (38th attempt).

Comment 7 David Gibson 2020-04-09 08:04:27 UTC

Also reproduced with upstream master (ae46d2aa6a7fbe8ca0946f24b061b6ccdc6c3f25).

Comment 8 David Gibson 2020-04-09 11:31:00 UTC

Failed to reproduce with upstream cd758a9b57ee85f0733c759e60f42b969c81f27b^ = 1c482452d5db0f52e4e8eed95bd7314eec537d78 on over 200 attempts, so I think we can be pretty confident the problem was introduced by cd758a9b57ee85f0733c759e60f42b969c81f27b and is not yet fixed upstream.

Comment 9 David Gibson 2020-04-15 07:06:11 UTC

Did a bunch of debugging on the upstream kernel.  The fault is occuring in the KVM page fault handler kvmppc_book3s_hv_page_fault() where we're exiting because we have a Linux PTE which is cache-inhibited, but we have a hash PTE which is not.

Specifically:
    Linux PTE:    0x400000079788003d
    Hash PTE (R): 0x00000000001e1190

Comment 10 David Gibson 2020-04-16 02:52:06 UTC

Since this bug is upstream, it will be useful to reference publically.  Making public.

Comment 11 David Gibson 2020-04-16 03:55:23 UTC

Tried a change suggested by Paul Mackerras, and it seems to be working on >140 attempts so far.

I've posted the change upstream.

Comment 12 David Gibson 2020-04-28 04:53:05 UTC

Now merged upstream as ae49dedaa92b55258544aace7c585094b862ef79 "KVM: PPC: Book3S HV: Handle non-present PTEs in page fault functions".

Brewing test kernel at:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=28222807

Comment 17 Frantisek Hrbata 2020-05-17 17:39:07 UTC

Patch(es) available on kernel-4.18.0-200.el8

Comment 24 Zhenyu Zhang 2020-05-25 01:38:13 UTC

Test Environment:
Host Distro: RHEL-8.3.0-20200520.n.4 BaseOS ppc64le 
Host Kernel: 4.18.0-201.el8.ppc64le
Qemu-kvm: qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420
SLOF: SLOF-20200327-1.git8e012d6f.module+el8.3.0+6612+6b86f0c9 
Guest OS:RHEL.8.3.0, RHEL.7.9

log:
http://10.0.136.47/4270815/results.html


fast-virt-6620 acceptance test no hit this issue with kernel-4.18.0-201.el8.ppc64le on Power8

Comment 25 Zhenyu Zhang 2020-06-04 01:45:30 UTC

Set bug to VERIFIED according to comment22 and comment24.

Comment 26 Zhenyu Zhang 2020-08-21 06:46:33 UTC

Hi David,

Hit this issue on RHEL-8.2.1-z 

Since this bug was fixed on RHEL-8.3.0, do we plan to fix it on RHEL-8.2.1-z?


Test Environment:
Host Distro: RHEL-8.2.0 BaseOS ppc64le 
Host Kernel: 4.18.0-193.el8.ppc64le
Qemu-kvm: qemu-kvm-4.2.0-29.module+el8.2.1+7712+3c3fe332.2
SLOF: SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd
Guest OS:RHEL.8.2.1, RHEL.7.9

log:
http://10.0.136.47/4506164/results.html

Comment 27 David Gibson 2020-08-24 01:21:31 UTC

Martin, I always forget the support durations on the various y-streams.  Do you think we need to fix this for RHEL8.2?

Comment 28 David Gibson 2020-09-07 23:53:46 UTC

After discussion on Monday's call, I think this should be backported to 8.2.1.z

Comment 29 Zhenyu Zhang 2020-09-17 02:53:57 UTC

(In reply to David Gibson from comment #28)
> After discussion on Monday's call, I think this should be backported to
> 8.2.1.z

Hi David,

Thanks for the feedback.
Since the current bug status is verified, should we change the bug status to track the 8.2.1.z modification?

Hit this issue on RHEL-8.2.1-z, qemu-kvm-4.2.0-29.module+el8.2.1+7983+71fbcacb.3.

Comment 30 David Gibson 2020-09-17 03:01:48 UTC

Not yet.  We need to wait for the zstream approval, then the bug will be cloned to track the zstream variant.

Comment 31 Zhenyu Zhang 2020-09-17 03:08:33 UTC

(In reply to David Gibson from comment #30)
> Not yet.  We need to wait for the zstream approval, then the bug will be
> cloned to track the zstream variant.

OK, Thanks for the quick reply.
Please let me know when you need to clone this bug.

Comment 33 Zhenyu Zhang 2020-10-13 07:28:11 UTC

(In reply to David Gibson from comment #30)
> Not yet.  We need to wait for the zstream approval, then the bug will be
> cloned to track the zstream variant.

Hi Martin,

Hit this issue again on RHEL AV 8.2.1 - Batch 2 - Compose

So do we still need to wait for zstream approval?

Test Environment:
Host Distro: RHEL-8.2.0 BaseOS ppc64le
Host Kernel: kernel-4.18.0-193.el8.ppc64le
virt repo url: http://download.eng.bos.redhat.com/rhel-8/rel-eng/updates/ADVANCED-VIRT-8/ADVANCED-VIRT-8.2.1-RHEL-8-20201012.0/compose/Advanced-virt/ppc64le/os
qemu-kvm: qemu-kvm-4.2.0-29.module+el8.2.1+7990+27f1e480.4.ppc64le
SLOF: SLOF-20191022-3.git899d9883.module+el8.2.0+5449+efc036dd.noarch


07:52:56 DEBUG| Send command: {'execute': 'cont', 'id': 'uhMrPsH4'}
07:52:56 INFO | Boot a guest with stg0("aio=threads,cache=none").
07:52:56 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 360s)
07:53:19 INFO | [qemu output] error: kvm run failed Bad address
07:53:19 INFO | [qemu output] error: kvm run failed Bad address
07:53:19 INFO | [qemu output] NIP c0000000001c26c8   LR c0000000001a6044 CTR c0000000001bbc80 XER 0000000020000000 CPU#1
07:53:19 INFO | [qemu output] MSR 8000000040009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
07:53:19 INFO | [qemu output] TB 00000000 00000000 DECR 0
07:53:19 INFO | [qemu output] GPR00 c0000000001a5fe0 c000000007c27b70 c000000001920b00 c00000073dd98c00
07:53:19 INFO | [qemu output] GPR04 c00000073dd98c80 00000000000003fe 0000000000000000 c00000073859d600
07:53:19 INFO | [qemu output] GPR08 0000000000000000 c00000073dd98c00 c000000007bf3180 0000000000000001
07:53:19 INFO | [qemu output] GPR12 0000000000004400 c000000003ecee00 0000000000000000 00007fffab340000
07:53:19 INFO | [qemu output] GPR16 0000000126f2fd38 0000000126f2fcd8 0000000126f2fd08 c000000000245930
07:53:19 INFO | [qemu output] GPR20 00000000d62925ce 0000000000000001 0000000000000000 0000000000000000
07:53:19 INFO | [qemu output] GPR24 0000000000000000 c000000007bf3180 000000073cad0000 c00000000194dd70
07:53:19 INFO | [qemu output] GPR28 c0000000012c8c00 0000000000000008 c00000073dd98c00 c0000000012c8c00
07:53:19 INFO | [qemu output] CR 88004488  [ L  L  -  -  G  G  L  L  ]             RES ffffffffffffffff
07:53:19 INFO | [qemu output]  SRR0 0000000126f23088  SRR1 800000000000f033    PVR 00000000004b0201 VRSAVE 00000000ffffffff
07:53:19 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c000000003ecee00  SPRG2 00007fffa94067c0  SPRG3 0000000000000001
07:53:19 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
07:53:19 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000
07:53:19 INFO | [qemu output]  CFAR 0000000000000000
07:53:19 INFO | [qemu output]  LPCR 000000000384f001
07:53:19 INFO | [qemu output]  SDR1 0000000000000000   DAR 00000100374f2990  DSISR 0000000000000000
07:53:19 INFO | [qemu output] NIP c0000000001c26c8   LR c0000000001a6044 CTR c0000000001bbc80 XER 0000000020000000 CPU#3
07:53:19 INFO | [qemu output] MSR 8000000040009033 HID0 0000000000000000  HF 8000000000000000 iidx 3 didx 3
07:53:19 INFO | [qemu output] TB 00000000 00000000 DECR 0
07:53:19 INFO | [qemu output] GPR00 c0000000001a5fe0 c000000007c1fb70 c000000001920b00 c00000073df98c00
07:53:19 INFO | [qemu output] GPR04 c00000073df98c80 00000000000003ff 0000000000000000 c000000738591a00
07:53:19 INFO | [qemu output] GPR08 0000000000000000 c00000073df98c00 c000000007bebc00 0000000000000001
07:53:19 INFO | [qemu output] GPR12 0000000000004400 c000000003ecae00 0000000000000000 00007fffab340000
07:53:19 INFO | [qemu output] GPR16 0000000126f2fd38 0000000126f2fcd8 0000000126f2fd08 c000000000245930
07:53:19 INFO | [qemu output] GPR20 00000000d6292113 0000000000000001 0000000000000000 0000000000000000
07:53:19 INFO | [qemu output] GPR24 0000000000000000 c000000007bebc00 000000073ccd0000 c00000000194dd70
07:53:19 INFO | [qemu output] GPR28 c0000000012c8c00 0000000000000018 c00000073df98c00 c0000000012c8c00
07:53:19 INFO | [qemu output] CR 88004488  [ L  L  -  -  G  G  L  L  ]             RES ffffffffffffffff
07:53:19 INFO | [qemu output]  SRR0 0000000126f23088  SRR1 800000000000d033    PVR 00000000004b0201 VRSAVE 00000000ffffffff
07:53:19 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c000000003ecae00  SPRG2 00007fffa40067c0  SPRG3 0000000000000003
07:53:19 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000  SPRG6 0000000000000000  SPRG7 0000000000000000
07:53:19 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000
07:53:19 INFO | [qemu output]  CFAR 0000000000000000
07:53:19 INFO | [qemu output]  LPCR 000000000384f001
07:53:19 INFO | [qemu output]  SDR1 0000000000000000   DAR d001f7ffffd027b8  DSISR 0000000000000000

Comment 38 errata-xmlrpc 2020-11-04 01:12:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4431