Bug 1826160
Summary: | [ppc64le][dump]executing kdump test in multiple guests will cause error in both guest and host | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux Advanced Virtualization | Reporter: | Min Deng <mdeng> | ||||||||||||||
Component: | qemu-kvm | Assignee: | Virtualization Maintenance <virt-maint> | ||||||||||||||
qemu-kvm sub component: | General | QA Contact: | Min Deng <mdeng> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||||||||||||
Severity: | high | ||||||||||||||||
Priority: | high | CC: | bfu, bugproxy, dgibson, fnovak, hannsj_uhl, mdeng, ngu, qzhang, virt-maint, xianwang, xuma, yihyu | ||||||||||||||
Version: | 8.2 | Keywords: | Triaged | ||||||||||||||
Target Milestone: | rc | ||||||||||||||||
Target Release: | 8.3 | ||||||||||||||||
Hardware: | ppc64le | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2020-09-17 00:26:47 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | 1820402 | ||||||||||||||||
Bug Blocks: | 1776265 | ||||||||||||||||
Attachments: |
|
Description
Min Deng
2020-04-21 06:42:57 UTC
I set the ITR to 8.3.0 since this is a high/crash type problem. Feel free to adjust to 8.2.1 if a fix is possible there or leave at 8.3.0 if a future rebase is the best way. AFAICT, qemu isn't doing anything wrong here. The guest kdump kernel is crashing while trying to dump, which causes qemu to report and error. So looks like the real problem is in kdump. What is the guest kernel and userspace version? The build information, qemu-kvm-common-4.2.0-19.module+el8.2.0+6296+6b821950.ppc64le or qemu-kvm-5.0.0-0.scrmod+el8.3.0+6312+cee4f348.ppc64le kernel-4.18.0-193.el8.ppc64le host-kernel:kernel-4.18.0-193.9.el8.ppc64le Tried the issue on p9 ,also hit the similar issue. rpm -qa|grep SLOF SLOF-20200327-1.git8e012d6f.el8.noarch [root@ibm-p9b-42 ~]# rpm -qa|grep qemu-kvm qemu-kvm-block-curl-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-tests-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-tests-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-debugsource-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-common-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-ssh-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-iscsi-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-rbd-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-curl-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-ssh-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-iscsi-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-block-rbd-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-core-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-core-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le qemu-kvm-common-debuginfo-5.0.0-0.scrmod+el8.3.0+6312+1f7d6182.ppc64le I haven't been able to reproduce this, despite matching as many parameters as I could think of which look like they could be relevant. This error message: [ 0.403976] Failed to execute /init (error -2) suggests that the kdump initrd has been incorrectly constructed within this guest image and is missing the init file (error -2 is ENOENT). If you rebuild the kdump initrd by using "kdumpctl rebuild" inside the guest, does kdump work afterwards? Hi David, Please do this before trigger a crash,the problem can be reproduced 100%. Steps, 1.# service kdump stop Redirecting to /bin/systemctl stop kdump.service 2.# echo c >/proc/sysrq-trigger Actual result, qemu-kvm terminated right away. Expected result, the guest should work well,for example,it can reboot/generate dump file and so on so forth. Thanks. The behavior you describe in comment 6 is expected. You're explicitly disabling kdump, so the dump service is not active. That means that the guest kernel will report the panic to qemu, which will terminate it. You should have the same behaviour on x86, if a pvpanic device is attached (POWER guests always have an equivalent device attached, it's part of the firmware functionality). If you want qemu to keep running with the crashed guest, to trigger a dump using the monitor, for example, you can use the -no-shutdown option. (In reply to David Gibson from comment #7) > The behavior you describe in comment 6 is expected. > > You're explicitly disabling kdump, so the dump service is not active. That > means that the guest kernel will report the panic to qemu, which will > terminate it. > > You should have the same behaviour on x86, if a pvpanic device is attached > (POWER guests always have an equivalent device attached, it's part of the > firmware functionality). > > If you want qemu to keep running with the crashed guest, to trigger a dump > using the monitor, for example, you can use the -no-shutdown option. Hi David, QE understood above points,thanks for that. QE run some automation test for kdump on multiple guests today,hit one issue,so I had better paste it here for discussion.I will upload some logs to the bug too as well as step's instruction.It is not always reproducible since I hit one time among 4 or 5 times.But I failed to reproduce it manually but I will still try it in the following days.Any problems please let me know,thanks. In automation's log, 03:45:48 DEBUG| Kdump service status before our testing: kdump.service - Crash recovery kernel arming Loaded: loaded (/usr/lib/systemd/system/kdump.service; enabled; vendor preset: enabled) Active: active (exited) since Thu 2020-05-07 15:45:47 CST; 352ms ago Process: 3438 ExecStop=/usr/bin/kdumpctl stop (code=exited, status=0/SUCCESS) Process: 3447 ExecStart=/usr/bin/kdumpctl start (code=exited, status=0/SUCCESS) Main PID: 3447 (code=exited, status=0/SUCCESS) May 07 15:45:43 dhcp19-129-162.pnr.lab.eng.bos.redhat.com systemd[1]: Starting. May 07 15:45:47 dhcp19-129-162.pnr.lab.eng.bos.redhat.com kdumpctl[3447]: Modif May 07 15:45:47 dhcp19-129-162.pnr.lab.eng.bos.redhat.com kdumpctl[3447]: kexec May 07 15:45:47 dhcp19-129-162.pnr.lab.eng.bos.redhat.com kdumpctl[3447]: Start May 07 15:45:47 dhcp19-129-162.pnr.lab.eng.bos.redhat.com systemd[1]: Started . Hint: Some lines were ellipsized, use -l to show in full. 03:45:48 INFO | Triggering crash on vcpu 0 ... 03:45:48 INFO | Context: Kdump Testing, force the Linux kernel to crash 03:45:48 DEBUG| Attempting to log into 'vm2' (timeout 360s) 03:45:48 DEBUG| Found/Verified IP 10.19.129.105 for VM vm2 NIC 0 03:45:49 INFO | [qemu output] error: kvm run failed Bad address 03:45:49 INFO | [qemu output] NIP c000000008008ac0 LR 0000000000000670 CTR c000000008dd3d60 XER 0000000000000000 CPU#35 03:45:49 INFO | [qemu output] MSR c00000000008dc30 HID0 0000000000000000 HF 8000000000000001 iidx 3 didx 3 03:45:49 INFO | [qemu output] TB 00000000 00000000 DECR 0 03:45:49 INFO | [qemu output] GPR00 0000000000000670 c00000000f644590 c000000009966300 0000000000000000 03:45:49 INFO | [qemu output] GPR04 fffffffffffd5704 0000000000000670 0000000000000008 c000000008dd3d60 03:45:49 INFO | [qemu output] GPR08 feeeeeeeeeeeeeee ffffffffff0c576c c0000000081fdc08 0000000000000381 03:45:49 INFO | [qemu output] GPR12 c000000008dd3d60 c00000000ff4c680 c0000000f6cbbf90 0000000000000000 03:45:49 INFO | [qemu output] GPR16 c0000000019a21e0 0000000000000000 0000000000000800 0000000000000001 03:45:49 INFO | [qemu output] GPR20 c000000001265608 0000000000000023 0000000000000000 0000000000000000 03:45:49 INFO | [qemu output] GPR24 0000000000000023 c000000009265808 feeeeeeeeeeeeeee c0000000090310c0 03:45:49 INFO | [qemu output] GPR28 000000000000000b c00000000f6446a0 c00000000f644570 c0000000081fdc08 03:45:49 INFO | [qemu output] CR 88008228 [ L L - - L E E L ] RES ffffffffffffffff 03:45:49 INFO | [qemu output] SRR0 c000000008008ac0 SRR1 c0000000081fdc00 PVR 00000000004e1202 VRSAVE 0000000000000000 03:45:49 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c00000000ff4c680 SPRG2 c00000000ff4c680 SPRG3 0000000000000023 03:45:49 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 03:45:49 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000 03:45:49 INFO | [qemu output] CFAR 0000000000000000 03:45:49 INFO | [qemu output] LPCR 0000000003d6f41f 03:45:49 INFO | [qemu output] PTCR 0000000000000000 DAR beeeeeeef815fe8e DSISR 0000000000000000 03:45:49 INFO | [qemu output] error: kvm run failed Bad address 03:45:49 INFO | [qemu output] NIP c000000008008ac0 LR 0000000000000670 CTR c000000008dd3d60 XER 0000000000000000 CPU#26 03:45:49 INFO | [qemu output] MSR c00000000008dc30 HID0 0000000000000000 HF 8000000000000001 iidx 3 didx 3 03:45:49 INFO | [qemu output] TB 00000000 00000000 DECR 0 03:45:49 INFO | [qemu output] GPR00 0000000000000670 c00000000f2a2e40 c000000009966300 0000000000000000 03:45:49 INFO | [qemu output] GPR04 fffffffffffd5704 0000000000000670 0000000000000008 c000000008dd3d60 03:45:49 INFO | [qemu output] GPR08 feeeeeeeeeeeeeee ffffffffff0c576c c0000000081fdc08 0000000000000381 03:45:49 INFO | [qemu output] GPR12 c000000008dd3d60 c00000000ff59e80 c0000000f6cf7f90 0000000000000000 03:45:49 INFO | [qemu output] GPR16 c0000000019a21e0 0000000000000000 0000000000000800 0000000000000001 03:45:49 INFO | [qemu output] GPR20 c000000001265608 000000000000001a 0000000000000000 0000000000000000 03:45:49 INFO | [qemu output] GPR24 000000000000001a 0000000000000000 0000000000000000 c0000000090310c0 03:45:49 INFO | [qemu output] GPR28 000000000000000b c00000000f2a2f50 c00000000f2a2e20 c0000000081fdc08 03:45:49 INFO | [qemu output] CR 88008228 [ L L - - L E E L ] RES ffffffffffffffff 03:45:49 INFO | [qemu output] SRR0 c000000008008ac0 SRR1 c0000000081fdc00 PVR 00000000004e1202 VRSAVE 0000000000000000 03:45:49 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c00000000ff59e80 SPRG2 c00000000ff59e80 SPRG3 000000000000001a 03:45:49 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 03:45:49 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000 03:45:49 INFO | [qemu output] CFAR 0000000000000000 03:45:49 INFO | [qemu output] LPCR 0000000003d6f41f 03:45:49 INFO | [qemu output] PTCR 0000000000000000 DAR beeeeeeef815fe8e DSISR 0000000000000000 host console, [82048.653732] KVM: Got unsupported MMU fault [82048.654544] KVM: Got unsupported MMU fault [82947.506516] watchdog: CPU 0 detected hard LOCKUP on other CPUs 3 [82947.506569] watchdog: CPU 0 TB:42545620888830, last SMP heartbeat TB:42537684573427 (15500ms ago) [82947.507429] watchdog: CPU 3 Hard LOCKUP [82947.507434] watchdog: CPU 3 TB:42545621013192, last heartbeat TB:42537428563546 (16000ms ago) [82947.507437] Modules linked in: xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_route_ipv6 nft_chain_nat_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nft_counter nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_tables vhost_net vhost tap tun nfnetlink bluetooth ecdh_generic rfkill rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache bridge stp llc kvm_hv kvm i2c_dev sunrpc ses at24 ofpart enclosure powernv_flash scsi_transport_sas xts uio_pdrv_genirq mtd uio ipmi_powernv ipmi_devintf ibmpowernv vmx_crypto ipmi_msghandler opal_prd ip_tables xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm i40e aacraid drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod [82947.507579] CPU: 3 PID: 99750 Comm: qemu-kvm Kdump: loaded Not tainted 4.18.0-193.13.el8.ppc64le #1 [82947.507582] NIP: c0080000080bd7cc LR: c0080000080bd7cc CTR: c00000000001aba0 [82947.507587] REGS: c000000bf2a27748 TRAP: 0100 Not tainted (4.18.0-193.13.el8.ppc64le) [82947.507589] MSR: 9000000102803033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]> CR: 24422222 XER: 00000000 [82947.507615] CFAR: c000000001b35ca8 IRQMASK: 40000003d6f41f [82947.507621] GPR00: c0080000080bd7cc c000000bf2a278b0 c000000001965300 c000000bf2a27748 [82947.507632] GPR04: c009dff19f2f4198 0000000000000000 0000000000000000 0000000000000003 [82947.507643] GPR08: 0000000000000003 0000000000000000 0000000ffb1d0000 c0080000080cfd10 [82947.507654] GPR12: c00000000001aba0 c000000fffffbb80 00007fffa4e70000 00007ffd737f0000 [82947.507665] GPR16: 00007fffa3d24410 c000200df0a6a558 0000000000000001 c0000000012f5cf0 [82947.507675] GPR20: 0000000000000003 c000000001b35ca8 c000200df0a6a558 0000000000000003 [82947.507686] GPR24: 0000000000000003 ffffffffffffffff 0040000003d6f41f 0000000000000100 [82947.507697] GPR28: c000200df0a60000 c000000bf92c0000 c000000fec3e4c00 c000000bef1d2a40 [82947.507709] NIP [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] [82947.507713] LR [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] [82947.507716] Call Trace: [82947.507719] [c000000bf2a278b0] [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] (unreliable) [82947.507727] [c000000bf2a27980] [c0080000080be748] kvmppc_vcpu_run_hv+0x980/0x1060 [kvm_hv] [82947.507732] [c000000bf2a27a90] [c00800000841de5c] kvmppc_vcpu_run+0x34/0x48 [kvm] [82947.507738] [c000000bf2a27ab0] [c008000008418f8c] kvm_arch_vcpu_ioctl_run+0x364/0x820 [kvm] [82947.507744] [c000000bf2a27ba0] [c008000008403298] kvm_vcpu_ioctl+0x460/0x7d0 [kvm] [82947.507749] [c000000bf2a27d10] [c00000000052c490] do_vfs_ioctl+0xe0/0xaa0 [82947.507754] [c000000bf2a27de0] [c00000000052d024] sys_ioctl+0xc4/0x160 [82947.507759] [c000000bf2a27e30] [c00000000000b408] system_call+0x5c/0x70 [82947.507763] Instruction dump: [82947.507767] 614af804 7fa95000 409efcf8 7fe3fb78 48012d0d e8410018 4bfffce8 60000000 [82947.507782] 60000000 2f9b0100 409efbfc 48012549 <e8410018> 4bfffbf0 60000000 60000000 [82947.517421] watchdog: CPU 3 became unstuck TB:42545626476761 [82947.517432] CPU: 3 PID: 99750 Comm: qemu-kvm Kdump: loaded Not tainted 4.18.0-193.13.el8.ppc64le #1 [82947.517445] NIP: c00000000000a8fc LR: c00000000001ae54 CTR: c00000000802dca0 [82947.517467] REGS: c000000bf2a27610 TRAP: 0901 Not tainted (4.18.0-193.13.el8.ppc64le) [82947.517477] MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28422244 XER: 20040000 [82947.517496] CFAR: 0000000000000874 IRQMASK: 0 [82947.517496] GPR00: c0080000080bd414 c000000bf2a27890 c000000001965300 0000000000000900 [82947.517496] GPR04: 0000000ffb1d0000 000026b1eca5dc35 000026b022d693ff c000000fffffbb80 [82947.517496] GPR08: 0000000800000000 0000000028008228 b000000000001003 0000000000000009 [82947.517496] GPR12: 0000000024842828 c000000fffffbb80 [82947.517559] NIP [c00000000000a8fc] replay_interrupt_return+0x0/0x4 [82947.517572] LR [c00000000001ae54] arch_local_irq_restore+0x74/0x90 [82947.517582] Call Trace: [82947.517588] [c000000bf2a27890] [c000000fec3e4c00] 0xc000000fec3e4c00 (unreliable) [82947.517603] [c000000bf2a278b0] [c0080000080bd414] kvmhv_run_single_vcpu+0x3ec/0xda0 [kvm_hv] [82947.537629] [c000000bf2a27980] [c0080000080be748] kvmppc_vcpu_run_hv+0x980/0x1060 [kvm_hv] [82947.537660] [c000000bf2a27a90] [c00800000841de5c] kvmppc_vcpu_run+0x34/0x48 [kvm] [82947.537690] [c000000bf2a27ab0] [c008000008418f8c] kvm_arch_vcpu_ioctl_run+0x364/0x820 [kvm] [82947.537729] [c000000bf2a27ba0] [c008000008403298] kvm_vcpu_ioctl+0x460/0x7d0 [kvm] [82947.537764] [c000000bf2a27d10] [c00000000052c490] do_vfs_ioctl+0xe0/0xaa0 [82947.537793] [c000000bf2a27de0] [c00000000052d024] sys_ioctl+0xc4/0x160 [82947.537815] [c000000bf2a27e30] [c00000000000b408] system_call+0x5c/0x70 [82947.537834] Instruction dump: [82947.537850] 7d200026 618c8000 2c030900 4182e7e8 2c030500 4182f2e0 2c030f00 4182f3f8 [82947.537884] 2c030a00 4182ff9c 2c030e60 4182f088 <4e800020> 7c781b78 48000385 4800039d Message from syslogd@ibm-p9b-42 at May 7 04:00:47 ... kernel:watchdog: CPU 0 detected hard LOCKUP on other CPUs 3 Message from syslogd@ibm-p9b-42 at May 7 04:00:47 ... kernel:watchdog: CPU 0 TB:42545620888830, last SMP heartbeat TB:42537684573427 (15500ms ago) Message from syslogd@ibm-p9b-42 at May 7 04:00:47 ... kernel:watchdog: CPU 3 Hard LOCKUP Message from syslogd@ibm-p9b-42 at May 7 04:00:47 ... kernel:watchdog: CPU 3 TB:42545621013192, last heartbeat TB:42537428563546 (16000ms ago) Message from syslogd@ibm-p9b-42 at May 7 04:00:47 ... kernel:watchdog: CPU 3 became unstuck TB:42545626476761 [84119.553375] watchdog: CPU 0 detected hard LOCKUP on other CPUs 2 [84119.553437] watchdog: CPU 0 TB:43145708882094, last SMP heartbeat TB:43137521676261 (15990ms ago) [84119.554277] watchdog: CPU 2 Hard LOCKUP [84119.554281] watchdog: CPU 2 TB:43145709018683, last heartbeat TB:43137516557332 (16000ms ago) [84119.554283] Modules linked in: xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_route_ipv6 nft_chain_nat_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nft_counter nft_chain_route_ipv4 nft_chain_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_tables vhost_net vhost tap tun nfnetlink bluetooth ecdh_generic rfkill rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache bridge stp llc kvm_hv kvm i2c_dev sunrpc ses at24 ofpart enclosure powernv_flash scsi_transport_sas xts uio_pdrv_genirq mtd uio ipmi_powernv ipmi_devintf ibmpowernv vmx_crypto ipmi_msghandler opal_prd ip_tables xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm i40e aacraid drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod [84119.554421] CPU: 2 PID: 99745 Comm: qemu-kvm Kdump: loaded Not tainted 4.18.0-193.13.el8.ppc64le #1 [84119.554424] NIP: c0080000080bd7cc LR: c0080000080bd7cc CTR: c00000000001aba0 [84119.554428] REGS: c000000d885e3748 TRAP: 0100 Not tainted (4.18.0-193.13.el8.ppc64le) [84119.554430] MSR: 9000000102803033 <SF,HV,VEC,VSX,FP,ME,IR,DR,RI,LE,TM[E]> CR: 24422222 XER: 00000000 [84119.554456] CFAR: c000000001b35ca8 IRQMASK: 40000003d6f41f [84119.554462] GPR00: c0080000080bd7cc c000000d885e38b0 c000000001965300 c000000d885e3748 [84119.554472] GPR04: c009dff19f2f4198 0000000000000000 0000000000000000 0000000000000002 [84119.554483] GPR08: 0000000000000003 0000000000000000 0000000ffb120000 c0080000080cfd10 [84119.554494] GPR12: c00000000001aba0 c000000fffffcd00 00007fffa4e70000 00007ffd927d0000 [84119.554503] GPR16: 00007fffa3d24410 c000200df0a6a558 0000000000000001 c0000000012f5cf0 [84119.554514] GPR20: 0000000000000002 c000000001b35ca8 c000200df0a6a558 0000000000000002 [84119.554524] GPR24: 0000000000000002 ffffffffffffffff 0040000003d6f41f 0000000000000100 [84119.554534] GPR28: c000200df0a60000 c000000bfbd40000 c000000fec3ed900 c000000bef1b66c0 [84119.554547] NIP [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] [84119.554550] LR [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] [84119.554552] Call Trace: [84119.554556] [c000000d885e38b0] [c0080000080bd7cc] kvmhv_run_single_vcpu+0x7a4/0xda0 [kvm_hv] (unreliable) [84119.554564] [c000000d885e3980] [c0080000080be748] kvmppc_vcpu_run_hv+0x980/0x1060 [kvm_hv] [84119.554569] [c000000d885e3a90] [c00800000841de5c] kvmppc_vcpu_run+0x34/0x48 [kvm] [84119.554575] [c000000d885e3ab0] [c008000008418f8c] kvm_arch_vcpu_ioctl_run+0x364/0x820 [kvm] [84119.554579] [c000000d885e3ba0] [c008000008403298] kvm_vcpu_ioctl+0x460/0x7d0 [kvm] [84119.554584] [c000000d885e3d10] [c00000000052c490] do_vfs_ioctl+0xe0/0xaa0 [84119.554590] [c000000d885e3de0] [c00000000052d024] sys_ioctl+0xc4/0x160 [84119.554595] [c000000d885e3e30] [c00000000000b408] system_call+0x5c/0x70 [84119.554598] Instruction dump: [84119.554602] 614af804 7fa95000 409efcf8 7fe3fb78 48012d0d e8410018 4bfffce8 60000000 [84119.554616] 60000000 2f9b0100 409efbfc 48012549 <e8410018> 4bfffbf0 60000000 60000000 [84119.564517] watchdog: CPU 2 became unstuck TB:43145714590195 [84119.564531] CPU: 2 PID: 99745 Comm: qemu-kvm Kdump: loaded Not tainted 4.18.0-193.13.el8.ppc64le #1 [84119.564569] NIP: c00000000000a8fc LR: c00000000001ae54 CTR: c00000000802dca0 [84119.564600] REGS: c000000d885e3610 TRAP: 0901 Not tainted (4.18.0-193.13.el8.ppc64le) [84119.564620] MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28422244 XER: 20040000 [84119.564660] CFAR: 0000000000000874 IRQMASK: 0 [84119.564660] GPR00: c0080000080bd414 c000000d885e3890 c000000001965300 0000000000000900 [84119.564660] GPR04: 0000000ffb120000 0000273da4addc65 0000273bdadead42 c000000fffffcd00 [84119.564660] GPR08: 0000000800000000 0000000028008268 b000000000001003 0000000000000009 [84119.564660] GPR12: 0000000024842828 c000000fffffcd00 [84119.564778] NIP [c00000000000a8fc] replay_interrupt_return+0x0/0x4 [84119.564809] LR [c00000000001ae54] arch_local_irq_restore+0x74/0x90 [84119.564837] Call Trace: [84119.564843] [c000000d885e3890] [c000000fec3ed900] 0xc000000fec3ed900 (unreliable) [84119.564883] [c000000d885e38b0] [c0080000080bd414] kvmhv_run_single_vcpu+0x3ec/0xda0 [kvm_hv] [84119.564932] [c000000d885e3980] [c0080000080be748] kvmppc_vcpu_run_hv+0x980/0x1060 [kvm_hv] [84119.564973] [c000000d885e3a90] [c00800000841de5c] kvmppc_vcpu_run+0x34/0x48 [kvm] [84119.565013] [c000000d885e3ab0] [c008000008418f8c] kvm_arch_vcpu_ioctl_run+0x364/0x820 [kvm] [84119.565051] [c000000d885e3ba0] [c008000008403298] kvm_vcpu_ioctl+0x460/0x7d0 [kvm] [84119.565085] [c000000d885e3d10] [c00000000052c490] do_vfs_ioctl+0xe0/0xaa0 [84119.565105] [c000000d885e3de0] [c00000000052d024] sys_ioctl+0xc4/0x160 [84119.565127] [c000000d885e3e30] [c00000000000b408] system_call+0x5c/0x70 [84119.565160] Instruction dump: [84119.565168] 7d200026 618c8000 2c030900 4182e7e8 2c030500 4182f2e0 2c030f00 4182f3f8 [84119.565207] 2c030a00 4182ff9c 2c030e60 4182f088 <4e800020> 7c781b78 48000385 4800039d Message from syslogd@ibm-p9b-42 at May 7 04:20:19 ... kernel:watchdog: CPU 0 detected hard LOCKUP on other CPUs 2 Message from syslogd@ibm-p9b-42 at May 7 04:20:19 ... kernel:watchdog: CPU 0 TB:43145708882094, last SMP heartbeat TB:43137521676261 (15990ms ago) Message from syslogd@ibm-p9b-42 at May 7 04:20:19 ... kernel:watchdog: CPU 2 Hard LOCKUP Message from syslogd@ibm-p9b-42 at May 7 04:20:19 ... kernel:watchdog: CPU 2 TB:43145709018683, last heartbeat TB:43137516557332 (16000ms ago) Message from syslogd@ibm-p9b-42 at May 7 04:20:19 ... kernel:watchdog: CPU 2 became unstuck TB:43145714590195 Created attachment 1686207 [details]
auto-debug-log
Created attachment 1686209 [details]
auto-instruction
Build ibm-p9b-42.pnr.lab.eng.bos.redhat.com kernel-4.18.0-193.13.el8.ppc64le - host kernel-4.18.0-195.el8.ppc64le - guest SLOF-20200327-1.git8e012d6f.scrmod+el8.3.0+6495+1936fa11.wrb.noarch qemu-kvm-5.0.0-0.scrmod+el8.3.0+6495+1936fa11.wrb200506.ppc64le I see the guest side error described in comment 8: 03:45:49 INFO | [qemu output] error: kvm run failed Bad address This is probably caused by bug 1820402. I have posted a fix for that, but we're still waiting for it to be merged downstream. I'm not sure if the host side errors are related to this or not. I think we need to retest once we have a fix for bug 1820402 in order to check. Min, Now that we have a fix for bug 1820402, can you please retest this. You'll need your *host* kernel updated with the fix from bug 1820402. Already set up the test,will update the final result as soon as getting it,thanks. Fortunately,the similar problem still can be reproduced and uploaded log to this bug too. build information, kernel-4.18.0-200.el8.ppc64le host kernel-4.18.0-201.el8.ppc64le guest qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.ppc64le May 21 10:37:11 dhcp16-213-134.lab2.eng.bos.redhat.com systemd[1]: Starting Cr. May 21 10:37:14 dhcp16-213-134.lab2.eng.bos.redhat.com kdumpctl[7747]: Modified May 21 10:37:14 dhcp16-213-134.lab2.eng.bos.redhat.com kdumpctl[7747]: kexec: l May 21 10:37:14 dhcp16-213-134.lab2.eng.bos.redhat.com kdumpctl[7747]: Startin] May 21 10:37:15 dhcp16-213-134.lab2.eng.bos.redhat.com systemd[1]: Started Cra. Hint: Some lines were ellipsized, use -l to show in full. 22:37:16 INFO | Triggering crash on vcpu 0 ... 22:37:16 INFO | Context: Kdump Testing, force the Linux kernel to crash 22:37:16 DEBUG| Attempting to log into 'vm2' (timeout 360s) 22:37:16 DEBUG| Found/Verified IP 10.16.213.131 for VM vm2 NIC 0 22:37:16 INFO | [qemu output] error: kvm run failed Bad address 22:37:16 INFO | [qemu output] NIP c000000008008ac0 LR 000000000000066f CTR c000000008ddaff0 XER 0000000000000000 CPU#38 22:37:16 INFO | [qemu output] MSR c00000000008d330 HID0 0000000000000000 HF 8000000000000001 iidx 3 didx 3 22:37:16 INFO | [qemu output] TB 00000000 00000000 DECR 0 22:37:16 INFO | [qemu output] GPR00 000000000000066f c00000000f840c90 c000000009977900 0000000000000000 22:37:16 INFO | [qemu output] GPR04 fffffffaff26a3b8 000000000000066f 0000000000000008 c000000008ddaff0 22:37:16 INFO | [qemu output] GPR08 feeeeeeeeeeeeeee 00000004ffc7e880 c00000000805d338 0000000000000381 22:37:16 INFO | [qemu output] GPR12 9000000000001003 c00000000ff57a80 c0000000f7cd3f90 0000000000000000 22:37:16 INFO | [qemu output] GPR16 c0000000019b21d8 0000000000000001 0000000000000800 0000000000000001 22:37:16 INFO | [qemu output] GPR20 c000000001275608 0000000000000026 0000000000000001 0000000000000000 22:37:16 INFO | [qemu output] GPR24 0000000000000026 c000000009275808 feeeeeeeeeeeeeee c000000009041e40 22:37:16 INFO | [qemu output] GPR28 000000000000000b c00000000f840da0 c00000000f840c70 c00000000805d338 22:37:16 INFO | [qemu output] CR 88008228 [ L L - - L E E L ] RES ffffffffffffffff 22:37:16 INFO | [qemu output] SRR0 c000000008008ac0 SRR1 c00000000805d330 PVR 00000000004e1202 VRSAVE 0000000000000000 22:37:16 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c00000000ff57a80 SPRG2 c00000000ff57a80 SPRG3 0000000000000026 22:37:16 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 22:37:16 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000 22:37:16 INFO | [qemu output] CFAR 0000000000000000 22:37:16 INFO | [qemu output] LPCR 0000000003d6f41f 22:37:16 INFO | [qemu output] PTCR 0000000000000000 DAR beeeeeeef81646f6 DSISR 0000000000000000 22:37:16 INFO | [qemu output] error: kvm run failed Bad address 22:37:16 INFO | [qemu output] NIP c000000008008ac0 LR 000000000000066f CTR c000000008ddaff0 XER 0000000000000000 CPU#31 22:37:16 INFO | [qemu output] MSR c00000000008d5b0 HID0 0000000000000000 HF 8000000000000001 iidx 3 didx 3 22:37:16 INFO | [qemu output] TB 00000000 00000000 DECR 0 22:37:16 INFO | [qemu output] GPR00 000000000000066f c00000000f56dcb0 c000000009977900 0000000000000000 22:37:16 INFO | [qemu output] GPR04 fffffffaff40a638 000000000000066f 0000000000000008 c000000008ddaff0 22:37:16 INFO | [qemu output] GPR08 feeeeeeeeeeeeeee 00000004ffc7e880 c0000000081fd5b8 0000000000000381 22:37:16 INFO | [qemu output] GPR12 9000000000001003 c00000000ff62280 c0000000f7ce7f90 0000000000000000 22:37:16 INFO | [qemu output] GPR16 c0000000019b21d8 0000000000000000 0000000000000800 0000000000000001 22:37:16 INFO | [qemu output] GPR20 c000000001275608 000000000000001f 0000000000000000 0000000000000000 22:37:16 INFO | [qemu output] GPR24 000000000000001f c000000009275808 feeeeeeeeeeeeeee c000000009275808 22:37:16 INFO | [qemu output] GPR28 000000000000000b c00000000f56ddc0 c00000000f56dc90 c0000000081fd5b8 22:37:16 INFO | [qemu output] CR 88008228 [ L L - - L E E L ] RES ffffffffffffffff 22:37:16 INFO | [qemu output] SRR0 c000000008008ac0 SRR1 c0000000081fd5b0 PVR 00000000004e1202 VRSAVE 0000000000000000 22:37:16 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c00000000ff62280 SPRG2 c00000000ff62280 SPRG3 000000000000001f 22:37:16 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 22:37:16 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000 Created attachment 1690489 [details]
newlog
QE also tried the issue on older build and hit issues like this,it seems it was hardware issue,which was different with above comments.Also uploaded log to the bug.Thanks. build information, kernel-4.18.0-200.el8.ppc64le host kernel-4.18.0-193.el8.ppc64le guest,there's test result for kernel-4.18.0-201.el8.ppc64le,see above comment. qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.ppc64le 00:37:33 INFO | Triggering crash on vcpu 0 ... 00:37:33 INFO | Context: Kdump Testing, force the Linux kernel to crash 00:37:33 DEBUG| Attempting to log into 'vm2' (timeout 360s) 00:37:33 DEBUG| Found/Verified IP 10.16.213.142 for VM vm2 NIC 0 00:37:35 INFO | [qemu output] KVM: unknown exit, hardware reason 3b8 00:37:35 INFO | [qemu output] NIP 0000000000000700 LR c0000000081f6580 CTR c000000008dd9528 XER 0000000000000000 CPU#24 00:37:35 INFO | [qemu output] MSR 8000000000001001 HID0 0000000000000000 HF 8000000000000001 iidx 3 didx 3 00:37:35 INFO | [qemu output] TB 00000000 00000000 DECR 0 00:37:35 INFO | [qemu output] GPR00 c0000000081f6580 c00000000fbfe670 00007fff8872ee60 00007fff889005b8 00:37:35 INFO | [qemu output] GPR04 c000000008370de4 0000000000000669 0000000000000008 c000000008da8700 00:37:35 INFO | [qemu output] GPR08 feeeeeeeeeeeeeee 0000000040000000 0000000080000018 0000000000000381 00:37:35 INFO | [qemu output] GPR12 c000000008dd9528 c00000000ff6ca00 c0000000f7cd7f90 0000000000000000 00:37:35 INFO | [qemu output] GPR16 c0000000019520d8 0000000000000000 0000000000000800 0000000000000001 00:37:35 INFO | [qemu output] GPR20 c000000001235608 0000000000000018 c0000000016a7fa8 0000000000000000 00:37:35 INFO | [qemu output] GPR24 0000000000000018 0000000000000000 0000000000000019 c000000008ffc6c8 00:37:35 INFO | [qemu output] GPR28 00007fff889005b8 0000000000000669 0000000000000008 0000000000000000 00:37:35 INFO | [qemu output] CR 48008244 [ G L - - L E G G ] RES ffffffffffffffff 00:37:35 INFO | [qemu output] SRR0 c000000008dd952c SRR1 8000000000001003 PVR 00000000004e1202 VRSAVE 0000000000000000 00:37:35 INFO | [qemu output] SPRG0 0000000000000000 SPRG1 c00000000ff6ca00 SPRG2 c00000000ff6ca00 SPRG3 0000000000000018 00:37:35 INFO | [qemu output] SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000 00:37:35 INFO | [qemu output] HSRR0 0000000000000000 HSRR1 0000000000000000 00:37:35 INFO | [qemu output] CFAR 0000000000000000 00:37:35 INFO | [qemu output] LPCR 0000000003d6f41f 00:37:35 INFO | [qemu output] PTCR 0000000000000000 DAR beeeeeeef812fe8e DSISR 0000000000000000 00:37:36 DEBUG| Trying to SCP with command 'scp -r -v -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o PreferredAuthentications=password -P 22 root@\[10.16.213.142\]:/etc/kdump.conf /home/kar/workspace/job-results/job-2020-05-19T23.04-0ddb0eb/test-results/5-Host_RHEL.m8.u3.product_av.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.8.2.0.ppc64le.io-github-autotest-qemu.kdump.multi_vms/kdump.conf-vm2-test', timeout 600s Created attachment 1690509 [details]
old-build-log
Discussed with mdeng to clarify the situation. The current issue is the one shown in comment 15. With several guests running in an automated test, one of them is falling over with a Bad Address error (-EFAULT) from KVM_RUN. This is despite having the fix for bug 1820402 present. The comment 17 trace isn't really interesting because that kernel does *not* have the bug 1820402 bug fix present, which means we don't know if it's showing the issue we're still trying to find or the known problem from bug 1820402. Unfortunately the traces in comment 15 give us almost no useful information. They are all guest state, but an -EFAULT from KVM_RUN almost certainly indicates a host kernel or qemu error. Next steps: we'll need at least one of these two things to proceed from here: 1) The *host* dmesg logs when the error occurs (unfortunately those weren't captured in the comment 15 case) 2) Instructions on how to set up and run the automated test case that has triggered this problem (In reply to David Gibson from comment #19) > Discussed with mdeng to clarify the situation. > > The current issue is the one shown in comment 15. With several guests > running in an automated test, one of them is falling over with a Bad Address > error (-EFAULT) from KVM_RUN. This is despite having the fix for bug > 1820402 present. The comment 17 trace isn't really interesting because that > kernel does *not* have the bug 1820402 bug fix present, which means we don't > know if it's showing the issue we're still trying to find or the known > problem from bug 1820402. > > Unfortunately the traces in comment 15 give us almost no useful information. > They are all guest state, but an -EFAULT from KVM_RUN almost certainly > indicates a host kernel or qemu error. > > Next steps: we'll need at least one of these two things to proceed from here: > > 1) The *host* dmesg logs when the error occurs (unfortunately those weren't > captured in the comment 15 case) > 2) Instructions on how to set up and run the automated test case that has > triggered this problem QE will try it later and it probably takes much time,as soon as the result is available, the bug will be updated correspondingly. Thanks. Tried but still hit the same issue without explicit error message from host,I will try it on another host, thanks. build information, qemu-kvm-5.0.0-0.scrmod+el8.3.0+7150+88a2c83e.wrb200624.ppc64le Min, Have you been able to gather any more information about how this bug triggers? Tried the bug on both P8 and P9, and I guess it was related with vcpu number, since in automation cmdline, every guest will consume an half of cpus of the host. P8 results, the host has 24 cpus and each guest has 12 in the cmdline,but I failed to reproduce it by manual. 01:33:32 INFO | Triggering crash on vcpu 0 ... 01:33:32 INFO | Context: Check the vmcore file after triggering a crash 01:33:37 INFO | Context: Check the vmcore file after triggering a crash --> Waiting for kernel crash dump to complete 01:33:37 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 1200s) 01:33:38 DEBUG| Found/Verified IP 10.0.1.218 for VM avocado-vt-vm1 NIC 0 01:35:08 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:35:31 WARNI| Error occur when update VM address cache: Login timeout expired (output: 'exceeded 10 s timeout') 01:37:35 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:37:57 WARNI| Error occur when update VM address cache: Login timeout expired (output: 'exceeded 10 s timeout') 01:38:32 INFO | [qemu output] qemu-kvm: OS terminated: OS panic: System is deadlocked on memory 01:38:32 INFO | [qemu output] 01:38:32 INFO | [qemu output] (Process terminated with status 0) 01:38:33 WARNI| registers is not alive. Can't query the avocado-vt-vm1 status 01:40:02 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:40:13 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:42:17 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:42:18 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:44:23 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:44:23 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:46:28 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:46:29 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:48:33 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:48:34 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:50:39 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:50:39 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:52:21 WARNI| IPv6 address sniffing is not supported yet by using TShark, please fallback to use other sniffers by uninstalling TShark when testing with IPv6 01:52:44 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 10s) 01:52:44 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' 01:53:47 ERROR| Can't get guest network status information, reason: Client process terminated (status: 1, output: '') 01:53:47 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 360s) 02:00:03 DEBUG| Attempting to log into 'avocado-vt-vm1' via serial console (timeout 360s) 02:00:03 WARNI| Error occur when update VM address cache: VM is dead detail: 'qemu-kvm: OS terminated: OS panic: System is deadlocked on memory\n\n' P9's results, Got the same issue but with error was as following, ... Message from syslogd@ibm-p9wr-04 at Aug 3 04:26:21 ... kernel:kvmppc_emulate_mmio: emulation failed (7ce01828) Notes, the debug logs also were attached in the bug. Created attachment 1703288 [details]
for P8 and P9
Build information, kernel-4.18.0-230.el8.ppc64le qemu-kvm-5.1.0-0.scrmod+el8.3.0+7493+a5e196a4.wrb200729.ppc64le P8's host [root@ibm-p8-11 home]# lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 192 On-line CPU(s) list: 0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120,128,136,144,152,160,168,176,184 Off-line CPU(s) list: 1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111,113-119,121-127,129-135,137-143,145-151,153-159,161-167,169-175,177-183,185-191 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 4 NUMA node(s): 4 Model: 2.1 (pvr 004b 0201) Model name: POWER8E (raw), altivec supported CPU max MHz: 3923.0000 CPU min MHz: 2061.0000 L1d cache: 64K L1i cache: 32K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0,8,16,24,32,40 NUMA node1 CPU(s): 48,56,64,72,80,88 NUMA node16 CPU(s): 96,104,112,120,128,136 NUMA node17 CPU(s): 144,152,160,168,176,184 [root@ibm-p8-11 home]# free -m total used free shared buff/cache available Mem: 518925 11433 491251 42 16240 504346 Swap: 4095 0 4095 P9 host, [root@ibm-p9wr-04 home]# lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 2 Model: 2.2 (pvr 004e 1202) Model name: POWER9, altivec supported CPU max MHz: 3800.0000 CPU min MHz: 2300.0000 L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 [root@ibm-p9wr-04 home]# free -m total used free shared buff/cache available Mem: 257550 4673 241671 33 11205 251164 Swap: 4095 0 4095 Do a summary for this bug, 1.The bug was reproduced on comment15 Build information, kernel-4.18.0-200.el8.ppc64le host kernel-4.18.0-201.el8.ppc64le guest qemu-kvm-5.0.0-0.module+el8.3.0+6620+5d5e1420.ppc64le 2.QE also hit other situations on comment23 Build information, kernel-4.18.0-230.el8.ppc64le qemu-kvm-5.1.0-0.scrmod+el8.3.0+7493+a5e196a4.wrb200729.ppc64le please have a look on the attachment if you need. QE tried this bug on the latest build as followings, kernel-4.18.0-236.el8.ppc64le qemu-kvm-5.1.0-6.module+el8.3.0+8041+42ff16b8.ppc64le P9:ibm-p9wr-14.ibm2.lab.eng.bos.redhat.com P8:ibm-p8-garrison-01.rhts.eng.bos.redhat.com After running the test for multiple times [about 10 times], now the bug can't be reproduced. Thanks. Any issues please let me know. Created attachment 1714820 [details]
latest build
Base on comment27, it can be close as currentrelease, if there's any concerns, please just let me know, thanks a lot. Great news. Thanks for rechecking this, Min. |