Bug 1506882
Summary: | Call trace showed up in dmesg after migrating guest when "stress-ng --numa 2" was running inside guest | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Min Deng <mdeng> | ||||
Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> | ||||
Status: | CLOSED ERRATA | QA Contact: | Min Deng <mdeng> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.5 | CC: | dgibson, knoel, lmiksik, lvivier, mdeng, michen, mrezanin, qzhang, virt-maint, xianwang | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64le | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | qemu-kvm-rhev-2.10.0-9.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-04-11 00:44:15 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Min Deng
2017-10-27 05:10:36 UTC
QE will try it on x86 later as soon as get the test result. It won't be reproduced without migration It won't be reproduced while guest has more than 1 node (In reply to Min Deng from comment #1) > QE will try it on x86 later as soon as get the test result. QE tried the bug on the following builds it could not be reproduced on x86 platform.So mark the bug as power only.Thanks a lot. qemu-kvm-rhev-2.10.0-3.el7.x86_64 guest:kernel-3.10.0-693.el7.x86_64/kernel-3.10.0-759.el7.x86_64.rpm We need a bit more information here: 1. Does the call trace appear in the guest or the host? 2. Can we have more context from the call trace. comment 0 mentions a trace in a attachement, but I don't see any attachement on this BZ. (In reply to David Gibson from comment #4) > We need a bit more information here: > > 1. Does the call trace appear in the guest or the host? The call trace happened in guest > 2. Can we have more context from the call trace. comment 0 mentions a > trace in a attachement, but I don't see any attachement on this BZ. I will double check and upload it,thanks a lot. Created attachment 1351778 [details]
calltracelogfromguest
Call trace seems to be a soft lockup with a ludicrously long time. Looks like something is messed up in the migration of the time. Laurent, I believe you were looking at another bug with a similar symptom, this is likely to be a dupe. Hi, Laurent, I hit a problem, it is similar to this bug, but I am not sure, the result is as following: on P9, execute "# stress --cpu 4 --io 4 --vm 2", after migration, on src host, migration status is "completed" while on dst host, the status of vm is "running", but vm is hang on dst host, and there is call trace on dmesg: [ 24.472855] Call Trace: [ 24.472858] [c0000001f593bc50] [c0000001f5b84970] 0xc0000001f5b84970 (unreliable) [ 24.472861] [c0000001f593bc80] [c00000000012e770] copy_mm+0x48c/0x63c [ 24.472865] [c0000001f593bd40] [c00000000012b09c] copy_process.isra.31.part.32+0x72c/0x10a0 [ 24.472868] [c0000001f593bdc0] [c00000000012c970] _do_fork+0xf0/0x4e0 [ 24.472873] [c0000001f593be30] [c00000000000b410] ppc_clone+0x8/0xc [ 24.472874] Instruction dump: [ 24.472878] 7c7f1b78 48000008 e8410018 e93f00a8 e95f00a0 e9090008 e94a00f0 e90806d8 [ 24.472885] 7d0a5278 7d4a0074 794ad182 694a0001 <0b0a0000> 39490030 7d005028 31080001 [ 24.472893] ---[ end trace b01f99a09aec16ae ]--- [ 34.500801] nr_pdflush_threads exported in /proc is scheduled for removal [ 35.280012] sctp: Hash tables configured (bind 4096/4096) could you help to check whether the problem is same with this bug? if not, I will file a new BZ, thanks (In reply to xianwang from comment #8) > Hi, Laurent, > I hit a problem, it is similar to this bug, but I am not sure, the result is > as following: > on P9, execute "# stress --cpu 4 --io 4 --vm 2", after migration, on src > host, migration status is "completed" while on dst host, the status of vm is > "running", but vm is hang on dst host, and there is call trace on dmesg: > > [ 24.472855] Call Trace: > [ 24.472858] [c0000001f593bc50] [c0000001f5b84970] 0xc0000001f5b84970 > (unreliable) > [ 24.472861] [c0000001f593bc80] [c00000000012e770] copy_mm+0x48c/0x63c > [ 24.472865] [c0000001f593bd40] [c00000000012b09c] > copy_process.isra.31.part.32+0x72c/0x10a0 > [ 24.472868] [c0000001f593bdc0] [c00000000012c970] _do_fork+0xf0/0x4e0 > [ 24.472873] [c0000001f593be30] [c00000000000b410] ppc_clone+0x8/0xc > [ 24.472874] Instruction dump: > [ 24.472878] 7c7f1b78 48000008 e8410018 e93f00a8 e95f00a0 e9090008 > e94a00f0 e90806d8 > [ 24.472885] 7d0a5278 7d4a0074 794ad182 694a0001 <0b0a0000> 39490030 > 7d005028 31080001 > [ 24.472893] ---[ end trace b01f99a09aec16ae ]--- > [ 34.500801] nr_pdflush_threads exported in /proc is scheduled for removal > [ 35.280012] sctp: Hash tables configured (bind 4096/4096) > > > could you help to check whether the problem is same with this bug? if not, I > will file a new BZ, thanks The interesting part in comment 6 attachment is: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 3489660870s! [stress-ng-numa:2697] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 3489660884s! [stress-ng-numa:2698] Do you have the same traces with this new bug? Bisected to: commit 42043e4f1241eeb77f87f5816b5cf0b6e9583ed7 Author: Laurent Vivier <lvivier> Date: Fri Jan 27 13:24:58 2017 +0100 spapr: clock should count only if vm is running This is a port to ppc of the i386 commit: 00f4d64 kvmclock: clock should count only if vm is running We remove timebase_post_load function, and use the VM state change handler to save and restore the guest_timebase (on stop and continue). We keep timebase_pre_save to reduce the clock difference on migration like in: 6053a86 kvmclock: reduce kvmclock difference on migration Time base offset has originally been introduced by commit 98a8b52 spapr: Add support for time base offset migration So while VM is paused, the time is stopped. This allows to have the same result with date (based on Time Base Register) and hwclock (based on "get-time-of-day" RTAS call). Moreover in TCG mode, the Time Base is always paused, so this patch also adjust the behavior between TCG and KVM. VM state field "time_of_the_day_ns" is now useless but we keep it to be able to migrate to older version of the machine. As vmstate_ppc_timebase structure (with timebase_pre_save() and timebase_post_load() functions) was only used by vmstate_spapr, we register the VM state change handler only in ppc_spapr_init(). Commit 42043e4 adds a call to kvm_set_one_reg(cpu, KVM_REG_PPC_TB_OFFSET, ...): ... @@ -904,9 +890,44 @@ static int timebase_post_load(void *opaque, int version_id) CPU_FOREACH(cpu) { PowerPCCPU *pcpu = POWERPC_CPU(cpu); pcpu->env.tb_env->tb_offset = tb_off_adj; +#if defined(CONFIG_KVM) + kvm_set_one_reg(cpu, KVM_REG_PPC_TB_OFFSET, + &pcpu->env.tb_env->tb_offset); +#endif } +} ... And it seems the cause of the problem. If I remove it, all works fine. I've checked the content of TBR on src and dest and the content is okay before and after migration under stress. #include <stdio.h> #define FREQ 512000000 int main(void) { unsigned long timebase; asm("mftb %0" : "=r" (timebase)); printf("%lu.%lu\n", timebase / FREQ, timebase % FREQ); return 0; } If I use upstream kernel 4.14.0+ (cf9b077) in guest the problem cannot be reproduced. Smells like a host kernel bug setting the TV_OFFSET that got fixed upstream. (In reply to Laurent Vivier from comment #13) > If I use upstream kernel 4.14.0+ (cf9b077) in guest the problem cannot be > reproduced. Sorry, I have a mistake, I have used the old package of qemu-kvm-rhev-2.9, I have updated the packages to qemu-kvm-rhev-2.10.0-6.el7, and there is no this problem,so, please ignore comment8, thanks (In reply to xianwang from comment #15) > (In reply to Laurent Vivier from comment #13) > > If I use upstream kernel 4.14.0+ (cf9b077) in guest the problem cannot be > > reproduced. > > Sorry, I have a mistake, I have used the old package of qemu-kvm-rhev-2.9, I > have updated the packages to qemu-kvm-rhev-2.10.0-6.el7, and there is no > this problem,so, please ignore comment8, thanks Now, I have updated to the latest packages as following: Host: Host: 4.14.0-2.el7a.ppc64le qemu-kvm-rhev-2.10.0-6.el7.ppc64le SLOF-20170724-2.git89f519f.el7.noarch Guest: 4.14.0-2.el7a.ppc64le If stress guest with "# stress --io 4", this problem is not hit, but with "# stress --cpu 4 --io 4 --vm 2 --vm-bytes 128M", sometimes this bug is reproduced as comment8, but maybe as Laurent said that it is fixed upstream. (In reply to Laurent Vivier from comment #13) > If I use upstream kernel 4.14.0+ (cf9b077) in guest the problem cannot be > reproduced. Bisecting doesn't converge with a commit that can fix the problem (nothing related to POWER, or KVM). So I think the problem cannot be reproduced reliably with latest kernel and is not fixed. The timestamp is computed using get_timestamp() which uses running_clock(). On POWER, running_clock() is: unsigned long long running_clock(void) { ... if (firmware_has_feature(FW_FEATURE_LPAR) && cpu_has_feature(CPU_FTR_ARCH_207S)) return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift; ... return local_clock() - cputime_to_nsecs(kcpustat_this_cpu->cpustat[CPUTIME_STEAL]); } And in our case, both features are true, and the cause of the problem is boot_tb > get_vtb(). It seems get_vtb() is not computed correctly after migration. The problem is in QEMU which doesn't save/restore the content of the VTB. The following patches fixes the problem: diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c index b9c49c2..4e11e6f 100644 --- a/target/ppc/translate_init.c +++ b/target/ppc/translate_init.c @@ -8081,10 +8081,10 @@ static void gen_spr_power8_ebb(CPUPPCState *env) /* Virtual Time Base */ static void gen_spr_vtb(CPUPPCState *env) { - spr_register(env, SPR_VTB, "VTB", + spr_register_kvm(env, SPR_VTB, "VTB", SPR_NOACCESS, SPR_NOACCESS, &spr_read_tbl, SPR_NOACCESS, - 0x00000000); + KVM_REG_PPC_VTB, 0x00000000); } but the fix is not enough as we can have problem with the migration from a previous release and trying to restore an invalid value. I've tested a migration from qemu-2.8.0 and pseries-2.8.0 to a patched master (qemu-2.11.0-rc1). The VTB received is 0 (as is it not initialized by qemu-2.8.0), but the value seems to be ignored by KVM and a non zero VTB is used by the kernel. I have no explanation for that. Sent upstream: http://patchwork.ozlabs.org/patch/840170/ Fix included in qemu-kvm-rhev-2.10.0-9.el7 Reproduced the bug on kernel-3.10.0-747.el7.ppc64le qemu-kvm-rhev-2.10.0-3.el7.ppc64le Verified the bug on kernel-3.10.0-804.el7.ppc64le qemu-kvm-rhev-2.10.0-10.el7.ppc64le Steps please refer to comment 0 Draw a conclusion,the original issue has been fixed so the bug can be moved to verified,thanks a lot. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1104 |