Red Hat Bugzilla – Bug 1259171
[RHEL.7.2] got stack trace when test ksm serial splitting and merging
Last modified: 2018-03-30 17:38:28 EDT
Description of problem:
guest got stack trace when run autotest script to test ksm serial splitting and merging
Version-Release number of selected component (if applicable):
Steps to Reproduce:
run autotest script ksm_serial, autotest will boot up 4 guests.
one guest got call trace:
2015-08-31 19:51:37: [ 5544.443056] Call Trace:
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff811c5f89>] migrate_pages+0x1e9/0x790
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8118d120>] ? isolate_freepages_block+0x380/0x380
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8118e119>] compact_zone+0x299/0x400
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8118e31c>] compact_zone_order+0x9c/0xf0
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8118e6d1>] try_to_compact_pages+0x121/0x1a0
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8162e218>] __alloc_pages_direct_compact+0xac/0x196
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff811723a8>] __alloc_pages_nodemask+0x788/0xb90
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff811b665a>] alloc_pages_vma+0x9a/0x140
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff811cb093>] do_huge_pmd_anonymous_page+0x123/0x510
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff81195b38>] handle_mm_fault+0x1a8/0xf30
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8163ea42>] __do_page_fault+0x152/0x420
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff81186909>] ? vm_mmap_pgoff+0xb9/0xe0
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8163edd3>] trace_do_page_fault+0x43/0x110
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8163e4f9>] do_async_page_fault+0x29/0xe0
2015-08-31 19:51:37: [ 5544.443056] [<ffffffff8163b078>] async_page_fault+0x28/0x30
2015-08-31 19:51:37: [ 5544.443056] Code: ff ff 07 00 77 46 4c 89 e2 48 c1 ea 16 48 8b 14 d5 80 77 ed 81 48 85 d2 74 32 83 e0 7f 48 c1 e0 05 48 01 d0 74 26 f6 00 02 74 21 <49> 8b 45 00 48 89 c2 48 c1 e8 29 48 c1 ea 36 25 00 18 00 00 48
another guest got stack trace too:
2015-08-31 19:29:09: [ 4336.736054] Stack:
2015-08-31 19:29:09: [ 4336.736054] ffffffff81166f9a 00000003ba85b000 ffff880af2e87d60 ffffffff81168c5f
2015-08-31 19:29:09: [ 4336.736054] ffff880af2e87d08 ffff880af2e87e70 00000003ba85b000 0000000000001000
2015-08-31 19:29:09: [ 4336.736054] ffff880032b8d080 ffff880af2e87fd8 00000000fa259f00 0000000000000000
2015-08-31 19:29:09: [ 4336.736054] Call Trace:
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff81166f9a>] ? iov_iter_copy_from_user_atomic+0x6a/0xa0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff81168c5f>] generic_file_buffered_write+0x14f/0x290
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff8116a155>] __generic_file_aio_write+0x1d5/0x3e0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff8116a3bd>] generic_file_aio_write+0x5d/0xc0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff811dcb1d>] do_sync_write+0x8d/0xd0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff811dd33d>] vfs_write+0xbd/0x1e0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff811ddddf>] SyS_write+0x7f/0xe0
2015-08-31 19:29:09: [ 4336.736054] [<ffffffff81643509>] system_call_fastpath+0x16/0x1b
2015-08-31 19:29:09: [ 4336.736054] Code: 74 30 83 fa 08 72 27 89 f9 83 e1 07 74 15 83 e9 08 f7 d9 29 ca 8a 06 88 07 48 ff c6 48 ff c7 ff c9 75 f2 89 d1 c1 e9 03 83 e2 07 <f3> 48 a5 89 d1 f3 a4 31 c0 66 66 90 c3 0f 1f 00 66 66 90 21 d2
no error .
1) host cpu:
processor : 31
vendor_id : AuthenticAMD
cpu family : 21
model : 1
model name : AMD Opteron(TM) Processor 6272
stepping : 2
microcode : 0x600063d
cpu MHz : 2100.007
cache size : 2048 KB
physical id : 1
siblings : 16
core id : 7
cpu cores : 8
apicid : 79
initial apicid : 47
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 cx16 sse4_1 sse4_2 popcnt aes xsave avx lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 nodeid_msr topoext perfctr_core perfctr_nb arat cpb hw_pstate npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold
this seems to crash in the page migration code of the host kernel (guest is irrelevant for this bug).
If KSM is into the equation and the page migration otherwise works fine (NUMA balancing and memory compaction are the two main users of page migration) this could mean something's wrong in the migration of KSM pages.
Unfortunately both stack traces are truncated so it's uncertain if it's really the migration of KSM pages that tripped on a bug, so above I'm guessing.
To proceed with proper debugging:
1) could you retrieve a full log? Before the Call Trace: and Stack: lines there's other useful information. Perhaps it's still in /var/log.
2) is a kernel crash dump available with kexec? could you reproduce with that enabled?
3) does it go away if you run the below commands shortly after boot before running the testcase?
echo NO_NUMA > /sys/kernel/debug/sched_features
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo 1000000000 > /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs
Ignore if the first command fails if the host is not NUMA.
The point 3) really intends to disable page migration, but there's no global tweak to disable it, and no global tweak to disable compaction either. Even after point 3) is applied, drivers will still invoke compaction at will, so the crash may still happen even after point 3) sometime, but it should happen less frequently if page migration of KSM pages is the problem.
Andrea, This looks very ugly. I can't tell how often this happens. We don't want these messages filling up the logs. Do you know the cause?
You said "crash" but this looks like just a backtrace. You said host, but Yang Meng said the BT was in the guest. Which is it?
Yang Meng, What is the qemu-kvm command line? Is there NUMA set up in the guests? Thanks!
I think I confused them for host stack traces because I wouldn't have expected KSM to be able to destabilize the guest like this and the main problem is still that they're truncated and we don't know if it's a crash or just a warning, could be a VM_BUG_ON a WARN_ON or an harmless guest softlockup induced by a host inefficiency.
They are definitely guest stacktraces (like the first comment correctly said), because do_async_page_fault is in one of them.
The header above that stack trace would tell it. We need the full logs.
Considering the other KSM bug I've been working on, I wonder if the host tried to migrate a KSM page and these are harmless guest softlockups that happened because KSM in the host computed for too long while trying to migrate a KSM page. I assume KSM has been running in the host only, if yes, I still think the bug is unlikely in the guest, but we need the full logs around the stacktraces to make progress on this one.
(In reply to Karen Noel from comment #3)
> Andrea, This looks very ugly. I can't tell how often this happens. We don't
> want these messages filling up the logs. Do you know the cause?
> You said "crash" but this looks like just a backtrace. You said host, but
> Yang Meng said the BT was in the guest. Which is it?
> Yang Meng, What is the qemu-kvm command line? Is there NUMA set up in the
> guests? Thanks!
hi, numa didn't set up ,and the following is the commandline:
-name 'virt-tests-vm1' \
-sandbox off \
-machine pc \
-vga cirrus \
-chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20150831-181355-S3Y85kan,server,nowait \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,id=qmp_id_catch_monitor,path=/tmp/monitor-catch_monitor-20150831-181355-S3Y85kan,server,nowait \
-mon chardev=qmp_id_catch_monitor,mode=control \
-device pvpanic,ioport=0x505,id=idRGYTYM \
-chardev socket,id=serial_id_serial0,path=/tmp/serial-serial0-20150831-181355-S3Y85kan,server,nowait \
-device isa-serial,chardev=serial_id_serial0 \
-chardev socket,id=seabioslog_id_20150831-181355-S3Y85kan,path=/tmp/seabios-20150831-181355-S3Y85kan,server,nowait \
-device isa-debugcon,chardev=seabioslog_id_20150831-181355-S3Y85kan,iobase=0x402 \
-device ich9-usb-uhci1,id=usb1,bus=pci.0,addr=03 \
-drive id=drive_image1,if=none,cache=none,snapshot=on,aio=native,format=qcow2,file=/home/autotest-devel/client/tests/virt/shared/data/images/RHEL-Server-7.2-64.qcow2 \
-device ide-hd,id=image1,drive=drive_image1,bus=ide.0,unit=0 \
-device e1000,mac=9a:18:19:1a:1b:1c,id=idB5ehSJ,netdev=idvI6NWv,bus=pci.0,addr=04 \
-netdev tap,id=idvI6NWv,fd=21 \
-m 44958 \
-smp 16,maxcpus=16,cores=8,threads=1,sockets=2 \
-cpu 'Opteron_G4',+kvm_pv_unhalt \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-vnc :0 \
-rtc base=utc,clock=host,driftfix=slew \
-boot order=cdn,once=c,menu=off,strict=off \
-pidfile /tmp/virt-tests-vm1-20150831-181352-bHLc.pid \
which other lines are in /var/log just before/around 19:51:37? and 19:29:09?
(In reply to Andrea Arcangeli from comment #6)
> Hello Yang,
> which other lines are in /var/log just before/around 19:51:37? and 19:29:09?
the log is not got from /var/log ,it's just the log generated by autotest script, do you need the full log of autotest ?
If the guest images aren't wiped out automatically after the autotest script run, you could try to restsart the guest and try to grab those logs from `journalctl` or /var/log/. If those two fragments are the only thing that gets logged into the autotest log, there shall be more kernel messages before those Stack/Call trace lines into /var/log or journalctl. If there aren't other kernel messages in the autotest logs I probably don't need the full log of autotest, other kernel messages would have the exact same timestamp: 2015-08-31 19:29:09 and 2015-08-31 19:51:37. If there are other lines with those two timestamps I need those yes.