Description of problem: Installed RHE8.1 (4.18.0-119.el8.ppc64le) guest,build LTP :git clone https://github.com/linux-test-project/ltp.git while running LTP cpuset_hotplug(runtest) ,Processor 1 is stuck.task irqbalance:4305 blocked for more than 120 seconds Version-Release number of selected component (if applicable): host: # lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Thread(s) per core: 4 Core(s) per socket: 16 Socket(s): 2 NUMA node(s): 2 Model: 2.3 (pvr 004e 1203) Model name: POWER9, altivec supported CPU max MHz: 3800.0000 CPU min MHz: 2300.0000 L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 10240K NUMA node0 CPU(s): 0-63 NUMA node8 CPU(s): 64-127 # update_flash_nv -d Firmware version: Product Version : witherspoon-ibm-OP9-v2.2-3.5 Product Extra : bmc-firmware-version-2.03 Product Extra : buildroot-2019.02.1-16-ge01dcd0 Product Extra : capp-ucode-p9-dd2-v4 Product Extra : hcode-hw040319a.940 Product Extra : hostboot-e5622fb Product Extra : hostboot-binaries-hw021419a.930 Product Extra : linux-5.0.5-openpower1-p4b42b5c Product Extra : machine-xml-e3e9aef Product Extra : occ-58e422d Product Extra : petitboot-v1.10.3 Product Extra : sbe-1410677 Product Extra : skiboot-v6.3-rc1-p1ce8930 # uname -r 4.18.0-120.el8.ppc64le # /usr/libexec/qemu-kvm -version QEMU emulator version 4.0.0 (qemu-kvm-4.0.0-5.module+el8.1.0+3622+5812d9bf) guest: # lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0,2-63 Off-line CPU(s) list: 1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 32 NUMA node(s): 1 Model: 2.3 (pvr 004e 1203) Model name: POWER9 (architected), altivec supported Hypervisor vendor: KVM Virtualization type: para L1d cache: 32K L1i cache: 32K NUMA node0 CPU(s): 0,2-63 # uname -r 4.18.0-119.el8.ppc64le How reproducible: 3/3 Steps to Reproduce: 1.Installed RHE8.1 (4.18.0-119.el8.ppc64le) guest 2.boot guest # /usr/libexec/qemu-kvm \ > -name zhenyzha-RHEL-8.1 \ > -sandbox off \ > -machine pseries,cap-nested-hv=on \ > -m 120G \ > -nodefaults \ > -vga std \ > -device nec-usb-xhci,id=xhci \ > -device usb-tablet,id=usb-tablet0 \ > -device usb-kbd,id=usb-kbd0 \ > -smp 64,cores=2,threads=1,sockets=32 \ > -vnc :30 \ > -serial stdio \ > -rtc base=utc,clock=host \ > -boot order=cdn,menu=off,strict=off \ > -enable-kvm \ > -qmp unix:/var/tmp/qmp-monitor-zhenyzha,server,nowait \ > -qmp tcp:0:3001,server,nowait \ > -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup \ > -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=40:f2:e9:5d:9c:07 \ > -device virtio-scsi-pci,bus=pci.0,addr=0x06,id=scsi-pci-0 \ > -drive id=my0,format=qcow2,media=disk,if=none,file=os.qcow2 \ > -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=my0,id=virtio0-0-1,bootindex=0 3. Build LTP on guest # git clone https://github.com/linux-test-project/ltp.git # make autotools # ./configure # make # make install # cd /opt/ltp 4. run ltp case # /opt/ltp/runltp -f controllers the /var/log/messages display : Jul 25 10:42:30 localhost LTP: starting cpuset_hotplug (cpuset_hotplug_test.sh) Jul 25 10:42:30 localhost kernel: cpu 1 (hwid 8) Ready to die... Jul 25 10:42:30 localhost systemd[1]: Started /usr/lib/udev/kdump-udev-throttler. Jul 25 10:42:31 localhost kernel: cpu 1 (hwid 8) Ready to die... Jul 25 10:42:31 localhost kernel: Querying DEAD? cpu 1 (8) shows 2 Jul 25 10:42:32 localhost kdump-udev-throttler[41941]: kexec: unloaded kdump kernel Jul 25 10:42:32 localhost kdump-udev-throttler[41941]: Stopping kdump: [OK] Jul 25 10:42:32 localhost kdump-udev-throttler[41941]: Modified cmdline:BOOT_IMAGE=/vmlinuz-4.18.0-119.el8.ppc64le ro console=ttyS0,115200 biosdevname=0 net.ifnames=0 rhgb console=tty0 irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 rootflags=nofail kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd elfcorehdr=158272K Jul 25 10:42:32 localhost kdump-udev-throttler[41941]: kexec: loaded kdump kernel Jul 25 10:42:32 localhost kdump-udev-throttler[41941]: Starting kdump: [OK] Jul 25 10:45:01 localhost kernel: Processor 1 is stuck. Jul 25 10:47:31 localhost kernel: INFO: task irqbalance:4305 blocked for more than 120 seconds.-----------------------------------------blocked Jul 25 10:47:31 localhost kernel: Not tainted 4.18.0-119.el8.ppc64le #1 Jul 25 10:47:31 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 25 10:47:31 localhost kernel: irqbalance D 0 4305 1 0x00040080 Jul 25 10:47:31 localhost kernel: Call Trace: Jul 25 10:47:31 localhost kernel: [c000001d967df850] [c000001d75582f80] 0xc000001d75582f80 (unreliable) Jul 25 10:47:31 localhost kernel: [c000001d967dfa20] [c00000000001fa10] __switch_to+0x2e0/0x4e0 Jul 25 10:47:31 localhost kernel: [c000001d967dfa80] [c000000000d411c4] __schedule+0x2c4/0xb20 Jul 25 10:47:31 localhost kernel: [c000001d967dfb50] [c000000000d41a68] schedule+0x48/0xb0 Jul 25 10:47:31 localhost kernel: [c000001d967dfb70] [c000000000d42060] schedule_preempt_disabled+0x20/0x30 Jul 25 10:47:31 localhost kernel: [c000001d967dfb90] [c000000000d43ac8] __mutex_lock.isra.1+0x3b8/0x6f0 Jul 25 10:47:31 localhost kernel: [c000001d967dfc20] [c0000000001e18f8] irq_lock_sparse+0x28/0x40 Jul 25 10:47:31 localhost kernel: [c000001d967dfc40] [c0000000001ee54c] show_interrupts+0x18c/0x550 Jul 25 10:47:31 localhost kernel: [c000001d967dfd00] [c00000000050b66c] seq_read+0x1bc/0x640 Jul 25 10:47:31 localhost kernel: [c000001d967dfda0] [c000000000594264] proc_reg_read+0x84/0x100 Jul 25 10:47:31 localhost kernel: [c000001d967dfdd0] [c0000000004c489c] sys_read+0x10c/0x310 Jul 25 10:47:31 localhost kernel: [c000001d967dfe30] [c00000000000b388] system_call+0x5c/0x70 Jul 25 10:47:31 localhost kernel: Processor 1 is stuck. Jul 25 10:49:02 localhost rhsmd[42188]: In order for Subscription Manager to provide your system with updates, your system must be registered with the Customer Portal. Please enter your Red Hat login to ensure your system is up-to-date. Jul 25 10:50:01 localhost kernel: Processor 1 is stuck. Jul 25 10:50:02 localhost systemd[1]: Starting system activity accounting tool... Jul 25 10:50:02 localhost systemd[1]: Started system activity accounting tool. Jul 25 10:52:32 localhost kernel: Processor 1 is stuck. Jul 25 10:55:03 localhost kernel: Processor 1 is stuck. Jul 25 10:57:33 localhost kernel: Processor 1 is stuck. Jul 25 10:59:48 localhost kernel: INFO: task irqbalance:4305 blocked for more than 120 seconds. Jul 25 10:59:48 localhost kernel: Not tainted 4.18.0-119.el8.ppc64le #1 Jul 25 10:59:48 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 25 10:59:48 localhost kernel: irqbalance D 0 4305 1 0x00040080 Jul 25 10:59:48 localhost kernel: Call Trace: Jul 25 10:59:48 localhost kernel: [c000001d967df850] [000000010005b820] 0x10005b820 (unreliable) Jul 25 10:59:48 localhost kernel: [c000001d967dfa20] [c00000000001fa10] __switch_to+0x2e0/0x4e0 Jul 25 10:59:48 localhost kernel: [c000001d967dfa80] [c000000000d411c4] __schedule+0x2c4/0xb20 Jul 25 10:59:48 localhost kernel: [c000001d967dfb50] [c000000000d41a68] schedule+0x48/0xb0 Jul 25 10:59:48 localhost kernel: [c000001d967dfb70] [c000000000d42060] schedule_preempt_disabled+0x20/0x30 Jul 25 10:59:48 localhost kernel: [c000001d967dfb90] [c000000000d43ac8] __mutex_lock.isra.1+0x3b8/0x6f0 Jul 25 10:59:48 localhost kernel: [c000001d967dfc20] [c0000000001e18f8] irq_lock_sparse+0x28/0x40 Jul 25 10:59:48 localhost kernel: [c000001d967dfc40] [c0000000001ee54c] show_interrupts+0x18c/0x550 Jul 25 10:59:48 localhost kernel: [c000001d967dfd00] [c00000000050b66c] seq_read+0x1bc/0x640 Jul 25 10:59:48 localhost kernel: [c000001d967dfda0] [c000000000594264] proc_reg_read+0x84/0x100 Jul 25 10:59:48 localhost kernel: [c000001d967dfdd0] [c0000000004c489c] sys_read+0x10c/0x310 Jul 25 10:59:48 localhost kernel: [c000001d967dfe30] [c00000000000b388] system_call+0x5c/0x70 Jul 25 11:00:03 localhost kernel: Processor 1 is stuck. Jul 25 11:00:04 localhost systemd[1]: Starting system activity accounting tool... Jul 25 11:00:04 localhost LTP: starting cpuset_memory (cpuset_memory_testset.sh) Jul 25 11:00:04 localhost systemd[1]: Started system activity accounting tool. Jul 25 11:00:04 localhost LTP: starting cpuset_memory_pressure (cpuset_memory_pressure_testset.sh) Jul 25 11:00:04 localhost LTP: starting cpuset_memory_spread (cpuset_memory_spread_testset.sh) Jul 25 11:00:04 localhost LTP: starting cpuset_regression_test (cpuset_regression_test.sh) Jul 25 11:00:04 localhost LTP: starting cgroup_xattr Jul 25 11:00:04 localhost kernel: new mount options do not match the existing superblock, will be ignored # cat /opt/ltp/results/LTP_RUN_ON-2019_07_25-09h_57m_04s.log | grep FAIL memcg_max_usage_in_bytes FAIL 2 memcg_stat FAIL 1 memcg_use_hierarchy FAIL 1 cpuset_hotplug FAIL 1 cpuset_regression_test FAIL 1 Actual results: Processor 1 is stuck.task irqbalance:4305 blocked for more than 120 seconds Expected results: the guest without calltrace Additional info:
Created attachment 1593342 [details] this is guest /var/log
(In reply to zhenyzha from comment #0) ... > # cat /opt/ltp/results/LTP_RUN_ON-2019_07_25-09h_57m_04s.log | grep FAIL > memcg_max_usage_in_bytes FAIL 2 > memcg_stat FAIL 1 > memcg_use_hierarchy FAIL 1 These 3 failures are tracked in BZ 1732785 and are not related to the cpuset_hotplug error.
zhenyzha, Can you retest now that we have official builds based on qemu-4.1?
(In reply to David Gibson from comment #7) > zhenyzha, > > Can you retest now that we have official builds based on qemu-4.1? OK,Update results later.
hit this issue again on qemu-4.1 # cat results/LTP_RUN_ON-2019_08_26-16h_58m_15s.log | grep FAIL ...... cpuset_hotplug FAIL 1 cpuset_regression_test FAIL 1 check guest : vim /var/log/messages Aug 26 17:44:39 dhcp19-129-175 LTP: starting cpuset_hotplug (cpuset_hotplug_test.sh) Aug 26 17:44:40 dhcp19-129-175 kernel: Querying DEAD? cpu 1 (8) shows 2 Aug 26 17:44:40 dhcp19-129-175 kernel: cpu 1 (hwid 8) Ready to die... Aug 26 17:44:40 dhcp19-129-175 systemd[1]: Started /usr/lib/udev/kdump-udev-throttler. Aug 26 17:44:40 dhcp19-129-175 kernel: Querying DEAD? cpu 1 (8) shows 2 Aug 26 17:44:41 dhcp19-129-175 kdump-udev-throttler[41319]: kexec: unloaded kdump kernel Aug 26 17:44:41 dhcp19-129-175 kdump-udev-throttler[41319]: Stopping kdump: [OK] Aug 26 17:44:42 dhcp19-129-175 kdump-udev-throttler[41319]: Modified cmdline:BOOT_IMAGE=/vmlinuz-4.18.0-136.el8.ppc64le ro console=ttyS0,115200 biosdevname=0 net.ifnames=0 rhgb console=tty0 irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 rootflags=nofail kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd elfcorehdr=158272K Aug 26 17:44:42 dhcp19-129-175 kdump-udev-throttler[41319]: kexec: loaded kdump kernel Aug 26 17:44:42 dhcp19-129-175 kdump-udev-throttler[41319]: Starting kdump: [OK] Aug 26 17:47:10 dhcp19-129-175 kernel: WARNING: CPU: 1 PID: 0 at arch/powerpc/platforms/pseries/hotplug-cpu.c:159 pseries_mach_cpu_die+0xbc/0x2f0 Aug 26 17:47:10 dhcp19-129-175 kernel: Modules linked in: loop fuse nf_tables_set nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nft_chain_route_ipv6 nft_chain_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nft_chain_route_ipv4 nf_conntrack ip6_tables ip_tables nft_compat ip_set nf_tables nfnetlink xfs libcrc32c bochs_drm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm xts vmx_crypto virtio_net net_failover virtio_blk failover drm_panel_orientation_quirks dm_multipath dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi Aug 26 17:47:10 dhcp19-129-175 kernel: CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Not tainted 4.18.0-136.el8.ppc64le #1 Aug 26 17:47:10 dhcp19-129-175 kernel: NIP: c0000000000fc61c LR: c0000000000fc5d8 CTR: c000000007ffee00 Aug 26 17:47:10 dhcp19-129-175 kernel: REGS: c0000018f674fab0 TRAP: 0700 Not tainted (4.18.0-136.el8.ppc64le) Aug 26 17:47:10 dhcp19-129-175 kernel: MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 24000444 XER: 20040000 Aug 26 17:47:10 dhcp19-129-175 kernel: CFAR: c0000000000b2bb8 IRQMASK: 1 #012GPR00: c0000000000fc5d8 c0000018f674fd30 c000000001662900 0000000000000001 #012GPR04: c0000018fe16b570 c0000018fe16b570 0000000000000001 0000000000000010 #012GPR08: c0000018fe16b570 0000000000000001 00000018fcfb0000 00000018fcfb0000 #012GPR12: c0000000000b5200 c000000007ffee00 c0000018f674ff90 0000000000000000 #012GPR16: c0000000016920e8 0000000000000000 0000000000000800 0000000000000001 #012GPR20: c000000001195608 0000000000000001 0000000000000000 00000000000000 0000008 c000000001691ee8 Aug 26 17:47:10 dhcp19-129-175 kernel: NIP [c0000000000fc61c] pseries_mach_cpu_die+0xbc/0x2f0 Aug 26 17:47:10 dhcp19-129-175 kernel: LR [c0000000000fc5d8] pseries_mach_cpu_die+0x78/0x2f0 Aug 26 17:47:10 dhcp19-129-175 kernel: Call Trace: Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fd30] [c0000000000fc5d8] pseries_mach_cpu_die+0x78/0x2f0 (unreliable) Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fde0] [c0000000000592e8] cpu_die+0x48/0x70 Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fe00] [c0000000000210c0] arch_cpu_idle_dead+0x20/0x40 Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fe20] [c000000000199154] do_idle+0x2d4/0x480 Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fea0] [c000000000199538] cpu_startup_entry+0x38/0x40 Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674fed0] [c000000000058ea0] start_secondary+0x780/0x860 Aug 26 17:47:10 dhcp19-129-175 kernel: [c0000018f674ff90] [c00000000000ac70] start_secondary_prolog+0x10/0x14 Aug 26 17:47:10 dhcp19-129-175 kernel: Instruction dump: Aug 26 17:47:10 dhcp19-129-175 kernel: 3b1842f0 7d5bf02a 3bb80004 7fa9eb78 7d29502e 2f890001 419e00c0 7d5bf02a Aug 26 17:47:10 dhcp19-129-175 kernel: 7d3d502e 7d290034 5529d97e 69290001 <0b090000> 38800000 39200000 7f45d378 Aug 26 17:47:10 dhcp19-129-175 kernel: ---[ end trace 9b6249510dc45846 ]--- Aug 26 17:47:10 dhcp19-129-175 kernel: cpu 1 (hwid 8) Ready to die... Aug 26 17:47:10 dhcp19-129-175 kernel: Processor 1 is stuck. -----------------------------------------stuck Aug 26 17:49:23 dhcp19-129-175 kernel: INFO: task irqbalance:3711 blocked for more than 120 seconds. -----------------------------------------blocked Aug 26 17:49:23 dhcp19-129-175 kernel: Tainted: G W --------- - - 4.18.0-136.el8.ppc64le #1 Aug 26 17:49:23 dhcp19-129-175 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 26 17:49:23 dhcp19-129-175 kernel: irqbalance D 0 3711 1 0x00040080 Aug 26 17:49:23 dhcp19-129-175 kernel: Call Trace: Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13850] [c0000018efdc5e80] 0xc0000018efdc5e80 (unreliable) Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13a20] [c00000000001fa00] __switch_to+0x2e0/0x4e0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13a80] [c000000000d43654] __schedule+0x2c4/0xb20 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13b50] [c000000000d43ef8] schedule+0x48/0xb0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13b70] [c000000000d444f0] schedule_preempt_disabled+0x20/0x30 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13b90] [c000000000d45f58] __mutex_lock.isra.1+0x3b8/0x6f0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13c20] [c0000000001e1758] irq_lock_sparse+0x28/0x40 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13c40] [c0000000001ee3ac] show_interrupts+0x18c/0x550 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13d00] [c00000000050d30c] seq_read+0x1bc/0x640 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13da0] [c000000000595e94] proc_reg_read+0x84/0x100 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13dd0] [c0000000004c668c] sys_read+0x10c/0x310 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000018efe13e30] [c00000000000b388] system_call+0x5c/0x70 Aug 26 17:49:23 dhcp19-129-175 kernel: INFO: task kworker/53:1:24184 blocked for more than 120 seconds. Aug 26 17:49:23 dhcp19-129-175 kernel: Tainted: G W --------- - - 4.18.0-136.el8.ppc64le #1 Aug 26 17:49:23 dhcp19-129-175 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 26 17:49:23 dhcp19-129-175 kernel: kworker/53:1 D 0 24184 2 0x00000888 Aug 26 17:49:23 dhcp19-129-175 kernel: Workqueue: events slab_caches_to_rcu_destroy_workfn Aug 26 17:49:23 dhcp19-129-175 kernel: Call Trace: Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc437d0] [c0000017c77ed900] 0xc0000017c77ed900 (unreliable) Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc439a0] [c00000000001fa00] __switch_to+0x2e0/0x4e0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43a00] [c000000000d43654] __schedule+0x2c4/0xb20 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43ad0] [c000000000d43ef8] schedule+0x48/0xb0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43af0] [c000000000d47ca8] rwsem_down_read_failed+0x138/0x250 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43b70] [c0000000001d1578] __percpu_down_read+0x128/0x130 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43ba0] [c000000000144abc] cpus_read_lock+0x7c/0x90 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43bc0] [c0000000001f7e18] rcu_barrier+0xc8/0x320 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43c30] [c0000000003f96d8] slab_caches_to_rcu_destroy_workfn+0xa8/0x110 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43c70] [c000000000171ef4] process_one_work+0x2f4/0x5c0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43d10] [c000000000172c50] worker_thread+0x360/0x760 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43dc0] [c00000000017c4dc] kthread+0x1ac/0x1c0 Aug 26 17:49:23 dhcp19-129-175 kernel: [c0000017bfc43e30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 Aug 26 17:49:41 dhcp19-129-175 kernel: Processor 1 is stuck. Aug 26 17:52:11 dhcp19-129-175 kernel: Processor 1 is stuck. Aug 26 17:52:11 dhcp19-129-175 systemd[1]: Starting system activity accounting tool... Aug 26 17:52:11 dhcp19-129-175 systemd[1]: Started system activity accounting tool. Aug 26 17:54:41 dhcp19-129-175 kernel: Processor 1 is stuck. Aug 26 17:57:12 dhcp19-129-175 kernel: Processor 1 is stuck. Aug 26 17:59:37 dhcp19-129-175 kernel: INFO: task kworker/0:3:402 blocked for more than 120 seconds. Aug 26 17:59:37 dhcp19-129-175 kernel: Tainted: G W --------- - - 4.18.0-136.el8.ppc64le #1 Aug 26 17:59:37 dhcp19-129-175 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 26 17:59:37 dhcp19-129-175 kernel: kworker/0:3 D 0 402 2 0x00000808 Aug 26 17:59:37 dhcp19-129-175 kernel: Workqueue: events vmstat_shepherd Version-Release number of selected component (if applicable): host: # uname -r 4.18.0-137.el8.ppc64le # /usr/libexec/qemu-kvm -version QEMU emulator version 4.1.0 (qemu-kvm-4.1.0-4.module+el8.1.0+4020+16089f93) guest: # uname -r 4.18.0-136.el8.ppc64le
Additional info: The same steps were tested on qemu-kvm-4.0.0-6, no hit this issue. # /usr/libexec/qemu-kvm -version QEMU emulator version 4.0.0 (qemu-kvm-4.0.0-6.module+el8.1.0+3736+a2aefea3) cpuset_hotplug PASS cpuset_regression_test PASS
Additional info: The same steps were tested on 4.18.0-137.el8.ppc64le host, no hit this issue. cpuset_hotplug FAIL ----------------but no Call Trace: cpuset_regression_test PASS [root@ibm-p9b-11 results]# cat LTP_RUN_ON-2019_08_29-02h_47m_11s.log Test Start Time: Thu Aug 29 02:47:12 2019 ----------------------------------------- Testcase Result Exit Value -------- ------ ---------- cpuset_hotplug FAIL 1 ----------------------------------------------- Total Tests: 1 Total Skipped Tests: 0 Total Failures: 1 Kernel Version: 4.18.0-137.el8.ppc64le Machine Architecture: ppc64le Hostname: ibm-p9b-11.pnr.lab.eng.bos.redhat.com check the host /var/log/messages: Aug 29 02:47:11 ibm-p9b-11 kernel: loop: module loaded Aug 29 02:47:12 ibm-p9b-11 LTP: starting cpuset_hotplug (cpuset_hotplug_test.sh) Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 31: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 110: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 187: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 244: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 263: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 439: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 460: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 kernel: IRQ 536: no longer affine to CPU1 Aug 29 02:47:12 ibm-p9b-11 systemd[1]: Started /usr/lib/udev/kdump-udev-throttler. Aug 29 02:47:13 ibm-p9b-11 systemd[1]: Started /usr/lib/udev/kdump-udev-throttler. Aug 29 02:47:13 ibm-p9b-11 kdump-udev-throttler[66336]: Throttling kdump restart for concurrent udev event Aug 29 02:47:13 ibm-p9b-11 kdump-udev-throttler[66196]: kexec: unloaded kdump kernel Aug 29 02:47:13 ibm-p9b-11 kdump-udev-throttler[66196]: Stopping kdump: [OK] Aug 29 02:47:14 ibm-p9b-11 kdump-udev-throttler[66196]: Modified cmdline:ro irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 rootflags=nofail kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd elfcorehdr=158272K Aug 29 02:47:14 ibm-p9b-11 kdump-udev-throttler[66196]: kexec: loaded kdump kernel Aug 29 02:47:14 ibm-p9b-11 kdump-udev-throttler[66196]: Starting kdump: [OK] Aug 29 02:47:15 ibm-p9b-11 systemd[1]: Started /usr/lib/udev/kdump-udev-throttler. Aug 29 02:47:16 ibm-p9b-11 kdump-udev-throttler[67035]: kexec: unloaded kdump kernel Aug 29 02:47:16 ibm-p9b-11 kdump-udev-throttler[67035]: Stopping kdump: [OK] Aug 29 02:47:16 ibm-p9b-11 kdump-udev-throttler[67035]: Modified cmdline:ro irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 rootflags=nofail kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd elfcorehdr=158272K Aug 29 02:47:16 ibm-p9b-11 kdump-udev-throttler[67035]: kexec: loaded kdump kernel Aug 29 02:47:16 ibm-p9b-11 kdump-udev-throttler[67035]: Starting kdump: [OK]
The same steps were tested on qemu-kvm-4.1.0-5.module+el8.1.0+4076+b5e41ebc, no hit this issue. host: # uname -r 4.18.0-137.el8.ppc64le # /usr/libexec/qemu-kvm -version QEMU emulator version 4.1.0 (qemu-kvm-4.1.0-5.module+el8.1.0+4076+b5e41ebc) guest: # uname -r 4.18.0-139.el8.ppc64le cpuset_hotplug PASS cpuset_regression_test PASS so close this bug