1659127 – Stress guest and stop it, then do live migration, guest hit call trace on destination end

Bug 1659127 - Stress guest and stop it, then do live migration, guest hit call trace on destination end

Summary: Stress guest and stop it, then do live migration, guest hit call trace on des...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Advanced Virtualization
Classification:	Red Hat
Component:	qemu-kvm
Sub Component:
Version:	8.0
Hardware:	s390x
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	pre-dev-freeze
Target Release:	8.0
Assignee:	David Hildenbrand
QA Contact:	xianwang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-12-13 16:07 UTC by David Hildenbrand
Modified:	2019-11-12 00:10 UTC (History)
CC List:	11 users (show)
Fixed In Version:	qemu-kvm-3.1.0-9.module+el8+2731+e40e7b84
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1653569
Environment:
Last Closed:	2019-05-29 16:04:59 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:1293	0	None	None	None	2019-05-29 16:05:20 UTC

Comment 1 Danilo de Paula 2018-12-14 12:04:16 UTC

QA_ACK please?

Comment 5 Danilo de Paula 2019-01-29 14:10:15 UTC

Fix included in qemu-kvm-3.1.0-9.module+el8+2731+e40e7b84

Comment 6 xianwang 2019-02-02 11:05:11 UTC

Due to our s390x hosts are not available, so I have to tried it on s390x vm which is reserved in beaker.

Bug reproduction:
Host(L1vm):
4.18.0-64.el8.s390x
qemu-kvm-3.1.0-2.module+el8+2606+2c716ad7.s390x

Guest(L2vm):
4.18.0-57.el8.s390x

steps are same with bug reports, call trace message is as following:
[ 3410.137893] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 3410.137985] rcu: 	(detected by 1, t=22180 jiffies, g=265545, q=0)
[ 3410.137996] rcu: All QSes seen, last rcu_sched kthread activity 22180 (4295278284-4295256104), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 3410.138002] stress          R  running task        0  2509   2507 0x02000000
[ 3410.138021] Call Trace:
[ 3410.138068] ([<000000000033a4c8>] iterate_supers+0x88/0x168)
[ 3410.138074]  [<0000000000374b6a>] ksys_sync+0x5a/0xb0 
[ 3410.138076]  [<0000000000374bea>] sys_sync+0x2a/0x40 
[ 3410.138089]  [<000000000077e14e>] system_call+0x2aa/0x2c8 
[ 3410.138098] rcu: rcu_sched kthread starved for 22180 jiffies! g265545 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
[ 3410.138099] rcu: RCU grace-period kthread stack dump:
[ 3410.138101] rcu_sched       R  running task        0    10      2 0x00000000
[ 3410.138105] Call Trace:
[ 3410.138109] ([<0000000000778b9e>] __schedule+0x2ce/0x8e8)
[ 3410.138110]  [<0000000000779202>] schedule+0x4a/0xa8 
[ 3410.138112]  [<000000000077d0ae>] schedule_timeout+0x1ce/0x458 
[ 3410.138125]  [<00000000001bfcac>] rcu_gp_kthread+0x2fc/0x630 
[ 3410.138130]  [<0000000000169774>] kthread+0x14c/0x168 
[ 3410.138132]  [<000000000077e19a>] kernel_thread_starter+0xa/0x10 
[ 3410.138134]  [<000000000077e190>] kernel_thread_starter+0x0/0x10 

[ 3411.575446] systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
[ 3411.579917] systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
[ 3411.651198] systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart.
[ 3411.652508] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 2.
[ 3411.656274] systemd[1]: Stopped Flush Journal to Persistent Storage.
[ 3411.657380] systemd[1]: Stopping Flush Journal to Persistent Storage...
[ 3411.657697] systemd[1]: Stopped Journal Service.
[ 3411.766634] systemd[1]: Starting Journal Service...
[ 3412.041858] systemd-journald[2563]: File /run/log/journal/de3755463f414686bb01ec7bb05b37c1/system.journal corrupted or uncleanly shut down, renaming and replacing.
[ 3412.176527] systemd[1]: Started Journal Service.
[ 3414.404264] systemd-coredump[2560]: MESSAGE=Process 703 (systemd-journal) of user 0 dumped core.
[ 3414.404323] systemd-coredump[2560]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.4527f4ea9aa14529be1909460be28029.703.1549103297000000.lz4
[ 3414.404335] systemd-coredump[2560]: Stack trace of thread 703:
[ 3414.404339] systemd-coredump[2560]: #0  0x000002aa02509460 open_journal (systemd-journald)
[ 3414.404341] systemd-coredump[2560]: #1  0x000002aa0250f85c system_journal_open (systemd-journald)
[ 3414.404343] systemd-coredump[2560]: #2  0x000002aa0250faa6 find_journal (systemd-journald)
[ 3414.404345] systemd-coredump[2560]: #3  0x000002aa0250c02c dispatch_message_real (systemd-journald)
[ 3414.404348] systemd-coredump[2560]: #4  0x000002aa0250ff4c server_dispatch_message (systemd-journald)
[ 3414.404350] systemd-coredump[2560]: #5  0x000002aa02508374 server_read_dev_kmsg (systemd-journald)
[ 3414.404352] systemd-coredump[2560]: #6  0x000003ff97fe39cc source_dispatch (libsystemd-shared-239.so)
[ 3414.408438] systemd-coredum: 5 output lines suppressed due to ratelimiting

Bug verification:
Host(L1vm):
4.18.0-64.el8.s390x
qemu-kvm-3.1.0-11.module+el8+2747+40c9b77e.s390x

steps are same with above.

result:
migration completed, there is no call trace and vm works well after miration.

Comment 7 xianwang 2019-02-03 01:59:23 UTC

According to comment 6, this issue is verified on vm, could you think we could verify this bug? thanks

Comment 8 xianwang 2019-02-03 02:10:18 UTC

Hi, David,
According to comment 6, this issue is verified on vm, do you think we could verify this bug? thanks

Comment 11 errata-xmlrpc 2019-05-29 16:04:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1293

Note You need to log in before you can comment on or make changes to this bug.