Bug 1926381 - Guest OS crash with a kernel panic : unable to handle kernel paging request
Summary: Guest OS crash with a kernel panic : unable to handle kernel paging request
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 2.3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 2.6.2
Assignee: Vladik Romanovsky
QA Contact: zhe peng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-08 18:31 UTC by Jean-Francois Saucier
Modified: 2024-03-25 18:08 UTC (History)
8 users (show)

Fixed In Version: virt-launcher-container-v2.6.2-4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-04 20:09:13 UTC
Target Upstream Version:
Embargoed:
rnetser: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2021:1502 0 None None None 2021-05-04 20:09:23 UTC

Description Jean-Francois Saucier 2021-02-08 18:31:50 UTC
Description of problem:

Running a custom guest OS (based on Fedora Core 12), the guest crash with a kernel panic after a few days :

[  134.251405] BUG: unable to handle kernel paging request at ffffffff8104e8f7
[  134.252284] IP: [<ffffffff8104e8f7>] kvm_kick_cpu+0x27/0x30
[  134.252979] PGD 1c0a067 [  134.253266] PUD 1c0b063 
PMD 10001e1 [  134.253733] 
[  134.253936] Oops: 0003 [#1] SMP
[  134.254316] Modules linked in: softdog virtio_net cpuid tg3 hwmon e1000e ptp pps_core e1000 vmxnet3 i2c_i801 i2c_smbus virtio_console xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink xt_addrtype iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack overlay [last unloaded: virtio_net]
[  134.258187] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.9.58 #1
[  134.259039] Hardware name: KubeVirt None, BIOS 1.12.0-5.module+el8.1.1+5309+6d656f05 04/01/2014
[  134.259256] task: ffffffff81c0e500 task.stack: ffffffff81c00000
[  134.259256] RIP: 0010:[<ffffffff8104e8f7>]  [<ffffffff8104e8f7>] kvm_kick_cpu+0x27/0x30
[  134.259256] RSP: 0018:ffff88107fc03df0  EFLAGS: 00010046
[  134.259256] RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000002
[  134.259256] RDX: ffff88107fc80000 RSI: 0000000000000003 RDI: 0000000000000002
[  134.259256] RBP: ffff88107fc03df8 R08: 0000000000000068 R09: ffff88107ffe5640
[  134.259256] R10: 0000000000000100 R11: ffffea004093f9c0 R12: ffff881036fa0f00
[  134.259256] R13: 0000000000000001 R14: ffff881038a09000 R15: 0000000000000082
[  134.259256] FS:  0000000000000000(0000) GS:ffff88107fc00000(0000) knlGS:0000000000000000
[  134.259256] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.259256] CR2: ffffffff8104e8f7 CR3: 0000001038131000 CR4: 00000000001406b0
[  134.259256] Stack:
[  134.259256]  ffff881036fa4800 ffff88107fc03e08 ffffffff810a2c90 ffff88107fc03e58
[  134.259256]  ffffffff810a2665 ffffea004093f9c0 ffff881036818350 0000000000000000
[  134.259256]  ffff881036818340 ffff881036fa4808 0000000000000003 0000000000000000
[  134.259256] Call Trace:
[  134.259256]  <IRQ> [  134.259256]  [<ffffffff810a2c90>] __pv_queued_spin_unlock_slowpath+0xa0/0xe0
[  134.259256]  [<ffffffff810a2665>] __raw_callee_save___pv_queued_spin_unlock_slowpath+0x15/0x24
[  134.259256]  [<ffffffff810a2694>] .slowpath+0x9/0x15
[  134.259256]  [<ffffffff81758aa0>] _raw_spin_unlock_irqrestore+0x10/0x20
[  134.259256]  [<ffffffff81415ad8>] virtblk_done+0xc8/0xe0
[  134.259256]  [<ffffffff8139a301>] vring_interrupt+0x31/0x50
[  134.259256]  [<ffffffff810aa4c1>] __handle_irq_event_percpu+0x41/0x1b0
[  134.259256]  [<ffffffff810aa653>] handle_irq_event_percpu+0x23/0x60
[  134.259256]  [<ffffffff810aa6cb>] handle_irq_event+0x3b/0x60
[  134.259256]  [<ffffffff810ad98b>] handle_edge_irq+0xab/0x150
[  134.259256]  [<ffffffff8102cadd>] handle_irq+0x1d/0x30
[  134.259256]  [<ffffffff8175b35d>] do_IRQ+0x4d/0xd0
[  134.259256]  [<ffffffff817598bf>] common_interrupt+0x7f/0x7f
[  134.259256]  <EOI> [  134.259256]  [<ffffffff81758816>] ? native_safe_halt+0x6/0x10
[  134.259256]  [<ffffffff81758563>] default_idle+0x23/0xd0
[  134.259256]  [<ffffffff810341ff>] arch_cpu_idle+0xf/0x20
[  134.259256]  [<ffffffff81758973>] default_idle_call+0x23/0x40
[  134.259256]  [<ffffffff810a05bc>] cpu_startup_entry+0xec/0x1e0
[  134.259256]  [<ffffffff817520b7>] rest_init+0x77/0x80
[  134.259256]  [<ffffffff81d230cf>] start_kernel+0x430/0x43d
[  134.259256]  [<ffffffff81d22a8d>] ? set_init_arg+0x55/0x55
[  134.259256]  [<ffffffff81d22120>] ? early_idt_handler_array+0x120/0x120
[  134.259256]  [<ffffffff81d225d6>] x86_64_start_reservations+0x2a/0x2c
[  134.259256]  [<ffffffff81d22715>] x86_64_start_kernel+0x13d/0x14c
[  134.259256] Code: 00 00 00 00 66 66 66 66 90 48 63 ff 55 48 c7 c0 3a a1 00 00 48 8b 14 fd c0 13 a7 81 48 89 e5 53 31 db 0f b7 0c 10 b8 05 00 00 00 <0f> 01 c1 5b 5d c3 0f 1f 00 66 66 66 66 90 55 65 48 8b 04 25 80 
[  134.259256] RIP  [<ffffffff8104e8f7>] kvm_kick_cpu+0x27/0x30
[  134.259256]  RSP <ffff88107fc03df0>
[  134.259256] CR2: ffffffff8104e8f7
[    0.000000] do_IRQ: 0.97 No irq handler for vector
ÿdoing setup
waiting for dump device vda5
dumping to vda5
Copying data                                      : [100.0 %] /           eta: 0s
dump complete
rebooting
[   18.867620] reboot: Restarting system 



Version-Release number of selected component (if applicable):
Openshift 4.4.3
CNV 2.3


How reproducible:
Sometime the guest OS runs fine but crash at other time.


Steps to Reproduce:
1. Deploy the custom guest OS on CNV 2.3
2. Run the guest OS for a couple of days


Actual results:
Sometime cause a kernel panic


Expected results:
Guest OS runs fine

Comment 1 Vladik Romanovsky 2021-02-08 21:23:02 UTC
This is about supporting very old or custom guest OS kernels that are experiencing kernel panics due to an enabled pvspinlock.

I've posted a PR to allow users to disable this option for their guests.
https://github.com/kubevirt/kubevirt/pull/4972

Comment 12 zhe peng 2021-04-27 06:14:35 UTC
verify with build virt-launcher-container-v2.6.2-4

step:
1. create fedora 12 vm
2. add below part in vm's yaml file to disable pvspinlock
....
    features:
      pvspinlock:
        enabled: false
....
3. start fedora12 vm and check libvirt xml 
...
 <features>
    <acpi/>
    <pvspinlock state='off'/>
  </features>
...

move to verified.

Comment 21 errata-xmlrpc 2021-05-04 20:09:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 2.6.2 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:1502

Comment 22 Red Hat Bugzilla 2023-09-15 01:00:47 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.