596223 – Kdump on Intel fails because of misrouted timer IRQs

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 596223 - Kdump on Intel fails because of misrouted timer IRQs

Summary: Kdump on Intel fails because of misrouted timer IRQs

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Chris Lalancette
QA Contact:	Qian Cai
Docs Contact:
URL:
Whiteboard:	see also bug 418501
Depends On:
Blocks:	524819
TreeView+	depends on / blocked

Reported:	2010-05-26 12:19 UTC by Dor Laor
Modified:	2016-04-26 14:40 UTC (History)
CC List:	21 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	505527
Environment:
Last Closed:	2010-11-11 15:45:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 1 RHEL Program Management 2010-05-26 12:36:44 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 2 Chris Lalancette 2010-06-04 20:03:32 UTC

I've been looking at this problem and made some progress, although it's not all encouraging news.

I think I've described the basic problem enough, so I'll skip it here and just launch into specifics. In the KVM code, there are a number of places where we make incorrect assumptions regarding the timer and which vcpu to deliver it to:

1) In the i8254.c code when the hrtimer representing the PIT expires. In
this case, when we get the callback, we kick only the BSP.

2) In the i8254.c code when a vcpu is migrated from one processor to another.
In this case we only migrate the PIT timer if the vcpu to be migrated is the
BSP.

3) In the lapic code when deciding whether to accept a PIC interrupt, we only accept interrupts on the BSP.

4) In the irq_comm.c code when calling kvm_irq_delivery_to_apic(). The problem
here is that we don't take into account the fact that an LAPIC might be disabled when trying to deliver an interrupt in DM_LOWEST mode. Further, on a kexec, the processor that we are kexec'ing *to* gets it's APIC ID re-written to the BSP APIC ID. What it means in the end is that we are currently still matching against the BSP even though vcpu 1 (where the kexec is happening) would match if we let it.

I have a patch currently that can take care of 1), 3), and 4), and works in my
testing (it needs to be cleaned up a bit to not be so inefficient, but it should work). However, problem 2) is pretty sticky. The reason we are currently migrating the PIT timer around with the BSP is pretty well explained in commit 2f5997140f22f68f6390c49941150d3fa8a95cb7. With my new patch, though, we are no longer guaranteeing that we are going to inject onto CPU 0. I think we can do something where when the hrtimer expires, we figure out which processor will get the timer interrupt and IPI to that processor to cause the VMEXIT. Unfortunately it's racy because the expiration of the hrtimer is de-coupled from the setting of the IRQ for the interrupt. That means that the hrtimer could expire, we could choose vcpu 2 (say), IPI to cause a VMEXIT, but by the time it goes to VMENTER the guest has changed something in the (IO)APIC(s) so that the set_irq logic chooses vcpu 3 to do the injection. This would result in a delayed injection that 2f5997 is trying to avoid.

Taking a step back, it seems to me that something along the lines of my previous patchset (where we do set_irq directly from the hrtimer callback) is the right way to go. We would still need to IPI to the appropriate physical cpu to cause a VMEXIT on the cpu we care about, but we would avoid the race I describe in the previous paragraph. Unfortunately that patchset is also much more risky.

Thoughts? I'll also post a similar mail to kvm@ to gauge opinions there, but does anyone have any thoughts here?

Chris Lalancette

Comment 3 Aristeu Rozanski 2010-07-01 16:13:04 UTC

Patch(es) available on kernel-2.6.32-42.el6

Comment 8 releng-rhel@redhat.com 2010-11-11 15:45:04 UTC

Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.

anderson
apevec
clalance
ehabkost
knoel
mikeda
mtosatti
ndai
nhorman
ovirt-maint
pbonzini
phan
qbarnes
qcai
rmitchel
sct
sgrinber
syeghiay
tburke
virt-maint
ykaul