Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1417049 - kernel panic due to wrong init of percpu rh_cpu_info.logical_proc_id for half of the CPUs on vmware rhel-7 VM
Summary: kernel panic due to wrong init of percpu rh_cpu_info.logical_proc_id for half...
Keywords:
Status: CLOSED DUPLICATE of bug 1414054
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel
Version: 7.3
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: 7.3
Assignee: Red Hat Kernel Manager
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 1451659 (view as bug list)
Depends On:
Blocks: 1298243
TreeView+ depends on / blocked
 
Reported: 2017-01-27 05:47 UTC by Supreet
Modified: 2020-12-14 08:02 UTC (History)
31 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-02 07:49:03 UTC
Target Upstream Version:


Attachments (Terms of Use)
traceback (308.49 KB, text/plain)
2017-01-27 05:52 UTC, Supreet
no flags Details
Patch for qemu-1.5.3 to simulate broken FW behavior (633 bytes, patch)
2017-05-29 08:14 UTC, Sergio Lopez
no flags Details | Diff

Description Supreet 2017-01-27 05:47:49 UTC
Description of problem:

With RHEL 7.3 kernel, it panics with below variability :

1) ipsec must be started; if this is done with systemctl and the above 
configuration, a panic happens ~80% of the time immediately

2) the more rules defined in iptables, the increased likelihood for an 
occurrence of the problem

3) the more VCPUs, the increased likelihood for an occurrence of the 
problem.  In fact, with 4 VCPUs the problem has not been seen, with 6 
its infrequent and with 8 its very frequent.

4) the problem has not been seen in KVM based VMs with greater VCPU 
count (20) and the same iptables and ipsec setups.

5) if ipsec's pluto process is started during the boot as we do with 
systemd, the problem happens within a few seconds.   If that process is 
started by hand, there can be substantial delay--10s of minutes--before 
there is a panic.   There is no problem if the ipsec modules are loaded 
but pluto is not started.

6) when fewer processes (i.e., application) are started., the likelihood 
of the problem decreases somewhat but does not go to zero.

But with the  exact same setup with 3.10.0-327.36.1.el7.x86_64 does not have a 
problem.

The problem is always for fresh Rhel 7.3 installs.

Version-Release number of selected component (if applicable):
kernel-3.10.0-514.el7.x86_64                     

How reproducible:
a) vmware guest
b) at least 6 VCPUs and preferably 8
c)  iptables rules defined
d) ipsec's pluto process started; no configuration established


Actual results:
It gives kernel panic

Expected results:
It should not kernel panic

Additional info:

The iptables rules defined that we see the problem most frequently with are:
[root@test ~]# iptables -S
-P INPUT DROP
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A INPUT -p udp -m udp --dport 1112:1124 -j DROP
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p udp -m udp --sport 67:68 -j ACCEPT
-A INPUT -p udp -m udp --sport 53 -j ACCEPT
-A INPUT -p udp -m udp --dport 3784 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 50001:50005 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 10300:10304 -j ACCEPT
-A INPUT -p esp -j ACCEPT
-A INPUT -p ah -j ACCEPT
-A INPUT -p udp -m udp --dport 60000 -j ACCEPT
-A INPUT -p udp -m udp --dport 2123 -j ACCEPT
-A INPUT -p udp -m udp --dport 23272 -j ACCEPT
-A INPUT -p udp -m multiport --dports 49900,50000,49999 -j ACCEPT
-A INPUT -i eth0 -j ACCEPT
-A OUTPUT -o eth0 -j ACCEPT

Comment 1 Supreet 2017-01-27 05:52:33 UTC
Created attachment 1244991 [details]
traceback

Comment 4 scott.otto 2017-02-03 06:43:21 UTC
What progress has been made on this issue?

Comment 11 Hannes Frederic Sowa 2017-02-07 15:57:04 UTC
Brew build of kernel-3.10.0-514.el7.x86_64 with patchwork.usersys.redhat.com/patch/161244/ applied:

https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=12503333

Comment 12 scott.otto 2017-02-13 16:17:54 UTC
Any progress made on this issue?

Comment 13 Hannes Frederic Sowa 2017-02-13 16:34:28 UTC
Hello,

we already proposed a test kernel which might fix this issue. Can you try to contact GSS or Supreet so you can get access to the kernel?

Thanks,
Hanns

Comment 20 scott.otto 2017-02-23 02:07:04 UTC
Additional information needed?   I had previously provided feedback:

Most recent comment: On 2017-02-17 07:49:04, Otto, Scott commented:
"Problem is same with 3.10.0-514.el7.bz1417049.x86_64"

Most recent comment: On 2017-02-17 08:26:07, Randhawa, Supreet commented:
"Hello Scott,

I have informed the engineering team about the same.

Thanks & Regards,
Supreet Kaur Randhawa
GSS, Red Hat"

Comment 21 Hannes Frederic Sowa 2017-02-23 11:05:42 UTC
Hi Scott, I think I am fine right now, just need more time to review some changes. I come back to you in case I need further information. Thanks!

Comment 22 scott.otto 2017-03-06 20:40:30 UTC
Hi,
May we have an update on this issue?
Thanks,
Scott

Comment 30 Hannes Frederic Sowa 2017-03-17 10:28:20 UTC
Hello Scott,

after dissecting all the vmcores and didn't got any more clues out of that, I tried to set up a 8 vcpu vmware box and tried to reinstall the same iptables as well as ipsec rules (basically empty with wildcard policies). After adding some programs in early boot-up in systemd that cause quite some network activity I was still not able to reproduce this issue. At this point, I have to unfortunately say, I don't know anywhere else to look at.

My analysis so far showed that even the ipsec system itself might not be the cause for the problem.

The entries in the flow cache table, which causes the kernel panic are not even managed by reference counts or shared between subsystems. They solely get populated by some dense line of hash table management code. Further checking the invalid pointer in the hash table, it appears it points to random kmalloc'ed data in the size from 128 to 192 bytes, so I did find structures of platform management code (sensor/cooling/fan drivers) as well as some timer structures.

Another hint is that the pointer is actually always at the same location only on the percpu hash table on CPU 3. This leaves me to believe that we might see a pointer corruption from another subsystem. I also investigated on vmxnet3 driver but haven't seen any more changes. I think we need support for Nokia and install some debugging code on their systems or need support to reproduce this in-house.

Thanks a lot,
Hannes

Comment 32 scott.otto 2017-03-17 15:00:19 UTC
Hannes,
I'm not surprised at the difficulty in reproducing what looks to be a race condition.  I'm sure there are variables in those conditions we haven't identified.

I suspected that what pluto does when starting and interacting with kernel space affects the timing in those conditions rather than being a direct cause but that being the most solid aspect of the problem has that being the area of investigation.

Note that you have only two vmcore's of the problem; on other test boxes I recall seeing other vCPUs than #3 having the fault and can't necessarily say the ptr value is always the same in those vmcores.

I just got a call from RH regarding setting up remote access to one of our test boxes.  I have the box that can be used for that; we'll have to work out the policies and then procedures for allowing such access.

In the mean time, I think it would be a futile effort to setup a box within RH.  I'm perfectly willing to have product made available and I run the experiments.

Regards,
Scott

Comment 33 Hannes Frederic Sowa 2017-03-20 16:09:12 UTC
Hi Scott,

I am still hoping that the address which causes the fault doesn't change too often. In case of those crashes it always was the address 0xffffffff983bc06c.

I would suggest, that we build up a scenario where we hopefully can get the kernel crash on the same address always, while having a hw breakpoint installed on that particular address.

I would suggest, that we install a very early systemd unit that runs perf early on:

perf record -e '\mem:0xffffffff983bc06c/8:rwx' -a -R -g -o /root/hw_trace.dat

Do you think we can do this? I think it would show the fastest progress.

Thanks,
Hannes

Comment 34 Hannes Frederic Sowa 2017-03-21 13:15:21 UTC
Hello Scott,

as a test, would it also be possible to switch from e1000 NIC to vmxnet3?

Thanks,
Hannes

Comment 52 Hannes Frederic Sowa 2017-05-19 19:33:38 UTC
*** Bug 1451659 has been marked as a duplicate of this bug. ***

Comment 72 Sergio Lopez 2017-05-29 08:11:14 UTC
A patch for the FW CPUID discrepancy was already applied downstream on version 3.10.0-514.21.1.el7:

* Sat Apr 22 2017 Frantisek Hrbata <fhrbata@hrbata.com> [3.10.0-514.21.1.el7]
(...)
- [x86] x86/smpboot: Make logical package management more robust (Prarit Bhargava) [1441643 1414054]
- [x86] x86/cpu: Deal with broken firmware (VMWare/XEN) (Prarit Bhargava) [1441643 1414054]
(...)

I'm testing this with a small QEMU patch which allows simulating the broken behavior with a KVM host:

 - kernel-3.10.0-514.16.1.el7.x86_64

crash> px rh_cpu_info:all | grep logical_proc_id
  logical_proc_id = 0x0
  logical_proc_id = 0x2
  logical_proc_id = 0x4
  logical_proc_id = 0x6
  logical_proc_id = 0x8
  logical_proc_id = 0xa
  logical_proc_id = 0xffff
  logical_proc_id = 0xffff
  logical_proc_id = 0xffff
  logical_proc_id = 0xffff
  logical_proc_id = 0xffff
  logical_proc_id = 0xffff


 - kernel-3.10.0-514.21.1.el7.x86_64

crash> px rh_cpu_info:all | grep logical_proc_id
  logical_proc_id = 0x0
  logical_proc_id = 0x1
  logical_proc_id = 0x2
  logical_proc_id = 0x3
  logical_proc_id = 0x4
  logical_proc_id = 0x5
  logical_proc_id = 0x6
  logical_proc_id = 0x7
  logical_proc_id = 0x8
  logical_proc_id = 0x9
  logical_proc_id = 0xa
  logical_proc_id = 0xb


Also, kernel-3.10.0-514.21.1 shows the following in dmesg:

[    0.198057] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 1 APIC: 2
[    0.219476] [Firmware Bug]: CPU2: APIC id mismatch. Firmware: 2 APIC: 4
[    0.240860] [Firmware Bug]: CPU3: APIC id mismatch. Firmware: 3 APIC: 6
[    0.262399] [Firmware Bug]: CPU4: APIC id mismatch. Firmware: 4 APIC: 8
[    0.283908] [Firmware Bug]: CPU5: APIC id mismatch. Firmware: 5 APIC: a
[    0.307575] [Firmware Bug]: CPU6: APIC id mismatch. Firmware: 6 APIC: c
[    0.330728] [Firmware Bug]: CPU7: APIC id mismatch. Firmware: 7 APIC: e
[    0.352018] [Firmware Bug]: CPU8: APIC id mismatch. Firmware: 8 APIC: 10
[    0.373506] [Firmware Bug]: CPU9: APIC id mismatch. Firmware: 9 APIC: 12
[    0.394972] [Firmware Bug]: CPU10: APIC id mismatch. Firmware: a APIC: 14
[    0.416442] [Firmware Bug]: CPU11: APIC id mismatch. Firmware: b APIC: 16

Comment 73 Sergio Lopez 2017-05-29 08:14:06 UTC
Created attachment 1283170 [details]
Patch for qemu-1.5.3 to simulate broken FW behavior

Comment 74 Michal Tesar 2017-06-02 07:40:04 UTC
Hello all,
the kernel-3.10.0-514.21.1.el7 confirmed by the customer as fixing the problem.
So this BZ can be closed now.

Thanks to all for all the effort done here.

Michal Tesar

Comment 75 Michal Tesar 2017-06-02 07:49:03 UTC

*** This bug has been marked as a duplicate of bug 1414054 ***


Note You need to log in before you can comment on or make changes to this bug.