1325404 – divide by zero leads to host reboot

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1325404 - divide by zero leads to host reboot

Summary: divide by zero leads to host reboot

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	kernel-rt
Sub Component:
Version:	7.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Clark Williams
QA Contact:	Jiri Kastner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1175461 1274397 1327961 1328121
TreeView+	depends on / blocked

Reported:	2016-04-08 17:57 UTC by Luiz Capitulino
Modified:	2016-11-03 19:45 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: merge of RHEL commit ede53d8149707a caused nanoseconds to be used in jiffies context Consequence: time ran forward VERY FAST and caused a divide-by-zero when the counter overflowed Fix: fix merge and make sure all time units are jiffies Result: no divide-by-zero
Clone Of:
Clones:	1327961 1328121 (view as bug list)
Environment:
Last Closed:	2016-11-03 19:45:30 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2016:2584	0	normal	SHIPPED_LIVE	Important: kernel-rt security, bug fix, and enhancement update	2016-11-03 12:08:49 UTC

Description Luiz Capitulino 2016-04-08 17:57:08 UTC

Description of problem:

I get the following exception in the *host* while executing a 24 hours run of a KVM-RT test-case. This exception causes the host to reboot.

[143516.953097] divide error: 0000 [#1] PREEMPT SMP 
[143516.953115] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter iTCO_wdt iTCO_vendor_support ipmi_devintf dcdbas coretemp intel_rapl iosf_mbi kvm_intel kvm crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr ipmi_ssif sb_edac edac_core sg lpc_ich mfd_core ipmi_si ipmi_msghandler wmi acpi_power_meter mei_me mei shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit
[143516.953120]  drm_kms_helper ttm drm i2c_core tg3 ixgbe ahci mdio libahci dca libata ptp pps_core megaraid_sas fjes dm_mirror dm_region_hash dm_log dm_mod
[143516.953122] CPU: 4 PID: 11967 Comm: top Not tainted 3.10.0-371.rt56.247.el7.x86_64 #1
[143516.953123] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015
[143516.953124] task: ffff880457259780 ti: ffff880442abc000 task.ti: ffff880442abc000
[143516.953130] RIP: 0010:[<ffffffff810b6b80>]  [<ffffffff810b6b80>] cputime_adjust+0x110/0x130
[143516.953130] RSP: 0018:ffff880442abfcc0  EFLAGS: 00010a83
[143516.953131] RAX: fa7b539052a6ecb0 RBX: ffff880450085240 RCX: ffff880442abfde8
[143516.953132] RDX: 0000000000000000 RSI: ffff880459eefa10 RDI: 0000000000000000
[143516.953132] RBP: ffff880442abfcc0 R08: 00003e9ed8d9fb33 R09: 00000000ffffefd0
[143516.953133] R10: ffff880442abfde0 R11: ffff880442abfc38 R12: ffff880442abfde0
[143516.953133] R13: ffff880442abfde8 R14: 0000000000000000 R15: ffff880459eef800
[143516.953134] FS:  00007f068b7e6740(0000) GS:ffff88046ec80000(0000) knlGS:0000000000000000
[143516.953135] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[143516.953135] CR2: 00007f2eb6b0a000 CR3: 000000043f87a000 CR4: 00000000001427e0
[143516.953136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[143516.953136] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[143516.953136] Stack:
[143516.953138]  ffff880442abfd08 ffffffff810b7ab0 00003db907e9b1c0 ffffefd09529aa00
[143516.953138]  00003e9ed8d9fb33 00000000c8d0d88f ffff88043fb3e480 ffff880450085240
[143516.953139]  ffff880458b40000 ffff880442abfe40 ffffffff8124ce5b ffffffffffffffff
[143516.953139] Call Trace:
[143516.953143]  [<ffffffff810b7ab0>] thread_group_cputime_adjusted+0x50/0x70
[143516.953146]  [<ffffffff8124ce5b>] do_task_stat+0x8eb/0xb60
[143516.953148]  [<ffffffff810b0beb>] ? migrate_enable+0xdb/0x210
[143516.953150]  [<ffffffff8124da44>] proc_tgid_stat+0x14/0x20
[143516.953151]  [<ffffffff812478bd>] proc_single_show+0x4d/0x90
[143516.953154]  [<ffffffff811fd69a>] seq_read+0xfa/0x3a0
[143516.953157]  [<ffffffff811d812e>] vfs_read+0x9e/0x170
[143516.953158]  [<ffffffff811d8c7f>] SyS_read+0x7f/0xe0
[143516.953162]  [<ffffffff81648054>] tracesys+0xdd/0xe2
[143516.953171] Code: 89 fa 48 c1 ea 20 48 85 d2 74 ca 4c 89 ca 48 d1 ef 49 89 c1 48 d1 ea 48 89 d0 eb 9f 0f 1f 80 00 00 00 00 89 c0 31 d2 49 0f af c1 <48> f7 f7 4c 89 c7 48 29 c7 49 89 c0 e9 0a ff ff ff 66 66 66 66 
[143516.953173] RIP  [<ffffffff810b6b80>] cputime_adjust+0x110/0x130
[143516.953173]  RSP <ffff880442abfcc0>


Version-Release number of selected component (if applicable): kernel-rt-3.10.0-371.rt56.247.el7.x86_64


How reproducible:

My system is setup for KVM-RT. I have four KVM-RT guests executing a KVM-RT test-case and top running in the host. This bug triggers in about 10 hours.

Comment 1 Luiz Capitulino 2016-04-08 18:02:37 UTC

Some more additional information about this bug:

- I didn't get it with 3.10.0-365.rt56.242, so it must be something added between -366 and -371

- I've got a kdump. However, I'm unable to run crash on it. Still investigating what this could be

Comment 2 Luiz Capitulino 2016-04-11 13:43:35 UTC

I've got some new information about this bug:

1. After a few hours running the test-case, I can see that the CPU time for all qemu-kvm _processes_ are abnormally high. "top" reports a TIME+ field of "5124168h", ps reports "21114971:27":

qemu      12040 107304795  2.5 2093120 826408 ? SLl  10:06 21114971:27 /usr/libexec/qemu-kvm

2. When total=0 in cputime_adjust(), we have the following information:

o The thread having the cputime adjusted _seems_ to be an qemu I/O thread (see item 3 below)

o On cputime_adjust() entry we have:

 - stime=18446611312260584768
 - rtime=262703155899173
 - total=127301701033152
 - curr->stime=18446611312260584768
 - curr->utime=260063150000000
 - curr->sum_exec_runtime=262703155899173

3. Here's some relevant information from crash regarding the thread that's getting its cputime adjusted:

  prio = 120, 
  static_prio = 120, 
  normal_prio = 120, 
  rt_priority = 0, 
  sched_class = 0xffffffff8166b7c0 <fair_sched_class>, 

  exec_start = 122859095823545, 
  sum_exec_runtime = 140335862570, 
  vruntime = 2189359642803, 
  prev_sum_exec_runtime = 140335812859, 
  nr_migrations = 489084, 

  utime = 51817000000, 
  stime = 82809000000, 
  utimescaled = 51817000000, 
  stimescaled = 82809000000, 
  gtime = 0, 
  prev_cputime = {
    utime = 8797807, 
    stime = 12316928
  }, 

The only strange thing is: I/O threads are normally forked off by housekeeping vCPU threads. However, the parent task_struct pointer of this thread points to systemd (?) I thought threads would only be re-parented by systemd when the parent thread exits...

Comment 3 Luiz Capitulino 2016-04-11 14:01:54 UTC

Here's a theory: there's something going wrong with the accounting of system time for real-time tasks running on nohz_full cores. It seems to be an overflow or some miscounting that leads to absurdly high values.

The housekeeping vCPU threads, which have RT prio and run on nozh_full cores, fork off several I/O theads due to our I/O intensive test-case. Those I/O threads will inherit the overflowed accounting values and trig the problem someway.

Here's an example of a housekeeping vCPU thread CPU time from crash:

    exec_start = 122859693302461, 
    sum_exec_runtime = 25068487381408, 
    vruntime = 18446744073697551616, 
    prev_sum_exec_runtime = 7036470, 
    nr_migrations = 3, 

  utime = 24889936000000, 
  stime = 14817138242231518656, 
  utimescaled = 24889936000000, 
  stimescaled = 14817138242231518656, 
  gtime = 24889839000000, 
  prev_cputime = {
    utime = 0, 
    stime = 0
  },

  vtime_snap = 4417536930, 
  vtime_snap_whence = VTIME_SYS,

Comment 4 Luiz Capitulino 2016-04-11 21:00:34 UTC

I've created a reproducer that allowed me to find out the root cause of this issue. I'll describe the reproducer first.

reproducer
----------

1. Setup a nohz_full core (eg. nohz_full=15)

2. Create a program that takes 100% of the CPU:

int main(void)
{
   for ( ;; ) ;
   return 0;
}

3. Run this program and pin it to the nohz_full core:

# taskset -c 15 ./prog

4. Run "top -d1" and watch the TIME+ column. You should get an absurdly high value, like "5124095h" in less than a minute

root cause
----------

This problem happens because parts of the following commit are not present in the -RT tree:

commit ede53d8149707adf896069122b198ec9a93fcfa9
Author: Rik van Riel <riel>
Date:   Mon Mar 7 23:40:11 2016 -0500

    [kernel] sched, time: Switch VIRT_CPU_ACCOUNTING_GEN to jiffy granularity

Two things are missing:

1. The lines that initialize current->vtime_snap (this is the root cause)
2. All the vtime_delta() checks in the vtime_ accounting functions

As current->vtime_snap is initialized with a nanosecond value in the -RT tree, it can become greater than jiffies which makes get_vtime_delta() return negative in __vtime_account_system(), which in turn causes account_system_time() to add an overflowed(?) value to current->stime.

Comment 5 Clark Williams 2016-04-14 00:18:04 UTC

Mis-merged was corrected and fix went into kernel-rt-3.10.0-376.rt56.253.el7 build.

Comment 13 Jiri Kastner 2016-09-30 11:16:24 UTC

[indy@localhost code-review]$ ../MRG-RT-tools/check_commit_presence rpmbuild/3.10.0/511.rt56.416.el7/BUILD/kernel-3.10.0/linux-3.10.0.x86_64/ ede53d8149707adf896069122b198ec9a93fcfa9
Reverting ede53d8149707adf896069122b198ec9a93fcfa9 (kernel-3.10.0-367.el7~22) ... Unreversed patch detected!  Ignore -R? [n] 
Apply anyway? [n] 
8 out of 8 hunks ignored -- saving rejects to file kernel/sched/cputime.c.rej
** FAILED **
Restoring  Done
0 patch(es) was found applied.  1 patch(es) was not checked, due to a failure

Comment 14 Luiz Capitulino 2016-09-30 12:25:45 UTC

Jiri,

Could you describe what your verification test is and what it means if it's failing?

IIRC, this issue was debugged down to a merge conflict. I wonder how check_commit_presence is going to check for that.

Comment 15 Jiri Kastner 2016-09-30 13:09:12 UTC

luiz,
it reverts and apply commit in build tree prepared using rpmbuild, sometime it needs more patches applied above investigated patch

Comment 16 Jiri Kastner 2016-09-30 13:14:49 UTC

oh, correct commit is 5cd2685d8dd330cdade14375ffe9626fe21d56c2

Comment 17 Luiz Capitulino 2016-09-30 13:41:24 UTC

Just a few notes on verifying this BZ:

 - The original issue was reproduced with KVM-RT guests running cyclictest for 24 hours

 - The reproducer from comment 4 came later. I don't remember if this reproducer has ever triggered the divide by zero issue (but it did trig the initial manifestation of the issue)

 - If one wants to try to reproduce the divide by zero issue with the reproducer from comment 4, expect to run it for 24 hours. However, I'd not recommend doing this, since this is very time consuming for a single issue and we run 24 hours tests continuously for KVM-RT so we'd know if the issue is still present

Setting this to SanityOnly seems appropriate to me.

Comment 18 Jiri Kastner 2016-09-30 14:51:32 UTC

(In reply to Luiz Capitulino from comment #17)
> Just a few notes on verifying this BZ:
> 
>  - The original issue was reproduced with KVM-RT guests running cyclictest
> for 24 hours
> 
>  - The reproducer from comment 4 came later. I don't remember if this
> reproducer has ever triggered the divide by zero issue (but it did trig the
> initial manifestation of the issue)
> 
>  - If one wants to try to reproduce the divide by zero issue with the
> reproducer from comment 4, expect to run it for 24 hours. However, I'd not
> recommend doing this, since this is very time consuming for a single issue
> and we run 24 hours tests continuously for KVM-RT so we'd know if the issue
> is still present
> 
> Setting this to SanityOnly seems appropriate to me.

code review only
###########
[indy@localhost code-review]$ rm -fr rpmbuild/3.10.0/511.rt56.416.el7/BUILD/kernel-3.10.0/ && rpmbuild --quiet -D "_topdir %{getenv:PWD}/rpmbuild/%{version}/%{release}" -bp rpmbuild/3.10.0/511.rt56.416.el7/SPECS/kernel-rt.spec && rpm -q --specfile rpmbuild/3.10.0/511.rt56.416.el7/SPECS/kernel-rt.spec | head -1
kernel-rt-3.10.0-511.rt56.416.el7.x86_64
[indy@localhost code-review]$ ../MRG-RT-tools/check_commit_presence rpmbuild/3.10.0/511.rt56.416.el7/BUILD/kernel-3.10.0/linux-3.10.0.x86_64/ 5cd2685d8dd330cdade14375ffe9626fe21d56c2Reverting 5cd2685d8dd330cdade14375ffe9626fe21d56c2 (kernel-rt-3.10.0-376.rt56.254.el7~2) ... Applied
Restoring . Done
1 patch(es) was found applied.

Comment 20 errata-xmlrpc 2016-11-03 19:45:30 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2584.html

Note You need to log in before you can comment on or make changes to this bug.