Bug 1630370

Summary: unpredictable out of memory killing multiple processes
Product: [Fedora] Fedora Reporter: R Bruce Hoffman <bruce>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 28CC: airlied, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: ppc64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-02-21 21:10:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output after cascade oom killer none

Description R Bruce Hoffman 2018-09-18 13:20:53 UTC
Created attachment 1484372 [details]
dmesg output after cascade oom killer

Description of problem:
Randomly encountering OOM, OOM killer starts a cascade of killing off services resulting in having to reboot the server or restart partition immediate from HMC.

This is occuring on FC28 and also occured on FC27.


Version-Release number of selected component (if applicable):

Linux mailsvrc.ctcodeworks.com 4.18.7-200.fc28.ppc64 #1 SMP Mon Sep 10 15:21:31 UTC 2018 ppc64 ppc64 ppc64 GNU/Linux

This is running as a guest on a hosted partition, partition hosted by IBM i OS V7R3 on Power 8 and others experiencing same issues on Power 7 and Power 7+.



How reproducible:

Just let it run.  It seems to eventually happen.


Steps to Reproduce:
1.
2.
3.

Actual results:

Processes are killed off by oom-killer

Expected results:

Keep running for days/weeks/months.

Additional info:

Attached is output of dmesg just after a cascade of services are killed off.  Nothing in the journalctl. no apparent warnings, no repeated sequences of which processes get killed when this happens.

Comment 1 Justin M. Forbes 2019-01-29 16:24:12 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs.

Fedora 28 has now been rebased to 4.20.5-100.fc28.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29.

If you experience different issues, please open a new bug report for those.

Comment 2 Justin M. Forbes 2019-02-21 21:10:16 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 3 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.