This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1277329 - Core dump when running openshift for several days
Core dump when running openshift for several days
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.0.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Dan Mace
Jianwei Hou
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-02 23:57 EST by Anping Li
Modified: 2016-01-26 14:16 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-26 14:16:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Anping Li 2015-11-02 23:57:41 EST
Description:
During system testing, there are lot of 'oc login' and 'oc new-projects'. A core dump appears on master.


Version-Release number of selected component (if applicable):
openshift version
openshift v3.0.2.903-114-g2849767
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
etcd 2.1.2

Environment:
  Linux 10.66.79.249 3.10.0-326.el7.x86_64
  8GB RAM | 2 VCPU | 40.0GB Disk

How reproducible:
two during testing.  t

Steps to Reproduce:

1. Set Openshift Environment
2. Run system testing
   About 500 users log in one by one
   Create some new-project and add new-applications

Actual Result:
The master core dump
-rw-------. 1 root root 9.3G Nov  3 06:34 /var/lib/origin/core.7271

Expected Result:
No core dump appears.
Comment 6 Paul Weil 2015-11-09 10:48:06 EST
Based on https://github.com/openshift/origin/issues/5737#issuecomment-154767531 I am marking this upcoming release.  

https://github.com/openshift/origin/pull/5760 should help and https://github.com/openshift/origin/pull/5791 is being reviewed.
Comment 7 Anping Li 2015-11-09 18:56:28 EST
just add a note:
The master process was restarted in a longevity running. sometimes, the coredump was created. sometime, no coredump .
Comment 8 Anping Li 2015-11-16 01:19:02 EST
also found core files in nodes, list files names here.

Node1:
-rw-------. 1 root root 104706048 Nov  2 01:03 core.7407
-rw-------. 1 root root 105684992 Nov  2 01:09 core.7582

Node2:
[root@10 origin]# ll
total 4767976
-rw-------. 1 root root 431923200 Nov  7 23:22 core.103285
-rw-------. 1 root root 342179840 Nov  8 02:51 core.27960
-rw-------. 1 root root 326107136 Nov  8 05:30 core.37090
-rw-------. 1 root root 298287104 Nov  8 06:50 core.43346
-rw-------. 1 root root 294760448 Nov  8 08:10 core.46545
-rw-------. 1 root root 266563584 Nov  8 09:32 core.48717
-rw-------. 1 root root 280465408 Nov  8 11:33 core.50656
-rw-------. 1 root root 238465024 Nov  8 12:01 core.53502
-rw-------. 1 root root 265490432 Nov  8 12:26 core.54257
-rw-------. 1 root root 277458944 Nov  8 13:14 core.54986
-rw-------. 1 root root 243896320 Nov  8 13:56 core.56204
-rw-------. 1 root root 238051328 Nov  8 14:10 core.57402
-rw-------. 1 root root 338337792 Nov  8 20:17 core.57745
-rw-------. 1 root root 248844288 Nov  8 21:00 core.71455
-rw-------. 1 root root 309309440 Nov  8 23:32 core.72688
-rw-------. 1 root root 770785280 Nov  7 04:43 core.976
Comment 9 Anping Li 2015-11-17 20:26:19 EST
I have store all core dump files, leave a message me if anyone need them.
Comment 11 Dan Mace 2016-01-07 09:02:11 EST
Let's re-test to see if the core dumps are still occurring since the referenced PRs have been merged. Several memory leaks have been plugged since this issue was filed which could have been responsible for the crashes.
Comment 12 Anping Li 2016-01-08 02:32:32 EST
I Will run testing about 3 days and update the result.
Comment 13 Anping Li 2016-01-11 22:04:17 EST
Run reliability testing for 4 days, there is no core dump. so move bug to verified.
Comment 15 errata-xmlrpc 2016-01-26 14:16:38 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:0070

Note You need to log in before you can comment on or make changes to this bug.