Bug 1288014

Summary:	Panic, if redhat/openshift-ovs-multitenant is enabled.
Product:	OpenShift Container Platform	Reporter:	Alexander Koksharov <akokshar>
Component:	Networking	Assignee:	Dan Winship <danw>
Status:	CLOSED ERRATA	QA Contact:	Meng Bo <bmeng>
Severity:	high	Docs Contact:
Priority:	high
Version:	3.1.0	CC:	aos-bugs, bleanhar, eparis, jokerman, mmccomas, pep, pruan, zzhao
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-01-26 19:19:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1267746

Description Alexander Koksharov 2015-12-03 10:23:22 UTC

Description of problem:
Ff redhat/openshift-ovs-multitenant is enabled service atomic-openshift-node constantly restarting.
See attachments for more informations.

Version-Release number of selected component (if applicable):


How reproducible:
enable multitenant plugin

Steps to Reproduce:
1.
2.
3.

Actual results:
On a customer's platform, service atomic-openshift-node is crashing and restarting.

Expected results:


Additional info:

from gdb:
Core was generated by `/usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --logl'.
Program terminated with signal 6, Aborted.
#0  0x0000000000440bf4 in ?? ()
"/root/./core.13087" is a core file.

core file and journal logs attached

Comment 3 Dan Winship 2015-12-03 16:49:55 UTC

This isn't actually related to multitenant. They'd get the same crash with openshift-ovs-subnet.

There's something odd about some container in this environment that is tripping up the code. Could you get the output of "oc get pods --all-namespaces -o json" ?

Comment 5 Dan Winship 2015-12-04 14:55:18 UTC

Hm... there's nothing in that output that should cause this error... the bad pod must have gotten cleaned up or something.

Anyway, this appears to have already been fixed in git: https://github.com/openshift/openshift-sdn/pull/214

Moving this to ON_QA, although I don't have a reproducer for it so I'm not sure they can really test it...

Comment 8 Meng Bo 2016-01-04 05:36:55 UTC

The issue has been fixed with puddle 2015-12-19.3.

RPM versions:
atomic-openshift-node-3.1.1.0-1.git.0.8632732.el7aos.x86_64
atomic-openshift-3.1.1.0-1.git.0.8632732.el7aos.x86_64
atomic-openshift-sdn-ovs-3.1.1.0-1.git.0.8632732.el7aos.x86_64

Reproduce steps:
Create pod which requests PVC which the pvc was not created yet.
Check the pod status should be stacked on ContainerCreating.
Restart the openshift-node service where the pod placed.

No panic error in node log, and the openshift-node service running well.

Comment 10 errata-xmlrpc 2016-01-26 19:19:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:0070