Bug 1295486

Summary: Dead loop on virtual device tun0 causes stack corruption and reboot
Product: OpenShift Container Platform Reporter: Ryan Howe <rhowe>
Component: NetworkingAssignee: Dan Williams <dcbw>
Status: CLOSED CURRENTRELEASE QA Contact: Meng Bo <bmeng>
Severity: low Docs Contact:
Priority: medium    
Version: 3.1.0CC: aos-bugs, dale, dcbw, eparis, fleitner, hsowa, jbenc, jkrieger, jsiddle, pep, rhowe, rkhan, yadu
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1297881 (view as bug list) Environment:
Last Closed: 2016-05-17 08:03:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1286671, 1297881    

Comment 2 Ryan Howe 2016-01-04 16:56:03 UTC
Description of problem:

A single node hostsubnet object that contains a hostip equal to its tun0 interface causing the error in dmesg:

"Dead loop on virtual device tun0, fix it urgently!" 


Is resulting in stack corruption and reboot of all other nodes with even correct configuration for hostsubnet.

Comment 11 Dan Williams 2016-01-12 19:33:31 UTC
Upstream openshift-sdn pull request to prevent this on the openshift side is https://github.com/openshift/openshift-sdn/pull/245

Comment 13 Dan Williams 2016-04-12 18:42:51 UTC
https://github.com/openshift/openshift-sdn/pull/245 got merged to origin and will be present in OpenShift 3.2 and later.

Comment 14 Yan Du 2016-04-14 07:03:15 UTC
Test latest OSE env, issue have been fixed.

openshift v3.2.0.15
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

After changing hostip in hostsubnet on master1 node1 equal to the ip of its tun0, node2 could work normally without any crash or reboot, and the pods located on node1 could transfer to node2.

Comment 15 Josep 'Pep' Turro Mauri 2016-05-17 08:03:55 UTC
(In reply to Dan Williams from comment #13)
> https://github.com/openshift/openshift-sdn/pull/245 got merged to origin and
> will be present in OpenShift 3.2 and later.

OSE 3.2 is now available: https://access.redhat.com/errata/RHSA-2016:1064

(In reply to Yan Du from comment #14)
> Test latest OSE env, issue have been fixed.
> 
> openshift v3.2.0.15
> kubernetes v1.2.0-36-g4a3f9c5
> etcd 2.2.5

3.2 released version atomic-openshift-3.2.0.20-1.git.0.f44746c.el7 which should contain the fix.