Bug 1295486 - Dead loop on virtual device tun0 causes stack corruption and reboot
Dead loop on virtual device tun0 causes stack corruption and reboot
Status: CLOSED CURRENTRELEASE
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Dan Williams
Meng Bo
:
Depends On:
Blocks: 1286671 1297881
  Show dependency treegraph
 
Reported: 2016-01-04 11:45 EST by Ryan Howe
Modified: 2016-05-17 04:03 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1297881 (view as bug list)
Environment:
Last Closed: 2016-05-17 04:03:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 2 Ryan Howe 2016-01-04 11:56:03 EST
Description of problem:

A single node hostsubnet object that contains a hostip equal to its tun0 interface causing the error in dmesg:

"Dead loop on virtual device tun0, fix it urgently!" 


Is resulting in stack corruption and reboot of all other nodes with even correct configuration for hostsubnet.
Comment 11 Dan Williams 2016-01-12 14:33:31 EST
Upstream openshift-sdn pull request to prevent this on the openshift side is https://github.com/openshift/openshift-sdn/pull/245
Comment 13 Dan Williams 2016-04-12 14:42:51 EDT
https://github.com/openshift/openshift-sdn/pull/245 got merged to origin and will be present in OpenShift 3.2 and later.
Comment 14 Yan Du 2016-04-14 03:03:15 EDT
Test latest OSE env, issue have been fixed.

openshift v3.2.0.15
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

After changing hostip in hostsubnet on master1 node1 equal to the ip of its tun0, node2 could work normally without any crash or reboot, and the pods located on node1 could transfer to node2.
Comment 15 Josep 'Pep' Turro Mauri 2016-05-17 04:03:55 EDT
(In reply to Dan Williams from comment #13)
> https://github.com/openshift/openshift-sdn/pull/245 got merged to origin and
> will be present in OpenShift 3.2 and later.

OSE 3.2 is now available: https://access.redhat.com/errata/RHSA-2016:1064

(In reply to Yan Du from comment #14)
> Test latest OSE env, issue have been fixed.
> 
> openshift v3.2.0.15
> kubernetes v1.2.0-36-g4a3f9c5
> etcd 2.2.5

3.2 released version atomic-openshift-3.2.0.20-1.git.0.f44746c.el7 which should contain the fix.

Note You need to log in before you can comment on or make changes to this bug.