1214832 – openshift-tc fails at boot time if node uses a bonded NIC due to init script priority

Bug 1214832 - openshift-tc fails at boot time if node uses a bonded NIC due to init script priority

Summary: openshift-tc fails at boot time if node uses a bonded NIC due to init script ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Containers
Sub Component:
Version:	2.2.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Timothy Williams
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-04-23 15:48 UTC by Josep 'Pep' Turro Mauri
Modified:	2019-07-11 09:01 UTC (History)
CC List:	7 users (show)
Fixed In Version:	rubygem-openshift-origin-node-1.36.2.2-1
Doc Type:	Bug Fix
Doc Text:	Previously when a bonded NIC was configured as the external interface for a node host, the network service was not started when the openshift-tc service was started. This was due to bonded NICs depending on the network service to be started, which would not start until after the openshift-tc service. This bug fix modifies the priority of the openshift-tc service on nodes to start after the network service by default. As a result, the openshift-tc service is able to start successfully on boot when a bonded interface is configured as the external interface; the network service is started first, initializing the bonded interface.
Clone Of:
Environment:
Last Closed:	2015-07-21 19:12:21 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:1463	0	normal	SHIPPED_LIVE	Red Hat OpenShift Enterprise 2.2.6 bug fix and enhancement update	2015-07-21 23:11:33 UTC

Description Josep 'Pep' Turro Mauri 2015-04-23 15:48:50 UTC

Description of problem:
openshift-tc's init script has a default start priority of 7. This means that it's started before the 'network' service (prio 10) which causes it to fail to start when the NIC configured for openshift service (EXTERNAL_ETH_DEV) depends on the network service to start.

Version-Release number of selected component (if applicable):
rubygem-openshift-origin-node-1.34.1.1-1.el6op.noarch

How reproducible:
Alwasys

Steps to Reproduce:
1. Install a node with NIC bonding and onfigure EXTERNAL_ETH_DEV=bond0
2. Enable TC (TRAFFIC_CONTROL_ENABLED=true, chkconfig openshift-tc on)
3. boot

Actual results:
openshift-tc fails to start. Extract from boot.log:
......
Starting systemtap: [WARNING]
Calling the system activity data collector (sadc)...
Starting monitoring for VG root_vg:   5 logical volume(s) in volume group "root_vg" monitored
[  OK  ]
Starting cgconfig service: [  OK  ]
Starting multipathd daemon: [  OK  ]
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.23.9.21/lib/openshift-origin-node/utils/tc.rb:107:in `get_interface_mtu': Unable to determine external network interface IP address. (RuntimeError)
        from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.23.9.21/lib/openshift-origin-node/utils/tc.rb:80:in `initialize'
        from /usr/sbin/oo-admin-ctl-tc:28:in `new'
        from /usr/sbin/oo-admin-ctl-tc:28:in `<main>'
iptables: Applying firewall rules: [  OK  ]
iptables: Loading additional modules: nf_conntrack [  OK  ] Bringing up loopback interface:  [  OK  ] Bringing up interface bond0:  Determining if ip address 22.114.234.40 is already in use for device bond0...
[  OK  ]
Bringing up interface bond1:  Determining if ip address 22.115.210.67 is already in use for device bond1...
[  OK  ]
Starting auditd: [  OK  ]
.....

Expected results:


Additional info:
[root@node4 init.d]# grep chkconfig network openshift-tc
network:# chkconfig: 2345 10 90
openshift-tc:# chkconfig:    345 7 93

Comment 3 Timothy Williams 2015-05-19 20:03:41 UTC

https://github.com/openshift/origin-server/pull/6146

Comment 8 Anping Li 2015-05-22 15:54:00 UTC

Verified and pass 

1) The chkconfig priority have been changed 
[root@broker init.d]# grep chkconfig network openshift-tc
network:# chkconfig: 2345 10 90
openshift-tc:# chkconfig:    345 11 89

2) network can be boot.
Starting monitoring for VG vg_dhcp12945:   2 logical volume(s) in volume group "vg_dhcp12945" monitored
^[[60G[^[[0;32m  OK  ^[[0;39m]^M
Starting cgconfig service: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M
ip6tables: Applying firewall rules: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M
iptables: Applying firewall rules: ^[[60G[^[[0;32m  OK  ^[[0;39m]^M
Bringing up loopback interface:  ^[[60G[^[[0;32m  OK  ^[[0;39m]^M
Bringing up interface bond0:
Determining IP information for bond0... done.
^[[60G[^[[0;32m  OK  ^[[0;39m]^M
3) openshift-tc is started
[root@broker init.d]# service openshift-tc status
Bandwidth shaping status: 
qdisc htb 1: root refcnt 17 r2q 10 default 0 direct_packets_stat 2075
 Sent 249309 bytes 2079 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 0bit 0pps backlog 0b 0p requeues 0 

class htb 1:1 root prio 0 rate 800000Kbit ceil 800000Kbit burst 1600b cburst 1600b 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 rate 0bit 0pps backlog 0b 0p requeues 0 
 lended: 0 borrowed: 0 giants: 0
 tokens: 250 ctokens: 250

 [OK]

Comment 10 errata-xmlrpc 2015-07-21 19:12:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1463.html

Note You need to log in before you can comment on or make changes to this bug.