Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
== Description of problem:
When the interface is not managed by NetworkManager the IPsrcaddr resource fails
to stop after successful IPsrcadd start and monitor.
== Version-Release number of selected component (if applicable):
resource-agents-3.9.5-105.el7
== How reproducible:
Always
== Steps to Reproduce:
1. Configure interface ensX with IPADDR, GATEWAY and NM_CONTROLLED="no" (for example IPADDR=10.0.0.85, PREFIX=24, GATEWAY=10.0.0.1)
2. configure IPaddr2 resource (vip) with IP y.y.y.y in cluster with same subnet as the ensX that is different from IPADDR and GATEWAY (for example y.y.y.y=10.0.0.10)
3. Configure IPsrcaddr resource (vip_route) in cluster to use the IPaddr2 as default route (ipaddress=10.0.0.10 cidr_netmask=24)
4. Disable both resources and try debug-start,debug-monitor, debug-stop them on single node
# pcs resource disable vip
# pcs resource disable vip_route
# pcs resource debug-start vip
# pcs resource debug-start vip_route
# pcs resource debug-monitor vip_route
# pcs resource debug-stop vip_route
== Actual results:
# pcs resource debug-start vip
Operation start for vip (ocf:heartbeat:IPaddr2) returned 0
# pcs resource debug-start vip_route
Operation start for vip_route (ocf:heartbeat:IPsrcaddr) returned 0
# pcs resource debug-monitor vip_route
Operation monitor for vip_route (ocf:heartbeat:IPsrcaddr) returned 0
# pcs resource debug-stop vip_route
Error performing operation: Operation not permitted
Operation stop for vip_route (ocf:heartbeat:IPsrcaddr) returned 1
> stderr: Usage: ip route { list | flush } SELECTOR
> stderr: ip route save SELECTOR
> stderr: ip route restore
> stderr: ip route showdump
> stderr: ip route get ADDRESS [ from ADDRESS iif STRING ]
> stderr: [ oif STRING ] [ tos TOS ]
> stderr: [ mark NUMBER ]
> stderr: ip route { add | del | change | append | replace } ROUTE
> stderr: SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ]
> stderr: [ table TABLE_ID ] [ proto RTPROTO ]
> stderr: [ type TYPE ] [ scope SCOPE ]
> stderr: ROUTE := NODE_SPEC [ INFO_SPEC ]
> stderr: NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ]
> stderr: [ table TABLE_ID ] [ proto RTPROTO ]
> stderr: [ scope SCOPE ] [ metric METRIC ]
> stderr: INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]...
> stderr: NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS
> stderr: OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]
> stderr: [ rtt TIME ] [ rttvar TIME ] [reordering NUMBER ]
> stderr: [ window NUMBER ] [ cwnd NUMBER ] [ initcwnd NUMBER ]
> stderr: [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ]
> stderr: [ rto_min TIME ] [ hoplimit NUMBER ] [ initrwnd NUMBER ]
> stderr: [ features FEATURES ] [ quickack BOOL ] [ congctl NAME ]
> stderr: [ expires TIME ]
> stderr: TYPE := { unicast | local | broadcast | multicast | throw |
> stderr: unreachable | prohibit | blackhole | nat }
> stderr: TABLE_ID := [ local | main | default | all | NUMBER ]
> stderr: SCOPE := [ host | link | global | NUMBER ]
> stderr: NHFLAGS := [ onlink | pervasive ]
> stderr: RTPROTO := [ kernel | boot | static | NUMBER ]
> stderr: TIME := NUMBER[s|ms]
> stderr: BOOL := [1|0]
> stderr: FEATURES := ecn
> stderr: ocf-exit-reason:command 'ip route replace dev ensX' failed
== Expected results:
# pcs resource debug-start vip
Operation start for vip (ocf:heartbeat:IPaddr2) returned 0
# pcs resource debug-start vip_route
Operation start for vip_route (ocf:heartbeat:IPsrcaddr) returned 0
# pcs resource debug-monitor vip_route
Operation monitor for vip_route (ocf:heartbeat:IPsrcaddr) returned 0
# pcs resource debug-stop vip_route
Operation stop for vip_route (ocf:heartbeat:IPsrcaddr) returned 0
== Additional info:
Issue is not reproducible when the interface is managed by NetworkManager.
Below are outputs from 'ip route' command when NetworkManager is and is not in use.
=======
## bond0 without NetworkManager clean start
default via 10.0.0.1 dev bond0
10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85
# after starting IPsrcaddr
default via 10.0.0.1 dev bond0 src 10.0.0.10
10.0.0.0/24 dev bond0 scope link src 10.0.0.10
# attempting to stop IPsrcaddr results in error (exit code 1)
## bond0 with NetworkManager clean start
default via 10.0.0.1 dev bond0 proto static metric 300
10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300
# after starting IPsrcaddr
default via 10.0.0.1 dev bond0 proto static src 10.0.0.10 metric 300
10.0.0.0/24 dev bond0 scope link src 10.0.0.10
10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300
# after stopping the IPsrcaddr
default via 10.0.0.1 dev bond0 proto static metric 300
10.0.0.0/24 dev bond0 scope link
10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300
=======
Customer provided us with the patch containing workaround that works in their environment (there are some hardcoded things)
--- /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr 2017-06-23 09:32:28.000000000 +0200
+++ /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr.custom 2017-10-16 15:55:31.617636470 +0200
@@ -172,7 +172,7 @@
rc=$OCF_SUCCESS
ocf_log info "The ip route has been already set.($NETWORK, $INTERFACE, $ROUTE_WO_SRC)"
else
- ip route replace $NETWORK dev $INTERFACE src $1 || \
+ ip route replace $NETWORK dev $INTERFACE proto kernel scope link src $1 || \
errorexit "command 'ip route replace $NETWORK dev $INTERFACE src $1' failed"
$CMDCHANGE $ROUTE_WO_SRC src $1 || \
@@ -204,7 +204,10 @@
[ $rc = 2 ] && errorexit "The address you specified to stop does not match the preferred source address"
- ip route replace $NETWORK dev $INTERFACE || \
+# WORKAROUND !!!!
+ PRIMARY=`ip -4 addr show bond1 primary | fgrep inet | awk -c '{ print $2; }' | cut -f1 -d\/`
+
+ ip route replace $NETWORK dev $INTERFACE proto kernel scope link src ${PRIMARY} || \
errorexit "command 'ip route replace $NETWORK dev $INTERFACE' failed"
$CMDCHANGE $ROUTE_WO_SRC || \
Additional tests:
- test when NetworkManager service is completely off
- no change in behaviour (systemctl disable NetworkManager and reboot, verified that NM was stopped)
- test adding the same route as NetworkManager adds when it is used to see if then this works
- adding a route _after_ starting IPsrcaddr gets us into state from which things starts to work correctly.
However we cannot add the same route _before_ starting the IPsrcaddr as the start of IPsrcaddr would then fail
The following commands seems to causing the failure
207 ip route replace $NETWORK dev $INTERFACE || \
471 INTERFACE=`echo $findif_out | awk '{print $1}'`
472 NETWORK=`ip route list dev $INTERFACE scope link proto kernel match $ipaddress|grep -o '^[^ ]*'`
- In scenario when it fails we see following
> stderr: + 17:02:43: srca_stop:207: ip route replace dev ens7
- In scenario when it works we see following
> stderr: + 17:03:05: srca_stop:207: ip route replace 10.0.0.0/24 dev ens7
So it seems that we are missing the $NETWORK. Below are the outputs how the NETWORK is determined
# ip route
default via 10.0.0.1 dev ens7 src 10.0.0.10
10.0.0.0/24 dev ens7 scope link src 10.0.0.10
...
> stderr: ++ 17:02:43: 472: ip route list dev ens7 scope link proto kernel match 10.0.0.10
> stderr: ++ 17:02:43: 472: grep -o '^[^ ]*'
> stderr: + 17:02:43: 472: NETWORK=
# ip route
default via 10.0.0.1 dev ens7 src 10.0.0.10
10.0.0.0/24 dev ens7 scope link src 10.0.0.10
10.0.0.0/24 dev ens7 proto kernel scope link src 10.0.0.85 metric 100
...
> stderr: ++ 17:08:13: 472: ip route list dev ens7 scope link proto kernel match 10.0.0.10
> stderr: ++ 17:08:13: 472: grep -o '^[^ ]*'
> stderr: + 17:08:13: 472: NETWORK=10.0.0.0/24
So it looks that in some cases we fail to detect correctly the NETWORK
Comment 2Oyvind Albrigtsen
2017-11-01 15:11:27 UTC
Bumping to 7.6.
Comment 11Oyvind Albrigtsen
2019-04-05 08:43:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2019:2012
== Description of problem: When the interface is not managed by NetworkManager the IPsrcaddr resource fails to stop after successful IPsrcadd start and monitor. == Version-Release number of selected component (if applicable): resource-agents-3.9.5-105.el7 == How reproducible: Always == Steps to Reproduce: 1. Configure interface ensX with IPADDR, GATEWAY and NM_CONTROLLED="no" (for example IPADDR=10.0.0.85, PREFIX=24, GATEWAY=10.0.0.1) 2. configure IPaddr2 resource (vip) with IP y.y.y.y in cluster with same subnet as the ensX that is different from IPADDR and GATEWAY (for example y.y.y.y=10.0.0.10) 3. Configure IPsrcaddr resource (vip_route) in cluster to use the IPaddr2 as default route (ipaddress=10.0.0.10 cidr_netmask=24) 4. Disable both resources and try debug-start,debug-monitor, debug-stop them on single node # pcs resource disable vip # pcs resource disable vip_route # pcs resource debug-start vip # pcs resource debug-start vip_route # pcs resource debug-monitor vip_route # pcs resource debug-stop vip_route == Actual results: # pcs resource debug-start vip Operation start for vip (ocf:heartbeat:IPaddr2) returned 0 # pcs resource debug-start vip_route Operation start for vip_route (ocf:heartbeat:IPsrcaddr) returned 0 # pcs resource debug-monitor vip_route Operation monitor for vip_route (ocf:heartbeat:IPsrcaddr) returned 0 # pcs resource debug-stop vip_route Error performing operation: Operation not permitted Operation stop for vip_route (ocf:heartbeat:IPsrcaddr) returned 1 > stderr: Usage: ip route { list | flush } SELECTOR > stderr: ip route save SELECTOR > stderr: ip route restore > stderr: ip route showdump > stderr: ip route get ADDRESS [ from ADDRESS iif STRING ] > stderr: [ oif STRING ] [ tos TOS ] > stderr: [ mark NUMBER ] > stderr: ip route { add | del | change | append | replace } ROUTE > stderr: SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ] > stderr: [ table TABLE_ID ] [ proto RTPROTO ] > stderr: [ type TYPE ] [ scope SCOPE ] > stderr: ROUTE := NODE_SPEC [ INFO_SPEC ] > stderr: NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ] > stderr: [ table TABLE_ID ] [ proto RTPROTO ] > stderr: [ scope SCOPE ] [ metric METRIC ] > stderr: INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]... > stderr: NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS > stderr: OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] > stderr: [ rtt TIME ] [ rttvar TIME ] [reordering NUMBER ] > stderr: [ window NUMBER ] [ cwnd NUMBER ] [ initcwnd NUMBER ] > stderr: [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ] > stderr: [ rto_min TIME ] [ hoplimit NUMBER ] [ initrwnd NUMBER ] > stderr: [ features FEATURES ] [ quickack BOOL ] [ congctl NAME ] > stderr: [ expires TIME ] > stderr: TYPE := { unicast | local | broadcast | multicast | throw | > stderr: unreachable | prohibit | blackhole | nat } > stderr: TABLE_ID := [ local | main | default | all | NUMBER ] > stderr: SCOPE := [ host | link | global | NUMBER ] > stderr: NHFLAGS := [ onlink | pervasive ] > stderr: RTPROTO := [ kernel | boot | static | NUMBER ] > stderr: TIME := NUMBER[s|ms] > stderr: BOOL := [1|0] > stderr: FEATURES := ecn > stderr: ocf-exit-reason:command 'ip route replace dev ensX' failed == Expected results: # pcs resource debug-start vip Operation start for vip (ocf:heartbeat:IPaddr2) returned 0 # pcs resource debug-start vip_route Operation start for vip_route (ocf:heartbeat:IPsrcaddr) returned 0 # pcs resource debug-monitor vip_route Operation monitor for vip_route (ocf:heartbeat:IPsrcaddr) returned 0 # pcs resource debug-stop vip_route Operation stop for vip_route (ocf:heartbeat:IPsrcaddr) returned 0 == Additional info: Issue is not reproducible when the interface is managed by NetworkManager. Below are outputs from 'ip route' command when NetworkManager is and is not in use. ======= ## bond0 without NetworkManager clean start default via 10.0.0.1 dev bond0 10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 # after starting IPsrcaddr default via 10.0.0.1 dev bond0 src 10.0.0.10 10.0.0.0/24 dev bond0 scope link src 10.0.0.10 # attempting to stop IPsrcaddr results in error (exit code 1) ## bond0 with NetworkManager clean start default via 10.0.0.1 dev bond0 proto static metric 300 10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300 # after starting IPsrcaddr default via 10.0.0.1 dev bond0 proto static src 10.0.0.10 metric 300 10.0.0.0/24 dev bond0 scope link src 10.0.0.10 10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300 # after stopping the IPsrcaddr default via 10.0.0.1 dev bond0 proto static metric 300 10.0.0.0/24 dev bond0 scope link 10.0.0.0/24 dev bond0 proto kernel scope link src 10.0.0.85 metric 300 ======= Customer provided us with the patch containing workaround that works in their environment (there are some hardcoded things) --- /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr 2017-06-23 09:32:28.000000000 +0200 +++ /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr.custom 2017-10-16 15:55:31.617636470 +0200 @@ -172,7 +172,7 @@ rc=$OCF_SUCCESS ocf_log info "The ip route has been already set.($NETWORK, $INTERFACE, $ROUTE_WO_SRC)" else - ip route replace $NETWORK dev $INTERFACE src $1 || \ + ip route replace $NETWORK dev $INTERFACE proto kernel scope link src $1 || \ errorexit "command 'ip route replace $NETWORK dev $INTERFACE src $1' failed" $CMDCHANGE $ROUTE_WO_SRC src $1 || \ @@ -204,7 +204,10 @@ [ $rc = 2 ] && errorexit "The address you specified to stop does not match the preferred source address" - ip route replace $NETWORK dev $INTERFACE || \ +# WORKAROUND !!!! + PRIMARY=`ip -4 addr show bond1 primary | fgrep inet | awk -c '{ print $2; }' | cut -f1 -d\/` + + ip route replace $NETWORK dev $INTERFACE proto kernel scope link src ${PRIMARY} || \ errorexit "command 'ip route replace $NETWORK dev $INTERFACE' failed" $CMDCHANGE $ROUTE_WO_SRC || \ Additional tests: - test when NetworkManager service is completely off - no change in behaviour (systemctl disable NetworkManager and reboot, verified that NM was stopped) - test adding the same route as NetworkManager adds when it is used to see if then this works - adding a route _after_ starting IPsrcaddr gets us into state from which things starts to work correctly. However we cannot add the same route _before_ starting the IPsrcaddr as the start of IPsrcaddr would then fail The following commands seems to causing the failure 207 ip route replace $NETWORK dev $INTERFACE || \ 471 INTERFACE=`echo $findif_out | awk '{print $1}'` 472 NETWORK=`ip route list dev $INTERFACE scope link proto kernel match $ipaddress|grep -o '^[^ ]*'` - In scenario when it fails we see following > stderr: + 17:02:43: srca_stop:207: ip route replace dev ens7 - In scenario when it works we see following > stderr: + 17:03:05: srca_stop:207: ip route replace 10.0.0.0/24 dev ens7 So it seems that we are missing the $NETWORK. Below are the outputs how the NETWORK is determined # ip route default via 10.0.0.1 dev ens7 src 10.0.0.10 10.0.0.0/24 dev ens7 scope link src 10.0.0.10 ... > stderr: ++ 17:02:43: 472: ip route list dev ens7 scope link proto kernel match 10.0.0.10 > stderr: ++ 17:02:43: 472: grep -o '^[^ ]*' > stderr: + 17:02:43: 472: NETWORK= # ip route default via 10.0.0.1 dev ens7 src 10.0.0.10 10.0.0.0/24 dev ens7 scope link src 10.0.0.10 10.0.0.0/24 dev ens7 proto kernel scope link src 10.0.0.85 metric 100 ... > stderr: ++ 17:08:13: 472: ip route list dev ens7 scope link proto kernel match 10.0.0.10 > stderr: ++ 17:08:13: 472: grep -o '^[^ ]*' > stderr: + 17:08:13: 472: NETWORK=10.0.0.0/24 So it looks that in some cases we fail to detect correctly the NETWORK