RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1400172 - heartbeat: IPsrcaddr: fails unsetting due to duplicate route lines
Summary: heartbeat: IPsrcaddr: fails unsetting due to duplicate route lines
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: resource-agents
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-30 15:45 UTC by Tzafrir Cohen
Modified: 2020-12-14 07:54 UTC (History)
7 users (show)

Fixed In Version: resource-agents-3.9.5-88.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-01 14:57:40 UTC
Target Upstream Version:
Embargoed:
tzafrir: needinfo-


Attachments (Terms of Use)
Fix route editing for IPSrcaddr.sh (3.65 KB, message/rfc822)
2016-12-22 11:51 UTC, Tzafrir Cohen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3037861 0 None None None 2017-05-18 14:07:46 UTC
Red Hat Product Errata RHBA-2017:1844 0 normal SHIPPED_LIVE resource-agents bug fix and enhancement update 2017-08-01 17:49:20 UTC

Description Tzafrir Cohen 2016-11-30 15:45:33 UTC
Description of problem:
The 'start' operation of /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr runs 'ip route change' to set the source route. This adds a second route line for the local network. This duplicate entry then confuses the 'stop' operation of the script.

Version-Release number of selected component (if applicable):
3.9.5-54.el7_2.17

How reproducible:
Always.

Steps to Reproduce:
Run:

export OCF_ROOT=/usr/lib/ocf
export PATH="/usr/sbin:/sbin:$PATH"
export OCF_RESKEY_ipaddress=192.168.0.253
export OCF_RESKEY_cidr_netmask=20

export OCF_RESKEY_nic=eth0
export OCF_RESKEY_ip=192.168.0.253
export OCF_RESKEY_cidr_netmask=20

/usr/lib/ocf/resource.d/heartbeat/IPaddr2 start

ip route show dev eth0

/usr/lib/ocf/resource.d/heartbeat/IPsrcaddr start

ip route show dev eth0

/usr/lib/ocf/resource.d/heartbeat/IPsrcaddr stop

ip route show dev eth0

/usr/lib/ocf/resource.d/heartbeat/IPaddr2 stop


Actual results:


default via 192.168.0.1  proto static  metric 100 
192.168.0.0/20  proto kernel  scope link  src 192.168.0.196  metric 100 

default via 192.168.0.1  proto static  src 192.168.0.253  metric 100 
192.168.0.0/20  scope link  src 192.168.0.253 
192.168.0.0/20  proto kernel  scope link  src 192.168.0.196  metric 100 

Error: either "to" is duplicate, or "192.168.0.0/20" is a garbage.
ocf-exit-reason:command 'ip route replace 192.168.0.0/20
192.168.0.0/20 dev eth0' failed

default via 192.168.0.1  proto static  src 192.168.0.253  metric 100 
192.168.0.0/20  scope link  src 192.168.0.253 
192.168.0.0/20  proto kernel  scope link  src 192.168.0.196  metric 100 



Expected results:

This is what I get after applying the patch mentioned below.

default via 192.168.0.1 dev eth0  proto static  metric 100 
192.168.0.0/20 dev eth0  proto kernel  scope link  src 192.168.0.196  metric 100

default via 192.168.0.1 dev eth0  proto static  src 192.168.0.253  metric 100 
192.168.0.0/20 dev eth0  scope link  src 192.168.0.253 
192.168.0.0/20 dev eth0  proto kernel  scope link  src 192.168.0.196  metric 100

default via 192.168.0.1  proto static  metric 100 
192.168.0.0/20  scope link 
192.168.0.0/20  proto kernel  scope link  src 192.168.0.196  metric 100 

Additional info:

The fix I applied was to further filter the output of ip route:

--- /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr.orig    2016-11-30 15:43:23.896352263 +0000
+++ /usr/lib/ocf/resource.d/heartbeat/IPsrcaddr 2016-11-30 15:43:39.419058286 +0000
@@ -469,7 +469,7 @@
 }
 
 INTERFACE=`echo $findif_out | awk '{print $1}'`
-NETWORK=`ip route list dev $INTERFACE scope link match $ipaddress|grep -o '^[^ ]*'`
+NETWORK=`ip route list dev $INTERFACE scope link proto kernel match $ipaddress|grep -o '^[^ ]*'`
 
 case $1 in
        start)          srca_start $ipaddress

Comment 2 Oyvind Albrigtsen 2016-12-21 14:37:25 UTC
Tested and working patch: https://github.com/ClusterLabs/resource-agents/pull/904

Comment 3 Tzafrir Cohen 2016-12-22 11:46:32 UTC
Thanks for that. However, I realized that things are a bit more complicated. 

The reason we get a duplicate route line is that 'ip route replace' was run with parameters that are different from the actual route. This causes a new route to be created.

We replaced the parsing of the routing table. A new patch will shortly be added.

Comment 4 Tzafrir Cohen 2016-12-22 11:51:33 UTC
Created attachment 1234728 [details]
Fix route editing for IPSrcaddr.sh

Comment 6 Oyvind Albrigtsen 2017-02-24 13:34:05 UTC
(In reply to Tzafrir Cohen from comment #4)
> Created attachment 1234728 [details]
> Fix route editing for IPSrcaddr.sh

Can you send me some more information of your setup?

I dont see any issues when I just change the NETWORK= line (so it seems the first part of my patch isnt necessary), so I guess that should be enough to solve the issue unless there's some special case I'm not hitting.

-NETWORK=`ip route list dev $INTERFACE scope link match $ipaddress|grep -o '^[^ ]*'`
+NETWORK=`ip route list dev $INTERFACE scope link proto kernel match $ipaddress|grep -o '^[^ ]*'`

Comment 8 michal novacek 2017-06-15 07:22:29 UTC
I have verified that source address is correctly added to default route when
IPsrcaddr agent is running in resource-agents-3.9.5-80

----

* configure cluster with ipaddr and ipaddrsrc in a group [1]
* disable the group [2]

before the patch (resource-agents-3.9.5-80.el7)
===============================================

[root@host-035 ~]# pcs resource
...
 Resource Group: vip-g
     vip        (ocf::heartbeat:IPaddr2):       Stopped (disabled)
     vip-src    (ocf::heartbeat:IPsrcaddr):     Stopped (disabled)

[root@host-035 ~]# ip ro
> default via 10.15.107.254 dev eth0 proto static metric 100 
10.15.104.0/22 dev eth0 scope link 
10.15.104.0/22 dev eth0 proto kernel scope link src 10.15.105.35 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 metric 100 

[root@host-035 ~]# pcs resource enable vip-g
[root@host-035 ~]# pcs resource
...
 Resource Group: vip-g
     vip        (ocf::heartbeat:IPaddr2):       Started host-035
     vip-src    (ocf::heartbeat:IPsrcaddr):     Started host-035

[root@host-035 ~]# ip ro
> default via 10.15.107.254 dev eth0 proto static metric 100 
10.15.104.0/22 dev eth0 scope link 
10.15.104.0/22 dev eth0 proto kernel scope link src 10.15.105.35 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 metric 100 

after the patch (resource-agents-3.9.5-104.el7)
===============================================

[root@host-035 ~]# pcs resource
...
 Resource Group: vip-g
     vip        (ocf::heartbeat:IPaddr2):       Stopped (disabled)
     vip-src    (ocf::heartbeat:IPsrcaddr):     Stopped (disabled)

[root@host-035 ~]# ip ro
default via 10.15.107.254 dev eth0 proto static metric 100 
10.15.104.0/22 dev eth0 scope link 
10.15.104.0/22 dev eth0 proto kernel scope link src 10.15.105.35 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 metric 100 

[root@host-035 ~]# pcs resource enable vip-g
[root@host-035 ~]# pcs resource
...
 Resource Group: vip-g
     vip        (ocf::heartbeat:IPaddr2):       Started host-035
     vip-src    (ocf::heartbeat:IPsrcaddr):     Started host-035

[root@host-035 ~]# ip ro
> default via 10.15.107.254 dev eth0 proto static src 10.15.107.150 metric 100 
10.15.104.0/22 dev eth0 scope link src 10.15.107.150 
10.15.104.0/22 dev eth0 proto kernel scope link src 10.15.105.35 metric 100 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.35 metric 100


-----

> (1) pcs-config
[root@host-034 ~]# pcs config
Cluster Name: STSRHTS3691
Corosync Nodes:
 host-034 host-035
Pacemaker Nodes:
 host-034 host-035

Resources:
 Clone: dlm-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: dlm (class=ocf provider=pacemaker type=controld)
   Operations: monitor interval=30s on-fail=fence (dlm-monitor-interval-30s)
               start interval=0s timeout=90 (dlm-start-interval-0s)
               stop interval=0s timeout=100 (dlm-stop-interval-0s)
 Clone: clvmd-clone
  Meta Attrs: interleave=true ordered=true 
  Resource: clvmd (class=ocf provider=heartbeat type=clvm)
   Attributes: with_cmirrord=1
   Operations: monitor interval=30s on-fail=fence (clvmd-monitor-interval-30s)
               start interval=0s timeout=90 (clvmd-start-interval-0s)
               stop interval=0s timeout=90 (clvmd-stop-interval-0s)
 Group: vip-g
  Resource: vip (class=ocf provider=heartbeat type=IPaddr2)
   Attributes: cidr_netmask=22 ip=10.15.107.150
   Operations: monitor interval=10s timeout=20s (vip-monitor-interval-10s)
               start interval=0s timeout=20s (vip-start-interval-0s)
               stop interval=0s timeout=20s (vip-stop-interval-0s)
  Resource: vip-src (class=ocf provider=heartbeat type=IPsrcaddr)
   Attributes: cidr_netmask=22 ipaddress=10.15.107.150
   Operations: monitor interval=10 timeout=20s (vip-src-monitor-interval-10)
               start interval=0s timeout=20s (vip-src-start-interval-0s)
               stop interval=0s timeout=20s (vip-src-stop-interval-0s)

Stonith Devices:
 Resource: fence-host-034 (class=stonith type=fence_xvm)
  Attributes: delay=5 pcmk_host_check=static-list pcmk_host_list=host-034 pcmk_host_map=host-034:host-034.virt.lab.msp.redhat.com
  Operations: monitor interval=60s (fence-host-034-monitor-interval-60s)
 Resource: fence-host-035 (class=stonith type=fence_xvm)
  Attributes: pcmk_host_check=static-list pcmk_host_list=host-035 pcmk_host_map=host-035:host-035.virt.lab.msp.redhat.com
  Operations: monitor interval=60s (fence-host-035-monitor-interval-60s)
Fencing Levels:

Location Constraints:
Ordering Constraints:
  start dlm-clone then start clvmd-clone (kind:Mandatory)
Colocation Constraints:
  clvmd-clone with dlm-clone (score:INFINITY)
Ticket Constraints:

Alerts:
 No alerts defined

Resources Defaults:
 No defaults set
Operations Defaults:
 No defaults set

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: STSRHTS3691
 dc-version: 1.1.16-10.el7-94ff4df
 have-watchdog: false
 last-lrm-refresh: 1497508285
 no-quorum-policy: freeze

Quorum:
  Options:

> (2) pcs-status
[root@host-034 ~]# pcs status
Cluster name: STSRHTS3691
Stack: corosync
Current DC: host-035 (version 1.1.16-10.el7-94ff4df) - partition with quorum
Last updated: Thu Jun 15 01:46:02 2017
Last change: Thu Jun 15 01:45:59 2017 by root via cibadmin on host-034

2 nodes configured
8 resources configured (4 DISABLED)

Online: [ host-034 host-035 ]

Full list of resources:

 fence-host-034 (stonith:fence_xvm):    Started host-035
 fence-host-035 (stonith:fence_xvm):    Started host-034
 Clone Set: dlm-clone [dlm]
     Started: [ host-034 host-035 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ host-034 host-035 ]
 Resource Group: vip-g
     vip        (ocf::heartbeat:IPaddr2):       Stopped (disabled)
     vip-src    (ocf::heartbeat:IPsrcaddr):     Stopped (disabled)

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

Comment 9 errata-xmlrpc 2017-08-01 14:57:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1844


Note You need to log in before you can comment on or make changes to this bug.