Bug 2140619 - [RHEL 8.4] VIP fails to be configured on an interface with a route in a different table from the default one
Summary: [RHEL 8.4] VIP fails to be configured on an interface with a route in a diffe...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: resource-agents
Version: 8.4
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: ---
Assignee: Oyvind Albrigtsen
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-07 11:06 UTC by Luca Davidde
Modified: 2023-08-10 15:41 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-138491 0 None None None 2022-11-07 11:31:07 UTC
Red Hat Knowledge Base (Solution) 6987823 0 None None None 2022-11-28 09:38:14 UTC

Description Luca Davidde 2022-11-07 11:06:59 UTC
Description of problem:
Hi,
after an openstack minor upgrade (so also resource-agents package has been updated) 
pcs cluster became unable to configure a VIP on a specific interface:


pacemaker/pacemaker.log:Oct 11 09:39:46  IPaddr2(ip-172.16.221.35)[533457]:    ERROR: Unable to find nic or netmask.
pacemaker/pacemaker.log:Oct 11 09:39:46  IPaddr2(ip-172.16.221.35)[533457]:    WARNING: [findif] failed
---
 Operation start for ip-172.16.221.35 (ocf:heartbeat:IPaddr2) returned: 'error' (1)
  [....]
  +++ 13:57:31: findif:220: ip -o -f inet route list match 172.16.221.35/32 scope link
  +++ 13:57:31: findif:220: awk 'BEGIN{best=0} /\// { mask=$1; sub(".*/", "", mask); if( int(mask)>=best ) { best=int(mask); best_ln=$0; } } END{print best_ln}'
  ++ 13:57:31: findif:220: set --
  ++ 13:57:31: findif:222: '[' 0 = 0 ']'
  ++ 13:57:31: findif:223: case $OCF_RESKEY_ip in
  ++ 13:57:31: findif:223: case $OCF_RESKEY_ip in
  ++ 13:57:31: findif:229: '[' -z '' -o -z 32 ']'
  ++ 13:57:31: findif:230: '[' 0 = 0 ']'       <---- There was nothing returned from "ip" command so length is 0. 
  ++ 13:57:31: findif:231: ocf_log err 'Unable to find nic or netmask.'
  [....]
  ++ 13:57:31: hadate:175: date '+%b %d %T '
  + 13:57:31: __ha_log:250: echo 'IPaddr2(ip-172.16.221.35)[176579]:      Oct' 13 13:57:31 'ERROR: [findif] failed'
  + 13:57:31: ip_init:537: exit 1
---

After some investigations with cluster-ha and networking sbr, it turned out that the script findif.sh [1] used by the cluster to check on which nic the VIP has to be configured, doesn't get any output because the route belonging to the VIP's network is in a different routing table from the default one (which is a deliberate choice):
IP CONFIGURED ON vlan10 interfcace:
$ip_addr 
...
52: vlan10    inet 172.16.221.38/27 brd 172.16.221.63 scope global vlan10\       valid_lft forever preferred_lft forever
...

COMMAND ISSUED BY SCRIPT (semplified)
ip -o -f inet route list match 172.16.221.35/32 scope link <=== NO OUTPUT

ROUTE BELONGING TO VIP's NETWORK THAT SHOULD MATCH
$ grep "^172.16" ip_route_show_table_all
  172.16.221.32/27 dev vlan10 table public scope link      <=== is in table public

We managed to get this work by updating the resource specifying the nic name:

pcs resource update ip-172.16.221.35 nic=vlan10


But we'd like to understand if there's another way to have this working, considering that it seems it was working before the update.



[1]http://opengrok.brq.redhat.com/source/xref/RHEL-8/resource-agents-sap/4.1.1/30.el8/ClusterLabs-resource-agents-e711383f/heartbeat/findif.sh#230



Version-Release number of selected component (if applicable):
rhel 8.4
resource-agents.x86_64   4.1.1-90.el8_4.11

How reproducible:
on customer environment

Steps to Reproduce:
1.
2.
3.

Actual results:
cluster fails to configure vip

Expected results:
cluster succeed in configuring VIP



Additional info:


Note You need to log in before you can comment on or make changes to this bug.