Bug 2140619
| Summary: | [RHEL 8.4] VIP fails to be configured on an interface with a route in a different table from the default one | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Luca Davidde <ldavidde> |
| Component: | resource-agents | Assignee: | Oyvind Albrigtsen <oalbrigt> |
| Status: | ASSIGNED --- | QA Contact: | cluster-qe <cluster-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 8.4 | CC: | agk, cluster-maint, cmuresan, fdinitto, oalbrigt, shtiwari |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem: Hi, after an openstack minor upgrade (so also resource-agents package has been updated) pcs cluster became unable to configure a VIP on a specific interface: pacemaker/pacemaker.log:Oct 11 09:39:46 IPaddr2(ip-172.16.221.35)[533457]: ERROR: Unable to find nic or netmask. pacemaker/pacemaker.log:Oct 11 09:39:46 IPaddr2(ip-172.16.221.35)[533457]: WARNING: [findif] failed --- Operation start for ip-172.16.221.35 (ocf:heartbeat:IPaddr2) returned: 'error' (1) [....] +++ 13:57:31: findif:220: ip -o -f inet route list match 172.16.221.35/32 scope link +++ 13:57:31: findif:220: awk 'BEGIN{best=0} /\// { mask=$1; sub(".*/", "", mask); if( int(mask)>=best ) { best=int(mask); best_ln=$0; } } END{print best_ln}' ++ 13:57:31: findif:220: set -- ++ 13:57:31: findif:222: '[' 0 = 0 ']' ++ 13:57:31: findif:223: case $OCF_RESKEY_ip in ++ 13:57:31: findif:223: case $OCF_RESKEY_ip in ++ 13:57:31: findif:229: '[' -z '' -o -z 32 ']' ++ 13:57:31: findif:230: '[' 0 = 0 ']' <---- There was nothing returned from "ip" command so length is 0. ++ 13:57:31: findif:231: ocf_log err 'Unable to find nic or netmask.' [....] ++ 13:57:31: hadate:175: date '+%b %d %T ' + 13:57:31: __ha_log:250: echo 'IPaddr2(ip-172.16.221.35)[176579]: Oct' 13 13:57:31 'ERROR: [findif] failed' + 13:57:31: ip_init:537: exit 1 --- After some investigations with cluster-ha and networking sbr, it turned out that the script findif.sh [1] used by the cluster to check on which nic the VIP has to be configured, doesn't get any output because the route belonging to the VIP's network is in a different routing table from the default one (which is a deliberate choice): IP CONFIGURED ON vlan10 interfcace: $ip_addr ... 52: vlan10 inet 172.16.221.38/27 brd 172.16.221.63 scope global vlan10\ valid_lft forever preferred_lft forever ... COMMAND ISSUED BY SCRIPT (semplified) ip -o -f inet route list match 172.16.221.35/32 scope link <=== NO OUTPUT ROUTE BELONGING TO VIP's NETWORK THAT SHOULD MATCH $ grep "^172.16" ip_route_show_table_all 172.16.221.32/27 dev vlan10 table public scope link <=== is in table public We managed to get this work by updating the resource specifying the nic name: pcs resource update ip-172.16.221.35 nic=vlan10 But we'd like to understand if there's another way to have this working, considering that it seems it was working before the update. [1]http://opengrok.brq.redhat.com/source/xref/RHEL-8/resource-agents-sap/4.1.1/30.el8/ClusterLabs-resource-agents-e711383f/heartbeat/findif.sh#230 Version-Release number of selected component (if applicable): rhel 8.4 resource-agents.x86_64 4.1.1-90.el8_4.11 How reproducible: on customer environment Steps to Reproduce: 1. 2. 3. Actual results: cluster fails to configure vip Expected results: cluster succeed in configuring VIP Additional info: