Bug 1817988

Summary:	[IPI baremetal] restart keepalived when default route is changed
Product:	OpenShift Container Platform	Reporter:	Petr Horáček <phoracek>
Component:	Machine Config Operator	Assignee:	Yossi Boaron <yboaron>
Status:	CLOSED NEXTRELEASE	QA Contact:	Victor Voronkov <vvoronko>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.4	CC:	achernet, amurdaca, asegurap, augol, bperkins, danken, kgarriso, miabbott, smilner, vvoronko, wsun, wzheng, xtian, yboaron
Target Milestone:	---
Target Release:	4.4.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:	1751978	Environment:
Last Closed:	2020-07-02 07:45:31 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1751978
Bug Blocks:	1741265

Comment 1 Petr Horáček 2020-03-27 11:47:49 UTC

I observed the issue from https://bugzilla.redhat.com/show_bug.cgi?id=1751978 again on OpenShift 4.4.

I reconfigured the host, so the interface that was originally carrying VIPs is attached to an OVS bridge.

While the keepalived-monitor observed the change and rendered new config, keepalived container failed with 'permanent error CONFIG'.

I have a cluster available in case you want to debug it there.

keepalived-monitor logs:
time="2020-03-27T10:54:03Z" level=info msg="Config change detected" new config="{{ostest test.metalkube.org 192.168.111.5 14 192.168.111.2 10 192.168.111.4 93 24 0 } {0 0 0 [] } 192.168.111.23 worker-0  brcnv [1
92.168.111.1]}"                                                                                                                                                                                                   
time="2020-03-27T10:54:03Z" level=info msg="Runtimecfg rendering template" path=/etc/keepalived/keepalived.conf

keepalived.conf:
vrrp_script chk_ingress {
    script "/usr/bin/curl -o /dev/null -kLs http://0:1936/healthz"
    interval 1
    weight 50
}

vrrp_instance ostest_INGRESS {
    state BACKUP
    interface brcnv
    virtual_router_id 93
    priority 40
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass cluster_uuid_ingress_vip
    }
    virtual_ipaddress {
        192.168.111.4/24
    }
    track_script {
        chk_ingress
    }
}

keepalived logs:
The client sent: reload
Opening file '/etc/keepalived/keepalived.conf'.
Stopped
Keepalived_vrrp exited with permanent error CONFIG. Terminating
Stopping
Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2

Comment 2 Victor Voronkov 2020-03-29 17:50:09 UTC

Deployed a cluster with 4.4.0-rc.4 and tested the same command sequence there:

nmcli con add type bridge ifname br10
sudo nmcli con add type bridge-slave ifname enp5s0 master br10
sudo nmcli con up bridge-slave-enp5s0

keepalived.conf rendered with interface br10 and container restarted, no errors detected, so...

works for me

Comment 3 Petr Horáček 2020-03-30 07:56:36 UTC

Have it applied VIPs on the new interface br10?

Comment 5 Steve Milner 2020-04-08 13:55:32 UTC

Is this meant to for 4.4?

Comment 6 Kirsten Garrison 2020-04-08 17:44:40 UTC

This is duped from a 4.5 bz so this should probably be the 4.4 bz... @Yossi can you confirm?

Comment 7 Yossi Boaron 2020-04-12 13:48:06 UTC

Well, this bug's target release should be 4.5, the original bug's (the one this bug cloned from) target release was 4.2.

Comment 8 Antonio Murdaca 2020-04-15 08:33:18 UTC

(In reply to Yossi Boaron from comment #7)
> Well, this bug's target release should be 4.5, the original bug's (the one
> this bug cloned from) target release was 4.2.

Nope, the 4.5 BZ is the one you cloned from and it's already VERIFIED so I can't see how you would ship another fix to that, this must target something else, lower than 4.5, setting 4.4 but fix it please.