Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2154796

Summary: Add an appctl cmd to reconnect ovn-controller when SB goes down
Product: Red Hat Enterprise Linux Fast Datapath Reporter: zenghui.shi <zshi>
Component: ovn22.12Assignee: OVN Team <ovnteam>
Status: CLOSED NOTABUG QA Contact: Jianlin Shi <jishi>
Severity: medium Docs Contact:
Priority: unspecified    
Version: FDP 22.LCC: ctrautma, dceara, i.maximets, jiji
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-20 02:23:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zenghui.shi 2022-12-19 09:59:35 UTC
Description of problem:

ovn-controller inactivity probes are disabled on unix connections. 

The problem is that if somehow the SB goes down ovn-controller might not be able to determine on its own that the connection went away (for example if ovn-controller doesn't need to send data to the SB which is often the case). Although we could restart ovn-controller every time the SB goes down to reconnect, it would be good to have a way to tell ovn-controller to reconnect. for example, an appctl command.

Comment 1 Ilya Maximets 2022-12-19 12:40:19 UTC
(In reply to zenghui.shi from comment #0)
> The problem is that if somehow the SB goes down ovn-controller might not be
> able to determine on its own that the connection went away (for example if
> ovn-controller doesn't need to send data to the SB which is often the case).

Hmm.   I'm not sure if that is actually a problem.  Unix sockets are
typically good at waking up processes that are polling them in case the
other side goes down.  ovn-controller should receive a POLLERR, wake up
and try to re-connect.   It's not the case for the communication over
the network because remote node can go away without signalling.  But on
the same host the kernel always knows that the other process is dead,
so the state of a unix socket connection should always be clear.

Comment 2 Dumitru Ceara 2022-12-19 14:35:32 UTC
(In reply to Ilya Maximets from comment #1)
> (In reply to zenghui.shi from comment #0)
> > The problem is that if somehow the SB goes down ovn-controller might not be
> > able to determine on its own that the connection went away (for example if
> > ovn-controller doesn't need to send data to the SB which is often the case).
> 
> Hmm.   I'm not sure if that is actually a problem.  Unix sockets are
> typically good at waking up processes that are polling them in case the
> other side goes down.  ovn-controller should receive a POLLERR, wake up
> and try to re-connect.   It's not the case for the communication over
> the network because remote node can go away without signalling.  But on
> the same host the kernel always knows that the other process is dead,
> so the state of a unix socket connection should always be clear.

You're right.  My bad, I had the (wrong) impression that ovn-controller
won't receive any event.  I think it's safe to close this as NOTABUG.

@zshi what do you think?

Comment 3 zenghui.shi 2022-12-20 02:23:23 UTC
> > Hmm.   I'm not sure if that is actually a problem.  Unix sockets are
> > typically good at waking up processes that are polling them in case the
> > other side goes down.  ovn-controller should receive a POLLERR, wake up
> > and try to re-connect.   It's not the case for the communication over
> > the network because remote node can go away without signalling.  But on
> > the same host the kernel always knows that the other process is dead,
> > so the state of a unix socket connection should always be clear.
> 
> You're right.  My bad, I had the (wrong) impression that ovn-controller
> won't receive any event.  I think it's safe to close this as NOTABUG.
> 
> @zshi what do you think?

Agree, let's close it as NOTABUG.

Thanks Ilya and Dumitru for the quick response and clarification!