Description of problem: ovn-controller inactivity probes are disabled on unix connections. The problem is that if somehow the SB goes down ovn-controller might not be able to determine on its own that the connection went away (for example if ovn-controller doesn't need to send data to the SB which is often the case). Although we could restart ovn-controller every time the SB goes down to reconnect, it would be good to have a way to tell ovn-controller to reconnect. for example, an appctl command.
(In reply to zenghui.shi from comment #0) > The problem is that if somehow the SB goes down ovn-controller might not be > able to determine on its own that the connection went away (for example if > ovn-controller doesn't need to send data to the SB which is often the case). Hmm. I'm not sure if that is actually a problem. Unix sockets are typically good at waking up processes that are polling them in case the other side goes down. ovn-controller should receive a POLLERR, wake up and try to re-connect. It's not the case for the communication over the network because remote node can go away without signalling. But on the same host the kernel always knows that the other process is dead, so the state of a unix socket connection should always be clear.
(In reply to Ilya Maximets from comment #1) > (In reply to zenghui.shi from comment #0) > > The problem is that if somehow the SB goes down ovn-controller might not be > > able to determine on its own that the connection went away (for example if > > ovn-controller doesn't need to send data to the SB which is often the case). > > Hmm. I'm not sure if that is actually a problem. Unix sockets are > typically good at waking up processes that are polling them in case the > other side goes down. ovn-controller should receive a POLLERR, wake up > and try to re-connect. It's not the case for the communication over > the network because remote node can go away without signalling. But on > the same host the kernel always knows that the other process is dead, > so the state of a unix socket connection should always be clear. You're right. My bad, I had the (wrong) impression that ovn-controller won't receive any event. I think it's safe to close this as NOTABUG. @zshi what do you think?
> > Hmm. I'm not sure if that is actually a problem. Unix sockets are > > typically good at waking up processes that are polling them in case the > > other side goes down. ovn-controller should receive a POLLERR, wake up > > and try to re-connect. It's not the case for the communication over > > the network because remote node can go away without signalling. But on > > the same host the kernel always knows that the other process is dead, > > so the state of a unix socket connection should always be clear. > > You're right. My bad, I had the (wrong) impression that ovn-controller > won't receive any event. I think it's safe to close this as NOTABUG. > > @zshi what do you think? Agree, let's close it as NOTABUG. Thanks Ilya and Dumitru for the quick response and clarification!