Bug 1862495 - Running hybrid-overlay-node as a windows service exits abruptly on os.Exit()
Summary: Running hybrid-overlay-node as a windows service exits abruptly on os.Exit()
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Windows Containers
Version: 4.6
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.6.0
Assignee: Mansi Kulkarni
QA Contact: gaoshang
Depends On:
TreeView+ depends on / blocked
Reported: 2020-07-31 14:42 UTC by Mansi Kulkarni
Modified: 2020-10-27 16:22 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2020-10-27 16:21:52 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:4196 0 None None None 2020-10-27 16:22:11 UTC

Description Mansi Kulkarni 2020-07-31 14:42:51 UTC
Description of problem:

Running hybrid-overlay-node as a Windows service relies on os.Exit() to stop the service on svc.Stop or service shutdown. This approach has some concerns as it would exit the process immediately without leaving any chance for the code to cleanly exit (which differs from what happens when SIGINT is raised when running as a daemon or from the command line). The code needs to be updated to avoid using os.Exit() and exit the service gracefully.

Version-Release number of selected component (if applicable):

How reproducible:

Actual results:
Running hybrid-overlay-node as a Windows service exits abruptly on svc.Stop or shutdown.

Expected results:
Running hybrid-overlay-node as a Windows service exits gracefully on svc.Stop or shutdown.

Additional info:
Some ideas around how to fix this can be found on the PR discussions: https://github.com/ovn-org/ovn-kubernetes/pull/1514

This issue is being tracked upstream at: https://github.com/ovn-org/ovn-kubernetes/issues/1562

Comment 1 Mansi Kulkarni 2020-08-25 14:31:46 UTC
created a PR at https://github.com/ovn-org/ovn-kubernetes/pull/1577, got approval waiting for lgtm.

Comment 2 Mansi Kulkarni 2020-08-27 14:39:50 UTC
This PR(https://github.com/ovn-org/ovn-kubernetes/pull/1577) has been merged into upstream https://github.com/ovn-org/ovn-kubernetes

Comment 4 Mansi Kulkarni 2020-08-31 16:23:05 UTC
This PR has been merged downstream at: https://github.com/openshift/ovn-kubernetes from merge: https://github.com/openshift/ovn-kubernetes/pull/243

Comment 5 gaoshang 2020-09-07 11:58:00 UTC
@mankulka Could you please give some hints on how to verify this bug? Or should I wait running hybrid-overlay-node as Windows service feature finished to test it? Thanks.

Comment 6 Mansi Kulkarni 2020-09-09 16:55:02 UTC
@sgao As this feature has not been implemented in wmco yet, we can wait for running hybrid-overlay-node as Windows service feature ticket-> https://issues.redhat.com/browse/WINC-296 to test it.

Comment 7 gaoshang 2020-10-10 16:26:15 UTC
This bug has been verified on OCP 4.6.0-0.nightly-2020-10-09-224055 and passed, thanks.

windows-machine-config-operator git commit b24e6404aea83c2e4be6da1a0a5b306f496f983d

1. Try to stop hybrid-overlay-node
PS C:\Users\Administrator> Stop-Service hybrid-overlay-node
Stop-Service : Cannot stop service 'hybrid-overlay-node (hybrid-overlay-node)' because it has dependent services. It can only be stopped if the Force flag is set.
At line:1 char:1
+ Stop-Service hybrid-overlay-node
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.ServiceProcess.ServiceController:ServiceController) [Stop-Service], ServiceCommandException
    + FullyQualifiedErrorId : ServiceHasDependentServices,Microsoft.PowerShell.Commands.StopServiceCommand
PS C:\Users\Administrator> Get-Service hybrid-overlay-node -DependentServices

Status   Name               DisplayName
------   ----               -----------
Running  kube-proxy         kube-proxy

PS C:\Users\Administrator> Stop-Service kube-proxy
PS C:\Users\Administrator> Stop-Service hybrid-overlay-node
PS C:\Users\Administrator>

Comment 9 errata-xmlrpc 2020-10-27 16:21:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.