Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1940499

Summary: hybrid-overlay not logging properly before exiting due to an error
Product: OpenShift Container Platform Reporter: Sebastian Soto <ssoto>
Component: NetworkingAssignee: Alexander Constantinescu <aconstan>
Networking sub component: ovn-kubernetes QA Contact: gaoshang <sgao>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: high CC: aconstan, anusaxen, sgao
Version: 4.8   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1940566 (view as bug list) Environment:
Last Closed: 2021-07-27 22:54:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1940566    

Description Sebastian Soto 2021-03-18 14:57:44 UTC
Description of problem:

hybrid-overlay is experiencing an issue where fatal errors are not being
logged when logging to a file using the logfile flag. This commit
makes it so that the error is logged properly before the program exits.

We've been seeing a lot of support issues being opened due to people attempting to use features not available on certain Windows kernels.
Fixing this should reduce the amount of issues opened

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Attempt to set up a Windows node on a VXLAN cluster with a Windows Server 2019 image

Actual results:
Hybrid overlay panics and does not log to the log file in /var/log/


Expected results:
Hybrid overlay exits and logs to the log file

Additional info:

Comment 3 Kedar Kulkarni 2021-03-25 15:06:50 UTC
Hi,

I checked with Sebastian today and he confirmed there are some changes that they need to pull into the Windows MCO. Once that is done, this bz will be easy to validate. He will be trying to pull in those changes asap. 

Till then keeping a NEEDINFO open on Dev. 

Thanks,
KK.

Comment 5 Sebastian Soto 2021-04-28 13:29:17 UTC
This has made its way into the WMCO.
If hybrid overlay crashes an error should be present in the Windows node logs, which can be retrieved with:
`oc adm node-logs <NODE_NAME> --path=/hybrid-overlay/hybrid-overlay.log

Comment 6 Sebastian Soto 2021-04-28 13:31:02 UTC
As a a note, builds from master have this, not any released builds as of yet.

Comment 7 Anurag saxena 2021-04-30 15:17:56 UTC
@sgao Can you verify it since you would be up to date on WMCO changes? Let SDN team know if you need our help in any way

Comment 8 gaoshang 2021-05-06 15:41:49 UTC
Sure, this bug has been verified on OCP 4.8 + vSphere + Windows Server 2019 and passed, thanks.

Version-Release number of selected component (if applicable):
WMCO built from https://github.com/openshift/windows-machine-config-operator/commit/1ca41c250ff937d1543559ba19e805a7473d45bf
OCP version 4.8.0-0.nightly-2021-04-30-201824

Steps:

1. Install OCP 4.8 with ovn-kubernetes on vSphere, set hybridOverlayVXLANPort: 9898

2. Build WMCO and install it, refer to https://github.com/openshift/windows-machine-config-operator/blob/master/docs/HACKING.md

3. Create Windows machineset with Windows Server 2019

4. In this combination, hybrid-overlay would experience an issue. Check hybrid overlay exits and logs to the log file

$ oc get nodes -l kubernetes.io/os=windows -owide
NAME              STATUS                     ROLES    AGE     VERSION                            INTERNAL-IP     EXTERNAL-IP     OS-IMAGE                       KERNEL-VERSION    CONTAINER-RUNTIME
winworker-hsdrx   Ready,SchedulingDisabled   worker   6m49s   v1.21.0-rc.0.1190+e22a836a8b2659   172.31.249.32   172.31.249.32   Windows Server 2019 Standard   10.0.17763.1697   docker://19.3.14

$ oc adm node-logs winworker-hsdrx --path=/hybrid-overlay/hybrid-overlay.log
I0506 17:26:57.589456    1996 cert_rotation.go:137] Starting client certificate rotation controller
F0506 17:26:57.603194    1996 hybrid-overlay-node.go:53] this version of Windows does not support setting the VXLAN UDP port. Please make sure you install all the KB updates on your system.
F0506 17:26:57.603194    1996 hybrid-overlay-node.go:53] this version of Windows does not support setting the VXLAN UDP port. Please make sure you install all the KB updates on your system.
F0506 17:26:57.603194    1996 hybrid-overlay-node.go:53] this version of Windows does not support setting the VXLAN UDP port. Please make sure you install all the KB updates on your system.
F0506 17:26:57.603194    1996 hybrid-overlay-node.go:53] this version of Windows does not support setting the VXLAN UDP port. Please make sure you install all the KB updates on your system.

PS C:\Users\Administrator> Get-Service hybrid-overlay-node 

Status   Name               DisplayName
------   ----               -----------
Stopped  hybrid-overlay-... hybrid-overlay-node

Comment 11 errata-xmlrpc 2021-07-27 22:54:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Comment 12 Red Hat Bugzilla 2023-09-15 01:03:39 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days