Bug 1896691

Summary: oc must-gather on Windows nodes doesn't collect container runtime logs
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Windows ContainersAssignee: Christian Glombek <cglombek>
Status: CLOSED ERRATA QA Contact: gaoshang <sgao>
Severity: high Docs Contact:
Priority: high    
Version: 4.6CC: aos-bugs, aravindh, cglombek, erich, gmarkley, pmahajan, rgudimet, vhire
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-12-14 13:50:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1862643    
Bug Blocks:    

Description OpenShift BugZilla Robot 2020-11-11 09:35:46 UTC
+++ This bug was initially created as a clone of Bug #1862643 +++

oc must-gather PR#165 doesn't collect container runtime logs

https://github.com/openshift/must-gather/pull/165

Actual results:

'oc must-gather' on Windows only pulls the kube-proxy, hybrid-overlay, and kubelet logs.

Expected results:

'oc must-gather' should also retrieve container runtime logs.

If we need to diagnose issues occurring on a Windows node, lack of container runtime logs will severely diminish support's and engineering's ability to troubleshoot the problem, without other means to get these logs (e.g. creating a bastion host).

Not having the runtime logs will likely lead to a negative customer experience due our inability to identify issues in a timely fashion.

--- Additional comment from cglombek on 2020-10-01 20:05:20 UTC ---

Apparently Docker Daemon is hardcoded to log to the WinEvent logs on Windows.

As a possible workaround, I've created a PR to enable streaming WinEvent logs from the nodes through the kubelet: https://github.com/openshift/kubernetes/pull/383

Comment 2 gaoshang 2020-12-07 06:43:29 UTC
This bug has been verified on OCP 4.6.0-0.nightly-2020-12-06-095114 and passed, thanks.

Version:
WMCO built from https://github.com/openshift/windows-machine-config-operator/commit/289f21a28ad0f3d60f26e2254dd2b6f9819149b1
OCP 4.6.0-0.nightly-2020-12-06-095114

Steps:
1, WMCO is running and 2 Windows nodes bootstrapped.
2, Run oc adm node-logs, check Windows container runtime logs could be retrieved.
# oc adm node-logs -l kubernetes.io/os=windows --path=journal -u docker




   ProviderName: docker

   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM 11 Information      Docker daemon [graphdriver(s)=windowsfilter version=19.03.13                  
                                         commit=5b5efb2d49]                                                            
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine                                          


TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM 11 Information      Docker daemon [version=19.03.13 commit=5b5efb2d49                             
                                         graphdriver(s)=windowsfilter]                                                 
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine

3, Run must-gather, check it should pull the docker runtime logs from all Windows nodes.
# oc adm must-gather --dest-dir /tmp/must-gather
...
[must-gather-qvp56] OUT host_service_logs/windows/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/hybrid-overlay/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/hybrid-overlay/hybrid-overlay.log
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.ERROR
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.INFO
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.WARNING
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kubelet/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kubelet/kubelet.log
[must-gather-qvp56] OUT host_service_logs/windows/log_winevent/
[must-gather-qvp56] OUT host_service_logs/windows/log_winevent/docker_winevent.log
...

# tree /tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/
/tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/
├── log_files
│   ├── hybrid-overlay
│   │   └── hybrid-overlay.log
│   ├── kubelet
│   │   └── kubelet.log
│   └── kube-proxy
│       ├── kube-proxy.exe.ERROR
│       ├── kube-proxy.exe.INFO
│       └── kube-proxy.exe.WARNING
└── log_winevent
    └── docker_winevent.log

5 directories, 6 files

# cat /tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/log_winevent/docker_winevent.log 




   ProviderName: docker

   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM 11 Information      Docker daemon [graphdriver(s)=windowsfilter version=19.03.13                  
                                         commit=5b5efb2d49]                                                            
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine                                          


TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM 11 Information      Docker daemon [version=19.03.13 commit=5b5efb2d49                             
                                         graphdriver(s)=windowsfilter]                                                 
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine

Comment 5 errata-xmlrpc 2020-12-14 13:50:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259