Bug 1896691 - oc must-gather on Windows nodes doesn't collect container runtime logs
Summary: oc must-gather on Windows nodes doesn't collect container runtime logs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Windows Containers
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.6.z
Assignee: Christian Glombek
QA Contact: gaoshang
URL:
Whiteboard:
Depends On: 1862643
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-11 09:35 UTC by OpenShift BugZilla Robot
Modified: 2020-12-14 13:51 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-14 13:50:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kubernetes pull 446 0 None closed [release-4.6] Bug 1896691: UPSTREAM: 96120: kubelet: Expose a simple Get-WinEvent shim on the kubelet logs endpoint 2020-12-09 01:47:55 UTC
Github openshift must-gather pull 197 0 None closed [release-4.6] Bug 1896691: Collect docker logs from WinEvent log on Windows machines 2020-12-09 01:47:53 UTC
Github openshift windows-machine-config-operator pull 181 0 None closed Bug 1896691: [k8s] Update submodule to current openshift:release-4.6 2020-12-07 02:13:04 UTC
Red Hat Product Errata RHSA-2020:5259 0 None None None 2020-12-14 13:51:07 UTC

Description OpenShift BugZilla Robot 2020-11-11 09:35:46 UTC
+++ This bug was initially created as a clone of Bug #1862643 +++

oc must-gather PR#165 doesn't collect container runtime logs

https://github.com/openshift/must-gather/pull/165

Actual results:

'oc must-gather' on Windows only pulls the kube-proxy, hybrid-overlay, and kubelet logs.

Expected results:

'oc must-gather' should also retrieve container runtime logs.

If we need to diagnose issues occurring on a Windows node, lack of container runtime logs will severely diminish support's and engineering's ability to troubleshoot the problem, without other means to get these logs (e.g. creating a bastion host).

Not having the runtime logs will likely lead to a negative customer experience due our inability to identify issues in a timely fashion.

--- Additional comment from cglombek@redhat.com on 2020-10-01 20:05:20 UTC ---

Apparently Docker Daemon is hardcoded to log to the WinEvent logs on Windows.

As a possible workaround, I've created a PR to enable streaming WinEvent logs from the nodes through the kubelet: https://github.com/openshift/kubernetes/pull/383

Comment 2 gaoshang 2020-12-07 06:43:29 UTC
This bug has been verified on OCP 4.6.0-0.nightly-2020-12-06-095114 and passed, thanks.

Version:
WMCO built from https://github.com/openshift/windows-machine-config-operator/commit/289f21a28ad0f3d60f26e2254dd2b6f9819149b1
OCP 4.6.0-0.nightly-2020-12-06-095114

Steps:
1, WMCO is running and 2 Windows nodes bootstrapped.
2, Run oc adm node-logs, check Windows container runtime logs could be retrieved.
# oc adm node-logs -l kubernetes.io/os=windows --path=journal -u docker




   ProviderName: docker

   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM 11 Information      Docker daemon [graphdriver(s)=windowsfilter version=19.03.13                  
                                         commit=5b5efb2d49]                                                            
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine                                          


TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM 11 Information      Docker daemon [version=19.03.13 commit=5b5efb2d49                             
                                         graphdriver(s)=windowsfilter]                                                 
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine

3, Run must-gather, check it should pull the docker runtime logs from all Windows nodes.
# oc adm must-gather --dest-dir /tmp/must-gather
...
[must-gather-qvp56] OUT host_service_logs/windows/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/hybrid-overlay/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/hybrid-overlay/hybrid-overlay.log
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.ERROR
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.INFO
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kube-proxy/kube-proxy.exe.WARNING
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kubelet/
[must-gather-qvp56] OUT host_service_logs/windows/log_files/kubelet/kubelet.log
[must-gather-qvp56] OUT host_service_logs/windows/log_winevent/
[must-gather-qvp56] OUT host_service_logs/windows/log_winevent/docker_winevent.log
...

# tree /tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/
/tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/
├── log_files
│   ├── hybrid-overlay
│   │   └── hybrid-overlay.log
│   ├── kubelet
│   │   └── kubelet.log
│   └── kube-proxy
│       ├── kube-proxy.exe.ERROR
│       ├── kube-proxy.exe.INFO
│       └── kube-proxy.exe.WARNING
└── log_winevent
    └── docker_winevent.log

5 directories, 6 files

# cat /tmp/must-gather/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-bca8d68b7c127b8c4d53297e30969b91e4264812c991338a04270a0e03df37c2/host_service_logs/windows/log_winevent/docker_winevent.log 




   ProviderName: docker

   ProviderName: docker

TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM 11 Information      Docker daemon [graphdriver(s)=windowsfilter version=19.03.13                  
                                         commit=5b5efb2d49]                                                            
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine                                          


TimeCreated          Id LevelDisplayName Message                                                                       
-----------          -- ---------------- -------                                                                       
12/7/2020 5:53:26 AM  1 Information      Starting up                                                                   
12/7/2020 5:53:26 AM  1 Information      Windows default isolation mode: process                                       
12/7/2020 5:53:28 AM  1 Information      Loading containers: start.                                                    
12/7/2020 5:53:28 AM  1 Information      Restoring existing overlay networks from HNS into docker                      
12/7/2020 5:53:34 AM  1 Information      Loading containers: done.                                                     
12/7/2020 5:53:34 AM  1 Information      starting telemetry service                                                    
12/7/2020 5:53:34 AM 11 Information      Docker daemon [version=19.03.13 commit=5b5efb2d49                             
                                         graphdriver(s)=windowsfilter]                                                 
12/7/2020 5:53:34 AM  1 Information      Daemon has completed initialization                                           
12/7/2020 5:53:34 AM  1 Information      API listen on //./pipe/docker_engine

Comment 5 errata-xmlrpc 2020-12-14 13:50:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.8 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5259


Note You need to log in before you can comment on or make changes to this bug.