Description of problem: oc debug <pod-name> does not work for Windows pods. The error thrown is: $ oc debug win-webserver-79878b949c-5hqw4 Starting pod/win-webserver-79878b949c-5hqw4-debug, command was: powershell.exe -command $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); }; failed to try resolving symlinks in path "\\var\\log\\pods\\default_win-webserver-79878b949c-5hqw4-debug_6db9ac83-3a9f-428e-9234-605b3d36927d\\windowswebserver\\0.log": CreateFile \var\log\pods\default_win-webserver-79878b949c-5hqw4-debug_6db9ac83-3a9f-428e-9234-605b3d36927d\windowswebserver\0.log: The system cannot find the file specified. Removing debug pod ... Version-Release number of selected component (if applicable): 4.9 How reproducible: Always Steps to Reproduce: 1. A Windows workload win-webserver is created on a Windows node brought up using a sample service and deployment: https://docs.openshift.com/container-platform/4.8/windows_containers/scheduling-windows-workloads.html#sample-windows-workload-deployment_scheduling-windows-workloads 2. oc debug win-webserver-79878b949c-5hqw4 is executed Actual results: The command errors out and removes the debug pod. Expected results: The command lands into a debug pod without errors. Additional info: After some digging around, it turns out the main issue is the default debug container command being: /bin/sh which should be cmd for Windows. what oc exec is doing for Windows(https://github.com/openshift/console/blob/8c7a7e60edb4722d4a5069030025bcc238dba714/frontend/public/components/pod-exec.jsx#L60) could be the way forward. Another issue that's being highlighted from the error and needs fix is when the debug command errors out getLogs() is called, and the way it is being called is not compatible with Windows log collection. Tested this on 4.9, pretty sure this would need to be backported to other versions as well.
@maszulik Why has this been passed back to the Windows Container team? This is an issue with how oc handles the debug command for Windows and has nothing to with the Windows Machine Config Operator which is what the WinC team owns.
After the discussion with workloads team passing this bug back to the team. Points to be considered for fixing this bug: Windows pod spec is expected to have set the tolerations for host: nodeSelector: kubernetes.io/os: windows node.kubernetes.io/windows-build: '10.0.17763' tolerations: - key: "os" operator: "Equal" value: "windows" effect: "NoSchedule" ref: https://kubernetes.io/docs/setup/production-environment/windows/user-guide-windows-containers/#ensuring-os-specific-workloads-land-on-the-appropriate-container-host However the above taints and tolerations are recommended and not guaranteed to be present, in such cases we should do a best effort for Windows pods and fallback to /bin/sh where the OS can't be figured out. It is also required to fix the oc debug pod command for a feature console team is adding to the admin console that allows to debug pod containers from the UI.
This appears to be fixed in the latest oc version, I tried version 4.9.15 and it works.
[root@localhost ~]# oc debug node/ip-10-0-137-166.us-east-2.compute.internal error: cannot debug ip-10-0-137-166.us-east-2.compute.internal: can't debug Windows nodes [root@localhost ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-25-023600 True False 64m Cluster version is 4.10.0-0.nightly-2022-01-25-023600
@yinzhou , did you verify that `oc debug <podname>` works?