Bug 1947052
| Summary: | Creating the vSphere Windows VM golden image (vSphere IPI + Windows Container) | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Robert Bohne <rbohne> |
| Component: | Documentation | Assignee: | Latha S <lmurthy> |
| Status: | CLOSED WONTFIX | QA Contact: | gaoshang <sgao> |
| Severity: | unspecified | Docs Contact: | Latha S <lmurthy> |
| Priority: | unspecified | ||
| Version: | 4.7 | CC: | aravindh |
| Target Milestone: | --- | Flags: | rbohne:
needinfo-
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-27 16:16:49 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Thanks for detailed bug report. Before this gets merged into docs, dev / QE would need to verify and test this. Created a Story for the Support of Windows Server 2019 LTSC on vSphere without custom vxlan port https://issues.redhat.com/browse/WINC-598 Maybe we can proceed with the docbug without the Windows Server 2019 LTSC topic. Because the current documentation is basically broken! I suggest: Document URL: https://docs.openshift.com/container-platform/4.7/windows_containers/creating_windows_machinesets/creating-windows-machineset-vsphere.html#creating-the-vsphere-windows-vm-golden-image_creating-windows-machineset-vsphere Section Number and Name: Creating the vSphere Windows VM golden image Describe the issue: Split the steps a bit better. Suggestions for improvement: 1) # Original step 1 2) Disable IPv6 Via Powershell: ``` Get-NetAdapterBinding Disable-NetAdapterBinding -Name <Name> -ComponentID ms_tcpip6 ``` 3) # Original step 4 4) # Original step 5, maybe add the powershell example: ``` "exclude-nics=" | Set-Content -Path 'C:\ProgramData\VMware\VMware Tools\tools.conf' ``` 5) # Original part of step 3 Installation and Configuration of OpenSSH ``` Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0 Set-Service -Name ssh-agent -StartupType 'Automatic' Set-Service -Name sshd -StartupType 'Automatic' Start-Service ssh-agent Start-Service sshd $pubKeyConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PubkeyAuthentication yes','PubkeyAuthentication yes' $pubKeyConf | Set-Content -Path C:\ProgramData\ssh\sshd_config $passwordConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PasswordAuthentication yes','PasswordAuthentication yes' $passwordConf | Set-Content -Path C:\ProgramData\ssh\sshd_config Restart-Service sshd ``` 6) # Original step 2 Create C:\ProgramData\ssh\administrators_authorized_keys, be carefull with the ACL of the file. ``` "ssh-rsa ...." | Set-Content -Path 'C:\ProgramData\ssh\administrators_authorized_keys' # Fix permission $acl = Get-Acl C:\ProgramData\ssh\administrators_authorized_keys $acl.SetAccessRuleProtection($true, $false) $administratorsRule = New-Object system.security.accesscontrol.filesystemaccessrule("Administrators","FullControl","Allow") $systemRule = New-Object system.security.accesscontrol.filesystemaccessrule("SYSTEM","FullControl","Allow") $acl.SetAccessRule($administratorsRule) $acl.SetAccessRule($systemRule) $acl | Set-Acl ``` 7) Allow incoming connection for container logs # Original included in step 3. ``` $firewallRuleName = "ContainerLogsPort" $containerLogsPort = "10250" New-NetFirewallRule -DisplayName $firewallRuleName -Direction Inbound -Action Allow -Protocol TCP -LocalPort $containerLogsPort -EdgeTraversalPolicy Allow ``` 8) Install container runtime. Currently docker, it will change to containerd. ``` Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force Set-PSRepository PSGallery -InstallationPolicy Trusted Install-Module -Name DockerMsftProvider -Repository PSGallery -Force Install-Package -Name docker -ProviderName DockerMsftProvider -Force Restart-Computer -Force ``` Or add a link to Microsoft documentation: https://docs.microsoft.com/en-us/virtualization/windowscontainers/deploy-containers/deploy-containers-on-server 9) # Original step 6 Maybe add an example, how to pull images: Windows Server 2019 LTSC: `docker pull mcr.microsoft.com/windows/servercore:ltsc2019` Windows Server 1909 SAC: `docker pull mcr.microsoft.com/windows/servercore:1909` Here a list all avaiable images of Windows Server core: https://hub.docker.com/_/microsoft-windows-servercore?tab=description or the overall page: https://hub.docker.com/_/microsoft-windows-base-os-images?tab=description How to get the os version: Poweshell ``` > Get-ComputerInfo | select OsHardwareAbstractionLayer OsHardwareAbstractionLayer -------------------------- 10.0.19041.488 ``` CMD: ``` > ver Microsoft Windows [Version 10.0.19041.508] ``` [1] LTSC vs SAC: https://docs.microsoft.com/en-us/windows-server/get-started-19/servicing-channels-19 [2] https://docs.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/common-problems#pod-to-pod-connectivity-between-hosts-is-broken-on-my-kubernetes-cluster-running-on-vsphere [3] https://docs.microsoft.com/en-us/windows/release-health/release-information Additional information: - https://coreos.slack.com/archives/C01A97KAN9X/p1617726773019300 - https://coreos.slack.com/archives/CM4ERHBJS/p1617784608180200 - https://docs.microsoft.com/en-us/windows/release-health/release-information (In reply to Robert Bohne from comment #3) > 2) Disable IPv6 > Via Powershell: > ``` > Get-NetAdapterBinding > Disable-NetAdapterBinding -Name <Name> -ComponentID ms_tcpip6 > ``` Sadly this did not work very, well. Because after sysprep the devices are recreated (kind of). I tried to add disabling IPv6 via unattend.xml, this works for the new devices. But after the WMCO run, IPv6 is active on vEthernet (Ethernet0). <FirstLogonCommands> <SynchronousCommand wcm:action="add"> <Order>1</Order> <Description>Disable IPv6 at Ethernet0</Description> <CommandLine>cmd.exe /c powershell -Command "Disable-NetAdapterBinding -Name 'Ethernet0' -ComponentID ms_tcpip6"</CommandLine> <RequiresUserInput>true</RequiresUserInput> </SynchronousCommand> <SynchronousCommand wcm:action="add"> <Order>2</Order> <Description>Disable IPv6 at vEthernet (nat)</Description> <CommandLine>cmd.exe /c powershell -Command "Disable-NetAdapterBinding -Name 'vEthernet (nat)' -ComponentID ms_tcpip6"</CommandLine> <RequiresUserInput>true</RequiresUserInput> </SynchronousCommand> </FirstLogonCommands> Disabling IPv6 at GoldenImage via: reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip6\Parameters" /v DisabledComponents /t REG_DWORD /d 0xffffffff /f Source: https://docs.microsoft.com/en-us/troubleshoot/windows-server/networking/configure-ipv6-in-windows Works very well. Add KSC https://access.redhat.com/solutions/6005141 - Work-in-progress. This bug is trying to address multiple issues at the same time i.e. "disable IPv6", use 2019 LTSC in vSphere and update SSH installation. I suggest that this bug focus on fixing Rather this bug should focus on fixing the docs [0] in its current form i.e. update SSH installation given SSH utils has been deprecated and we have already tested that. The rest require more testing. [0] https://docs.openshift.com/container-platform/4.7/windows_containers/creating_windows_machinesets/creating-windows-machineset-vsphere.html#creating-the-vsphere-windows-vm-golden-image_creating-windows-machineset-vsphere The 2019 LTSC in vSphere part can completely ignore. Because I created an RFE for that: https://issues.redhat.com/browse/WINC-598 Fixes for the vSphere golden image docs are in QE review: https://github.com/openshift/openshift-docs/pull/35728. This is based on a rewrite of the recommendations given by Engineering on creating the template. With those changes going in soon, what else is there to address from the original request? Looks much better, the online point I miss is a hint about IPv6. As far as I know, it is currently not supported to use IPv6 together with the windows container. You should disable IPv6 on the windows node if it is enabled/provided by default in your network. @Aravindh can you confirm that IPv6 is not supported with Windows containers? If so, any recommended steps for disabling it? IPv6 is not supported with Windows containers, however we need to test out disabling it before we can add it to the doc. Please my previous comment [0]. [0] https://bugzilla.redhat.com/show_bug.cgi?id=1947052#c6 Thank you! I will hold off on IPv6 disabling docs. Perhaps a note should be added that it's not supported, so customers at least have a heads up until we can nail down exact steps? Or should we wait until steps are available before moving forward? We can mention what is written in the upstream docs [0], mainly "Overlay (VXLAN) networks on Windows do not support dual-stack networking today." [0] https://kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#ipv4-ipv6-dual-stack |
After a bunch of tests with Windows containers, there a couple of topics to fix in the documentation. High-level overview: * Windows 2019 LTSC - works very well on vSphere Custom VXLAN port (hybridOverlayVXLANPort) is only needed if your vSphere environment use any overlay networking like NSX-*. This is a huge benefit for our customers! * Custom VXLAN port is only supported with Windows 1909 SAC. WCMO 2.0 only supports 1909, build 18363.*. Newer version Windows Server 2004 (19041.*) or Windows Server 20H2 (19042.*) is not supported. (Planned for WCM 3.0) * SSH public key in C:\Users\Administrator\.ssh\authorized_keys is not available after sysprep. * IPv6 is not supported and has to be disabled in the golden image. * Split up golden image step 3 into OpenSSH and firewall settings Document URL: https://docs.openshift.com/container-platform/4.7/windows_containers/creating_windows_machinesets/creating-windows-machineset-vsphere.html#creating-the-vsphere-windows-vm-golden-image_creating-windows-machineset-vsphere Section Number and Name: Creating the vSphere Windows VM golden image Describe the issue: Split the steps a bit better. Suggestions for improvement: 1) Create VM from Windows 2019 LTSC [1] ... (Only supported with default VXLAN port.) Custom vxlan port (hybridOverlayVXLANPort) only supported with Windows 1909 SAC [1]*. Custom vxlan is only needed if your vSphere environment use any overlay network and conflicts with the default vxlan port 4789. * WMCO 2.0 only supports Windows Server 1909 SAC build 18363/* WMCO 3.0 (WINC-555) will support Windows Server (19041.*) or Windows Server 20H2 (19042.*). Windows server Version & build overview is available at Microsoft. [3] 2) Disable IPv6 Via Powershell: ``` Get-NetAdapterBinding Disable-NetAdapterBinding -Name <Name> -ComponentID ms_tcpip6 ``` 3) # Original step 4 4) # Original step 5, maybe add the powershell example: ``` "exclude-nics=" | Set-Content -Path 'C:\ProgramData\VMware\VMware Tools\tools.conf' ``` 5) # Original part of step 3 Intallation and Configuration of OpenSSH ``` Add-WindowsCapability -Online -Name OpenSSH.Server~~~~0.0.1.0 Set-Service -Name ssh-agent -StartupType 'Automatic' Set-Service -Name sshd -StartupType 'Automatic' Start-Service ssh-agent Start-Service sshd $pubKeyConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PubkeyAuthentication yes','PubkeyAuthentication yes' $pubKeyConf | Set-Content -Path C:\ProgramData\ssh\sshd_config $passwordConf = (Get-Content -path C:\ProgramData\ssh\sshd_config) -replace '#PasswordAuthentication yes','PasswordAuthentication yes' $passwordConf | Set-Content -Path C:\ProgramData\ssh\sshd_config Restart-Service sshd ``` 6) # Original step 2 Create C:\ProgramData\ssh\administrators_authorized_keys, be carefull with the ACL of the file. ``` "ssh-rsa ...." | Set-Content -Path 'C:\ProgramData\ssh\administrators_authorized_keys' # Fix permission $acl = Get-Acl C:\ProgramData\ssh\administrators_authorized_keys $acl.SetAccessRuleProtection($true, $false) $administratorsRule = New-Object system.security.accesscontrol.filesystemaccessrule("Administrators","FullControl","Allow") $systemRule = New-Object system.security.accesscontrol.filesystemaccessrule("SYSTEM","FullControl","Allow") $acl.SetAccessRule($administratorsRule) $acl.SetAccessRule($systemRule) $acl | Set-Acl ``` 7) Allow incoming connection for container logs # Original included in step 3. ``` $firewallRuleName = "ContainerLogsPort" $containerLogsPort = "10250" New-NetFirewallRule -DisplayName $firewallRuleName -Direction Inbound -Action Allow -Protocol TCP -LocalPort $containerLogsPort -EdgeTraversalPolicy Allow ``` 8) Install container runtime. Currently docker, it fill changed to containerd. ``` Install-PackageProvider -Name NuGet -MinimumVersion 2.8.5.201 -Force Set-PSRepository PSGallery -InstallationPolicy Trusted Install-Module -Name DockerMsftProvider -Repository PSGallery -Force Install-Package -Name docker -ProviderName DockerMsftProvider -Force Restart-Computer -Force ``` 9) # Original step 6 Maybe add an example, how to pull images: Windows Server 2019 LTSC: `docker pull mcr.microsoft.com/windows/servercore:ltsc2019` Windows Server 1909 SAC: `docker pull mcr.microsoft.com/windows/servercore:1909` Here a list all avaiable images of Windows Server core: https://hub.docker.com/_/microsoft-windows-servercore?tab=description or the overall page: https://hub.docker.com/_/microsoft-windows-base-os-images?tab=description How to get the os version: Poweshell ``` > Get-ComputerInfo | select OsHardwareAbstractionLayer OsHardwareAbstractionLayer -------------------------- 10.0.19041.488 ``` CMD: ``` > ver Microsoft Windows [Version 10.0.19041.508] ``` [1] LTSC vs SAC: https://docs.microsoft.com/en-us/windows-server/get-started-19/servicing-channels-19 [2] https://docs.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/common-problems#pod-to-pod-connectivity-between-hosts-is-broken-on-my-kubernetes-cluster-running-on-vsphere [3] https://docs.microsoft.com/en-us/windows/release-health/release-information Additional information: - https://coreos.slack.com/archives/C01A97KAN9X/p1617726773019300 - https://coreos.slack.com/archives/CM4ERHBJS/p1617784608180200 - https://docs.microsoft.com/en-us/windows/release-health/release-information