Description of problem: After exposing a NodePort service on a VM and its pod, the VM is accessible via ssh and the exposed port. If you restart the VM - the service and port are not accessible anymore. Version-Release number of selected component (if applicable): Client/server version: v0.12.0-alpha.2 How reproducible: Always Steps to Reproduce: 1. Create a cirros VM. # oc create -f cluster/example/vm-cirros.yaml 2. Start the VM: # virtctl start vm-cirros 3. Verify VMI is running. # oc get VMI NAME AGE PHASE IP NODENAME vm-cirros 10m Running 10.130.0.46 cnv-executor-ysegev-node1.example.com 4. Get the pod name. # oc get pods NAME READY STATUS RESTARTS AGE virt-launcher-vm-cirros-bdf7w 2/2 Running 0 11m 5. Expose a NodePort service via the pod, to be used for ssh (i.e. target port is 22). # oc expose pod virt-launcher-vm-cirros-bdf7w --name=testnp --port=27017 -- target-port=22 --type=NodePort 6. View the service and the exposed port. # oc get svc testnp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE testnp NodePort 172.30.134.72 <none> 27017:31854/TCP 17m 7. Open ssh connection to the VM via the IP address of the node which hosts the VM, and the exposed port: # ssh cirros.243.240 -p 31854 8. Make sure you are logged-in to the VM ssh console. 9. Exit the ssh console. 10. Restart the VM. # virtctl stop vm-cirros # virtctl start vm-cirros 11. After the VM is up again - try to ssh again to it via the same node IP and port. # ssh cirros.243.240 -p 31854 Actual results: ssh: connect to host 10.8.243.240 port 31854: Connection refused Expected results: ssh connection to the VM should be available via the exposed NodePort service, just like before restarting the VM. Additional info:
Some more investigation: After restarting the VM: 1. The VM's pod loads with a different name than the one before the restart. 2. The exposed service's iptables rules don't load: These are the exposed service rules before restarting the VM: -A KUBE-NODEPORTS -p tcp -m comment --comment "kubevirt/testnp:" -m tcp --dport 31221 -j KUBE-MARK-MASQ -A KUBE-NODEPORTS -p tcp -m comment --comment "kubevirt/testnp:" -m tcp --dport 31221 -j KUBE-SVC-4GYGD62LGYM2BRMM -A KUBE-SEP-HYW5LKWDE7SML36T -s 10.130.0.46/32 -m comment --comment "kubevirt/testnp:" -j KUBE-MARK-MASQ -A KUBE-SEP-HYW5LKWDE7SML36T -p tcp -m comment --comment "kubevirt/testnp:" -m tcp -j DNAT --to-destination 10.130.0.46:22 -A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.251.47/32 -p tcp -m comment --comment "kubevirt/testnp: cluster IP" -m tcp --dport 27017 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 172.30.251.47/32 -p tcp -m comment --comment "kubevirt/testnp: cluster IP" -m tcp --dport 27017 -j KUBE-SVC-4GYGD62LGYM2BRMM -A KUBE-SVC-4GYGD62LGYM2BRMM -m comment --comment "kubevirt/testnp:" -j KUBE-SEP-HYW5LKWDE7SML36T And these the rules after the restart: -A KUBE-EXTERNAL-SERVICES -p tcp -m comment --comment "kubevirt/testnp: has no endpoints" -m addrtype --dst-type LOCAL -m tcp --dport 31221 -j REJECT --reject-with icmp-port-unreachable -A KUBE-SERVICES -d 172.30.251.47/32 -p tcp -m comment --comment "kubevirt/testnp: has no endpoints" -m tcp --dport 27017 -j REJECT --reject-with icmp-port-unreachable
After consulting with Dan and Sebastian - the correct way to expose a service on a VM is via "virtctl expose" command, rather than "oc expose". However - you still cannot ssh the VM via the exposed NodePort service after restarting the VM. Steps to Reproduce: 1. Create a cirros VM. # oc create -f cluster/example/vm-cirros.yaml 2. Start the VM: # virtctl start vm-cirros 3. Verify VMI is running. # oc get VMI NAME AGE PHASE IP NODENAME vm-cirros 10m Running 10.130.0.46 cnv-executor-ysegev-node1.example.com 4. Expose a NodePort service, to be used for ssh (i.e. target port is 22). # virtctl expose vmi vm-cirros --name=testnp --port=27017 --target-port=22 --type=NodePort 6. View the service and the exposed port. # oc get svc testnp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE testnp NodePort 172.30.127.7 <none> 27017:31682/TCP 13m 7. Open ssh connection to the VM via the IP address of the node which hosts the VM, and the exposed port: # ssh cirros.243.240 -p 31682 8. Make sure you are logged-in to the VM ssh console. 9. Exit the ssh console. 10. Restart the VM. # virtctl stop vm-cirros # virtctl start vm-cirros 11. After the VM is up again - try to ssh again to it via the same node IP and port. # ssh cirros.243.240 -p 31854 Actual results: ssh request is rejected: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY! Someone could be eavesdropping on you right now (man-in-the-middle attack)! It is also possible that a host key has just been changed. The fingerprint for the ECDSA key sent by the remote host is SHA256:TvnAFvlwkTUT7zo3p8+bjXc2HTxhM28wjHg0bzAwVm4. Please contact your system administrator. Add correct host key in /home/ysegev/.ssh/known_hosts to get rid of this message. Offending ECDSA key in /home/ysegev/.ssh/known_hosts:35 ECDSA host key for [10.8.243.240]:31682 has changed and you have requested strict checking. Host key verification failed.
The way to resolve this is by deleting the previous entry of this connection in ~/.ssh/known_hosts. The exact entry that should be deleted is referenced in error message: Offending ECDSA key in /home/ysegev/.ssh/known_hosts:35
Extra info: The scenario in comment 2 happens when restarting the VM using virtctl command, i.e.: # virtctl stop vm-cirros # virtctl start vm-cirros When restarting from within the VM, on the other hand, there's no prevention to ssh the VM after startup is done. Try the following the following scenario: 1.a. Open ssh connection to the VM: # ssh cirros.243.240 -p 31682 or 1.b. Open a console to the VM using virtctl: # virtctl console vm-cirros 2. Once inside the VM console - reboot the VM using bash reboot command: # sudo reboot 3. After reboot is done and the VM is up again (it takes ~30 seconds) - try to ssh it again via the NodePort: # ssh cirros.243.240 -p 31682 You should be able to connect with no failure or error.
Hi Yossi, This is not a problem. This append because the vm-cirros is using containerDisk. That means that every time you reboot the vm I will start from a new disk (the ssh change). if you don't want this you need to deploy a vm with persistent volume so all the vm configuration will survive a reboot.