Description of problem ====================== When I use oc rsync with invalid container name to fetch data from the container, the error message complains about availability of copy strategies instead of the root cause (the container doesn't exist). This problem is similar to old, now fixed, OCP 3.x BZ 1314817, but this time it's about a container instead of a pod. Version-Release number of selected component ============================================ OCP 4.10.0-0.nightly-2022-02-22-093600 How reproducible ================ 100% Steps to Reproduce ================== 1. Run `oc rsync --container foo -n NS pod/POD:/etc/redhat-release /tmp/` so that NS and POD are valid references for a running pod in a namespace, but container foo doesn't exist. 2. Run the same command again, but this time with additiona option `--loglevel=4` Actual results ============== The oc rsync fails, complaining that rsync and tar are missing in the container: ``` $ oc rsync --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp WARNING: cannot use rsync: rsync not available in container WARNING: cannot use tar: tar not available in container error: No available strategies to copy. ``` The same run with `--loglevel=4` shows that the problem is actually elsewhere: ``` $ oc rsync --loglevel=4 --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp I0223 19:37:10.124688 11013 copy_rsync.go:59] Rsh command: oc rsh --container=foo --loglevel=4 --namespace=openshift-storage I0223 19:37:10.125136 11013 copy_rsync.go:82] Copying files with rsync I0223 19:37:10.125380 11013 exec_local.go:19] Local executor running command: rsync --blocking-io --archive --no-owner --no-group --omit-dir-times --numeric-ids -v -e oc rsh --container=foo --loglevel=4 --namespace=openshift-storage rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp I0223 19:37:10.975300 11013 exec_local.go:26] Error from local command execution: exit status 12 I0223 19:37:11.100162 11013 exec_remote.go:29] Remote executor running command: rsync --version I0223 19:37:11.607211 11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5 I0223 19:37:11.608690 11013 util.go:25] I0223 19:37:11.609671 11013 util.go:26] error: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5 I0223 19:37:11.610786 11013 copy_multi.go:30] Error output: WARNING: cannot use rsync: rsync not available in container I0223 19:37:11.613163 11013 copy_tar.go:119] Copying files with tar I0223 19:37:11.614579 11013 copy_tar.go:147] Creating local tar file /tmp/rsync4131236598 from remote path /etc/redhat-release I0223 19:37:11.615580 11013 copy_tar.go:203] Tarring /etc/redhat-release remotely I0223 19:37:11.616821 11013 copy_tar.go:227] Remote tar command: tar -C /etc -c redhat-release I0223 19:37:11.618229 11013 exec_remote.go:29] Remote executor running command: tar -C /etc -c redhat-release I0223 19:37:12.124460 11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5 I0223 19:37:12.126566 11013 exec_remote.go:29] Remote executor running command: tar --version I0223 19:37:12.638972 11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5 I0223 19:37:12.640420 11013 util.go:25] I0223 19:37:12.641621 11013 util.go:26] error: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5 I0223 19:37:12.643289 11013 copy_multi.go:30] Error output: WARNING: cannot use tar: tar not available in container error: No available strategies to copy ``` Expected results ================ The oc rsync command complains about missing/nonexistent container: ``` $ oc rsync --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp Error from server (NotFound): container "foo" not found ``` Additional info =============== When the problem is with missing pod or namespace, oc rsync has no problem with reporting the root cause: ``` $ oc rsync -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg6:/etc/redhat-release /tmp/etc Error from server (NotFound): pods "rook-ceph-tools-56f88ff6cb-tvpg6" not found $ oc rsync -n openshift-storaga pod/rook-ceph-tools-56f88ff6cb-tvpg6:/etc/redhat-release /tmp/etc Error from server (NotFound): namespaces "openshift-storaga" not found ``` I hit this problem during debugging a misterious failure of an automated test case, and after some debugging realized that the error here is a red herring. In my case, the container ceased to exist, which resulted in misleading error about rsync dissapearing from the container.
This could happen when oc rsync tries to fetch data while the target container is being restarted for some reason (liveliness probe failure, high memory utilization, ...). So when checked later, pod, container and tools are present, while the error message complains about missing tools in a container image.
thanks for the finding, posted a PR with a fix
can't reproduce the issue now : oc version --client Client Version: 4.11.0-0.nightly-2022-06-22-015220 Kustomize Version: v4.5.4 oc rsync --container foo pod/thanos-querier-7d869ccc58-mxlf7:/etc/redhat-release /tmp error: container foo not found in pod thanos-querier-7d869ccc58-mxlf7
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069