Bug 2057633 - oc rsync reports misleading error when container is not found
Summary: oc rsync reports misleading error when container is not found
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: oc
Version: 4.10
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.11.0
Assignee: Filip Krepinsky
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-23 18:44 UTC by Martin Bukatovic
Modified: 2022-08-10 10:51 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
additional validations/errors were added to oc rsync when targeted container is not running
Clone Of:
Environment:
Last Closed: 2022-08-10 10:50:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oc pull 1087 0 None open Bug 2057633: add validations for a pod & container to rsync 2022-03-07 16:21:19 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:51:12 UTC

Description Martin Bukatovic 2022-02-23 18:44:20 UTC
Description of problem
======================

When I use oc rsync with invalid container name to fetch data from the
container, the error message complains about availability of copy strategies
instead of the root cause (the container doesn't exist).

This problem is similar to old, now fixed, OCP 3.x BZ 1314817, but this time
it's about a container instead of a pod.

Version-Release number of selected component
============================================

OCP 4.10.0-0.nightly-2022-02-22-093600

How reproducible
================

100%

Steps to Reproduce
==================

1. Run `oc rsync --container foo -n NS pod/POD:/etc/redhat-release /tmp/` so
   that NS and POD are valid references for a running pod in a namespace, but
   container foo doesn't exist.
2. Run the same command again, but this time with additiona option
   `--loglevel=4`

Actual results
==============

The oc rsync fails, complaining that rsync and tar are missing in the
container:

```
$ oc rsync --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp
WARNING: cannot use rsync: rsync not available in container
WARNING: cannot use tar: tar not available in container
error: No available strategies to copy.
```

The same run with `--loglevel=4` shows that the problem is actually elsewhere:

```
$ oc rsync --loglevel=4  --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp
I0223 19:37:10.124688   11013 copy_rsync.go:59] Rsh command: oc rsh --container=foo --loglevel=4 --namespace=openshift-storage
I0223 19:37:10.125136   11013 copy_rsync.go:82] Copying files with rsync
I0223 19:37:10.125380   11013 exec_local.go:19] Local executor running command: rsync --blocking-io --archive --no-owner --no-group --omit-dir-times --numeric-ids -v -e oc rsh --container=foo --loglevel=4 --namespace=openshift-storage rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp
I0223 19:37:10.975300   11013 exec_local.go:26] Error from local command execution: exit status 12
I0223 19:37:11.100162   11013 exec_remote.go:29] Remote executor running command: rsync --version
I0223 19:37:11.607211   11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5
I0223 19:37:11.608690   11013 util.go:25] 
I0223 19:37:11.609671   11013 util.go:26] error: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5
I0223 19:37:11.610786   11013 copy_multi.go:30] Error output:
WARNING: cannot use rsync: rsync not available in container
I0223 19:37:11.613163   11013 copy_tar.go:119] Copying files with tar
I0223 19:37:11.614579   11013 copy_tar.go:147] Creating local tar file /tmp/rsync4131236598 from remote path /etc/redhat-release
I0223 19:37:11.615580   11013 copy_tar.go:203] Tarring /etc/redhat-release remotely
I0223 19:37:11.616821   11013 copy_tar.go:227] Remote tar command: tar -C /etc -c redhat-release
I0223 19:37:11.618229   11013 exec_remote.go:29] Remote executor running command: tar -C /etc -c redhat-release
I0223 19:37:12.124460   11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5
I0223 19:37:12.126566   11013 exec_remote.go:29] Remote executor running command: tar --version
I0223 19:37:12.638972   11013 exec_remote.go:54] Error from remote execution: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5
I0223 19:37:12.640420   11013 util.go:25] 
I0223 19:37:12.641621   11013 util.go:26] error: container foo is not valid for pod rook-ceph-tools-56f88ff6cb-tvpg5
I0223 19:37:12.643289   11013 copy_multi.go:30] Error output:
WARNING: cannot use tar: tar not available in container
error: No available strategies to copy
```

Expected results
================

The oc rsync command complains about missing/nonexistent container:

```
$ oc rsync --container foo -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg5:/etc/redhat-release /tmp
Error from server (NotFound): container "foo" not found
```

Additional info
===============

When the problem is with missing pod or namespace, oc rsync has no problem with
reporting the root cause:

```
$ oc rsync -n openshift-storage pod/rook-ceph-tools-56f88ff6cb-tvpg6:/etc/redhat-release /tmp/etc
Error from server (NotFound): pods "rook-ceph-tools-56f88ff6cb-tvpg6" not found
$ oc rsync -n openshift-storaga pod/rook-ceph-tools-56f88ff6cb-tvpg6:/etc/redhat-release /tmp/etc
Error from server (NotFound): namespaces "openshift-storaga" not found
```

I hit this problem during debugging a misterious failure of an automated test
case, and after some debugging realized that the error here is a red herring.
In my case, the container ceased to exist, which resulted in misleading error
about rsync dissapearing from the container.

Comment 1 Martin Bukatovic 2022-03-03 18:54:20 UTC
This could happen when oc rsync tries to fetch data while the target container is being restarted for some reason (liveliness probe failure, high memory utilization, ...). So when checked later, pod, container and tools are present, while the error message complains about missing tools in a container image.

Comment 2 Filip Krepinsky 2022-03-07 16:23:42 UTC
thanks for the finding, posted a PR with a fix

Comment 4 zhou ying 2022-06-22 02:58:53 UTC
can't reproduce the issue now :

oc version --client
Client Version: 4.11.0-0.nightly-2022-06-22-015220
Kustomize Version: v4.5.4

oc rsync --container foo pod/thanos-querier-7d869ccc58-mxlf7:/etc/redhat-release /tmp
error: container foo not found in pod thanos-querier-7d869ccc58-mxlf7

Comment 6 errata-xmlrpc 2022-08-10 10:50:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.