Bug 1299578 - sosreport output in container, not host
sosreport output in container, not host
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sos (Show other bugs)
7.2
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Pavel Moravec
BaseOS QE - Apps
:
Depends On:
Blocks: 1277223 1299794
  Show dependency treegraph
 
Reported: 2016-01-18 11:43 EST by Chris Evich
Modified: 2016-11-11 02:48 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1277223
Environment:
Last Closed: 2016-11-11 02:48:58 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Comment 2 Bryn M. Reeves 2016-01-19 08:25:30 EST
Do you have an environment where I can test this?

I tested the original (7.2) image and this worked correctly; verbose logs may provide some insight but ideally we'd need somewhere to be able to run tests and possibly instrument the installed packages.
Comment 3 Bryn M. Reeves 2016-01-19 08:26:41 EST
Also what is the version of the sos package in use? The image hashes are not meaningful for the sos maintainers in terms of a specific package NVR.
Comment 4 Chris Evich 2016-01-19 11:37:25 EST
(In reply to Bryn M. Reeves from comment #2)
> Do you have an environment where I can test this?

Bug discovered in CI testing, so "not really".  However, it's easy to reproduce on an atomic-host.  The steps come from the docs: https://access.redhat.com/documentation/en/red-hat-enterprise-linux-atomic-host/version-7/getting-started-with-containers/#using_the_atomic_tools_container_image

# atomic run rhel7/rhel-tools
# sosreport
...cut...
# ls /host/var/tmp
# exit

The rhel-tools container is run with the '--rm' flag (RUN label), so when you exit the container, all data goes *poof*, including the sosreport output.

What the docs indicate (seems to be desireable behavior), is the report itself should be written to /host/var/tmp.  That way the report will persist after the container is removed (implicitly, by exiting from it).

Further/OTOH, Bug 1299794 suggests that sosreport should be collecting data from /host instead of the container's filesystem.  This is also likely desired behaviour, since the host's state is more useful than the container (which will be short lived anyway).

To be clear though, this bug is more about the output files from sosreport not being stored in a safe/useful place.
Comment 5 Chris Evich 2016-01-19 11:39:04 EST
(In reply to Bryn M. Reeves from comment #3)
> Also what is the version of the sos package in use? The image hashes are not
> meaningful for the sos maintainers in terms of a specific package NVR.

I'll get ya that detail in a sec.  This was originally reproduced against a candidate/staging image.  I'll do it again using the latest os-tree and available rhel-tools image...
Comment 6 Bryn M. Reeves 2016-01-19 11:49:15 EST
> Bug discovered in CI testing, so "not really".  However, it's easy to 
> reproduce on an atomic-host.  The steps come from the docs:

I don't have access to an atomic host environment or a system where I can easily build one (sos is not my 'day job' - I work on the upstream project in my own time and help out with RHEL packaging and product integration when I can).

Since this is entirely dependent on the particular environment of the Atomic host we (the sos maintainers) really need some help to be able to work on this in a timely fashion.

> To be clear though, this bug is more about the output files from sosreport not 
> being stored in a safe/useful place.

They are very unlikely to be separate issues.
Comment 7 Chris Evich 2016-01-19 12:19:29 EST
[root@bz1299578 ~]# atomic host status
  TIMESTAMP (UTC)         VERSION   ID             OSNAME               REFSPEC                                                        
* 2015-12-03 19:40:36     7.2.1     aaf67b91fa     rhel-atomic-host     rhel-atomic-host-ostree:rhel-atomic-host/7/x86_64/standard

[root@bz1299578 ~]# docker images
REPOSITORY                                    TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
registry.access.redhat.com/rhel7/rhel-tools   latest              fd2acbeb2b97        6 weeks ago         1.159 GB

...

[root@bz1299578 ~]# atomic run rhel7/rhel-tools
Using default tag: latest
fd2acbeb2b97: Download complete 
6c3a84d798dc: Download complete 
Status: Downloaded newer image for registry.access.redhat.com/rhel7/rhel-tools:latest

docker run -it --name rhel-tools --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=rhel-tools -e IMAGE=rhel7/rhel-tools -v /run:/run -v /var/log:/var/log -v /etc/localtime:/etc/localtime -v /:/host rhel7/rhel-tools
docker run -it --name rhel-tools --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=rhel-tools -e IMAGE=rhel7/rhel-tools -v /run:/run -v /var/log:/var/log -v /etc/localtime:/etc/localtime -v /:/host rhel7/rhel-tools
[root@bz1299578 /]# sosreport


sosreport (version 3.2)

This command will collect diagnostic and configuration information from
this Red Hat Atomic Host system.

An archive containing the collected information will be generated in
/var/tmp and may be provided to a Red Hat support representative.

Any information provided to Red Hat will be treated in accordance with
the published support policies at:

  https://access.redhat.com/support/

The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before being
passed to any third party.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [bz1299578.novalocal]: 
Please enter the case id that you are generating this report for []: 

 Setting up archive ...
 Setting up plugins ...
[plugin:pcp] /var/log/pcp/pmlogger/bz1299578.novalocal not found
[plugin:sar] sar: could not list /var/log/sa
 Running plugins. Please wait ...

  Running 70/70: yum...                      
Creating compressed archive...

Your sosreport has been generated and saved in:
  /var/tmp/sosreport-bz1299578.novalocal-20160119171125.tar.xz

The checksum is: 31423813538ae9c973b704573053cc01

Please send this file to your support representative.

[root@bz1299578 /]# ls /var/tmp
sosreport-bz1299578.novalocal-20160119171125.tar.xz
sosreport-bz1299578.novalocal-20160119171125.tar.xz.md5
[root@bz1299578 /]# ls /host/var/tmp
[root@bz1299578 /]# ls /host/tmp
ks-script-BCtjUK  ks-script-CJy00F
[root@bz1299578 /]# exit
exit
[root@bz1299578 ~]# ls /var/tmp
[root@bz1299578 ~]# ls /tmp
ks-script-BCtjUK  ks-script-CJy00F


Good news is, I see we no-longer use --rm for the rhel-tools container, so you can still access the data (as long as you don't rm the container)

[root@bz1299578 ~]# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                        PORTS               NAMES
a77cc864b0a8        rhel7/rhel-tools    "/usr/bin/bash"     23 minutes ago      Exited (137) 38 seconds ago                       rhel-tools
[root@bz1299578 ~]# docker start -it rhel-tools
flag provided but not defined: -it
See 'docker start --help'.
[root@bz1299578 ~]# docker start rhel-tools
rhel-tools
[root@bz1299578 ~]# docker exec -it rhel-tools bash
[root@bz1299578 /]# ls /var/tmp.md5
sosreport-bz1299578.novalocal-20160119171125.tar.xz
sosreport-bz1299578.novalocal-20160119171125.tar.xz.md5
[root@bz1299578 /]# 

In any case, from the getting started guide, the expectation is the output files should appear in the host's /var/tmp seen as /host/var/tmp from inside the container.
Comment 8 Bryn M. Reeves 2016-01-19 14:02:52 EST
Here's the problem:

    # sosreport -vvv --debug
    policy sysroot is '/host' (in_container=False)
    set sysroot to '/' (default)
    
        sosreport (version 3.2)

    [...]

The in_container() test that we put in place for 7.1 is no longer evaluating true in the rhel-tools container.

For Red Hat distros this is just:

        if ENV_CONTAINER_UUID in os.environ:
            self._in_container = True

ENV_CONTAINER_UUID is the container UUID environment variable we were told was guaranteed to be present when running in a container:

    # Container environment variables on Red Hat systems.
    ENV_CONTAINER_UUID = 'container_uuid'

We added this extra check to prevent confusion when sos is run in a non-container environment and the administrator happens to have an environment variable named 'HOST'.

If there is some other reliable check we can implement then we can easily switch over to that - otherwise we'd probably need to add an additional command line switch as the auto detection needs to be both safe and reliable (this was tested and worked as intended on 7.1).
Comment 9 Bryn M. Reeves 2016-01-19 14:06:39 EST
Debug patch to log the policy sysroot results:

diff -up sos/sosreport.py.orig sos/sosreport.py
--- sos/sosreport.py.orig	2016-01-19 18:43:51.198111934 +0000
+++ sos/sosreport.py	2016-01-19 18:52:45.050111934 +0000
@@ -691,6 +691,10 @@ class SoSReport(object):
 
         msg = "default"
         host_sysroot = self.policy.host_sysroot()
+
+        self.soslog.debug("policy sysroot is '%s' (in_container=%s)"
+                          % (host_sysroot, self.policy.in_container()))
+
         # set alternate system root directory
         if self.opts.sysroot:
             msg = "cmdline"
Comment 10 Chris Evich 2016-01-19 14:13:09 EST
Bryn,

Ya, there's no more UUID exposed automaticly by the looks of it.  Checking 'container=docker' may be the preferred way.  That's the way systemd works IIUC.  I'll ask around and find out for sure.
Comment 11 Bryn M. Reeves 2016-01-19 14:15:27 EST
Dan mentions container_uuid in this post:

http://developerblog.redhat.com/2014/11/06/introducing-a-super-privileged-container-concept/

As it's apparently no longer being set we may need to change to just checking for 'container=docker' - I see this set when running the rhel-tools image on Chris' test system.
Comment 12 Chris Evich 2016-01-19 14:24:58 EST
hrmmm, looks like fedora doesn't set 'container=docker', and checking an env. var seems a bit error-prone to me, anyone can set/unset them.  I've got a meeting with Dan and team in an hour or so, I'll ask then.
Comment 13 Bryn M. Reeves 2016-01-19 14:26:38 EST
[policies/redhat] use 'container' variable for in_container() test

Signed-off-by: Bryn M. Reeves <bmr@redhat.com>

diff -up sos/policies/redhat.py.orig sos/policies/redhat.py
--- sos/policies/redhat.py.orig	2016-01-19 19:23:42.283111934 +0000
+++ sos/policies/redhat.py	2016-01-19 19:23:50.709111934 +0000
@@ -83,8 +83,9 @@ class RedHatPolicy(LinuxPolicy):
         """Check if sos is running in a container and perform container
         specific initialisation based on ENV_HOST_SYSROOT.
         """
-        if ENV_CONTAINER_UUID in os.environ:
-            self._in_container = True
+        if ENV_CONTAINER in os.environ:
+            if os.environ[ENV_CONTAINER] == 'docker':
+                self._in_container = True
         if ENV_HOST_SYSROOT in os.environ:
             self._host_sysroot = os.environ[ENV_HOST_SYSROOT]
         use_sysroot = self._in_container and self._host_sysroot != '/'
@@ -124,7 +125,7 @@ class RedHatPolicy(LinuxPolicy):
         return self.host_name()
 
 # Container environment variables on Red Hat systems.
-ENV_CONTAINER_UUID = 'container_uuid'
+ENV_CONTAINER = 'container'
 ENV_HOST_SYSROOT = 'HOST'
Comment 14 Bryn M. Reeves 2016-01-19 14:28:20 EST
# sosreport -vvv --debug
set sysroot to '/host' (policy)

sosreport (version 3.2)

This command will collect diagnostic and configuration information from
this Red Hat Atomic Host system.

An archive containing the collected information will be generated in
/host/var/tmp and may be provided to a Red Hat support representative.

Any information provided to Red Hat will be treated in accordance with
the published support policies at:

  https://access.redhat.com/support/

The generated archive may contain data considered sensitive and its
content should be reviewed by the originating organization before being
passed to any third party.

Press ENTER to continue, or CTRL-C to quit.[...]
[...]
Your sosreport has been generated and saved in:
  /host/var/tmp/sosreport-bz1299578.novalocal-20160119192702.tar.xz

The checksum is: 0eadbb426b100e50d7d731d8550ffc25

Please send this file to your support representative.
Comment 15 Bryn M. Reeves 2016-01-19 14:40:41 EST
> hrmmm, looks like fedora doesn't set 'container=docker', and checking an env. 
> var seems a bit error-prone to me, anyone can set/unset them.

I don't disagree: the original suggestion was that we should just test for an environment variable named 'HOST' containing a path - using 'container_uuid' was agreed on as something that would be reliable for Atomic and RHEL and safe for users in non-container environments who happen to have HOST in their environment (automatic sysroot has only been implemented for Red Hat and Fedora policies so far - users of other distributions need to give a sysroot path on the command line).

It's easy to change and easy for us to test other things but we need whatever we use to be robust - systemd stats '/proc/1/root' and '/' to determine if it is running in a chroot - we can add a similar check in the sos policy and only then try to use the 'HOST' or 'container' variables.
Comment 16 Chris Evich 2016-01-19 15:33:25 EST
(In reply to Bryn M. Reeves from comment #15)
> whatever we use to be robust - systemd stats '/proc/1/root' and '/' to
> determine if it is running in a chroot - we can add a similar check in the
> sos policy and only then try to use the 'HOST' or 'container' variables.

That sounds better to me.  Unf. Dan wasn't in my meeting today, and  this might be something we need a wider audience to resolve.  Best to fix it the right way once, than have to come back because it broke again.  I'll reach out internally and cc you...
Comment 17 Daniel Walsh 2016-01-20 08:54:51 EST
We are not setting container_uuid.  ENV container=docker is set in the RHEL base container, and maybe should be set for now by Fedora, but I am not even sure this is correct, since you could use other tools like runc and perhaps rkt in the future to run a RHEL7 image.

For now though this is the best we have.
Comment 18 Bryn M. Reeves 2016-01-20 09:59:17 EST
The particular use-case here is Atomic with docker; at the time we did this work initially we were told that 'container_uuid' was "guaranteed" to be present in this environment - that is no longer the case so we need to find a better test.

For now I am fine if that is more-or-less specific to the Atomic/docker setup - it is relatively easy for us to enable sysroot autodetection for other environments later (assuming there's something reliable to key off).

The main point here and the reason there is some time pressure is that this is functionality we've documented for a year already and that is now broken due to the environment change - this affects Cockpit and other integration pieces so it's something we really need to address, and ideally in a way that's going to be robust for a reasonable number of updates.
Comment 19 Chris Evich 2016-01-20 10:43:05 EST
(In reply to Bryn M. Reeves from comment #18)
> The particular use-case here is Atomic with docker; at the time we did this

Agreed.  Though to be clear, you mean RHEL Atomic, CentOS Atomic, Fedora Atomic etc.  ya?

I'd also suggest targeting specific "images" as well for support status, since sosreport would be dependent on the content (in the case of container=docker).  e.g. Some random slackware docker image may not use/support this env-var.

Dan's also correct in that docker is just the tooling, so depending on it's specifics vs other tooling will lead to a similar problem if/when other tooling comes on the scene.  For example, maybe later it could be 'container=rkt' or w/e.

For reporting on the host in the Atomic context, obviously the host details need to be exposed somehow (/host or w/e).  To me that suggests maybe a circular-reference check is possible.  For example, sosreport process from the POV of the host will have some key details that exactly match the sosreport process from the POV of the container.  For example, details from /proc/pid/maps (presuming address space is randomized).

Anyway, it's a tricky problem.  In the interest of speed, and finding an 80% fix, maybe the env. var check is the best we have for now :S
Comment 20 Bryn M. Reeves 2016-01-20 10:52:42 EST
> Though to be clear, you mean RHEL Atomic, CentOS Atomic, Fedora Atomic etc.  ya?

Right; "Atomic with docker on Red Hat-like distros" is probably fair (although the priority in this bug is clearly RHEL/Atomic).

> For example, maybe later it could be 'container=rkt' or w/e.

That's fine an we can add them as they come up - this is the reason that container_uuid was appealing for us in the first place - it is supported by other container systems such as LXC and would have given us automatic enablement of this feature on those platforms.

With all that said however I do have concerns that even with a "portable" check to key off that the requirements for sos in a super-privileged container are sufficiently special that the test portability doesn't matter - specific testing and possibly changes will be required to make sure things work correctly in each environment.
Comment 21 Chris Evich 2016-01-21 10:25:54 EST
(In reply to Bryn M. Reeves from comment #8)
> used are all manifestations of the same bug - that due to the missing
> "container_uuid" that these versions check for sos thinks it is running in a

That reminds me.  There is a use-case for using sosreport to gather the container's details rather than the hosts.  Maybe that needs to be an RFE, but in any case it might be prudent to not "fix" this problem so thoroughly that it can't be bypassed when needed.
Comment 22 Bryn M. Reeves 2016-01-21 10:44:56 EST
We discussed that use case last year - it should be possible now using '--sysroot=/' although there may be some cases (command execution especially) that do not currently work as desired - right now this isn't a supported feature so we have a bit of time to get it right.
Comment 23 Pavel Moravec 2016-01-25 06:52:44 EST
Hi Chris,
could you please confirm the change per https://bugzilla.redhat.com/show_bug.cgi?id=1299578#c13 correctly distinguishes between container and host, and if this check is expected to be stable for now on?

I plan to backport the change to 7.2.z batch 2 where I should hand-over errata to QE within few days. So a feedback / response in that time frame is welcomed.
Comment 24 Bryn M. Reeves 2016-01-25 07:17:27 EST
> could you please confirm the change per https://bugzilla.redhat.com
> /show_bug.cgi?id=1299578#c13 correctly distinguishes between container and 
> host, and if this check is expected to be stable for now on?

That's not really the important question: using container_uuid correctly distinguished container vs. host until it was abruptly removed from the environment between 7.1 and 7.2.

What we really need to know before committing to this new environment variable is whether we can rely on this not also suddenly going away and breaking HOST auto-detection (at least within the support limits of the current Atomic product).
Comment 25 Chris Evich 2016-01-25 09:55:22 EST
Agreed.  I sent a message last-week to atomic-devel, https://lists.projectatomic.io/projectatomic-archives/atomic-devel/2016-January/msg00039.html but there's no replies so far.

If there's nothing better, than I think what Dan said above is (unfortunately) the case: "...this is the best we have.".  It's not set in the Fedora image (verified this last week) but I'm not sure about CentOS.

We probably need separate bugs to ensure those changes are made (at least in fedora images).  I'll put that on my TODO list.
Comment 26 Chris Evich 2016-01-25 12:41:18 EST
(In reply to Pavel Moravec from comment #23)
> I plan to backport the change to 7.2.z batch 2 where I should hand-over


Oh, that reminds me, should this bug be flagged against 7.2 then and added to the tracking bug 1287902 b/c right now it's set up for 7.3
Comment 27 Bryn M. Reeves 2016-01-25 12:48:35 EST
No - it's correct as-is: a bug needs to get accepted for rhel-X.Y (7.3 in this case) before it can be requested for rhel-x.y.z (7.2.z here).
Comment 28 Chris Evich 2016-01-27 10:12:46 EST
Opened Bug 1302354 for Fedora.

Any reason anyone knows we can't make this BZ public?
Comment 29 Chris Evich 2016-02-10 12:08:42 EST
Talked with Bryn, "ok" to make this public.  Waiting on PM and hoping for confirmation 'container' env. var isn't going to vanish like the last env. var.
Comment 30 Pavel Moravec 2016-02-21 07:09:22 EST
(In reply to Chris Evich from comment #29)
> Talked with Bryn, "ok" to make this public.  Waiting on PM and hoping for
> confirmation 'container' env. var isn't going to vanish like the last env.
> var.

Hello,
has you received such confirmation the 'container' env. name will stay stable?
Comment 31 Chris Evich 2016-02-22 11:17:00 EST
(In reply to Pavel Moravec from comment #30)
> (In reply to Chris Evich from comment #29)
> Hello,
> has you received such confirmation the 'container' env. name will stay
> stable?

I think Dan's indication above in c17 still stands.  This technology is so new and developing, it's really hard to give any guarantees.  For the foreseeable future, this is probably the best standard Red Hat can stick to.  We can be ready for change by implementing sufficient testing (which I have).  As soon as this bug is closed/fixed, I'll add checks to my test to verify sosreport recognizes the host-type (in welcome message) and places the output in the correct location.
Comment 32 Daniel Walsh 2016-02-22 17:12:18 EST
container=docker is provided by the base image I think Fedora/Centos and RHEL all provide this.

Docker does not do this by default
Comment 33 Bryn M. Reeves 2016-02-23 05:00:49 EST
According to Chris Fedora does _not_ currently set this.

It would be good to confirm that this is going to be present in Red Hat distributions before we commit to it - we look a bit stupid when our own integration keeps breaking because of silly things like disappearing environment variables.
Comment 34 Chris Evich 2016-02-23 13:08:54 EST
(In reply to Bryn M. Reeves from comment #33)
> According to Chris Fedora does _not_ currently set this.

I opened this one for fedora
https://bugzilla.redhat.com/show_bug.cgi?id=1302354
Comment 35 Chris Evich 2016-05-10 12:10:03 EDT
Verified this problem still exists in sos in Atomic 7.2.4 rhel-tools:7.2-23 image ID 7bd6bdb83046
Comment 36 Bryn M. Reeves 2016-05-11 07:41:22 EDT
The sos version is always more helpful than any Atomic/rhel-tools version for sos bugs. That said we have not updated anything here (bug is still NEW & Pavel has not granted a devel_ack yet) so we would not expect anything to have changed.
Comment 37 Chris Evich 2016-06-21 15:25:17 EDT
Verified problem still exists in Atomic 7.2.5 rhel7/rhel-tools:7.2-28 ID 1df53c6954ce
Comment 38 Pavel Moravec 2016-08-30 10:21:50 EDT
FYI this has been fixed in sos 3.3 we rebase to in RHEL7.3. If willing to test it, here is the package:

http://download-node-02.eng.bos.redhat.com/brewroot/////packages/sos/3.3/2.el7/noarch/sos-3.3-2.el7.noarch.rpm
Comment 39 Chris Evich 2016-08-31 12:42:18 EDT
(In reply to Pavel Moravec from comment #38)
> FYI this has been fixed in sos 3.3 we rebase to in RHEL7.3. If willing to
> test it, here is the package:

(on Atomic 7.2.6)
# atomic run registry.access.redhat.com/rhel7/rhel-tools:latest
Trying to pull repository registry.access.redhat.com/rhel7/rhel-tools
...cut...
# yum upgrade http://download-node-02.eng.bos.redhat.com/brewroot/////packages/sos/3.3/2.el7/noarch/sos-3.3-2.el7.noarch.rpm
...cut...
Transaction test succeeded
Running transaction
  Updating   : sos-3.3-2.el7.noarch                                                       1/2 
  Cleanup    : sos-3.2-35.el7_2.3.noarch                                                  2/2 
...cut...
# sosreport
...cut...
Creating compressed archive...

Your sosreport has been generated and saved in:
  /var/tmp/sosreport-$HOSTNAME-20160831163602.tar.xz

# ls -la /var/tmp
(it's there)

# exit
# ls -la /var/tmp
total 12
drwxrwxrwt.  5 root root 4096 Aug 31 16:08 .
drwxr-xr-x. 24 root root 4096 Aug 29 05:49 ..
drwx------.  3 root root 4096 Aug 31 16:35 sos.1CyqMn

# ls -la /var/tmp/sos.1CyqMn/
total 178136
drwx------.  3 root root      4096 Aug 31 16:35 .
drwxrwxrwt.  5 root root      4096 Aug 31 16:08 ..
drwx------. 15 root root      4096 Aug 31 16:35 sosreport-$HOSTNAME-20160831160524
-rw-------.  1 root root 172829696 Aug 31 16:35 sosreport-$HOSTNAME-20160831160524.tar

So the tarball does appear to be getting copied onto the host, but shouldn't it be at the top-level of /var/tmp and not nested beneath whatever 'sos.1CyqMn' is?

Just double-checking.  Otherwise we can mark this as VERIFIED.  Please let me know.
Comment 40 Chris Evich 2016-08-31 12:46:05 EDT
Update: Ahh, I see now that sosreport-$HOSTNAME-20160831163602.tar.xz was in the container's /var/tmp.  That other file *60524 was from a prior attempt that I ctrl-c'd because the host's root filesystem filled up and sosreport "hung".

Sp, it appears that sosreport is _not_ copying the tarball onto the host as I would have expected.  If it doesn't, and someone accidentally 'docker rm rhel-tools' then the sosreport data will be lost!
Comment 41 Bryn M. Reeves 2016-09-01 05:58:14 EDT
Please attach the sos.log from the run that left the tarball inside the container and also the complete output of 'env', run from the same environment as the report (i.e. including HOST and whatever else is set for the container).
Comment 43 Bryn M. Reeves 2016-09-01 09:46:19 EDT
Manually updating the image to sos-3.3-2.el7 gives the expected results:

<bmr> 3.3-2.el7 works good:
<bmr> # sosreport -vvv --batch --debug -o general
<bmr> set sysroot to '/host' (policy)
<bmr> sosreport (version 3.3)
<bmr> [...]
<bmr> [archive:TarFileArchive] initialised empty FileCacheArchive at '/host/var/tmp/sos.Vgrh_o/sosreport-jmaster.usersys.redhat.com-20160901134453'
<bmr> [...]
<bmr> [plugin:general] added copyspec '['/host/etc/sysconfig']'
<bmr> [plugin:general] added copyspec '['/host/proc/stat']'
<bmr> [...]
Comment 44 Chris Evich 2016-09-01 09:50:27 EDT
Yep, looks like this was my goof.  Seems likely I rm'd the rhel-tools container when host ran out of space and forgot to re-update the package.  Thanks Bryn for setting things right.
Comment 45 Pavel Moravec 2016-11-11 02:48:58 EST
This bug has been fixed due to sos rebase to 3.3 [1] that includes the upstream fix. Relevant RHEL7.3 sos errata is [2].

Therefore I am closing the bug. Please test it if it addresses the reported problem properly, and if not, reopne the BZ.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1293044
[2] https://rhn.redhat.com/errata/RHBA-2016-2380.html

Note You need to log in before you can comment on or make changes to this bug.