Bug 1298186

Summary:	Please consider adding a -k timeout as well to commands run by sosreport
Product:	Red Hat Enterprise Linux 7	Reporter:	Paulo Andrade <pandrade>
Component:	sos	Assignee:	Pavel Moravec <pmoravec>
Status:	CLOSED DUPLICATE	QA Contact:	BaseOS QE - Apps <qe-baseos-apps>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	7.1	CC:	agk, apmukher, bmr, dkutalek, dwojewod, erich, gavin, mhradile, plambri, pmoravec, sbradley, sknauss
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-03-24 16:55:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1420851

Description Paulo Andrade 2016-01-13 12:24:41 UTC

An user has reported an issue with large journal logs,
that also happened to be corrupted. The only solution to
be able to collect a sosreport was to clean the journal
files and run sosreport without the logs.

  The suggestion would be something like adding a 600
seconds kill timeout, and the pseudo patch:

sos/utilities.py:
    if timeout and is_executable("timeout"):
-        command = "timeout %ds %s" % (timeout, command)
+        command = "timeout %ds --kill-after=%ds %s" % (timeout, timeout * 2, command)

Comment 2 Bryn M. Reeves 2016-01-13 13:07:38 UTC

https://github.com/sosreport/sos/issues/337

Comment 4 Bryn M. Reeves 2016-04-26 16:30:23 UTC

I'm not sure I see what the patch is supposed to accomplish?

The timeout handling is intended to deal with cases where the command we are calling is blocked (for e.g. due to unavailable block devices/network file systems, or kernel failures).

That is not the case for the journalctl commands: they are actually running, but dealing with an unreasonable quantity of data. If the command completes before the timeout expires then we have seen an out-of-memory condition in sos's postprocessing handlers when trying to read the entire set of data into memory.

This is a known problem but addressing it requires very significant change and we do not as yet have an estimate when that would appear upstream.

Sending a kill to the subordinate process from /usr/bin/timeout does not affect this - to be honest, I am not even sure from a quick test that the -k option does anything unless the -s switch has been used (to deliver a signal after the timeout):

$ time timeout -k 15 10 /usr/bin/sleep 100
real	0m10.002s
user	0m0.001s
sys	0m0.002s

It would be easier to understand what was being requested if this bug included the standard bugzilla template sections - I don't really understand from the limited description given what it is trying to achieve or how it is supposed to work.

Comment 9 Bryn M. Reeves 2016-04-27 10:39:34 UTC

> We seem to get timeouts for several plugins during sos report collection on 
> controller nodes:

The requested change will have no effect on this behaviour.

Comment 10 Bryn M. Reeves 2016-04-27 10:47:14 UTC

> If the journalctl command was still running and ignored the signal received, 
> the proposed change would wait for another 300s and send KILL signal.

Which still won't do anything useful if the task is blocked in uninterrubtible sleep (the most common cause of apparent command hangs in sos).

I don't see any evidence  that this change would improve behaviour for this or any other case. In the case that a process has gone off into the weeds it is something of a gamble whether sending a signal to it will achieve anything and even so, signaling the process (even if it did respond) has nothing to do with the actual problem reported - that the report does not contain expected content due to command timeouts.

We can try to make the collection of journal data more robust (which is what this bug really seems to be asking for). The mentioned command was directly requested by the systemd folk:

  https://bugzilla.redhat.com/show_bug.cgi?id=879621

Unfortunately while this is probably fine for most desktop systems (that are rebooted reasonably frequently and that use on-disk journals subject to a rotation schedule), it appears problematic on systems with very long uptimes that are using either in-memory (tmpfs), or on-disk (but possibly not rotated) journal files.

I'm not certain of the best answer right now but fiddling with timeout signal parameters is not going to give us any improvement in this area.

Comment 11 GE Scott Knauss 2016-05-09 05:38:08 UTC

Bryan,
    Thanks. Do you think this should be owned by someone else?  If not a sosreport issue, who do you think should be looking at this?


Thanks,
Scott

Comment 12 Pavel Moravec 2016-05-09 06:32:50 UTC

(In reply to Bryn M. Reeves from comment #10)
> I'm not certain of the best answer right now but fiddling with timeout
> signal parameters is not going to give us any improvement in this area.

Collect 

journalctl --all --this-boot --no-pager -o verbose

with all_logs option only, plus newly collect:

journalctl --since="-7d" --all --no-pager -o verbose

?

Comment 13 Bryn M. Reeves 2016-05-09 13:53:24 UTC

> journalctl --all --this-boot --no-pager -o verbose
> 
> with all_logs option only, plus newly collect:

Sure but if someone runs it with a journal that's going to dump 2GiB of data to stdout we'll still fail the same way.

It's not really a workaround - just restricting the problem to a smaller set of users.

Comment 14 Bryn M. Reeves 2016-05-09 13:56:40 UTC

> Thanks. Do you think this should be owned by someone else?  If not a sosreport 
> issue, who do you think should be looking at this?

The proposed patch does not help but this is very definitely an sos problem; even with changes to journalctl that would make managing these large archives easier/more efficient we would still have to properly deal with the volume of data produced.

It is not that it is someone else's problem (although some of the beahviours mentioned here definitely indicate problems in other components), rather that there is not small/simple change that will fix it: it needs a wholesale re-write of the plugin IO mechanism and that is not really something that we can get done in time for 7.3.

I did have an idea at the weekend that would specifically help in the journalctl case however I've not had time to work on any patches to implement it.

Comment 16 Bryn M. Reeves 2016-05-09 18:36:55 UTC

There is no file: we are reading from journalctl's stdout into memory (that's why you see out-of-memory errors in some cases).

The timeouts are also handle by an external program, /usr/bin/timeout (from coreutils) so we do not have the ability to insert our own handling in this way.

Rewriting the timeout handling to allow that would be as complex and invasive as fixing the bug properly by moving to streamed IO for plugins.

There is no quick/easy way to fix this, sorry.

Comment 17 GE Scott Knauss 2016-05-19 14:49:13 UTC

Bryan,
    Thanks. Understand there is no quick easy fix. Are we working on the long and hard one?

-Scott

Comment 19 GE Scott Knauss 2016-06-14 08:03:46 UTC

Bryan,
       I was looking through BZs today and came across this one: 

https://bugzilla.redhat.com/show_bug.cgi?id=1183244 "sosreport tracebacks with memory error with huge system journall"  

The goal of this BZ seems to be re-writing the command IO paths. Would that also clear up this issue? If so... should we close this as a dupe and do you think there will be any likelihood of getting this done in RHEL 7.3?  

Thanks,
Scott

Comment 20 Bryn M. Reeves 2016-06-14 10:08:50 UTC

> Bryan,

It's still Bryn.

> The goal of this BZ seems to be re-writing the command IO paths.

Yes, but this is still a large (likely several months) piece of work. There is no time to do this before the next release deadline.

I've started pushing some pieces upstream that will help us to avoid these problems specifically for journal commands.

If you are interested in this work the best thing to do is to get a GitHub account and follow along there.

In particular Issue #825 (now closed) sets out the foundation of this work:

  https://github.com/sosreport/sos/issues/825

Comment 25 Pavel Moravec 2017-03-24 16:55:28 UTC

(In reply to Pavel Moravec from comment #24)
> Technically, this BZ shall be dup of 1183244 that we would like to have in
> 7.4. Work is pending in upstream, though, no definitive promise we will fix
> it in 7.4 so far :(

I am closing this BZ as dup of 1183244. Regrettably, no fix will be available in 7.4 time frame :(

*** This bug has been marked as a duplicate of bug 1183244 ***

Comment 26 Bryn M. Reeves 2018-07-20 10:38:28 UTC

It turns out that this change _does_ have a desirable effect, although it is rather non-obvious when looking at the 'timeout' documentation for the switch:

       -k, --kill-after=DURATION
              also send a KILL signal if COMMAND is still running
              this long after the initial signal was sent

What *actually* happens when the child is unkillable is that when -k is used, the 'timeout' command first sends a SIGTERM to both the process group (including itself), and to the monitored PID. The 'timeout' process blocks SIGTERM, and the child is in a TASK_UNINTERRUPTIBLE state so nothing happens this time.

When the -k timeout fires, it repeats this, except with SIGKILL. This time, the 'timeout' process is unable to catch the signal, but is also not blocked in uninterruptible sleep. It therefore exits allowing sos to continue.

What this means is that our existing timeout support can only cope with the case of a "busy child": it fails completely for processes that are actually blocked long-term in the kernel.

"Damn: I guess I didn't know".