| Summary: | sosreport may run out of memory if the journal has a lot of entries | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Evgheni Dereveanchin <ederevea> |
| Component: | sos | Assignee: | Pavel Moravec <pmoravec> |
| Status: | CLOSED DUPLICATE | QA Contact: | BaseOS QE - Apps <qe-baseos-apps> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.2 | CC: | agk, bmr, gavin, pkshiras, plambri, sbradley |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-05-08 12:20:53 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
I've also seen cases when the logs module just fills /tmp due to the high volume of messages in verbose output. The proposal here would be to check for journal size and only gather the latest info if there's too many lines present. > on slower systems the logs module will just time out > Running 28/67: logs... > [plugin:logs] command 'journalctl --all --this-boot --no-pager -o verbose' This isn't an sos problem per-se; journalctl is just taking too long to write the messages to stdio. We can increase the timeout but it's just a workaround really. > I've also seen cases when the logs module just fills /tmp due to the high > volume of messages in verbose output. Sos in RHEL7 does not write to /tmp. Do you mean /var/tmp? If you're actually seeing /tmp filled up then it's unlikely sos is responsible. > The proposal here would be to check for journal size and only gather the > latest info if there's too many lines present. Presently there is no way to do this (reasonably) with the existing journald tooling: there is no way to request the size (and afaik journalctl itself cannot know without reading in all the records). Teaching sos to inspect the raw journal files directly would be a layering violation. This would mean we'd have to do everything twice: once to count lines and a second to capture the data (and this is racy: a process generating a high rate of messages/sec will cause a large error in the two counts). Addressing the OOM condition for very large journals is possible but it involves fairly significant changes to the IO handling in the Plugin class as well as the process IO from sos.utilities. It's something that's on the upstream roadmap but it has not been implemented or evaluated for suitability for a RHEL update at this time. *** This bug has been marked as a duplicate of bug 1183244 *** |
Description of problem: Currently the "logs" module of sosreport consumes a high amount of RAM, and if there's enough logs in the journal it will run out of memory and crash Version-Release number of selected component (if applicable): sos-3.2-35.el7_2.3 How reproducible: This was reproduced on a system with 32GB RAM and at least 10GB free. the amount of logs in the journal (produced by OpenShift) # journalctl -b | wc -l 1647020 Steps to Reproduce: 1. add a million lines to the journal (careful, may kill the system) # for i in {1..1000000}; do echo "test$i test$i test$i test$i test$i 12345 test$i" | systemd-cat; done 2. try to collect sosreport # sosreport Actual results: ... Setting up archive ... Setting up plugins ... Running plugins. Please wait ... Running 36/80: logs... Killed Expected results: sosreport collected successfully Additional info: on slower systems the logs module will just time out Running 28/67: logs... [plugin:logs] command 'journalctl --all --this-boot --no-pager -o verbose' timed out after 300s