Bug 1405635

Summary: crm_report should not collect /var/log/lastlog (or have some safety measures included)
Product: Red Hat Enterprise Linux 7 Reporter: Ken Gaillot <kgaillot>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: low Docs Contact:
Priority: medium    
Version: 7.3CC: abeekhof, cluster-maint, cluster-qe, jkortus, mnovacek
Target Milestone: rc   
Target Release: 7.4   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: pacemaker-1.1.16-2.el7 Doc Type: No Doc Update
Doc Text:
This minor issue was not reported by a customer.
Story Points: ---
Clone Of: 1405205 Environment:
Last Closed: 2017-08-01 17:54:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1405205    
Bug Blocks:    

Description Ken Gaillot 2016-12-16 21:59:32 UTC
+++ This bug was initially created as a clone of Bug #1405205 +++

Description of problem:
crm_report collects /var/log/lastlog file (or at least is looking for patterns in there for some reason).

Usually that is harmless, because the file is small. If it gets large, it creates a problem for grep used by crm_report, as it scans through that binary file and buffers everything up to a newline char.

You can quite easily create a file that is very large and grep will try to buffer it. Then it runs out of RAM and gets killed.

Easy way to create a large /var/log/lastlog is:
useradd -b /mnt/brawl -m -U -c "quota-sanity user" -u 10000000 quota-user-kPRAaCKm

Even though these circumstances are not exactly the common ones, we do crash on a file that most likely does not contain useful info. Can you please add a check there or just remove the grepping through it completely?

Ideally no part of crm_report should be able to eat up all memory :).

Version-Release number of selected component (if applicable):
pacemaker-cli-1.1.15-3.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. on any cluster node: useradd -b /mnt/brawl -m -U -c "quota-sanity user" -u 10000000 quota-user-kPRAaCKm
2. run crm_report that collects info from all nodes
3.

Actual results:
grep crashes (signal 6) on the affected node, maybe some files will be missing in the report.

Expected results:
* ideally skip /var/log/lastlog collection
* crm_report being more cautious on binary files with grep (doing a size check would be neat)

Additional info:

--- Additional comment from Jaroslav Kortus on 2016-12-15 16:06:36 EST ---

QA note: testable using brawl_quick + crm_report after (in ~3hrs) or by just running the quota subtest of it (ping refried), or just manually via reproducer.

--- Additional comment from Ken Gaillot on 2016-12-16 16:58:07 EST ---

The problem is that crm_report dynamically detects what system logs are used for the cluster by grepping for a particular pattern in (up to) all files in /var/log.

It's already on the long-term plan to convert crm_report from a shell script to python, to make the file handling much more efficient.

But for 6.9 timeframe, I can make sure "file" returns "text" or "compressed" before doing the grep. That will at least skip lastlog, wtmp, etc.

Comment 1 Ken Gaillot 2016-12-19 15:37:05 UTC
Fixed by upstream commit 083488ce

Comment 4 michal novacek 2017-05-26 10:52:19 UTC
I have verified that the procedure from bz1405205 #comment10 is valid for 
pacemaker-1.1.16-9 too.

Comment 5 errata-xmlrpc 2017-08-01 17:54:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1862