Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1345854 - non-deterministic calls of check_space_left()
non-deterministic calls of check_space_left()
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: audit (Show other bugs)
7.3
All Linux
high Severity high
: rc
: 7.3
Assigned To: Steve Grubb
Ondrej Moriš
: Regression
Depends On:
Blocks: 1256920
  Show dependency treegraph
 
Reported: 2016-06-13 06:35 EDT by Ondrej Moriš
Modified: 2016-11-04 02:13 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-04 02:13:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2418 normal SHIPPED_LIVE audit bug fix and enhancement update 2016-11-03 09:58:32 EDT

  None (edit)
Description Ondrej Moriš 2016-06-13 06:35:11 EDT
Description of problem:

With audit-2.5 we observed that space_left and admin_space_left actions are _sometimes_ performed only after auditd restart and not immediately when system is (nearly) running low of space. Behaviour was oddly non-deterministic, sometimes it worked just fine for hours and sometimes it was broken for hours. It was a regression from 2.4.

Version-Release number of selected component (if applicable):

audit-2.5.2-1.el7

How reproducible:

Sometimes. Depends on size of audit events.

Steps to Reproduce:

1. Configure admin_space to X and admin_space_action (syslog).

2. Fill partition containing /var/log/audit by audit events so that it has less than X free space.

3. Check /var/log/messages to see that action was triggered.

Actual results:

Action is sometimes not triggered.

Expected results:

Action is always triggered. 

Additional info:

We tracked the issue to auditd-event.c write_to_log() function, its relevant part calling check_space_left() changed between 2.5 and 2.4 as follows:

(2.5)

397	                if (config->daemonize == D_BACKGROUND) {
...
403	                        log_size += rc;
404	                        check_log_file_size();
405	                        // Keep loose tabs on the free space
406	                        if (rc%2 == 0)
407	                                check_space_left();
408	                }

(2.4)

427	                /* check log file size & space left on partition */
428	                if (config->daemonize == D_BACKGROUND) {
...
434	                        log_size += rc;
435	                        check_log_file_size(data);
436	                        check_space_left(data->log_fd, data);
437	                }

Where rc = fprintf(log_file, "%s\n", buf). I am not sure if it is the safe approach to condition check_space_left() calls based either on rc or log_size. It might happen that rc/log_size is odd number during a long sequence of write_to_log() calls. In that case a system might be already well below (admin_)space limit before (admin_)space_action is triggered.
Comment 2 Steve Grubb 2016-06-13 10:03:10 EDT
Is (rc%3 >  2) more deterministic? The idea here is to try to avoid calling fstatfs all the time.
Comment 3 Ondrej Moriš 2016-06-15 07:57:11 EDT
(In reply to Steve Grubb from comment #2)
> Is (rc%3 >  2) more deterministic? The idea here is to try to avoid calling
> fstatfs all the time.

Well, conditioning check_space_left() calls by event length might go wrong when, for instance, a bunch of similar events are generated (with the same length). I understand your point of saving fstatfs calls. The best would be to condition by serial number of an event to make 100% sure that check_space_left() is called at least once during a sequence of events.
Comment 4 Steve Grubb 2016-06-15 12:07:19 EDT
This is the proposed fix:
https://fedorahosted.org/audit/changeset/1266
Comment 5 Ondrej Moriš 2016-06-29 04:35:56 EDT
(In reply to Steve Grubb from comment #4)
> This is the proposed fix:
> https://fedorahosted.org/audit/changeset/1266

Thanks Steve, it looks better now. There is still a very rare case when the function might not be called but a probability it might be hit in the production environment is very close to zero (FTR: I noticed that a typo in the patch was already corrected upstream).
Comment 8 Ondrej Moriš 2016-08-10 06:10:37 EDT
Verified by Common Criteria audit-test on all architectures except ppc64 where results are currently not available, results from x86_64 are as follow:

NEW (audit-2.6.5-2.el7)
=======================
Bucket:                          fail-safe
Started:                         Thu Aug  4 18:19:57 CEST 2016
Kernel:                          3.10.0-483.el7.x86_64
Architecture:                    x86_64
Mode:                            64
Hostname:                        cc-v0b.lab.eng.brq.redhat.com
Profile:                         capp
selinux-policy version:          selinux-policy-3.13.1-92.el7.noarch

SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      28

[0] pam_loginuid success PASS
[1] pam_loginuid fail PASS
[2] max_log_file keep_logs PASS
[3] max_log_file rotate PASS
[4] max_log_file suspend PASS
[5] max_log_file syslog PASS
[6] admin_space_left halt PASS
[7] admin_space_left single PASS
[8] admin_space_left suspend PASS
[9] admin_space_left syslog PASS
[10] space_left halt PASS
[11] space_left single PASS
[12] space_left suspend PASS
[13] space_left syslog PASS
[14] disk_full_action halt PASS
[15] disk_full_action single PASS
[16] disk_full_action suspend PASS
[17] disk_full_action syslog PASS
[18] admin_space_left email PASS
[19] space_left email PASS

  20 pass (100%)
   0 fail (0%)
   0 error (0%)
------------------
  20 total (in 26s)
Comment 10 errata-xmlrpc 2016-11-04 02:13:32 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2418.html

Note You need to log in before you can comment on or make changes to this bug.