| Summary: | unlimited memory consumption in audit with small number of cpu cores | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Ondrej Moriš <omoris> |
| Component: | audit | Assignee: | Steve Grubb <sgrubb> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Ondrej Moriš <omoris> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 7.2 | CC: | jjaburek, mvadkert, omoris, pmoore |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-07-11 15:52:09 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Ondrej Moriš
2016-04-07 11:57:22 UTC
Jiri, please free to add any technical details. This bug is most likely two sided - both kernel and auditd contain a bug. When this happens, it might be good to get an audit status. Also, does running "auditctl --backlog_wait_time 0" fix the problem? Meaning that if you set the backlog_wait_time to 0 before running the tests, do you still see unlimited memory consumption? Is backlog_wait_time supported on RHEL kernels? I forgot to mention that we see this on ppc64le architecture (I do not know yet if it is arch specific or not). # auditctl --backlog_wait_time 0 backlog_wait_time is not supported on your kernel (In reply to Ondrej Moriš from comment #0) > We also experienced scheduler issues during the memory growing. We really haven't, the latencies were well below extreme - mostly at 10ms, going up to 50ms and occasionally around 100ms, even during memory pressure. My observation was simply (from watching "cat /proc/<auditd>/syscall" in a loop, to not cause too much overhead when tracing) that, when idle, auditd spends most of the time in epoll_wait(2) and when the apparent memory growth is happenning, it does mostly futex(2). The CPU (sys) time consumed also seems to grow with higher memory usage, indicating that auditd has simply "more work to do" / queued. The kernel side of the question would be - is the amount of events disproportional to what should be happening? IOW is the (huge?) amount of events expected? Sorry, not much I can add here. I have this feeling that its related to bz1296189. In it, they mention the problem on RHEL6/7 and some other patches that start to address the issue. I think some of them are scheduled for 7.2 or 7.3. In that case I will check if 7.2 is affected or not. Unfortunately, with RHEL-7.2 kernel 3.10.0-327.el7 the problem is still reproducible. Could you try testing against kernel-3.10.0-372.el7 or later? I think Bug #1253123 is related. (In reply to Steve Grubb from comment #10) > Could you try testing against kernel-3.10.0-372.el7 or later? I think Bug > #1253123 is related. I tried the reproducer with 3.10.0-383.el7.ppc64le and the problem is still there, no difference between 327 and 383 unfortunately. With audit-2.6.2-3 I cannot reproduce the bug. I see the following memory consumption during the test procedure from the description: RSS === ... 6976 6912 # Step 1 completed. 6912 6912 6912 # Step 2 started 6912 ... < 6912 all the time > 6912 6912 # Step 2 finished. 6720 6720 6976 6976 Tested with kernel-3.10.0-383, downgrading to audit-2.4.1-5 re-introduces the problem: RSS === ... 6208 6208 222272 279296 316608 348736 332544 335808 337792 323392 749504 778560 787904 824704 787840 820160 806016 788608 831616 827712 774912 794368 6208 6208 Therefore I will CLOSE this bug as CURRENTRELEASE. |