Bug 964219
| Summary: | cgred process dies | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Kenny Woodson <kwoodson> | ||||||||||
| Component: | libcgroup | Assignee: | Peter Schiffer <pschiffe> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Mike Gahagan <mgahagan> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | high | ||||||||||||
| Version: | 6.4 | CC: | admiller, agrimm, ccui, jsafrane, kwoodson, mfisher, mmahut, pschiffe, sten, tlavigne, twiest, varekova | ||||||||||
| Target Milestone: | rc | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | x86_64 | ||||||||||||
| OS: | Linux | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
| Doc Text: |
No documentation needed.
|
Story Points: | --- | ||||||||||
| Clone Of: | |||||||||||||
| : | 1011515 (view as bug list) | Environment: |
aws ec2 instance
|
||||||||||
| Last Closed: | 2013-11-21 22:33:46 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Bug Depends On: | |||||||||||||
| Bug Blocks: | 923851, 947775, 961026, 1011515 | ||||||||||||
| Attachments: |
|
||||||||||||
|
Description
Kenny Woodson
2013-05-17 14:57:17 UTC
Kenny, do you still see this problem (even with libcgroup-0.37-7.2.el6_4)? Thanks, peter Peter, Yes, we are still seeing these issues. Here are a few of them from today: Aug 16 05:53:05 ex-std-node96 CGRE[1340]: *** glibc detected *** /sbin/cgrulesengd: double free or corruption (fasttop): 0x0000000001bf3370 *** Aug 16 04:37:12 ex-std-node38 CGRE[1149]: *** glibc detected *** /sbin/cgrulesengd: double free or corruption (fasttop): 0x0000000000be73f0 *** Aug 16 10:44:43 ex-std-node4 CGRE[1103]: *** glibc detected *** /sbin/cgrulesengd: double free or corruption (fasttop): 0x00000000025eabf0 *** These are 3 separate servers. rpm -qa | grep libcgroup: libcgroup-pam-0.37-7.2.el6_4.x86_64 libcgroup-0.37-7.2.el6_4.x86_64 Thanks, kenny Kenny, are you able to generate coredump? If yes, could you attach it? Thanks, peter Peter, We have core file size set to -c for dumps. I have experimented and tried to dump the process by killing it with a 3, 4, 6, 8, and an 11. I haven't had any luck. Is there something else I can try? Any assistance would help. ]# cat /proc/sys/kernel/core_pattern /var/crash/core-%e-%s-%u-%g-%p-%t Thanks, kenny *** Bug 924438 has been marked as a duplicate of this bug. *** Created attachment 789652 [details]
core dump
Attached a core dump. Working with peter.
Created attachment 789653 [details]
actual core dump
Binary core dump
Created attachment 789692 [details]
2 more cgred dumps
Peter,
I'm not if this is the same issue but I found 2 more core files.
Attached.
This problem was introduced with bug #849757, and it should be fixed as part of the bug #913286. At Peter's request I installed the latest version for RHEL 6.5 and cgred process immediately dies.
libcgroup-debuginfo-0.40.rc1-2.el6.x86_64
libcgroup-pam-0.40.rc1-2.el6.x86_64
libcgroup-devel-0.40.rc1-2.el6.x86_64
libcgroup-0.40.rc1-2.el6.x86_64
Stopping CGroup Rules Engine Daemon... [ OK ]
Starting CGroup Rules Engine Daemon: /bin/bash: line 1: 2095 Segmentation fault /sbin/cgrulesengd -g cgred
[FAILED]
Attaching a core dump.
Created attachment 796412 [details]
cgred dump
FWIW, 0.40.rc1-4 has a serious memory leak. Every time the cache is reloaded, cgrulesengd is leaking 80 to 100 megabytes (on a system with 1500-2000 users). You can observe this by sending cgrulesengd a SIGUSR2 signal. That is hindering our testing of this fix. We have progressed much further with the latest fixes and the recently provided memory fixes have proven to be a much more efficient cgred process. I'd say this version is good. We are going to roll the latest version out to a few more of our production servers. Marking verified based on comment 25 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1685.html |