Bug 1277254 - /usr/libexec/abrt-hook-ccpp infinite loop
/usr/libexec/abrt-hook-ccpp infinite loop
Status: CLOSED DUPLICATE of bug 1255762
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: abrt (Show other bugs)
6.8
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: abrt
BaseOS QE - Apps
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-02 14:56 EST by Andy Grimm
Modified: 2016-11-07 22:48 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-03 09:17:58 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Andy Grimm 2015-11-02 14:56:27 EST
Description of problem:

We recently found an abrt-hook-ccpp process stuck on one of our nodes for more than a week.  We found the following processes on the system:

nscd       3453  1.7  0.0 2363160  972 ?        Dsl  Aug26 1739:29 /usr/sbin/nscd
root     317150  0.0  0.0  83936  1532 ?        S    Oct23   2:31 /usr/libexec/abrt-hook-ccpp 11 0 3453 28 28 1445604670 nscd

Tracing the abrt process showed a loop:

# strace -f -p 317150
Process 317150 attached
restart_syscall(<... resuming interrupted call ...>) = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0

so I checked the file descriptor to get the path:

# ls -l /proc/317150/fd
total 0
lr-x------. 1 root root 64 Nov  2 14:37 0 -> pipe:[1577878550]
lrwx------. 1 root root 64 Nov  2 14:37 1 -> /dev/null
lrwx------. 1 root root 64 Nov  2 14:37 2 -> /dev/null
lrwx------. 1 root root 64 Nov  2 14:37 3 -> socket:[1577881793]
lr-x------. 1 root root 64 Nov  2 14:37 4 -> /var/spool/abrt
lr-x------. 1 root root 64 Nov  2 14:37 5 -> /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898

then found the .lock file in the directory was a broken symlink:

# ls -la /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898
total 2024
drwxr-x---.  2 root abrt      41 Oct  5 15:06 .
drwxr-xr-x. 37 abrt abrt    4096 Oct 30 10:24 ..
lrwxrwxrwx.  1 root root       4 Oct  5 15:05 .lock -> 4017
-rw-------.  1 root root 2066788 Oct  5 15:06 sosreport.tar.xz


I removed the .lock file, and the processes exited, writing event_log and machine_id files to the directory:

# ls -la /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898
total 2028
drwxr-x---.  2 root abrt      61 Nov  2 14:53 .
drwxr-xr-x. 17 abrt abrt    4096 Nov  2 14:51 ..
-rw-r--r--.  1 root root       0 Nov  2 14:49 event_log
-rw-r--r--.  1 root root      93 Nov  2 14:49 machineid
-rw-------.  1 root root 2066788 Oct  5 15:06 sosreport.tar.xz
Comment 2 Jakub Filak 2015-11-03 02:47:24 EST
Thank you for the report. This issue looks like a duplicate of bug #1255762. Can you please provide us with full system log?
Comment 3 Andy Grimm 2015-11-03 09:17:58 EST
It is definitely a duplicate.  Closing this one.

*** This bug has been marked as a duplicate of bug 1255762 ***

Note You need to log in before you can comment on or make changes to this bug.