Bug 1277254 - /usr/libexec/abrt-hook-ccpp infinite loop
Summary: /usr/libexec/abrt-hook-ccpp infinite loop
Keywords:
Status: CLOSED DUPLICATE of bug 1255762
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: abrt
Version: 6.8
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: abrt
QA Contact: BaseOS QE - Apps
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-11-02 19:56 UTC by Andy Grimm
Modified: 2016-11-08 03:48 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-03 14:17:58 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Andy Grimm 2015-11-02 19:56:27 UTC
Description of problem:

We recently found an abrt-hook-ccpp process stuck on one of our nodes for more than a week.  We found the following processes on the system:

nscd       3453  1.7  0.0 2363160  972 ?        Dsl  Aug26 1739:29 /usr/sbin/nscd
root     317150  0.0  0.0  83936  1532 ?        S    Oct23   2:31 /usr/libexec/abrt-hook-ccpp 11 0 3453 28 28 1445604670 nscd

Tracing the abrt process showed a loop:

# strace -f -p 317150
Process 317150 attached
restart_syscall(<... resuming interrupted call ...>) = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0
symlinkat("317150", 5, ".lock")         = -1 EEXIST (File exists)
readlinkat(5, ".lock", "4017", 15)      = 4
access("/proc/4017", F_OK)              = 0
nanosleep({0, 500000000}, NULL)         = 0

so I checked the file descriptor to get the path:

# ls -l /proc/317150/fd
total 0
lr-x------. 1 root root 64 Nov  2 14:37 0 -> pipe:[1577878550]
lrwx------. 1 root root 64 Nov  2 14:37 1 -> /dev/null
lrwx------. 1 root root 64 Nov  2 14:37 2 -> /dev/null
lrwx------. 1 root root 64 Nov  2 14:37 3 -> socket:[1577881793]
lr-x------. 1 root root 64 Nov  2 14:37 4 -> /var/spool/abrt
lr-x------. 1 root root 64 Nov  2 14:37 5 -> /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898

then found the .lock file in the directory was a broken symlink:

# ls -la /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898
total 2024
drwxr-x---.  2 root abrt      41 Oct  5 15:06 .
drwxr-xr-x. 37 abrt abrt    4096 Oct 30 10:24 ..
lrwxrwxrwx.  1 root root       4 Oct  5 15:05 .lock -> 4017
-rw-------.  1 root root 2066788 Oct  5 15:06 sosreport.tar.xz


I removed the .lock file, and the processes exited, writing event_log and machine_id files to the directory:

# ls -la /var/spool/abrt/ccpp-2015-10-05-15:04:14-262898
total 2028
drwxr-x---.  2 root abrt      61 Nov  2 14:53 .
drwxr-xr-x. 17 abrt abrt    4096 Nov  2 14:51 ..
-rw-r--r--.  1 root root       0 Nov  2 14:49 event_log
-rw-r--r--.  1 root root      93 Nov  2 14:49 machineid
-rw-------.  1 root root 2066788 Oct  5 15:06 sosreport.tar.xz

Comment 2 Jakub Filak 2015-11-03 07:47:24 UTC
Thank you for the report. This issue looks like a duplicate of bug #1255762. Can you please provide us with full system log?

Comment 3 Andy Grimm 2015-11-03 14:17:58 UTC
It is definitely a duplicate.  Closing this one.

*** This bug has been marked as a duplicate of bug 1255762 ***


Note You need to log in before you can comment on or make changes to this bug.