In recent rawhide, I regularly see abrtd go into a spastic mode where it starts consuming all my cpu... strace says: [...] open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640) = -1 EEXIST (File exists) open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_RDONLY) = 7 read(7, ""..., 15) = 0 close(7) = 0 open("/var/cache/abrt/ccpp-1255924492-2006.lock", O_WRONLY|O_CREAT|O_EXCL, 0640^C <unfinished ...>
Something left an empty lock file there, and this is exactly what we hoped never happen: while ((fd = open(pLockFile, O_WRONLY | O_CREAT | O_EXCL, 0640)) < 0) { ... int r = read(fd, pid_buf, sizeof(pid_buf) - 1); close(fd); if (r == 0) { /* Other process did not write out PID yet. * We HOPE it did not crash... */ continue; } ... } write(fd, pPID, len); One question, how in the hell it happened, but now it would not be easy to find out. I have a patch which uses symlinks instead of short ordinary files. Symlinks can be created atomically, avoiding the possibility of "empty file". Will apply in to abrt git now.
for blocker evaluation purposes: denys, do you think it'd be safe/wise to apply this fix in F12 final? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
for the record, the patch for this is: http://git.fedorahosted.org/git/abrt.git?p=abrt.git;a=commit;h=01057ae36d686d8202547b9ff45bd1635415d13c -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
(In reply to comment #2) > for blocker evaluation purposes: denys, do you think it'd be safe/wise to apply > this fix in F12 final? I think at this point abrt git should go to repository almost daily. It is not stable yet. IOW: it has so many bugs that git almost always works better (has less bugs) than anything from three days ago. There is almost no "stability" to be gained by holding back abrt releases.
Anyone interested in testing this can try the latest abrt build from my repo: You can find the repo file at: http://jmoskovc.fedorapeople.org/abrt-rawhide.repo Thanks, Jirka
comment #4 makes me uncomfortable about shipping abrt by default, but point taken. Matthias, can you test the build from Jiri's repo and confirm it fixes this for you? If Matthias confirms, Jiri + Denys please send a build to Koji and a tag request for final ASAP. Thanks! -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
I have installed the build from the abrt-rawhide repo, and have not seen the 100% cpu problem since. Will keep monitoring it over the weekend.
Matthias, was there any selinux warning/denials, when this happened? Thanks, Jirka
Sorry, I can't say. And the 100% cpu eating hasn't reoccurred since updating to the abrt-rawhide repo.
I've been running the updated abrt for a few hours and have seen no problems with it. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers
Given that abrt-0.11 has been tagged, should we close this? I'm at least marking this as modified.
it only just got tagged. that's what I was waiting on before changing it. I think the current feedback is enough to close it, yes. If anyone sees the CPU usage bug come back they can re-open. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers