Description of problem: The latest version of audit-libs has started using capng_have_capability() before sending audit messages. This has in turn exposed a bug in libcap-ng's use of thread local state. Specifically, if you call any capng_* function and then call fork(), the static __thread struct cap_ng m will get initialized with the current PID. If you then fork() and call another capng API, (for example, you're trying to drop caps for a child process you're spawning), this thread local doesn't get reset. As a result capng_apply() will try to set capabilities on the parent PID and gets an error strace clearly shows the problem: [pid 26690] capset({_LINUX_CAPABILITY_VERSION_3, 26689}, {0, 0, 0}) = -1 EPERM (Operation not permitted) not it is runing pid 26690 but setting caps on 26689. AFAICT, this bug has existed forever. The new audit-libs 2.6.1 has just started calling capng_have_capability() which has in turn broken libvirt which uses capng when spawning processes after fork. The following demo shows the problem #include <libaudit.h> #include <cap-ng.h> #include <stdio.h> #include <sys/wait.h> int main(int argc, char **argv) { int fd = audit_open(); if (fd < 0) { perror("audit_open"); return 1; } audit_log_user_message(fd, AUDIT_VIRT_CONTROL, "test", NULL, NULL, NULL, 1); close(fd); pid_t child = fork(); if (child == 0) { capng_clear(CAPNG_SELECT_CAPS); if (capng_apply(CAPNG_SELECT_CAPS) < 0) { perror("capng_apply"); _exit(1); } _exit(0); } int status = 0; waitpid(child, &status, 0); fprintf(stderr, "Child exited %d\n", status); return 0; } $ gcc -lcap-ng -laudit -o cap cap.c [berrange@t530wlan ~]$ ./cap capng_apply: Operation not permitted Child exited 256 If you downgrade to audit-libs-2.6 the problem goes away, though clearly the bug in cap-ng still exists. Version-Release number of selected component (if applicable): audit-libs-2.6.1-1.fc24.x86_64 audit-libs-2.6.1-1.fc24.i686 libcap-ng-0.7.7-4.fc24.x86_64 libcap-ng-0.7.7-4.fc24.i686
As for the libvirt issue, the audit package has been fixed to save and restore state in upstream commit 1304. A new audit package will be released shortly. See bz 1351954.
Created attachment 1174946 [details] Use pthread_atfork to reset global state
This bug appears to have been reported against 'rawhide' during the Fedora 25 development cycle. Changing version to '25'.
Fixed in upstream commit 7759e6f. Going to close this as upstream. Daniel, thanks for pointing out the bug and a potential solution. What was committed used pthread_atfork as a weak symbol to fix things for people using pthreads. But also to document the issue in a man page explaining the problem and 2 potential solutions.