Description of problem: RHCS 2 core dumps not getting generated as a ceph user on "Ubuntu" likely due clrearing away of the PR_SET_DUMPABLE flag via setuid call. This is issue is valid for everyone MONS, OSDS, RGWS ans MDS daemons. Version-Release number of selected component (if applicable): ceph version 10.2.5-26redhat1xenial (d11a516e27459d970ff00b54315f5ba66185f046) How reproducible: Always Steps to Reproduce: set DefaultLimitCORE=infinity in /etc/systemd/system.conf $ sudo su - ceph $ ps aux | grep ceph $ $ strace -p 12638 strace: attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf $ gcore 12638 Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf ptrace: Operation not permitted. You can't do that without a process to debug. The program is not being run. gcore: failed to create core.12638 Expected results: Core should be generated and strace, gcore should work. Additional INFO : It is working as expected for rhel i.e. core dumps are getting generated. Also check bug https://bugzilla.redhat.com/show_bug.cgi?id=1389159.
Can you generate coredumps as root, rather than the ceph user? If so that seems like a good enough workaround for 2.2.
strace and gcore still work as root.
OK, let's take a look at this. The following simulates what we do in the ceph daemons. # groupadd ceph -g 167 -o -r # useradd ceph -u 167 -o -r -g ceph # cat << EOF >> sleeper.cpp #include <sys/types.h> #include <unistd.h> #include <sys/prctl.h> #include <iostream> #include <cerrno> int main() { int uid = 167; int gid = 167; if (setgid(gid) != 0) std::cerr << "unable to setgid " << gid << ": " << errno << std::endl; if (setuid(uid) != 0) std::cerr << "unable to setuid " << uid << ": " << errno << std::endl; if (prctl(PR_SET_DUMPABLE, 1) == -1) std::cerr << "warning: unable to set dumpable flag: " << errno << std::endl; sleep(3600); return 0; } EOF # g++ -g -o sleeper sleeper.cpp # strace -esetuid,prctl ./sleeper setuid(167) = 0 prctl(PR_SET_DUMPABLE, 1) = 0 ^Cstrace: Process 7567 detached # ./sleeper & [1] 7568 # ps auwwx|grep 7568 ceph 7568 0.0 0.0 13268 1524 pts/0 S 01:50 0:00 ./sleeper So that is doing just what we want, it has called setuid to switch to the ceph user and called prctl to set itself dumpable. # su - ceph No directory, logging in with HOME=/ $ strace -p 7568 strace: attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf Performing exactly the same procedure on CentOS or RHEL works as expected. Now, what is it trying to tell us? $ cat /proc/sys/kernel/yama/ptrace_scope 1 $ cat /etc/sysctl.d/10-ptrace.conf # The PTRACE system is used for debugging. With it, a single user process # can attach to any other dumpable process owned by the same user. In the # case of malicious software, it is possible to use PTRACE to access # credentials that exist in memory (re-using existing SSH connections, # extracting GPG agent information, etc). # # A PTRACE scope of "0" is the more permissive mode. A scope of "1" limits # PTRACE only to direct child processes (e.g. "gdb name-of-program" and # "strace -f name-of-program" work, but gdb's "attach" and "strace -fp $PID" # do not). The PTRACE scope is ignored when a user has CAP_SYS_PTRACE, so # "sudo strace -fp $PID" will work as before. For more details see: # https://wiki.ubuntu.com/SecurityTeam/Roadmap/KernelHardening#ptrace # # For applications launching crash handlers that need PTRACE, exceptions can # be registered by the debugee by declaring in the segfault handler # specifically which process will be using PTRACE on the debugee: # prctl(PR_SET_PTRACER, debugger_pid, 0, 0, 0); # # In general, PTRACE is not needed for the average running Ubuntu system. # To that end, the default is to set the PTRACE scope to "1". This value # may not be appropriate for developers or servers with only admin accounts. kernel.yama.ptrace_scope = 1 # echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope 0 # su - ceph No directory, logging in with HOME=/ $ strace -p 7568 strace: Process 7568 attached restart_syscall(<... resuming interrupted nanosleep ...>^Cstrace: Process 7568 detached <detached ...> $ cd /tmp/ $ gcore 7568 /root/sleeper: Permission denied. warning: Memory read failed for corefile section, 4096 bytes at 0x72c60000. warning: Memory read failed for corefile section, 4096 bytes at 0x72f68000. warning: Memory read failed for corefile section, 4096 bytes at 0x72f69000. warning: Memory read failed for corefile section, 16384 bytes at 0x73329000. warning: Memory read failed for corefile section, 8192 bytes at 0x7332d000. warning: Memory read failed for corefile section, 16384 bytes at 0x7332f000. warning: Memory read failed for corefile section, 40960 bytes at 0x736a5000. warning: Memory read failed for corefile section, 8192 bytes at 0x736af000. warning: Memory read failed for corefile section, 16384 bytes at 0x736b1000. warning: Memory read failed for corefile section, 20480 bytes at 0x738cd000. warning: Memory read failed for corefile section, 8192 bytes at 0x738d8000. warning: Memory read failed for corefile section, 4096 bytes at 0x738da000. warning: Memory read failed for corefile section, 4096 bytes at 0x738db000. warning: Memory read failed for corefile section, 4096 bytes at 0x738dc000. warning: Memory read failed for corefile section, 135168 bytes at 0x5a7b2000. warning: Memory read failed for corefile section, 8192 bytes at 0x5a7f9000. warning: Memory read failed for corefile section, 4096 bytes at 0xff600000. Saved corefile core.7568 $ file core.7568 core.7568: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style This is normal and expected behaviour for an Ubuntu Xenial system AFAICT (although I know very little about Ubuntu) and represents a configuration issue, not a problem with ceph code. Please test this after issuing "echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope" and, if that works, we can close this NOTABUG.