Description of problem: Vdsm is overriding /proc/sys/kernel/core_pattern during startup to: |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %e %i" But /usr/libexec/abrt-hook-ccpp is not installed since Fedora 26. So when a program crashes, the core dump is dropped. This was added in this commit: commit 893ac2a4d610791e26f6debdab8b06f8c36bc18d Author: Yeela Kaplan <ykaplan> Date: Mon Jul 6 18:27:47 2015 +0300 Adding abrt dependency and introduce configurator for it This is wrong; configuring core_pattern is done using /usr/lib/sysctl/*.conf, and on Fedora it configured by: $ cat /usr/lib/sysctl.d/50-coredump.conf ... kernel.core_pattern=|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e Removing the bad configuration restore core dumps. Version-Release number of selected component (if applicable): v4.40.0 How reproducible: Always Steps to Reproduce: 1. Run vm 2. Kill the vm kill -ABRT pid Actual results: No coredump is generated Expected results: Coredump generate Setting to urgent since we must have coredumps to debug issue with qemu.
Testing RHEL 8.1 We depend on abrt-addon-ccpp in vdsm spec: 160 Requires: abrt-addon-ccpp And abrt-ccpp.service is running: # systemctl status abrt-ccpp ● abrt-ccpp.service - Install ABRT coredump hook Loaded: loaded (/usr/lib/systemd/system/abrt-ccpp.service; enabled; vendor preset: enabled) Active: active (exited) since Wed 2020-01-08 18:37:50 IST; 6min ago Process: 968 ExecStart=/usr/sbin/abrt-install-ccpp-hook install (code=exited, status=0/SUCCESS) Main PID: 968 (code=exited, status=0/SUCCESS) Jan 08 18:37:50 host3 systemd[1]: Starting Install ABRT coredump hook... Jan 08 18:37:50 host3 systemd[1]: Started Install ABRT coredump hook. But core_pattern is configured to use coredumpctl: $ cat /proc/sys/kernel/core_pattern |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e Restarting abrt-ccpp changes the core pattern: # systemctl start abrt-ccpp.service # systemctl status abrt-ccpp.service ● abrt-ccpp.service - Install ABRT coredump hook Loaded: loaded (/usr/lib/systemd/system/abrt-ccpp.service; enabled; vendor preset: enabled) Active: active (exited) since Wed 2020-01-08 18:02:32 IST; 2s ago Process: 7013 ExecStop=/usr/sbin/abrt-install-ccpp-hook uninstall (code=exited, status=0/SUCCESS) Process: 7018 ExecStart=/usr/sbin/abrt-install-ccpp-hook install (code=exited, status=0/SUCCESS) Main PID: 7018 (code=exited, status=0/SUCCESS) # cat /proc/sys/kernel/core_pattern |/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %P %I %h %e So looks like abrt-ccpp.service is broken on RHEL 8.1. To use abrt-ccpp on RHEL 8.1 we can install a sysctl drop-in configuration (only on RHEL8.1): # cat /usr/lib/sysctl.d/60-vdsmd.conf # Install by vdsm kernel.core_pattern=|/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t %P %I %h %e This is kind of ugly because this what abrt-ccpp.service should do, or at least handled by abrt-ccpp, but it is a quick fix to continue to use abrt on RHEL 8.1. But all this trouble tell me that we need to move to systemd-coredump also on RHEL 8.1. This is probably another RHEL 8 porting task that we missed, because we did not follow upstream changes in Fedora. If we want to keep using abrt-ccp on RHEL 8.1, we need to file abrt bug for this.
> And abrt-ccpp.service is running > ... > But core_pattern is configured to use coredumpctl > ... > Restarting abrt-ccpp changes the core pattern > ... > So looks like abrt-ccpp.service is broken on RHEL 8.1. Not really - this is simply how abrt-ccpp.service works. A fragment of 'abrt-ccpp.service' file: [Service] Type=oneshot ExecStart=/usr/sbin/abrt-install-ccpp-hook install ExecStop=/usr/sbin/abrt-install-ccpp-hook uninstall RemainAfterExit=yes It's a 'oneshot' service - installs the hook and says goodbye. Systemd doesn't give up so easily though and on many occasions it will try to restore its original coredump handler. Therefore, per sysctl.d manual, we also need to mask systemd's '50-coredump.conf' configuration by creating a symlink of '/etc/sysctl.d/50-coredump.conf' pointing do '/dev/null'. > > To use abrt-ccpp on RHEL 8.1 we can install a sysctl drop-in configuration > (only on RHEL8.1): > ... Turns out that core dumps are currently also broken on RHEL/CentOS! This is because the pattern that we used to inject in 'vdsmd_init_common.sh' is wrong (some % fields were missing that were added by newer abrt version). This only strengthens my belief that we should definitely not try to define the core pattern ourselves. I wrote a test module to OST basic suite that crashes a VM on purpose and tests if a core dump is generated so we have no regressions in this area. > But all this trouble tell me that we need to move to systemd-coredump also > on RHEL 8.1. This is probably another RHEL 8 porting task that we missed, > because we did not follow upstream changes in Fedora. Given the rather short timeline to 4.4 GA I think we should stick with the known-and-tried abrt-ccpp for now, but switch to systemd-coredump in 4.5.
There is actually a bug filed for abrt on this issue [1], but since everyone else wants to jump on the 'systemd-coredump' train it didn't get much attention. After some chat with 'abrt' maintainers it's probably too late to get it fixed on their side before 4.4 GA, but a possibility for the future. In the long term we also should switch to systemd-coredump, but from what I've learned, the feature that avoids core dump duplication is at the planning stage. It would be nice to influence the 'abrt' team on this and prioritize appropriately. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1657158
Verified in vdsm-4.40.13-1.el8ev.x86_64
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.