Description of problem: When gluster-blockd process crashes, the core file is generated in / dir. However, in the rhgs-container image / directory is not persistent. We should be pointing all the core files to a persistent directory, probably /var/log/glusterfs/gluster-block? Version-Release number of selected component (if applicable): gluster-block-0.2.1-18.el7rhgs.x86_64 How reproducible: always Steps to Reproduce: 1. crash gluster-blockd Actual results: core file is generated in / directory Expected results: core file should be in a persistent directory, probably /var/log/glusterfs/gluster-block Additional info:
The selected approach is that the administrator has to configure the kernel.core_pattern on the systems themselves. Because this is a system-wide configuration, it is not appropriate for CNS to configure a default (that may be different from other components). The CNS containers now should contain a /var/log/core/ directory. Administrators can take the following steps to capture cores: 1. sysctl -w kernel.core_pattern=/var/log/core/%e_%p (%e and %p are just examples, see 'man 5 core' for more options) Now the admin can copy-out the core after a segfault. 2. configure a bind-mount so that /var/log/core/ is persistent on the host With this, even if the container exits, the core stays available (probably need to change the core_pattern to include other %-options, maybe %t?). Karthick, what were the exact steps that you tried to verify the change?
(In reply to Niels de Vos from comment #16) > The selected approach is that the administrator has to configure the > kernel.core_pattern on the systems themselves. Because this is a system-wide > configuration, it is not appropriate for CNS to configure a default (that > may be different from other components). > > The CNS containers now should contain a /var/log/core/ directory. > Administrators can take the following steps to capture cores: > > 1. sysctl -w kernel.core_pattern=/var/log/core/%e_%p > (%e and %p are just examples, see 'man 5 core' for more options) > > Now the admin can copy-out the core after a segfault. > > 2. configure a bind-mount so that /var/log/core/ is persistent on the host > Is this directory not bind mounted by default? I don't see so in the latest rhgs-server-container image. > With this, even if the container exits, the core stays available (probably > need to change the core_pattern to include other %-options, maybe %t?). > > > Karthick, what were the exact steps that you tried to verify the change? What is the fix that we are providing as part of this bug? Is it just the /var/log/core/ directory that is being created within the container? and rest would be configuration changes that we are suggesting? I believe we should at least provide a fix with automatic bind mounting of /var/log/core/ directory. Finally, the information on configuring how to configure core needs to be documented or our document should point to an existing document.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:2688
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days