Bug 1809179

Summary:	[RHEL-7.9] Dump doesn't automatically start; gives error "kdump: error: Dump target is not mounted."
Product:	Red Hat Enterprise Linux 7	Reporter:	Steve Bonds <ij2fdc402>
Component:	kexec-tools	Assignee:	Pingfan Liu <piliu>
Status:	CLOSED WONTFIX	QA Contact:	Emma Wu <xiawu>
Severity:	medium	Docs Contact:
Priority:	unspecified
Version:	7.7	CC:	kdump-bugs, piliu, ruyang, xiawu
Target Milestone:	rc	Flags:	pm-rhel: mirror+
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-09-02 07:27:04 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1653509

Description Steve Bonds 2020-03-02 14:38:56 UTC

Description of problem:

When a crash is triggered, the dump fails to start and gives the error:

Starting Kdump Vmcore Save Service...
kdump: dump target is
kdump: error: Dump target is not mounted.
kdump: saving vmcore failed
FAILED Failed to start Kdump Vmcore Save Service.
See 'systemctl status kdump-capture.service' for details.

The XFS mount process can be seen to start:

Found device /dev/mapper/vgroot-crash_lv.
Starting File System Check on /dev/mapper/vgroot-crash_lv...
systemd-fsck[488]: Started File System Check on /dev/mapper/vgroot-crash_lv./sbin/fsck.xfs: XFS file system.

Started dracut initqueue hook.
Reached target Remote File Systems (Pre).
Reached target Initrd Root File System.
Starting Reload Configuration from the Real Root...
Mounting /kdumproot/var/crash...
Reached target Remote File Systems.
Started Reload Configuration from the Real Root.
Reached target Initrd File Systems.
Reached target Initrd Default Target.
SGI XFS with ACLs, security attributes, realtime, no debug enabled
Starting dracut pre-pivot and cleanup hook...
XFS (dm-0): Mounting V4 Filesystem
Started dracut pre-pivot and cleanup hook.
Starting Kdump Vmcore Save Service...
kdump: dump target is
kdump: error: Dump target is not mounted.
kdump: saving vmcore failed
FAILED Failed to start Kdump Vmcore Save Service.

At this point it seems unclear why the dump fails to start. Setting `default=shell` and running the same service start manually works fine.

Version-Release number of selected component (if applicable):

Name : kexec-tools
Version : 2.0.15
Release : 33.el7

How reproducible:

Not especially. :-)

On certain affected servers it seems to happen regularly, on other supposedly identical servers it never seems to happen. This is probably due to an upstream issue mounting the /var/crash area and returning a false success for some short period of time before the mount area is actually ready. While the cause for this may remain unknown, there are ways to make the dump process more resilient when this or similar issues happen.

So far this has only been observed on XFS filesystems.

Steps to Reproduce:

1. (on an affected server)
2. initiate the crash process with SysRq or NMI
3. watch crash kernel start
4. watch crash kernel fail to capture dump

Actual results:

dump: dump target is
kdump: error: Dump target is not mounted.
kdump: saving vmcore failed

Expected results:

kdump: dump target is /dev/mapper/vgroot-crash_lv
kdump: saving to /kdumproot/var/crash///127.0.0.1-2020-02-29-19:17:56/
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
The kernel version is not supported.
The makedumpfile operation may be incomplete.

Copying data
...

Additional info:

While there may be multiple possible causes for the delay in mounting the crash destination, fixing this doesn't require actually finding the cause.

One fix is the classic fix for all race conditions: add a delay. For example in the SRPM the "dracut-kdump.sh" file could be modified like

BEFORE:

#!/bin/sh

# continue here only if we have to save dump.
if [ -f /etc/fadump.initramfs ] && [ ! -f /proc/device-tree/rtas/ibm,kernel-dump ]; then
exit 0
fi

exec &> /dev/console
. /lib/dracut-lib.sh
. /lib/kdump-lib-initramfs.sh

set -o pipefail
DUMP_RETVAL=0

export PATH=$PATH:$KDUMP_SCRIPT_DIR

AFTER (add a sleep):

#!/bin/sh

# continue here only if we have to save dump.
if [ -f /etc/fadump.initramfs ] && [ ! -f /proc/device-tree/rtas/ibm,kernel-dump ]; then
exit 0
fi

# Avoid upstream race condition
echo "kdump wait to avoid race conditions" > /dev/console
sleep 60

exec &> /dev/console
. /lib/dracut-lib.sh
. /lib/kdump-lib-initramfs.sh

set -o pipefail
DUMP_RETVAL=0

export PATH=$PATH:$KDUMP_SCRIPT_DIR

Another possible fix would be to add some retries to the service start so if it fails for any reason, there's a delay and it's retried a few times with an additional delay between retries.

A more targeted retry would be to adjust `kdump-lib-initramfs.sh` to allow for return code checks and limited retries for the following:

local _dev=$(findmnt -k -f -n -r -o SOURCE $1)
local _mp=$(findmnt -k -f -n -r -o TARGET $1)

echo "kdump: dump target is $_dev"

if [ -z "$_mp" ]; then
echo "kdump: error: Dump target $_dev is not mounted."
return 1
fi

One possible example to add a local retry:

for try in 1 2 3; do local _dev=$(findmnt -k -f -n -r -o SOURCE $1) && break || sleep 10; done
for try in 1 2 3; do local _mp=$(findmnt -k -f -n -r -o TARGET $1) && break || sleep 10; done

Comment 3 Bhupesh Sharma 2020-03-16 20:34:49 UTC

Hi Steve,

(In reply to Steve Bonds from comment #0)
> Description of problem:
> 
> When a crash is triggered, the dump fails to start and gives the error:
> 
>          Starting Kdump Vmcore Save Service...
>     kdump: dump target is 
>     kdump: error: Dump target  is not mounted.
>     kdump: saving vmcore failed
>     FAILED Failed to start Kdump Vmcore Save Service.
>     See 'systemctl status kdump-capture.service' for details.
> 
> The XFS mount process can be seen to start:
> 
>     Found device /dev/mapper/vgroot-crash_lv.
>              Starting File System Check on /dev/mapper/vgroot-crash_lv...
>     systemd-fsck[488]: Started File System Check on
> /dev/mapper/vgroot-crash_lv./sbin/fsck.xfs: XFS file system.
> 
>     Started dracut initqueue hook.
>     Reached target Remote File Systems (Pre).
>     Reached target Initrd Root File System.
>          Starting Reload Configuration from the Real Root...
>          Mounting /kdumproot/var/crash...
>     Reached target Remote File Systems.
>     Started Reload Configuration from the Real Root.
>     Reached target Initrd File Systems.
>     Reached target Initrd Default Target.
>         SGI XFS with ACLs, security attributes, realtime, no debug enabled
>     Starting dracut pre-pivot and cleanup hook...
>     XFS (dm-0): Mounting V4 Filesystem
>     Started dracut pre-pivot and cleanup hook.
>          Starting Kdump Vmcore Save Service...
>     kdump: dump target is 
>     kdump: error: Dump target  is not mounted.
>     kdump: saving vmcore failed
>     FAILED Failed to start Kdump Vmcore Save Service.
> 
> At this point it seems unclear why the dump fails to start. Setting
> `default=shell` and running the same service start manually works fine.
> 
> Version-Release number of selected component (if applicable):
> 
> Name        : kexec-tools
> Version     : 2.0.15
> Release     : 33.el7
> 
> How reproducible:
> 
> Not especially. :-)
> 
> On certain affected servers it seems to happen regularly, on other
> supposedly identical servers it never seems to happen. This is probably due
> to an upstream issue mounting the /var/crash area and returning a false
> success for some short period of time before the mount area is actually
> ready. While the cause for this may remain unknown, there are ways to make
> the dump process more resilient when this or similar issues happen. 
> 
> So far this has only been observed on XFS filesystems.
> 
> Steps to Reproduce:
> 
> 1. (on an affected server)
> 2. initiate the crash process with SysRq or NMI
> 3. watch crash kernel start
> 4. watch crash kernel fail to capture dump
> 
> Actual results:
> 
>    dump: dump target is 
>    kdump: error: Dump target  is not mounted.
>    kdump: saving vmcore failed
> 
> Expected results:
> 
>    kdump: dump target is /dev/mapper/vgroot-crash_lv
>    kdump: saving to /kdumproot/var/crash///127.0.0.1-2020-02-29-19:17:56/
>    kdump: saving vmcore-dmesg.txt
>    kdump: saving vmcore-dmesg.txt complete
>    kdump: saving vmcore
>    The kernel version is not supported.
>    The makedumpfile operation may be incomplete.
> 
>    Copying data
>    ...
> 
> Additional info:
> 
> While there may be multiple possible causes for the delay in mounting the
> crash destination, fixing this doesn't require actually finding the cause.
> 
> One fix is the classic fix for all race conditions: add a delay. For example
> in the SRPM the "dracut-kdump.sh" file could be modified like
> 
> BEFORE:
> 
>    #!/bin/sh
>    
>    # continue here only if we have to save dump.
>    if [ -f /etc/fadump.initramfs ] && [ ! -f
> /proc/device-tree/rtas/ibm,kernel-dump ]; then
>            exit 0
>    fi
>    
>    exec &> /dev/console
>    . /lib/dracut-lib.sh
>    . /lib/kdump-lib-initramfs.sh
> 
>    set -o pipefail
>    DUMP_RETVAL=0
>    
>    export PATH=$PATH:$KDUMP_SCRIPT_DIR
> 
> AFTER (add a sleep):
> 
>    #!/bin/sh
>    
>    # continue here only if we have to save dump.
>    if [ -f /etc/fadump.initramfs ] && [ ! -f
> /proc/device-tree/rtas/ibm,kernel-dump ]; then
>            exit 0
>    fi
>    
>    # Avoid upstream race condition
>    echo "kdump wait to avoid race conditions" > /dev/console
>    sleep 60
> 
>    exec &> /dev/console
>    . /lib/dracut-lib.sh
>    . /lib/kdump-lib-initramfs.sh
> 
>    set -o pipefail
>    DUMP_RETVAL=0
>    
>    export PATH=$PATH:$KDUMP_SCRIPT_DIR
> 
> Another possible fix would be to add some retries to the service start so if
> it fails for any reason, there's a delay and it's retried a few times with
> an additional delay between retries.
> 
> A more targeted retry would be to adjust `kdump-lib-initramfs.sh` to allow
> for return code checks and limited retries for the following:
> 
>     local _dev=$(findmnt -k -f -n -r -o SOURCE $1)
>     local _mp=$(findmnt -k -f -n -r -o TARGET $1)
> 
>     echo "kdump: dump target is $_dev"
> 
>     if [ -z "$_mp" ]; then
>         echo "kdump: error: Dump target $_dev is not mounted."
>         return 1
>     fi
> 
> One possible example to add a local retry:
> 
>     for try in 1 2 3; do local _dev=$(findmnt -k -f -n -r -o SOURCE $1) &&
> break || sleep 10; done
>     for try in 1 2 3; do local _mp=$(findmnt -k -f -n -r -o TARGET $1) &&
> break || sleep 10; done

I remember discussing a similar problem with RHEL-7 kexec-tools upstream (see <http://lists.infradead.org/pipermail/kexec/2020-March/024603.html>) which was reported for a use-case targeting saving vmcore on iSCSI server.

But, I think the root-cause seems to be the same:
- Basically whenever a File System Check (fsck) starts on the underlying dump target (targeted for saving vmcore) in the kdump kernel, we have issues reported with kdump failure (either 'dracut-initqueue timeout' starts or we have a kdump failure - as you shared),

- In all such cases, if we drop to kdump shell (by specifying 'default=shell' in the kdump.conf configuration file) and manually mount the intended dump target,  the mount is successful and thereafter we can save vmcore manually on the intended dump target (as fsck was able to run in the meanwhile and the intended dump target can now be found).

So, I think adding some kind of a retry/timeout attempt while trying to find intended dump target might help and probably adding it `kdump-lib-initramfs.sh` makes more sense.

I will do some debugging and get back with a suggestion/possible fix.

Thanks,
Bhupesh

Comment 5 Steve Bonds 2020-03-16 21:12:00 UTC

It seems unlikely that the delays in my specific case are fsck-related because XFS doesn't actually do any fsck. Instead it prints a misleading error message. (See https://bugzilla.redhat.com/show_bug.cgi?id=1546294) :-)

My suggestion would be to allow systemd to retry the failed crash steps several times. This looked like a good source for how to do that:

https://stackoverflow.com/questions/39284563/how-to-set-up-a-systemd-service-to-retry-5-times-on-a-cycle-of-30-seconds

Systemd retries would cover all possible failures to make the process more resilient.

To cover my specific issue, a targeted retry of the `findmnt` commands would be a nice addition to the general retry above.

I see you marked `needinfo`. What information can I provide?

Comment 6 Bhupesh Sharma 2020-03-17 20:47:43 UTC

(In reply to Steve Bonds from comment #5)
> It seems unlikely that the delays in my specific case are fsck-related
> because XFS doesn't actually do any fsck. Instead it prints a misleading
> error message. (See https://bugzilla.redhat.com/show_bug.cgi?id=1546294) :-)

Right, in this case, it might be a misleading fsck-related debug print, but like I shared, we have had similar issues
reported with other setups - such as iSCSI setups as well.

> My suggestion would be to allow systemd to retry the failed crash steps
> several times. This looked like a good source for how to do that:
> 
> https://stackoverflow.com/questions/39284563/how-to-set-up-a-systemd-service-
> to-retry-5-times-on-a-cycle-of-30-seconds
> 
> Systemd retries would cover all possible failures to make the process more
> resilient.
> 
> To cover my specific issue, a targeted retry of the `findmnt` commands would
> be a nice addition to the general retry above.

I think this would be a much better approach.
 
> I see you marked `needinfo`. What information can I provide?

I think the XFS related misleading error message was what I was wondering about. Thanks for clarifying the same.

I would work on a possible solution and share it shortly. Would need your help in verifying the same as I don't have a setup available right now, where this can be reproduced reliably.

Thanks.

Comment 8 Steve Bonds 2020-05-06 19:06:16 UTC

We're seeing this more and more on our Oracle Linux servers. You may hear from them since we're pursuing a solution as part of a service request there.

Our current workaround is the following added to /usr/lib/dracut/modules.d/99kdumpbase/kdump.sh before regenerating the kdump initrd files:

    # Avoid upstream race condition. Provide useful output to the impatient sysadmin
    for timeleft in {15..1}; do
      echo "kdump $timeleft second wait to avoid race conditions" > /dev/console
      sleep 1
    done

Based on testing from our affected servers, the mount typically completes within one or two seconds and usb settles within 7 seconds, so 15 seconds is plenty of overkill.

Comment 14 RHEL Program Management 2021-09-02 07:27:04 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.