Bug 1698069 - please update kdump at least on Fedora 29
Summary: please update kdump at least on Fedora 29
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kexec-tools
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kairui Song
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-09 14:38 UTC by Harald Reindl
Modified: 2019-08-06 01:55 UTC (History)
4 users (show)

Fixed In Version: kexec-tools-2.0.19-1.fc29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-06 01:55:08 UTC


Attachments (Terms of Use)

Description Harald Reindl 2019-04-09 14:38:05 UTC
for months we try to get *any* useful information why a network device based on fedora crashs randomly, it's pure luck that kdump.service starts at all instead and not end in http://lkml.iu.edu/hypermail/linux/kernel/1310.2/01470.html (Can't find kernel text map area from kcore)

if by luck that don't happen kdump repeats the same completly unrelated399  "iptables -j LOG" lines from dmesg until the whole disk is full no matter how large that disk is and "vmcore-dmesg-incomplete.txt" don't contain any other single useful line

       399 filtered.txt
  12929725 vmcore-dmesg-incomplete.txt

[harry@srv-rhsoft:/data/lounge-daten/firewall-2018/crash-2019-04-09]$ ls
insgesamt 2,9G
-rw------- 1 harry verwaltung    0 2019-04-09 03:01 vmcore-incomplete
-rw-r----- 1 harry verwaltung  93K 2019-04-09 03:09 filtered.txt
-rw-r----- 1 harry verwaltung 2,9G 2019-04-09 03:01
vmcore-dmesg-incomplete.txt

Comment 1 Fedora Update System 2019-04-29 08:36:49 UTC
kexec-tools-2.0.19-1.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-d762a7ad70

Comment 2 Kairui Song 2019-04-29 08:39:16 UTC
Hi, I've rebased to latest release, could you help have a try?

Comment 3 Harald Reindl 2019-04-29 12:21:27 UTC
it seems at least basically work now at the first try without the "Can't find kernel text map area from kcore"

why it pulls "dracut-squash" and "squashfs-tools" but then in the dracut.log says "Required modules to build a squashed kdump image is missing!" is unclear but likely don't matter, that stuff has way too much dependencies, i wrote some oter bugreports that it also has a hard requires for "dracut-network" which pulls a lot of dhcp-stuff not desired to be installed on a firewall device (my workaroiund is a metapackage with Provides/Obsoletes dracut-network which is a dirty solution when we have soft-epedendencies these days)

-----------------------------

anyways:

systemctl start kdump
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: Could not find 'strip'. Not stripping the initramfs.
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Generating early-microcode cpio image ***
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Store current command line parameters ***
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: Stored kernel commandline:
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: No dracut internal kernel commandline stored in the initramfs
Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Creating image file '/boot/initramfs-5.0.9-200.fc29.x86_64kdump.img' ***
Apr 29 13:59:41 firewall.esx.vmware.local dracut[10357]: *** Creating initramfs image file '/boot/initramfs-5.0.9-200.fc29.x86_64kdump.img' done ***
Apr 29 13:59:42 firewall.esx.vmware.local kdumpctl[9071]: kexec: loaded kdump kernel
Apr 29 13:59:42 firewall.esx.vmware.local kdumpctl[9071]: Starting kdump: [OK]
Apr 29 13:59:42 firewall.esx.vmware.local systemd[1]: Started Crash recovery kernel arming

-----------------------------

sync; echo 1 > /proc/sys/kernel/sysrq; echo c > /proc/sysrq-trigger

-----------------------------

[root@firewall:/var/crash/127.0.0.1-2019-04-29-14:16:12]$ ls
total 37M
drwxr-xr-x 2 root root 4.0K 2019-04-29 14:16 .
drwxr-xr-x 3 root root 4.0K 2019-04-29 14:16 ..
-rw------- 1 root root  36M 2019-04-29 14:16 vmcore
-rw-r--r-- 1 root root  62K 2019-04-29 14:16 vmcore-dmesg.txt

-----------------------------

"vmcore-dmesg.txt" is complete and don't fill the whole disk, but who knows, this worked basically at tests too (with a reboot when the system was up for longer time because of the can't find....) and i hope the last changes in the firewall rules won't trigger the tragedy in production any longer, at least if it was a bug in kdump maybe it's solved now

...........................................
[62258.649408] sysrq: SysRq : Trigger a crash
[62258.655081] Kernel panic - not syncing: sysrq triggered crash
[62258.660570] CPU: 0 PID: 997 Comm: bash Kdump: loaded Not tainted 5.0.9-200.fc29.x86_64 #1
[62258.668663] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/03/2018
[62258.678847] Call Trace:
[62258.682530]  dump_stack+0x5c/0x80
[62258.686097]  panic+0x101/0x2a7
[62258.689303]  ? printk+0x58/0x6f
[62258.693246]  sysrq_handle_crash+0x11/0x11
[62258.697663]  __handle_sysrq.cold.7+0x67/0x10e
[62258.702151]  write_sysrq_trigger+0x30/0x40
[62258.706605]  proc_reg_write+0x39/0x60
[62258.710172]  __vfs_write+0x36/0x1b0
[62258.713558]  ? set_close_on_exec+0x2a/0x70
[62258.717565]  vfs_write+0xa5/0x1a0
[62258.721169]  ksys_write+0x4f/0xb0
[62258.724468]  do_syscall_64+0x5b/0x160
[62258.728028]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[62258.733112] RIP: 0033:0x7ffff7eb9fc8
[62258.736838] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 77 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[62258.754512] RSP: 002b:00007fffffffe668 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[62258.762099] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ffff7eb9fc8
[62258.770190] RDX: 0000000000000002 RSI: 00005555556a69c0 RDI: 0000000000000001
[62258.776758] RBP: 00005555556a69c0 R08: 000000000000000a R09: 00007ffff7f4be80
[62258.784234] R10: 000000000000000a R11: 0000000000000246 R12: 00007ffff7f8d780
[62258.790932] R13: 0000000000000002 R14: 00007ffff7f88740 R15: 0000000000000002

Comment 4 Fedora Update System 2019-04-30 03:40:12 UTC
kexec-tools-2.0.19-1.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-d762a7ad70

Comment 5 Kairui Song 2019-04-30 05:39:49 UTC
(In reply to Harald Reindl from comment #3)
> it seems at least basically work now at the first try without the "Can't
> find kernel text map area from kcore"
> 
> why it pulls "dracut-squash" and "squashfs-tools" but then in the dracut.log
> says "Required modules to build a squashed kdump image is missing!" is
> unclear but likely don't matter, that stuff has way too much dependencies, i
> wrote some oter bugreports that it also has a hard requires for
> "dracut-network" which pulls a lot of dhcp-stuff not desired to be installed
> on a firewall device (my workaroiund is a metapackage with
> Provides/Obsoletes dracut-network which is a dirty solution when we have
> soft-epedendencies these days)
> 

Hi, "Required modules to build a squashed kdump image is missing!" possibly because you didn't enable the squash module in the kernel or didn't install the module.
Now kdump is using the squash module to save memory in the kdump kernel, that's why dracut-squash is introduced, it will build a smaller kdump initramfs and use less memory after decompress the initramfs. It's ok if squash module is not found, it will just fallback to build a normal kdump initramfs.

dracut-network is for network kdump target, eg. nfs. It has been there for a long time.


> -----------------------------
> 
> anyways:
> 
> systemctl start kdump
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: Could not find
> 'strip'. Not stripping the initramfs.
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Generating
> early-microcode cpio image ***
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Store current
> command line parameters ***
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: Stored kernel
> commandline:
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: No dracut internal
> kernel commandline stored in the initramfs
> Apr 29 13:59:39 firewall.esx.vmware.local dracut[10357]: *** Creating image
> file '/boot/initramfs-5.0.9-200.fc29.x86_64kdump.img' ***
> Apr 29 13:59:41 firewall.esx.vmware.local dracut[10357]: *** Creating
> initramfs image file '/boot/initramfs-5.0.9-200.fc29.x86_64kdump.img' done
> ***
> Apr 29 13:59:42 firewall.esx.vmware.local kdumpctl[9071]: kexec: loaded
> kdump kernel
> Apr 29 13:59:42 firewall.esx.vmware.local kdumpctl[9071]: Starting kdump:
> [OK]
> Apr 29 13:59:42 firewall.esx.vmware.local systemd[1]: Started Crash recovery
> kernel arming
> 
> -----------------------------
> 
> sync; echo 1 > /proc/sys/kernel/sysrq; echo c > /proc/sysrq-trigger
> 
> -----------------------------
> 
> [root@firewall:/var/crash/127.0.0.1-2019-04-29-14:16:12]$ ls
> total 37M
> drwxr-xr-x 2 root root 4.0K 2019-04-29 14:16 .
> drwxr-xr-x 3 root root 4.0K 2019-04-29 14:16 ..
> -rw------- 1 root root  36M 2019-04-29 14:16 vmcore
> -rw-r--r-- 1 root root  62K 2019-04-29 14:16 vmcore-dmesg.txt
> 
> -----------------------------
> 
> "vmcore-dmesg.txt" is complete and don't fill the whole disk, but who knows,
> this worked basically at tests too (with a reboot when the system was up for
> longer time because of the can't find....) and i hope the last changes in
> the firewall rules won't trigger the tragedy in production any longer, at
> least if it was a bug in kdump maybe it's solved now
> 
> ...........................................
> [62258.649408] sysrq: SysRq : Trigger a crash
> [62258.655081] Kernel panic - not syncing: sysrq triggered crash
> [62258.660570] CPU: 0 PID: 997 Comm: bash Kdump: loaded Not tainted
> 5.0.9-200.fc29.x86_64 #1
> [62258.668663] Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> Desktop Reference Platform, BIOS 6.00 07/03/2018
> [62258.678847] Call Trace:
> [62258.682530]  dump_stack+0x5c/0x80
> [62258.686097]  panic+0x101/0x2a7
> [62258.689303]  ? printk+0x58/0x6f
> [62258.693246]  sysrq_handle_crash+0x11/0x11
> [62258.697663]  __handle_sysrq.cold.7+0x67/0x10e
> [62258.702151]  write_sysrq_trigger+0x30/0x40
> [62258.706605]  proc_reg_write+0x39/0x60
> [62258.710172]  __vfs_write+0x36/0x1b0
> [62258.713558]  ? set_close_on_exec+0x2a/0x70
> [62258.717565]  vfs_write+0xa5/0x1a0
> [62258.721169]  ksys_write+0x4f/0xb0
> [62258.724468]  do_syscall_64+0x5b/0x160
> [62258.728028]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [62258.733112] RIP: 0033:0x7ffff7eb9fc8
> [62258.736838] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00
> f3 0f 1e fa 48 8d 05 55 77 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48>
> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
> [62258.754512] RSP: 002b:00007fffffffe668 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [62258.762099] RAX: ffffffffffffffda RBX: 0000000000000002 RCX:
> 00007ffff7eb9fc8
> [62258.770190] RDX: 0000000000000002 RSI: 00005555556a69c0 RDI:
> 0000000000000001
> [62258.776758] RBP: 00005555556a69c0 R08: 000000000000000a R09:
> 00007ffff7f4be80
> [62258.784234] R10: 000000000000000a R11: 0000000000000246 R12:
> 00007ffff7f8d780
> [62258.790932] R13: 0000000000000002 R14: 00007ffff7f88740 R15:
> 0000000000000002

Thanks, glad to know it works.

Comment 6 Fedora Update System 2019-08-06 01:55:08 UTC
kexec-tools-2.0.19-1.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.