Bug 2151842
Summary: | [Azure][RHEL9.2] Kdump cannot save vmcore via ssh with enabled accelerated networking NIC | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | xxiong | ||||
Component: | kexec-tools | Assignee: | Coiby <coxu> | ||||
Status: | CLOSED ERRATA | QA Contact: | xxiong | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 9.2 | CC: | cavery, coxu, huzhao, litian, ruyang, vkuznets, xuli, yacao, yuxisun | ||||
Target Milestone: | rc | Keywords: | Regression, Triaged | ||||
Target Release: | --- | Flags: | pm-rhel:
mirror+
|
||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | kexec-tools-2.0.25-10.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-05-09 08:14:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
kexec-tools-2.0.25-7.el9.x86_64 introduced an optimization to only include needed NIC drivers (bz2120186). Unfortunately the implementation doesn't work for this case because it doesn't install the driver of the backing physic NIC. As a proof, we can find adding "dracut_args --add-drivers mlx5_core" could make kdump work again. I'll send a patch to take care of this case. Checked with kexec-tools-2.0.25-10.el9.x86_64, on azure (mlx5) and hyper-v (Intel), both can save vmcore file well, set verify: Tested Hi Coiby, Checked with compose RHEL-9.2.0-20230110.5, found this issue still occurs, and rechecked with brew build, it works. Seems the kexec-tools-2.0.25-10.el9.x86_64.rpm in compose RHEL-9.2.0-20230110.5 (http://download.eng.pek2.redhat.com/rhel-9/nightly/RHEL-9/RHEL-9.2.0-20230110.5/compose/BaseOS/x86_64/os/Packages/kexec-tools-2.0.25-10.el9.x86_64.rpm) is not same as that in brew (https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm) ---------------------------------------- [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl status kdump: Kdump is operational [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:15:5d:c4:63:83 brd ff:ff:ff:ff:ff:ff inet 10.73.197.153/22 brd 10.73.199.255 scope global dynamic noprefixroute eth0 valid_lft 42864sec preferred_lft 42864sec inet6 2620:52:0:49c4:215:5dff:fec4:6383/64 scope global dynamic noprefixroute valid_lft 2591970sec preferred_lft 604770sec inet6 fe80::215:5dff:fec4:6383/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff inet 10.0.2.40/24 brd 10.0.2.255 scope global noprefixroute eth1 valid_lft forever preferred_lft forever inet6 fe80::215:5dff:fec4:6385/64 scope link valid_lft forever preferred_lft forever 4: enP37881s2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth1 state UP group default qlen 1000 link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff altname enP37881p0s2 [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep intel -rw-r--r-- root/root 6172 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko.xz -rw-r--r-- root/root 4500 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ghash-clmulni-intel.ko.xz drwxr-xr-x root/root 30 2023-01-12 12:25 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel drwxr-xr-x root/root 36 2023-01-12 12:25 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel/ixgbevf -rw-r--r-- root/root 45288 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel/ixgbevf/ixgbevf.ko.xz [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > /proc/sysrq-trigger [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa kexec-tools kexec-tools-2.0.25-10.el9.x86_64 ======================================================================work after changed to brew build============================= [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# wget https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm --no-check-certificate --2023-01-12 15:06:43-- https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm Resolving download.eng.bos.redhat.com (download.eng.bos.redhat.com)... 10.19.165.239 Connecting to download.eng.bos.redhat.com (download.eng.bos.redhat.com)|10.19.165.239|:443... connected. WARNING: The certificate of ‘download.eng.bos.redhat.com’ is not trusted. WARNING: The certificate of ‘download.eng.bos.redhat.com’ doesn't have a known issuer. HTTP request sent, awaiting response... 200 OK Length: 492295 (481K) [application/x-rpm] Saving to: ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ kexec-tools-2.0.25-10.el9.x86_64.rpm 100%[=====================================================================================================================>] 480.76K 253KB/s in 1.9s 2023-01-12 15:06:47 (253 KB/s) - ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ saved [492295/492295] [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ls anaconda-ks.cfg anac-post-log kexec-tools-2.0.25-10.el9.x86_64.rpm original-ks.cfg [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -ivh --force kexec-tools-2.0.25-10.el9.x86_64.rpm Verifying... ################################# [100%] Preparing... ################################# [100%] Updating / installing... 1:kexec-tools-2.0.25-10.el9 ################################# [100%] [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# reboot [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# Connection to 10.73.197.153 closed by remote host. Connection to 10.73.197.153 closed. bash-4.4$ ssh root.197.153 Password: Activate the web console with: systemctl enable --now cockpit.socket Register this system with Red Hat Insights: insights-client --register Create an account or view all your systems at https://red.ht/insights-dashboard Last login: Thu Jan 12 14:38:36 2023 from 10.72.12.219 [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep intel -rw-r--r-- root/root 6172 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko.xz -rw-r--r-- root/root 4500 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ghash-clmulni-intel.ko.xz drwxr-xr-x root/root 30 2023-01-12 15:07 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel drwxr-xr-x root/root 36 2023-01-12 15:07 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel/ixgbevf -rw-r--r-- root/root 45288 2023-01-07 05:35 squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ethernet/intel/ixgbevf/ixgbevf.ko.xz [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa kexec-tools kexec-tools-2.0.25-10.el9.x86_64 [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl status kdump: Kdump is operational [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > /proc/sysrq-trigger (In reply to xxiong from comment #11) > Hi Coiby, > Checked with compose RHEL-9.2.0-20230110.5, found this issue still occurs, > and rechecked with brew build, it works. > > Seems the kexec-tools-2.0.25-10.el9.x86_64.rpm in compose > RHEL-9.2.0-20230110.5 > (http://download.eng.pek2.redhat.com/rhel-9/nightly/RHEL-9/RHEL-9.2.0- > 20230110.5/compose/BaseOS/x86_64/os/Packages/kexec-tools-2.0.25-10.el9. > x86_64.rpm) is not same as that in brew > (https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec- > tools/2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm) > > > ---------------------------------------- > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl > status > kdump: Kdump is operational > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ip a > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group > default qlen 1000 > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > valid_lft forever preferred_lft forever > inet6 ::1/128 scope host > valid_lft forever preferred_lft forever > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group > default qlen 1000 > link/ether 00:15:5d:c4:63:83 brd ff:ff:ff:ff:ff:ff > inet 10.73.197.153/22 brd 10.73.199.255 scope global dynamic > noprefixroute eth0 > valid_lft 42864sec preferred_lft 42864sec > inet6 2620:52:0:49c4:215:5dff:fec4:6383/64 scope global dynamic > noprefixroute > valid_lft 2591970sec preferred_lft 604770sec > inet6 fe80::215:5dff:fec4:6383/64 scope link noprefixroute > valid_lft forever preferred_lft forever > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group > default qlen 1000 > link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff > inet 10.0.2.40/24 brd 10.0.2.255 scope global noprefixroute eth1 > valid_lft forever preferred_lft forever > inet6 fe80::215:5dff:fec4:6385/64 scope link > valid_lft forever preferred_lft forever > 4: enP37881s2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq > master eth1 state UP group default qlen 1000 > link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff > altname enP37881p0s2 > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd > /boot/initramfs-$(uname -r)kdump.img | grep intel > -rw-r--r-- root/root 6172 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > crc32c-intel.ko.xz > -rw-r--r-- root/root 4500 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > ghash-clmulni-intel.ko.xz > drwxr-xr-x root/root 30 2023-01-12 12:25 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel > drwxr-xr-x root/root 36 2023-01-12 12:25 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel/ixgbevf > -rw-r--r-- root/root 45288 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel/ixgbevf/ixgbevf.ko.xz > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > > /proc/sysrq-trigger > > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa > kexec-tools > kexec-tools-2.0.25-10.el9.x86_64 > > ======================================================================work > after changed to brew build============================= > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# wget > https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/ > 2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm > --no-check-certificate > --2023-01-12 15:06:43-- > https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/ > 2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm > Resolving download.eng.bos.redhat.com (download.eng.bos.redhat.com)... > 10.19.165.239 > Connecting to download.eng.bos.redhat.com > (download.eng.bos.redhat.com)|10.19.165.239|:443... connected. > WARNING: The certificate of ‘download.eng.bos.redhat.com’ is not trusted. > WARNING: The certificate of ‘download.eng.bos.redhat.com’ doesn't have a > known issuer. > HTTP request sent, awaiting response... 200 OK > Length: 492295 (481K) [application/x-rpm] > Saving to: ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ > > kexec-tools-2.0.25-10.el9.x86_64.rpm > 100%[======================================================================== > =============================================>] 480.76K 253KB/s in 1.9s > > > 2023-01-12 15:06:47 (253 KB/s) - ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ > saved [492295/492295] > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ls > anaconda-ks.cfg anac-post-log kexec-tools-2.0.25-10.el9.x86_64.rpm > original-ks.cfg > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -ivh > --force kexec-tools-2.0.25-10.el9.x86_64.rpm > Verifying... ################################# > [100%] > Preparing... ################################# > [100%] > Updating / installing... > 1:kexec-tools-2.0.25-10.el9 ################################# > [100%] > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# reboot > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# > Connection to 10.73.197.153 closed by remote host. > Connection to 10.73.197.153 closed. > bash-4.4$ ssh root.197.153 > Password: > Activate the web console with: systemctl enable --now cockpit.socket > > Register this system with Red Hat Insights: insights-client --register > Create an account or view all your systems at > https://red.ht/insights-dashboard > Last login: Thu Jan 12 14:38:36 2023 from 10.72.12.219 > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd > /boot/initramfs-$(uname -r)kdump.img | grep intel > -rw-r--r-- root/root 6172 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > crc32c-intel.ko.xz > -rw-r--r-- root/root 4500 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > ghash-clmulni-intel.ko.xz > drwxr-xr-x root/root 30 2023-01-12 15:07 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel > drwxr-xr-x root/root 36 2023-01-12 15:07 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel/ixgbevf > -rw-r--r-- root/root 45288 2023-01-07 05:35 > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > ethernet/intel/ixgbevf/ixgbevf.ko.xz > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa > kexec-tools > kexec-tools-2.0.25-10.el9.x86_64 > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl > status > kdump: Kdump is operational > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > > /proc/sysrq-trigger Hi Xiaoqiang, I download the two rpms and find they have the same content. I wonder if there is a race condition. Does compose RHEL-9.2.0-20230110.5 always fail each time crash is triggered? If the answer is yes, could you paste the output of "lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep hyperv"? (In reply to Coiby from comment #12) > (In reply to xxiong from comment #11) > > Hi Coiby, > > Checked with compose RHEL-9.2.0-20230110.5, found this issue still occurs, > > and rechecked with brew build, it works. > > > > Seems the kexec-tools-2.0.25-10.el9.x86_64.rpm in compose > > RHEL-9.2.0-20230110.5 > > (http://download.eng.pek2.redhat.com/rhel-9/nightly/RHEL-9/RHEL-9.2.0- > > 20230110.5/compose/BaseOS/x86_64/os/Packages/kexec-tools-2.0.25-10.el9. > > x86_64.rpm) is not same as that in brew > > (https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec- > > tools/2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm) > > > > > > ---------------------------------------- > > > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl > > status > > kdump: Kdump is operational > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ip a > > 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group > > default qlen 1000 > > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > > inet 127.0.0.1/8 scope host lo > > valid_lft forever preferred_lft forever > > inet6 ::1/128 scope host > > valid_lft forever preferred_lft forever > > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group > > default qlen 1000 > > link/ether 00:15:5d:c4:63:83 brd ff:ff:ff:ff:ff:ff > > inet 10.73.197.153/22 brd 10.73.199.255 scope global dynamic > > noprefixroute eth0 > > valid_lft 42864sec preferred_lft 42864sec > > inet6 2620:52:0:49c4:215:5dff:fec4:6383/64 scope global dynamic > > noprefixroute > > valid_lft 2591970sec preferred_lft 604770sec > > inet6 fe80::215:5dff:fec4:6383/64 scope link noprefixroute > > valid_lft forever preferred_lft forever > > 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group > > default qlen 1000 > > link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff > > inet 10.0.2.40/24 brd 10.0.2.255 scope global noprefixroute eth1 > > valid_lft forever preferred_lft forever > > inet6 fe80::215:5dff:fec4:6385/64 scope link > > valid_lft forever preferred_lft forever > > 4: enP37881s2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq > > master eth1 state UP group default qlen 1000 > > link/ether 00:15:5d:c4:63:85 brd ff:ff:ff:ff:ff:ff > > altname enP37881p0s2 > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd > > /boot/initramfs-$(uname -r)kdump.img | grep intel > > -rw-r--r-- root/root 6172 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > > crc32c-intel.ko.xz > > -rw-r--r-- root/root 4500 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > > ghash-clmulni-intel.ko.xz > > drwxr-xr-x root/root 30 2023-01-12 12:25 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel > > drwxr-xr-x root/root 36 2023-01-12 12:25 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel/ixgbevf > > -rw-r--r-- root/root 45288 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel/ixgbevf/ixgbevf.ko.xz > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > > > /proc/sysrq-trigger > > > > > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa > > kexec-tools > > kexec-tools-2.0.25-10.el9.x86_64 > > > > ======================================================================work > > after changed to brew build============================= > > > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# wget > > https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/ > > 2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm > > --no-check-certificate > > --2023-01-12 15:06:43-- > > https://download.eng.bos.redhat.com/brewroot/vol/rhel-9/packages/kexec-tools/ > > 2.0.25/10.el9/x86_64/kexec-tools-2.0.25-10.el9.x86_64.rpm > > Resolving download.eng.bos.redhat.com (download.eng.bos.redhat.com)... > > 10.19.165.239 > > Connecting to download.eng.bos.redhat.com > > (download.eng.bos.redhat.com)|10.19.165.239|:443... connected. > > WARNING: The certificate of ‘download.eng.bos.redhat.com’ is not trusted. > > WARNING: The certificate of ‘download.eng.bos.redhat.com’ doesn't have a > > known issuer. > > HTTP request sent, awaiting response... 200 OK > > Length: 492295 (481K) [application/x-rpm] > > Saving to: ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ > > > > kexec-tools-2.0.25-10.el9.x86_64.rpm > > 100%[======================================================================== > > =============================================>] 480.76K 253KB/s in 1.9s > > > > > > 2023-01-12 15:06:47 (253 KB/s) - ‘kexec-tools-2.0.25-10.el9.x86_64.rpm’ > > saved [492295/492295] > > > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# ls > > anaconda-ks.cfg anac-post-log kexec-tools-2.0.25-10.el9.x86_64.rpm > > original-ks.cfg > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -ivh > > --force kexec-tools-2.0.25-10.el9.x86_64.rpm > > Verifying... ################################# > > [100%] > > Preparing... ################################# > > [100%] > > Updating / installing... > > 1:kexec-tools-2.0.25-10.el9 ################################# > > [100%] > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# reboot > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# > > Connection to 10.73.197.153 closed by remote host. > > Connection to 10.73.197.153 closed. > > bash-4.4$ ssh root.197.153 > > Password: > > Activate the web console with: systemctl enable --now cockpit.socket > > > > Register this system with Red Hat Insights: insights-client --register > > Create an account or view all your systems at > > https://red.ht/insights-dashboard > > Last login: Thu Jan 12 14:38:36 2023 from 10.72.12.219 > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# lsinitrd > > /boot/initramfs-$(uname -r)kdump.img | grep intel > > -rw-r--r-- root/root 6172 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > > crc32c-intel.ko.xz > > -rw-r--r-- root/root 4500 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/arch/x86/crypto/ > > ghash-clmulni-intel.ko.xz > > drwxr-xr-x root/root 30 2023-01-12 15:07 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel > > drwxr-xr-x root/root 36 2023-01-12 15:07 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel/ixgbevf > > -rw-r--r-- root/root 45288 2023-01-07 05:35 > > squashfs-root/usr/lib/modules/5.14.0-230.el9.x86_64/kernel/drivers/net/ > > ethernet/intel/ixgbevf/ixgbevf.ko.xz > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# rpm -qa > > kexec-tools > > kexec-tools-2.0.25-10.el9.x86_64 > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# kdumpctl > > status > > kdump: Kdump is operational > > [root@LISAv2-TwoVM2Host1Dep-xxq-gen1b-YM13-638091223046-role-0 ~]# echo 'c' > > > /proc/sysrq-trigger > > Hi Xiaoqiang, > > I download the two rpms and find they have the same content. I wonder if > there is a race condition. Does compose RHEL-9.2.0-20230110.5 always fail > each time crash is triggered? If the answer is yes, could you paste the > output of "lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep hyperv"? Hi Coiby, I rechecked via manual, both mlnx and intel are pass on hyper-v, for azure (mlnx only), also is pass (the failed was caused by another issue (2156126). for the failed on comment 11, it was run by script, maybe the setup has some difference--still checking thanks. no difference found for lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep hyperv when work and not work setup: [root@LISAv2-TwoVM2Host1Dep-xxq-1-FQ72-638091423485-role-0 ~]# lsinitrd /boot/initramfs-$(uname -r)kdump.img | grep hyperv drwxr-xr-x root/root 39 2023-01-12 19:35 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/gpu/drm/hyperv -rw-r--r-- root/root 24036 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/gpu/drm/hyperv/hyperv_drm.ko.xz -rw-r--r-- root/root 9588 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/hid/hid-hyperv.ko.xz -rw-r--r-- root/root 8824 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/input/serio/hyperv-keyboard.ko.xz drwxr-xr-x root/root 38 2023-01-12 19:35 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/net/hyperv -rw-r--r-- root/root 58564 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/net/hyperv/hv_netvsc.ko.xz -rw-r--r-- root/root 2392 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/pci/controller/pci-hyperv-intf.ko.xz -rw-r--r-- root/root 23000 2023-01-10 04:32 squashfs-root/usr/lib/modules/5.14.0-231.el9.x86_64/kernel/drivers/pci/controller/pci-hyperv.ko.xz Compared kexec-tools-2.0.25-9.el9.x86_64 and kexec-tools-2.0.25-10.el9.x86_64.rpm (from compose) With the same manual setup: this issue occurs on kexec-tools-2.0.25-9.el9.x86_64 and pass on kexec-tools-2.0.25-10.el9.x86_64.rpm. So set this bug verified. For the c11 setup difference issue, will keep checking, and will file new bug to trace it if found new issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (kexec-tools bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2463 |
Created attachment 1931025 [details] console output Description of problem: Config vm( with enabled accelerated networking NIC) to save vmcore to remote machine, then trigger crash, VM hang after log:"kdump[524]: saving to..." , and can not save vmcore to remote. Version-Release number of selected component (if applicable): 5.14.0-205.el9.x86_64 kexec-tools-2.0.25-7.el9.x86_64 NetworkManager-1.41.6-1.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create rhel9.2 vm on Azure with enabled accelerated networking NIC [azureuser@controller-vm ~]$ lspci f7f2:00:02.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] (rev 80) 2. Modify /etc/kdump.conf like below: ssh root.0.5 --(change to your peer vm ip) sshkey /root/.ssh/kdump_id_rsa path /var/crash core_collector makedumpfile -F -l --message-level 1 -d 31 3. Run "kdumpctl propagate" to generate key pair and add to the remote VM 4. Run "kdumpctl restart kdump" 5. Trigger kdump "echo c > /proc/sysrq-trigger" Actual results: Check the logs display on console when trigger crash, VM hang after log:"kdump[524]: saving to..." , and can not save vmcore to remote. Expected results: Save vmcore without error Additional info: This issue doesn't occur when use kexec-tools-2.0.25-5.el9.x86_64