Description of problem: When dumping vmcore to SSH/NFS target, the kdump initramfs includes NIC drivers for inactive interfaces. Take NFS target as an example, this is how the problem occurs, 1. 'mount "xx.xx.xx.xx:/path /kdumproot nfs defaults"' will be passed to dracut by mkdumprd 2. dracut will eventually run "/usr/lib/dracut/dracut-install -D /tmp/initramfs/ -H --kerneldir /lib/modules/5.10.22-200.fc33.x86_64/ -o -m =drivers/net/ethernet". "-H" means only including host drivers and "=drivers/net/ethernet" means including all drivers in drivers/net/ethernet. So the system have multiple NICs, all host drivers in drivers/net/ethernet will be copied to the initramfs. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Find a system with multiple NICs, e.g. hp-z8-g4-01.khw2.lab.eng.bos.redhat.com [root@hp-z8-g4-01 net]# ip addr show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp9s0f2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether a0:8c:fd:dd:ca:be brd ff:ff:ff:ff:ff:ff 3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether a0:8c:fd:dd:ca:bc brd ff:ff:ff:ff:ff:ff inet 10.16.210.50/23 brd 10.16.211.255 scope global dynamic noprefixroute eno1 valid_lft 82797sec preferred_lft 82797sec inet6 2620:52:0:10d2:a28c:fdff:fedd:cabc/64 scope global dynamic noprefixroute valid_lft 2591952sec preferred_lft 604752sec inet6 fe80::a28c:fdff:fedd:cabc/64 scope link noprefixroute valid_lft forever preferred_lft forever 4: ens5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000 link/ether a0:fb:fd:dd:ca:bf brd ff:ff:ff:ff:ff:ff [root@hp-z8-g4-01 net]# readlink /sys/class/net/{enp9s0f2,eno1,ens5}/device/driver/module ../../../../module/i40e ../../../../module/e1000e ../../../../module/e1000e 2. Configure kdump to dump to a NFS target 3. Run "kdumpctl rebuild" [root@hp-z8-g4-01 ~]# kdumpctl rebuild kdump: Rebuilding /boot/initramfs-4.18.0-287.el8.dt4.x86_64kdump.img 4. In the build initramfs, you will find all NIC drivers are included. For hp-z8-g4-01.khw2.lab.eng.bos.redhat.com, [root@hp-z8-g4-01 t]# lsinitrd --unpack /boot/initramfs-4.18.0-287.el8.dt4.x86_64kdump.img [root@hp-z8-g4-01 t]# unsquashfs squash/root.img [root@hp-z8-g4-01 t]# cd squashfs-root/usr/lib/modules/4.18.0-287.el8.dt4.x86_64/kernel/drivers/net [root@hp-z8-g4-01 net]# tree . ├── ethernet │ └── intel │ ├── e1000e │ │ └── e1000e.ko.xz │ └── i40e │ └── i40e.ko.xz Actual results: kdump initramfs includes drivers for inactive NICs. Expected results: The initramfs should only include drivers that are needed by kdump. Additional info: 1. This bug is reproducible on Fedora 33, lenovo-sr630-02.rhts.eng.pek2.redhat.com. 2. It should be reproducible on REHL8.5 and RHEL9. 3. For dell-per740-55.rhts.eng.pek2.redhat.com, although there are two NICs, the other driver is in drivers/staging/qlge instead of in drivers/net/ethernet so it's not copied to the kdump initramfs.
Hi Coiby, I've noticed this issue before, so I opened a PR to dracut to add a `--hostonly-nic` option: https://github.com/dracutdevs/dracut/pull/957 It's merged, however kexec-tools part also needs a patch. Basically `mkdumprd` have know which NIC is actually needed by kdump, and pass that to dracut with this `--hostonly-nic` argument. And, we also need to detect network changes that will change the NIC that have access to the dump target, and rebuild the initramfs if it doesn't contain the NIC driver needed. Can you help try if this argument works as expected? We may put some effort into this if it helps a lot.
(In reply to Kairui Song from comment #1) > Hi Coiby, > > I've noticed this issue before, so I opened a PR to dracut to add a > `--hostonly-nic` option: > https://github.com/dracutdevs/dracut/pull/957 > Thanks for the work! > It's merged, however kexec-tools part also needs a patch. Basically > `mkdumprd` have know which NIC is actually needed by kdump, and pass that to > dracut with this `--hostonly-nic` argument. > > And, we also need to detect network changes that will change the NIC that > have access to the dump target, and rebuild the initramfs if it doesn't > contain the NIC driver needed. > > Can you help try if this argument works as expected? We may put some effort > into this if it helps a lot. Sure, I'll finish the kexec-tools part.
Note when a driver manage multiple NICs, memory resources are still allocated for the the inactive NICS as shown in https://bugzilla.redhat.com/show_bug.cgi?id=1936277. If we could fix this situation, https://bugzilla.redhat.com/show_bug.cgi?id=1962421 could be resolved automatically.