Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
Customer is running kexec-tools-2.0.0-300.el6.x86_64. Kdump fails to start due to the customer using a tap device on openvpn. Case number 01620459
Version-Release number of selected component (if applicable):
kexec-tools-2.0.0-300.el6.x86_64
How reproducible:
Every time
Steps to reproduce:
Created By: Robb Manes (6/20/2016 3:48 PM)
// Working Notes - these notes are not intended as a meaningful communication
// but rather an indicator of current thought processes and reference.
// Please feel free to comment and ask questions concerning them.
My steps to reproduce, which worked without problems:
Make the tap device manually:
# tunctl -t tap0
Set 'tap0' persistent and owned by uid 0
Set up SSH kdump:
/etc/kdump.conf
ssh kdump.redhat.com
path /share/
Set up kdump keys:
# service kdump propagate
Using existing keys...
kdump.redhat.com's password:
/root/.ssh/kdump_id_rsa has been added to ~kdump/.ssh/authorized_keys on waffle.usersys.redhat.com
Restart kdump:
# service kdump restart
Stopping kdump: [ OK ]
Detected change(s) the following file(s):
/etc/kdump.conf
Rebuilding /boot/initrd-2.6.32-642.el6.x86_64kdump.img
Starting kdump: [ OK ]
Ensure kdump kernel is loaded:
# grep -i crash /proc/iomem
03000000-0b0fffff : Crash kernel
Crash the system:
# echo 'c' > /proc/sysrq-trigger
From the console, I can see:
$ virsh cosole rhel6-kdump-test
- - - - - - - - - 8< - - - - - - - - -
mapping eth0 to eth0
udhcpc (v1.15.1) started
Sending discover...
Sending select for 10.12.212.85...
Lease of 10.12.212.85 obtained, lease time 43200
deleting routers
adding dns 10.11.5.4
adding dns 10.11.5.3
Saving to remote location kdump.redhat.com
Saving vmcore-dmesg.txt
reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT!
63+1 records in
63+1 records out
32270 bytes (32 kB) copied, 0.000106353 s, 303 MB/s
Saved vmcore-dmesg.txt
Free memory/Total memory (free %): 66724 / 114296 ( 58.3782 )
Excluding unnecessary pages : [100.0 %] |reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT!
Copying data : [100.0 %] \
59550+465 records in
59566+1 records out
30497992 bytes (30 MB) copied, 2.0862 s, 14.6 MB/s
Saving core complete
Restarting system.
From the SSH host, I see the core:
$ file /share/10.12.212.85-2016-06-20-15\:29\:05/vmcore.flat
/share/10.12.212.85-2016-06-20-15:29:05/vmcore.flat: data
So, on my host it works as expected. All I did was create a tunnel without configuration.
I note that in the last attempts provided to us, tap0 is missing a 'device' parameter in /sys:
# service kdump restart
Stopping kdump: [ OK ]
No kdump initial ramdisk found. [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
ls: cannot access /sys/class/net/tap0/device: No such file or directory
Starting kdump: [ OK ]
My host does not have this either, but here was no issue or complaint when I rebuilt the ramdisk:
# ls /sys/class/net/tap0/dev*
/sys/class/net/tap0/dev_id /sys/class/net/tap0/dev_port
# mv /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img.backup
# service kdump restart
Stopping kdump: [ OK ]
No kdump initial ramdisk found. [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
Starting kdump: [ OK ]
And I am running an identical kernel:
# uname -a
Linux rhel6-kdump-test 2.6.32-573.22.1.el6.x86_64 #1 SMP Thu Mar 17 03:23:39 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
There is one major difference here, as Nitin pointed out. The logic of mkdumprd uses the routable interface as the handle_netdev() argument, which, in this scenario, might be the VPN tunnel:
- - - - - - - - - 8< - - - - - - - - -
#find ethernet device used to route to remote host, ie eth0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
netdev=`/sbin/ip route get to $remoteip 2>&1` <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
[ $? != 0 ] && echo "Bad kdump location: $config_val" && cleanup_and_exit 1
DUMP_TARGET=$config_val
#the field in the ip output changes if we go to another subnet
OFF_SUBNET=`echo $netdev | grep via`
if [ -n "$OFF_SUBNET" ]
then
# we are going to a different subnet
netdev=`echo $netdev|awk '{print $5;}'|head -n 1`
else
# we are on the same subnet
netdev=`echo $netdev|awk '{print $3}'|head -n 1`
fi
#add the ethernet device to the list of modules
mkdir -p $MNTIMAGE/etc/network/
handlenetdev $netdev <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
|
.---------'
v
handlenetdev() {
local dev=$1
local ifcfg_file
local vnet_prefix
case " $handlednetdevices " in
*" $dev "*)
return ;;
*) handlednetdevices="$handlednetdevices $dev" ;;
esac
ifcfg_file=`find_ifcfg_by_devicename $dev` --------.
if [ -z "${ifcfg_file}" ]; then |
error "The ifcfg-$dev or ifcfg-xxx which contains DEVICE=$dev field doesn't exist."
cleanup_and_exit 1 |
fi |
.---------------------------------------------'
v
find_ifcfg_by_devicename() {
local dev=$1
- - - - - - - - - 8< - - - - - - - - -
In my example, my routable interface is not the tap device. Let us see if it is the routed interface - from an sosreport I unfortunately can't tell:
$ cat sos_commands/networking/ip_route_show_table_all | grep tap
fe80::/64 dev tap0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295
ff00::/8 dev tap0 table local metric 256 mtu 1500 advmss 1440 hoplimit 4294967295
Actual results:
[e723nb@smsslpoc1a ~]$ rpm -qa | grep -i kexec
kexec-tools-2.0.0-300.el6.x86_64
[e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart
Stopping kdump: [ OK ]
No kdump initial ramdisk found. [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist.
Failed to run mkdumprd
[e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart
Stopping kdump: [ OK ]
No kdump initial ramdisk found. [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist.
Failed to run mkdumprd
Expected results:
For kdump to not look for the tap device
Additional info:
Customer can take down the network(tap device and openvpn) and start kdump. The issue is that the customer will have to take down the network every time they upgrade the kernel. This will impact production. They cannot build an ifcfg for the tap device because of conflicts with the ifcfg and the openvpn during reboot.
Xunlei and Dave,
Ack, I have a request for the customer to provide us with the info. Robb requested the "ip route get to" a few days ago and I have re-requested that along with the new info requested.
Regards,
Billy Woods
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://rhn.redhat.com/errata/RHBA-2017-0584.html
Description of problem: Customer is running kexec-tools-2.0.0-300.el6.x86_64. Kdump fails to start due to the customer using a tap device on openvpn. Case number 01620459 Version-Release number of selected component (if applicable): kexec-tools-2.0.0-300.el6.x86_64 How reproducible: Every time Steps to reproduce: Created By: Robb Manes (6/20/2016 3:48 PM) // Working Notes - these notes are not intended as a meaningful communication // but rather an indicator of current thought processes and reference. // Please feel free to comment and ask questions concerning them. My steps to reproduce, which worked without problems: Make the tap device manually: # tunctl -t tap0 Set 'tap0' persistent and owned by uid 0 Set up SSH kdump: /etc/kdump.conf ssh kdump.redhat.com path /share/ Set up kdump keys: # service kdump propagate Using existing keys... kdump.redhat.com's password: /root/.ssh/kdump_id_rsa has been added to ~kdump/.ssh/authorized_keys on waffle.usersys.redhat.com Restart kdump: # service kdump restart Stopping kdump: [ OK ] Detected change(s) the following file(s): /etc/kdump.conf Rebuilding /boot/initrd-2.6.32-642.el6.x86_64kdump.img Starting kdump: [ OK ] Ensure kdump kernel is loaded: # grep -i crash /proc/iomem 03000000-0b0fffff : Crash kernel Crash the system: # echo 'c' > /proc/sysrq-trigger From the console, I can see: $ virsh cosole rhel6-kdump-test - - - - - - - - - 8< - - - - - - - - - mapping eth0 to eth0 udhcpc (v1.15.1) started Sending discover... Sending select for 10.12.212.85... Lease of 10.12.212.85 obtained, lease time 43200 deleting routers adding dns 10.11.5.4 adding dns 10.11.5.3 Saving to remote location kdump.redhat.com Saving vmcore-dmesg.txt reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT! 63+1 records in 63+1 records out 32270 bytes (32 kB) copied, 0.000106353 s, 303 MB/s Saved vmcore-dmesg.txt Free memory/Total memory (free %): 66724 / 114296 ( 58.3782 ) Excluding unnecessary pages : [100.0 %] |reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT! Copying data : [100.0 %] \ 59550+465 records in 59566+1 records out 30497992 bytes (30 MB) copied, 2.0862 s, 14.6 MB/s Saving core complete Restarting system. From the SSH host, I see the core: $ file /share/10.12.212.85-2016-06-20-15\:29\:05/vmcore.flat /share/10.12.212.85-2016-06-20-15:29:05/vmcore.flat: data So, on my host it works as expected. All I did was create a tunnel without configuration. I note that in the last attempts provided to us, tap0 is missing a 'device' parameter in /sys: # service kdump restart Stopping kdump: [ OK ] No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img ls: cannot access /sys/class/net/tap0/device: No such file or directory Starting kdump: [ OK ] My host does not have this either, but here was no issue or complaint when I rebuilt the ramdisk: # ls /sys/class/net/tap0/dev* /sys/class/net/tap0/dev_id /sys/class/net/tap0/dev_port # mv /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img.backup # service kdump restart Stopping kdump: [ OK ] No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img Starting kdump: [ OK ] And I am running an identical kernel: # uname -a Linux rhel6-kdump-test 2.6.32-573.22.1.el6.x86_64 #1 SMP Thu Mar 17 03:23:39 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux There is one major difference here, as Nitin pointed out. The logic of mkdumprd uses the routable interface as the handle_netdev() argument, which, in this scenario, might be the VPN tunnel: - - - - - - - - - 8< - - - - - - - - - #find ethernet device used to route to remote host, ie eth0 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< netdev=`/sbin/ip route get to $remoteip 2>&1` <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< [ $? != 0 ] && echo "Bad kdump location: $config_val" && cleanup_and_exit 1 DUMP_TARGET=$config_val #the field in the ip output changes if we go to another subnet OFF_SUBNET=`echo $netdev | grep via` if [ -n "$OFF_SUBNET" ] then # we are going to a different subnet netdev=`echo $netdev|awk '{print $5;}'|head -n 1` else # we are on the same subnet netdev=`echo $netdev|awk '{print $3}'|head -n 1` fi #add the ethernet device to the list of modules mkdir -p $MNTIMAGE/etc/network/ handlenetdev $netdev <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< | .---------' v handlenetdev() { local dev=$1 local ifcfg_file local vnet_prefix case " $handlednetdevices " in *" $dev "*) return ;; *) handlednetdevices="$handlednetdevices $dev" ;; esac ifcfg_file=`find_ifcfg_by_devicename $dev` --------. if [ -z "${ifcfg_file}" ]; then | error "The ifcfg-$dev or ifcfg-xxx which contains DEVICE=$dev field doesn't exist." cleanup_and_exit 1 | fi | .---------------------------------------------' v find_ifcfg_by_devicename() { local dev=$1 - - - - - - - - - 8< - - - - - - - - - In my example, my routable interface is not the tap device. Let us see if it is the routed interface - from an sosreport I unfortunately can't tell: $ cat sos_commands/networking/ip_route_show_table_all | grep tap fe80::/64 dev tap0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295 ff00::/8 dev tap0 table local metric 256 mtu 1500 advmss 1440 hoplimit 4294967295 Actual results: [e723nb@smsslpoc1a ~]$ rpm -qa | grep -i kexec kexec-tools-2.0.0-300.el6.x86_64 [e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart Stopping kdump: [ OK ] No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist. Failed to run mkdumprd [e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart Stopping kdump: [ OK ] No kdump initial ramdisk found. [WARNING] Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist. Failed to run mkdumprd Expected results: For kdump to not look for the tap device Additional info: Customer can take down the network(tap device and openvpn) and start kdump. The issue is that the customer will have to take down the network every time they upgrade the kernel. This will impact production. They cannot build an ifcfg for the tap device because of conflicts with the ifcfg and the openvpn during reboot.