Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1356615

Summary:	Kdump fails to start due to the customer using a tap device on openvpn.
Product:	Red Hat Enterprise Linux 6	Reporter:	Billy Woods <bwoods>
Component:	kexec-tools	Assignee:	Xunlei Pang <xlpang>
Status:	CLOSED ERRATA	QA Contact:	Qiao Zhao <qzhao>
Severity:	high	Docs Contact:
Priority:	urgent
Version:	6.7	CC:	bhe, bwoods, ccui, cye, dyoung, jaeshin, mmilgram, ruyang, xlpang
Target Milestone:	rc	Keywords:	ZStream
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	kexec-tools-2.0.0-301.el6	Doc Type:	If docs needed, set a value
Doc Text:	undefined	Story Points:	---
Clone Of:
Clones:	1375890 (view as bug list)		Environment:
Last Closed:	2017-03-21 09:12:35 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1269194, 1359574, 1366045, 1375890

Description Billy Woods 2016-07-14 13:18:18 UTC

Description of problem:

Customer is running kexec-tools-2.0.0-300.el6.x86_64. Kdump fails to start due to the customer using a tap device on openvpn. Case number 01620459

Version-Release number of selected component (if applicable):

kexec-tools-2.0.0-300.el6.x86_64

How reproducible:

Every time

Steps to reproduce:

 Created By: Robb Manes  (6/20/2016 3:48 PM)

// Working Notes - these notes are not intended as a meaningful communication
// but rather an indicator of current thought processes and reference.
// Please feel free to comment and ask questions concerning them.  

My steps to reproduce, which worked without problems:

Make the tap device manually:

	# tunctl -t tap0
	Set 'tap0' persistent and owned by uid 0

Set up SSH kdump:

	/etc/kdump.conf
	ssh kdump.redhat.com
	path /share/

Set up kdump keys:

	# service kdump propagate
	Using existing keys...
	kdump.redhat.com's password: 
	/root/.ssh/kdump_id_rsa has been added to ~kdump/.ssh/authorized_keys on waffle.usersys.redhat.com

Restart kdump:

	# service kdump restart
	Stopping kdump:                                            [  OK  ]
	Detected change(s) the following file(s):
	  
	  /etc/kdump.conf
	Rebuilding /boot/initrd-2.6.32-642.el6.x86_64kdump.img
	Starting kdump:                                            [  OK  ]

Ensure kdump kernel is loaded:

	# grep -i crash /proc/iomem 
	  03000000-0b0fffff : Crash kernel

Crash the system:

	# echo 'c' > /proc/sysrq-trigger

From the console, I can see:

	$ virsh cosole rhel6-kdump-test
	- - - - - - - - - 8< - - - - - - - - - 
	mapping eth0 to eth0
	udhcpc (v1.15.1) started
	Sending discover...
	Sending select for 10.12.212.85...
	Lease of 10.12.212.85 obtained, lease time 43200
	deleting routers
	adding dns 10.11.5.4
	adding dns 10.11.5.3
	Saving to remote location kdump.redhat.com
	Saving vmcore-dmesg.txt
	reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT!
	63+1 records in
	63+1 records out
	32270 bytes (32 kB) copied, 0.000106353 s, 303 MB/s
	Saved vmcore-dmesg.txt
	Free memory/Total memory (free %): 66724 / 114296 ( 58.3782 )
	Excluding unnecessary pages        : [100.0 %] |reverse mapping checking getaddrinfo for unused [10.12.213.189] failed - POSSIBLE BREAK-IN ATTEMPT!
	Copying data                       : [100.0 %] \
	59550+465 records in
	59566+1 records out
	30497992 bytes (30 MB) copied, 2.0862 s, 14.6 MB/s
	Saving core complete
	Restarting system.

From the SSH host, I see the core:

	$ file /share/10.12.212.85-2016-06-20-15\:29\:05/vmcore.flat 
	/share/10.12.212.85-2016-06-20-15:29:05/vmcore.flat: data

So, on my host it works as expected.  All I did was create a tunnel without configuration.

I note that in the last attempts provided to us, tap0 is missing a 'device' parameter in /sys:

	# service kdump restart
	Stopping kdump:                                            [  OK  ]
	No kdump initial ramdisk found.                            [WARNING]
	Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
	ls: cannot access /sys/class/net/tap0/device: No such file or directory
	Starting kdump:                                            [  OK  ]

My host does not have this either, but here was no issue or complaint when I rebuilt the ramdisk:

	# ls /sys/class/net/tap0/dev*
	/sys/class/net/tap0/dev_id  /sys/class/net/tap0/dev_port

	# mv /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img.backup

	# service kdump restart
	Stopping kdump:                                            [  OK  ]
	No kdump initial ramdisk found.                            [WARNING]
	Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
	Starting kdump:                                            [  OK  ]

And I am running an identical kernel:

	# uname -a
	Linux rhel6-kdump-test 2.6.32-573.22.1.el6.x86_64 #1 SMP Thu Mar 17 03:23:39 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

There is one major difference here, as Nitin pointed out.  The logic of mkdumprd uses the routable interface as the handle_netdev() argument, which, in this scenario, might be the VPN tunnel:

	- - - - - - - - - 8< - - - - - - - - - 
		    #find ethernet device used to route to remote host, ie eth0  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
		    netdev=`/sbin/ip route get to $remoteip 2>&1`  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
		    [ $? != 0 ] && echo "Bad kdump location: $config_val" && cleanup_and_exit 1
		    DUMP_TARGET=$config_val
		    #the field in the ip output changes if we go to another subnet
		    OFF_SUBNET=`echo $netdev | grep via`
		    if [ -n "$OFF_SUBNET" ]
		    then
		        # we are going to a different subnet
		        netdev=`echo $netdev|awk '{print $5;}'|head -n 1`
		    else
		        # we are on the same subnet
		        netdev=`echo $netdev|awk '{print $3}'|head -n 1`
		    fi

		    #add the ethernet device to the list of modules 
		    mkdir -p $MNTIMAGE/etc/network/
		    handlenetdev $netdev  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
		        |
	      .---------'
	      v
	handlenetdev() {
	    local dev=$1
	    local ifcfg_file
	    local vnet_prefix

	    case " $handlednetdevices " in
		*" $dev "*)
		    return ;;
		*) handlednetdevices="$handlednetdevices $dev" ;;
	    esac

	    ifcfg_file=`find_ifcfg_by_devicename $dev`  --------.
	    if [ -z "${ifcfg_file}" ]; then                     |
		error "The ifcfg-$dev or ifcfg-xxx which contains DEVICE=$dev field doesn't exist."
		cleanup_and_exit 1                              |
	    fi                                                  |
		  .---------------------------------------------'
		  v
	find_ifcfg_by_devicename() {
	    local dev=$1
	- - - - - - - - - 8< - - - - - - - - - 

In my example, my routable interface is not the tap device.  Let us see if it is the routed interface - from an sosreport I unfortunately can't tell:

	$ cat sos_commands/networking/ip_route_show_table_all  | grep tap
	fe80::/64 dev tap0  proto kernel  metric 256  mtu 1500 advmss 1440 hoplimit 4294967295
	ff00::/8 dev tap0  table local  metric 256  mtu 1500 advmss 1440 hoplimit 4294967295



Actual results:


[e723nb@smsslpoc1a ~]$ rpm -qa | grep -i kexec
kexec-tools-2.0.0-300.el6.x86_64

[e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart
Stopping kdump:                                            [  OK  ]
No kdump initial ramdisk found.                            [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist.
Failed to run mkdumprd


[e723nb@smsslpoc1a ~]$ sudo /sbin/service kdump restart
Stopping kdump:                                            [  OK  ]
No kdump initial ramdisk found.                            [WARNING]
Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img
The ifcfg-tap0 or ifcfg-xxx which contains DEVICE=tap0 field doesn't exist.
Failed to run mkdumprd
 

Expected results:

For kdump to not look for the tap device

Additional info:


Customer can take down the network(tap device and openvpn) and start kdump. The issue is that the customer will have to take down the network every time they upgrade the kernel. This will impact production. They cannot build an ifcfg for the tap device because of conflicts with the ifcfg and the openvpn during reboot.

Comment 2 Xunlei Pang 2016-07-15 01:56:06 UTC

Hi Billy,

Do you have any remote environment for me to have a look?

Thanks!

Comment 3 Billy Woods 2016-07-15 12:30:39 UTC

Hello,

I do not have any remote environment at this time. I can request this from the customer if needed?


Thank you!

Comment 6 Dave Young 2016-07-26 08:23:28 UTC

Please ensure the server for dumping can be accessed without vpn, or kdump will fail.

Comment 7 Billy Woods 2016-07-26 13:50:07 UTC

Xunlei and Dave,

Ack, I have a request for the customer to provide us with the info. Robb requested the "ip route get to" a few days ago and I have re-requested that along with the new info requested.


Regards,
Billy Woods

Comment 35 errata-xmlrpc 2017-03-21 09:12:35 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0584.html