Bug 1126656 - mkdumprd can't get non-default static route
Summary: mkdumprd can't get non-default static route
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kexec-tools
Version: 21
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Baoquan He
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 1125182
Blocks: 1126653
TreeView+ depends on / blocked
 
Reported: 2014-08-05 02:04 UTC by Qiao Zhao
Modified: 2015-11-25 07:31 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1125182
Environment:
Last Closed: 2015-11-25 07:31:43 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Qiao Zhao 2014-08-05 02:04:55 UTC
+++ This bug was initially created as a clone of Bug #1125182 +++

Description of problem:
Be different than bug 806992, my environment:
+-----------------------+       +------------------------+
| guest1                |       |  guest2                |
| eth0  10.66.xx.xx     |       | eth0 10.66.xx.xx       |
| eth1  192.168.10.x    |       | eth1 192.168.20.x      |
+-----------------------+       +------------------------+
[guest1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1 
DEVICE=eth1
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.10.1
NETMASK=255.255.255.0
HWADDR=02:00:00:00:00:10

[guest2 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
TYPE=Ethernet
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.20.1
NETMASK=255.255.255.0
HWADDR=02:00:00:00:00:12

Add route to guest1/guest2
[guest1 ~]# ip route add 192.168.20.0/24 dev eth1
# ip route show
192.168.20.0/24 dev eth1  scope link 
192.168.10.0/24 dev eth1  proto kernel  scope link  src 192.168.10.1 
10.66.86.0/23 dev eth0  proto kernel  scope link  src 10.66.87.250 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth1  scope link  metric 1003 
default via 10.66.87.254 dev eth0 

[guest2 ~]# ip route add 192.168.10.0/24 dev eth1
# ip route show
192.168.20.0/24 dev eth1  proto kernel  scope link  src 192.168.20.1 
192.168.10.0/24 dev eth1  scope link 
10.66.86.0/23 dev eth0  proto kernel  scope link  src 10.66.86.250 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth1  scope link  metric 1003 
default via 10.66.87.254 dev eth0 

In guest1:
[guest1 ~]# ping 192.168.20.1
PING 192.168.20.1 (192.168.20.1) 56(84) bytes of data.
64 bytes from 192.168.20.1: icmp_seq=1 ttl=64 time=3.53 ms

start kdump service in guest1
# grep -v ^# /etc/kdump.conf

net 192.168.20.1:/export/tmp
path /var/crash
core_collector makedumpfile -c --message-level 1 -d 31

# service kdump restar

# echo c > /proc/sysrq-trigger

[console log]
 vda: vda1 vda2
mapping eth1 to eth1
8021q: adding VLAN 0 to HW filter on device eth1
ip: RTNETLINK answers: No such process
Saving to remote location 192.168.20.1:/export/tmp
mount: RPC: Remote system error - Network is unreachable
Restarting system.
machine restart
[/console log]

Version-Release number of selected component (if applicable):
kexec-tools-2.0.0-278.el6

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
= 1 =  [use `ip route add 192.168.20.0/24 dev eth1`] - dump failed
+ local dev=eth1
++ /sbin/ip route show
++ grep '^[[:digit:]].*via.* eth1 '
+ local routes=
+ '[' -z '' ']'
++ /sbin/ip route show
++ awk '/^default/ {print $3}'
+ GATEWAY=10.66.87.254
+ '[' -n 10.66.87.254 ']'
+ echo '  ' gateway 10.66.87.254
+ '[' -n '' ']'
+ set +x
Starting kdump:                                            [  OK  ]

= 2 =  [use `ip route add 192.168.20.0/24 via 192.168.10.2 dev eth1`] - dump ok (192.168.20.2 is another machine)
+ local dev=eth1
++ /sbin/ip route show
++ grep '^[[:digit:]].*via.* eth1 '
+ local 'routes=192.168.20.0/24 via 192.168.10.2 dev eth1 '
+ '[' -z '' ']'
++ /sbin/ip route show
++ awk '/^default/ {print $3}'
+ GATEWAY=10.66.87.254
+ '[' -n 10.66.87.254 ']'
+ echo '  ' gateway 10.66.87.254
+ '[' -n '192.168.20.0/24 via 192.168.10.2 dev eth1 ' ']'
+ /sbin/ip route show
+ grep '^[[:digit:]].*via.* eth1 '
+ set +x
Starting kdump:                                            [  OK  ]

--- Additional comment from Vivek Goyal on 2014-07-31 09:35:55 EDT ---

Bao, I thought we solved static route issue in rhel6. Is it some corner case configuration issue.

--- Additional comment from Baoquan He on 2014-07-31 23:03:18 EDT ---

(In reply to Vivek Goyal from comment #1)
> Bao, I thought we solved static route issue in rhel6. Is it some corner case
> configuration issue.

Yes, it's a very weird case. When 2 machines are connected directly or by a bridge, usually they are configured in the same subnet, 
say 192.168.10.1 <--> 192.168.10.2
In this case, nothing is needed, they can communicate with each other directly.

However, if they are configured in different subnet, 
e.g 192.168.10.1 <--> 192.168.20.1
In this case, though they are connected directly, a specified route has to be configured by network admin. And you can skip nexthop ipaddr. Means below 2 routes works well.

192.168.20.0/24 via 192.168.20.1 dev eth0

192.168.20.0/24 dev eth0

The case Qiao are taking about is the 2nd route which is without "via xxx". Because in kdump implementation, we grep "via xxx" to find a crossing subnet route. Then this route is skipped too. 

I can change kdump script to find all routes which go through a certain NIC, but then the direct connection route will be added too. the direct connection route is created by kernel stack when a NIC is configured. Say ip addr is configured on NIC eth0, 192.168.10.1, then a direct network connection "192.168.10.0/24 dev eth0" is added automatically. I can't distinguish them.

--- Additional comment from Baoquan He on 2014-08-01 00:24:26 EDT ---


So for this case, 3 choices:

1. we keep this bug open, and handle it if any customers complain it.

2. Add a description to DOC, ask customers to add nexthop explicitly if they configure in this corner case. This suggested by Qiao.

3. grep all routes which go through a certain NIC. The defect is the unnecessary direct connection route will be added too.

--- Additional comment from Baoquan He on 2014-08-01 01:23:40 EDT ---

Hi Marc,

What do you say about this issue? From your point view or customers side, any suggestion or idea?

Thanks
Baoquan

--- Additional comment from Vivek Goyal on 2014-08-01 09:22:59 EDT ---

(In reply to Baoquan He from comment #2)
> (In reply to Vivek Goyal from comment #1)
> > Bao, I thought we solved static route issue in rhel6. Is it some corner case
> > configuration issue.
> 
> Yes, it's a very weird case. When 2 machines are connected directly or by a
> bridge, usually they are configured in the same subnet, 
> say 192.168.10.1 <--> 192.168.10.2
> In this case, nothing is needed, they can communicate with each other
> directly.
> 
> However, if they are configured in different subnet, 
> e.g 192.168.10.1 <--> 192.168.20.1
> In this case, though they are connected directly, a specified route has to
> be configured by network admin. And you can skip nexthop ipaddr. Means below
> 2 routes works well.
> 
> 192.168.20.0/24 via 192.168.20.1 dev eth0
> 
> 192.168.20.0/24 dev eth0
> 
> The case Qiao are taking about is the 2nd route which is without "via xxx".
> Because in kdump implementation, we grep "via xxx" to find a crossing subnet
> route. Then this route is skipped too. 
> 
> I can change kdump script to find all routes which go through a certain NIC,
> but then the direct connection route will be added too. the direct
> connection route is created by kernel stack when a NIC is configured. Say ip
> addr is configured on NIC eth0, 192.168.10.1, then a direct network
> connection "192.168.10.0/24 dev eth0" is added automatically. I can't
> distinguish them.

Can't we look at NIC ip and netmask and see if route is a subnet route for that ip (which will be automatically added) or that rotue does not belong to subnet. If it does not belong to subnet, then it is a static route.

--- Additional comment from Marc Milgram on 2014-08-01 10:01:08 EDT ---

Baoquan,

I don't see any customer cases linked to this BZ, and I haven't run into a customer hitting this problem.

On the other hand, some customer will try to do this.  It should probably get the propper fix.

Regards,

Marc

--- Additional comment from Xin Long on 2014-08-04 00:25:37 EDT ---

(In reply to Baoquan He from comment #2)
> (In reply to Vivek Goyal from comment #1)
> > Bao, I thought we solved static route issue in rhel6. Is it some corner case
> > configuration issue.
> 
> Yes, it's a very weird case. When 2 machines are connected directly or by a
> bridge, usually they are configured in the same subnet, 
> say 192.168.10.1 <--> 192.168.10.2
> In this case, nothing is needed, they can communicate with each other
> directly.
> 
> However, if they are configured in different subnet, 
> e.g 192.168.10.1 <--> 192.168.20.1
> In this case, though they are connected directly, a specified route has to
> be configured by network admin. And you can skip nexthop ipaddr. Means below
> 2 routes works well.
> 
> 192.168.20.0/24 via 192.168.20.1 dev eth0
> 
> 192.168.20.0/24 dev eth0
> 

actually, 192.168.20.0/24 via 192.168.20.1 dev eth0, this route cannot work, because 192.168.20.1 is not a direct address, so if you want do it through this methods , you need to add another route, like:
 
ip route add 192.168.20.1 dev eth0

but when you ping 192.168.20.1, the first route still is *ip route add 192.168.20.1 dev eth0*, so I think, the issue cannot be solved, it's not a good workaround.

if custom has this network topo to run kdump, I sugguest we should fix this bug. 
> The case Qiao are taking about is the 2nd route which is without "via xxx".
> Because in kdump implementation, we grep "via xxx" to find a crossing subnet
> route. Then this route is skipped too. 
> 
> I can change kdump script to find all routes which go through a certain NIC,
> but then the direct connection route will be added too. the direct
> connection route is created by kernel stack when a NIC is configured. Say ip
> addr is configured on NIC eth0, 192.168.10.1, then a direct network
> connection "192.168.10.0/24 dev eth0" is added automatically. I can't
> distinguish them.

--- Additional comment from Baoquan He on 2014-08-04 04:43:03 EDT ---


Hi all,

As Long said, the 2nd choice I wrote doesn't work, is not a work around.

Now, after discussion with Long and wpan who are familiar with networking, 2 ways come up:

1. Just find all routes through a certain NIC, and add them into route table in kdump kernel. Though direct connections are included too, it doesn't impact anything. Just a notice comes up to say that route exists, especially for direct connection routes.  This way is the simplest and direct.

2. Use netmask to find the specific route. say currently a NIC is configured as
192.168.10.1, target is 192.168.20.1.

Then ip route show :

192.168.10.0/24 /dev eth0 proto kernel  scope link src 192.168.10.1
192.168.20.0/24 /dev eth0
or
192.168.10.0/24 via 192.168.10.2 /dev eth0

Here use netmask to do the AND operation with target ip addr. If the result is equal to the route address, then this route is that we want and only this one is added to kdump kernel.

Surely this way is a little more complicated, the IP address need be transformed to a decimal integer and get the netmask.


So above 2 ways, which one do you prefer?  Any defects or any suggestions, better ideas?

Thanks
Baoquan

Comment 1 Fedora End Of Life 2015-11-04 13:44:06 UTC
This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.


Note You need to log in before you can comment on or make changes to this bug.