Bug 1409655 - iBFT (iSCSI boot): When iSCSI comes in from a secondary interface, after machine boots, it has two default routes (one on real, primary IF, one on IF dedicated to iSCSI).
Summary: iBFT (iSCSI boot): When iSCSI comes in from a secondary interface, after mac...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut
Version: 7.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Lukáš Nykrýn
QA Contact: Release Test Team
URL:
Whiteboard:
Depends On:
Blocks: 1298243 1420851 1465901 1466365 1549617 1551061
TreeView+ depends on / blocked
 
Reported: 2017-01-02 19:54 UTC by Thomas Gardner
Modified: 2018-06-14 10:53 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-14 10:53:26 UTC
Target Upstream Version:


Attachments (Terms of Use)
ifcfg-enp10s0 (334 bytes, text/plain)
2017-08-18 00:41 UTC, Thomas Gardner
no flags Details
ifcfg-ibft0 (202 bytes, text/plain)
2017-08-18 00:41 UTC, Thomas Gardner
no flags Details
Output of 'ip route show table all' command. (2.11 KB, text/plain)
2017-08-18 00:44 UTC, Thomas Gardner
no flags Details

Description Thomas Gardner 2017-01-02 19:54:14 UTC
Description of problem:

In gross terms (if I understand this correctly), the customer has
a network dedicated to iSCSI traffic.  They want this traffic to go
through a secondary interface on the machine, and leave their primary
interface for everything else.  The trouble is, since they're booting
from iSCSI (iBFT), the initial boot must have this secondary network
up, and it needs to have a gateway configured for that network.
Unfortunately, this becomes the default gateway, so when the machine
finishes booting, bringing up the primary interface (which is supposed
to have the real default gateway on it) in the process, they end up
with two default gateways.  The real one, through the primary network
and interface, and the one pointing to the router on the secondary
(iSCSI) network and interface.

From customer:
------------------------------------------------------------------------
Here is the ibft information we are using to boot:

[root@phylnxtst01-lab ~]# iscsiadm -m fw
# BEGIN RECORD 6.2.0.873-30
iface.initiatorname = iqn.2005-02.com.open-iscsi:phylnxtst01-lab
iface.transport_name = tcp
iface.hwaddress = 00:25:b5:01:01:1e
iface.bootproto = STATIC
iface.ipaddress = 10.42.48.35
iface.subnet_mask = 255.255.252.0
iface.gateway = 10.42.48.1
iface.vlan_id = 0
iface.net_ifacename = ibft0
node.name = iqn.1992-08.com.netapp:sn.c32b9c8476f311e2ba9e123478563412:vs.75
node.conn[0].address = 10.42.237.56
node.conn[0].port = 3260
node.boot_lun = 00000000
# END RECORD
------------------------------------------------------------------------

Also from customer:

------------------------------------------------------------------------
Every time the system boots, the network configuration is discovered
from ibft and the network is setup. This is when the gateway is
created. In addition to setting up the network, it also generates a
network script file. This file is stored in /run/initramfs/state.

[root@phylnxtst01-lab ~]# cat /run/initramfs/state/etc/sysconfig/network-scripts/ifcfg-ibft0
# Generated by dracut initrd
NAME="ibft0"
HWADDR="00:25:b5:01:01:1e"
DEVICE="ibft0"
ONBOOT=yes
NETBOOT=yes
UUID="d1305fc1-468d-46f8-9858-c5a59124249e"
IPV6INIT=yes
BOOTPROTO=ibft
GATEWAY="10.42.48.1"
TYPE=Ethernet

Then when the systemctl starts the service rhel-import-state, it
copies over this configuration file.

[root@phylnxtst01-lab ~]# cat /lib/systemd/rhel-import-state
#!/bin/bash
# rhel-import-state: import state files from initramfs (e.g. network config)

# copy state into root
cd /run/initramfs/state
find . -mindepth 1 -maxdepth 1 -exec cp -av -t / {} \;

# run restorecon on the copied files
if [ -e /sys/fs/selinux/enforce -a -x /usr/sbin/restorecon ]; then
    find . -mindepth 1 -print0 | { cd / && xargs --null restorecon -iF; }
fi


The configuration file itself doesn't actually matter. Even if this
service was disabled, the gateway would still be configured during
boot.
------------------------------------------------------------------------

Version-Release number of selected component (if applicable):


How reproducible:

For this customer, every reboot.

Steps to Reproduce:
For this customer:
1.  Reboot.
2.  Note there are two default routes on two different networks
    accessed through two different interfaces.
3.  Cry out "Not again!" while shaking fists in the air.

Actual results:


Expected results:


Additional info:

Comment 1 Thomas Gardner 2017-01-12 21:27:55 UTC
1) I guess I should have included actual and expected results in my initial comment.  Re-reading they're not quite as obvious now as they seemed to me then.

Actual results:

The _default_ route ends up on the secondary network (which is the only network at that point, of course) at initial boot time (before doing the pivot root to the real root).  This seems reasonable enough for the purpose of getting the system booted far enough to get the real (iSCSI) root mounted from the network, but then this condition persists after the pivot, and bringing up the rest of the network interfaces (including at least the real, primary network which includes the real default network route) results in two default routes because systemd finds that interface already configured so it leaves it alone.  I suppose one could rebuild the routing table after pivot through some systemd thing, but this sounds like it would be far more complicated to implement because systemd treats networks and routes as a unit (I believe), and this secondary interface can't simply be brought down and back up to fix the routing table because all the system's filesystems are coming from here.

Expected results:

This secondary network needs to be initialized to have a limited route for its intended subnet through the router intended for that subnet.  The subnet and routing information is provided, but it seems to be getting ignored.

------------------------------------------------------------------------

2) I also really should have pointed out that this works exactly as above for RHEL6 given the same ibft information.

Comment 2 Thomas Gardner 2017-01-13 16:32:41 UTC
(In reply to Thomas Gardner from comment #1)
> 2) I also really should have pointed out that this works exactly as above
> for RHEL6 given the same ibft information.

Well, that was worded poorly.  What I meant was is that in RHEL6, it works just as the "expected results" case above.  The secondary network interface (which must be brought up first because it has access to the iSCSI network, on which resides the root partition) has a route assigned to go through it which is limited in scope to the subnet for that network.  Then a default route is later assigned to go through the primary network interface which isn't brought up until after the root pivot.

Comment 4 Peter Kotvan 2017-08-15 12:29:36 UTC
Hello Thomas,

I'm trying to reproduce this bug without success yet. Can you pleas provide steps when in the process of installation and how is the primary network configured. Especially how it is set as primary?

Thank you in advance.

Comment 5 Thomas Gardner 2017-08-15 20:34:57 UTC
I just wanted to ACK your request.  It's going to take a little digging (much has happened since I worked on this case, and my brain is getting old and crusty).  I started reviewing the case again, but it's going to take me some time.  I can get you ifcfg-whatevs files quickly, if that'll help get you started.  Would that be a decent start while I finish rereading?

Comment 6 Peter Kotvan 2017-08-16 05:30:10 UTC
Hi Thomas,

that would be a good start! I have to figure out how the route configuration was achieved.

Thanks in advance.

Comment 7 Thomas Gardner 2017-08-18 00:37:00 UTC
Hi Peter,

Sorry it has taken me so long.  I've spent much of the day trying to wrap my head around this case again.  I'm not a networking guy, so I'm slow to follow some of it (and of course, I had to try to do some more investigation --- it's kinda who I am).  I was brought in because there's some systemd/dracut funny business going on that seems to be causing this.

I'll attach the config files in a moment, but I wanted to point out that the place that has the "GATEWAY=" line commented out in the ifcfg-ibft0 file was commented out by the user in an attempt to make it stop doing this.  It didn't work, but it's still there as an artifact.

It appears he was able to collect an SOS report with everything all messed up.  I'll also attach their "ip route show table all" output from the SOS report, but you can see from the first two lines:

default via 10.42.48.1 dev ibft0 
default via 10.42.44.1 dev enp10s0  proto static  metric 100 

it's got two default gateways.  I see, now that I went fishing in that file it's got a bunch of errors in it I hadn't seen before, maybe those might provide some clues.  Again, I'm not a network guy, I was looking at this from a boot perspective.  I'll call those errors to the attention of the network guy who was contributing on this case, too.

You're trying this on a system with iBFT, right?  That seems to be the crux of the matter:  Using iBFT, as a secondary interface, to be configured from static parameters stored on the card, including what should be a limited scope route, but which becomes the default route.

Anyway, I'll attach those config files, now.

Comment 8 Thomas Gardner 2017-08-18 00:41:01 UTC
Created attachment 1314967 [details]
ifcfg-enp10s0

Comment 9 Thomas Gardner 2017-08-18 00:41:54 UTC
Created attachment 1314968 [details]
ifcfg-ibft0

Comment 10 Thomas Gardner 2017-08-18 00:44:16 UTC
Created attachment 1314969 [details]
Output of 'ip route show table all' command.

Comment 11 Peter Kotvan 2017-09-05 12:54:01 UTC
Hello Tom,

I had a chance to dig in the problem once again. And I found out some things.

* During the manual installation one cannot set the default route for ibft device since it is not shown in the anaconda. So it gets DEFROUT=yes by default. You can set the other network interface to be used for default rout as well during the configuration but you'll end up with the situation you described in your comments.

# ip r
default via 192.168.137.1 dev ibft0 proto static 
default via 192.168.122.1 dev eth1 proto static metric 100 
169.254.0.0/16 dev ibft0 scope link metric 1002 
192.168.122.0/24 dev eth1 proto kernel scope link src 192.168.122.43 metric 100 
192.168.137.0/24 dev ibft0 proto kernel scope link src 192.168.137.101

* After the first boot of the newly installed system there was an another config file "ifcfg-ibft0-1. This contained "DEFROUTE=yes". I changed this to "no" and rebooted the system. This seems to have fixed the issue and the default route is set for the second network interface.

# ip r
default via 192.168.122.1 dev eth1 proto static metric 100 
169.254.0.0/16 dev ibft0 scope link metric 1002 
192.168.122.0/24 dev eth1 proto kernel scope link src 192.168.122.43 metric 100 
192.168.137.0/24 dev ibft0 proto kernel scope link src 192.168.137.101

Can you please ask the customer to try to reproduce these steps and check if it does fix the problem?

I'll ask my team-members whether the appearance of another ifcfg-ibft0 config file is expected and ok.

If the answer for both questions will be yes, I suggest to close as NOTABUG.

Comment 12 Peter Kotvan 2017-09-06 06:54:01 UTC
I've done some more research and figured out few things.

Default route being set through ibft0 interface actually depends on whether it is provided by dhcp or firmware configuration. If it is, the default route is used.

Can you ask the customer if this is the case? I configured the dhcp server to not provide gateway and the the resulting default route went through the other (not ibft0) interface.

To me it seems that the behaviour is correct and the problem is in misconfiguration.

I'll file another bug to resolve the creation of ifcfg-ibft0-1 and I'll mention it here.

Comment 13 Peter Kotvan 2017-09-06 07:49:56 UTC
This is the bug 1488753 mentioned in comment 12.

Comment 15 Lukáš Nykrýn 2017-11-02 12:04:44 UTC
Based on the latest comments it looks that there is actually nothing to do in dracut. Otherwise please feel free to reopen this bug.


Note You need to log in before you can comment on or make changes to this bug.