Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 910348

Summary: empty bridge has IFF_LOWER_UP flag and therefore breaks carrier detection
Product: Red Hat Enterprise Linux 7 Reporter: Pavel Šimerda (pavlix) <psimerda>
Component: kernelAssignee: Jiri Pirko <jpirko>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: dcbw, jpirko, rkhan, tgraf
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-03-08 10:01:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pavel Šimerda (pavlix) 2013-02-12 12:28:11 UTC
Userspace configuration tools like NetworkManager need to know when bridge is ready for use. This is when at least one slave device has IFF_LOWER_UP. This is almost exactly what kernel does for master's IFF_LOWER_UP with the sole exception of a bridge without slave.

There is a workaround in userspace to only consder a bridge up if it has slaves. Unfortunately there may be a race condition between the time userspace learns about a new slave and the time it learns about the master's IFF_LOWER_UP.

Expected behavior:

Bridge without slaves doesn't have IFF_LOWER_UP flag.

Actual behavior:

Bridge without slaves has IFF_LOWER_UP flag.

I'm interested in both fixing this in kernel and finding the best workaround for the userspace daemon to work with current kernels.

Comment 2 Thomas Graf 2013-02-13 08:36:14 UTC
One possibility I see is adding a workaround to libnl that removes the IFF_LOWER_UP flag for the bridge if no slave is referring to it or to provide a new API rtnl_link_bridge_is_up() that depends on IFF_LOWER_UP and number of slaves > 0.

Obviously you'd also need to listen for new port bridge notifications.

Comment 3 Jiri Pirko 2013-02-13 11:08:21 UTC
openvswitch behaves the same. Upstream does not like to change the behaviour saying that it would break some scripts.

I wonder if using new libnl api would help to avoid the race Pavel described.

Comment 4 Pavel Šimerda (pavlix) 2013-02-14 11:44:43 UTC
(In reply to comment #2)
> One possibility I see is adding a workaround to libnl that removes the
> IFF_LOWER_UP flag for the bridge if no slave is referring to it or to
> provide a new API rtnl_link_bridge_is_up() that depends on IFF_LOWER_UP and
> number of slaves > 0.
> 
> Obviously you'd also need to listen for new port bridge notifications.

This looks rather complicated for doing it in libnl and would probably confuse other libnl users.

(In reply to comment #3)
> openvswitch behaves the same. Upstream does not like to change the behaviour
> saying that it would break some scripts.
> 
> I wonder if using new libnl api would help to avoid the race Pavel described.

Nope. I'm doing this in NetworkManager's NMPlatform already and it's not such a big deal. The problem is in the race condition itself:

(In reply to comment #0)
> Unfortunately there may be a race condition between the time
> userspace learns about a new slave and the time it learns about the master's
> IFF_LOWER_UP.

That means the invariant of 'a bridge either has no slave or follows the slaves' carrier' doesn't hold for a short while, at least from the application's perspective. At least this is how I understand dcbw's information (adding to Cc).

Comment 5 Jiri Pirko 2013-03-08 10:01:42 UTC
Unfortunately, this is a no-go in upstream (http://marc.info/?l=linux-netdev&m=136061292027445&w=2). Closing this as wontfix. Feel free to reopen is situation changes.

Comment 6 Pavel Šimerda (pavlix) 2013-04-05 12:55:05 UTC
For the record, I'm just adding a note that bond's IFF_LOWER_UP can get wrong even on other occasions. I didn't test it thoroughly but I could get into the following state with NetworkManager tests and while only working around the empty master devices:

iproute output:

1480: nm-test-device: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether ce:61:05:3c:c4:ab brd ff:ff:ff:ff:ff:ff
    inet6 fe80::cc61:5ff:fe3c:c4ab/64 scope link 
       valid_lft forever preferred_lft forever
1481: nm-test-slave: <BROADCAST,NOARP,SLAVE> mtu 1500 qdisc noqueue master nm-test-device state DOWN 
    link/ether ce:61:05:3c:c4:ab brd ff:ff:ff:ff:ff:ff

nm-test-slave is the only slave device for nm-test-master. The slave is down but the master signals IFF_LOWER_UP. This is another example of inconsistent state.