Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2082536

Summary:

[NMCI] NM should set ethernet layer up before calling wpa_supplicant to perform EAPOL login (8021x_hostapd_freeradius_doc_procedure)

Product:

Red Hat Enterprise Linux 8

Reporter:

David Jaša <djasa>

Component:

NetworkManager

Assignee:

NetworkManager Development Team <nm-team>

Status:

CLOSED MIGRATED

QA Contact:

Desktop QE <desktop-qa-list>

Severity:

low

Docs Contact:

Priority:

unspecified

Version:

8.6

CC:

bgalvani, dcaratti, lrintel, rkhan, sfaye, sukulkar, till, vbenes

Target Milestone:

Keywords:

MigratedToJIRA, Triaged

Target Release:

---

Flags:

pm-rhel: mirror+

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2023-08-17 12:18:51 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
behave report	none

Description David Jaša 2022-05-06 11:17:55 UTC

Description of problem:
this was a mysterious issue in nmci: 8021x_hostapd_freeradius_doc_procedure failed quite consistently on el8 where wpa_supplicant called from shell most times succesfully authenticated against radius but then nm failed to bring up the connection because of wpa_supplicant the systemd service timed out waiting for any EAPOL reply (and NM then erroring out with unhelpful error of no secrets available). Network topology is:

                      no NS | vethsetup NS
       +----------------+
       |        br0     |   |
test1 -+- test1b   eth4 +---|--- (uplink)
       +----------------+
  |             |
  |             +-- hostapd listens on br0
  |
  +-- wpa_supplicant connects to test1

and statuses of relevant interfaces before calling of the wpa_supplicant and 'nmcli c up ...' is:
68: test1@test1b: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 1e:fa:8e:06:df:81 brd ff:ff:ff:ff:ff:ff
67: test1b@test1: <NO-CARRIER,BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue master br0 state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether 1a:87:1f:1d:5e:e9 brd ff:ff:ff:ff:ff:ff
66: br0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
    link/ether 1a:87:1f:1d:5e:e9 brd ff:ff:ff:ff:ff:ff
38: eth4@if37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP mode DEFAULT group default qlen 1000
    link/ether 86:56:b1:74:c0:fc brd ff:ff:ff:ff:ff:ff link-netns vethsetup

When the test1 interface is brought up using 'ip l set test1 up', the test consistently passes. So the likely explanation is that NM instructs wpa_supplicant.service to perform EAPOL login on interface whose link is down - and wpa_supplicant the systemd service then fails. IMO NM shouldn't leave bringin up link layer on wpa_supplicant and it should do so itself before calling wpa_supplicant.


Version-Release number of selected component (if applicable):
main, 1.38, 1.36, 1.34 on el8 (el9 seems unaffected)

Comment 1 David Jaša 2022-05-06 11:28:18 UTC

Created attachment 1877548 [details]
behave report

The other point of view may be that before and after running 'wpa_supplicant -c ...', test1 is also down and wpa_supplicant brings the interface up when logging in and down when exiting as you can see in attached Behave report, otherwise it wouldn't be able to autheticate against radius. Is this inconsistency between wpa_supplicant the systemd/dbus service and wpa_supplicant the shell command intentional, Davide?

Comment 2 David Jaša 2022-05-06 11:43:02 UTC

It seems with a bit diverse set of test machines that this is not the main issue, but it still seems to me worthy to try to find some conclusion for this one.

Comment 3 Beniamino Galvani 2022-05-10 07:55:06 UTC

In "Status After Scenario" I see:

 70: test1b@test1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UP group default qlen 1000
     link/ether 62:30:f7:2e:fe:f4 brd ff:ff:ff:ff:ff:ff
 71: test1@test1b: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
     link/ether 0a:e6:e1:f4:d1:7b brd ff:ff:ff:ff:ff:ff

Also in NM logs there is:

 <info>  [1651836008.0367] device (test1): carrier: link connected
 <info>  [1651836008.0398] device (test1): state change: unavailable -> disconnected (reason 'user-requested', sys-iface-state: 'managed')
 <info>  [1651836008.0522] device (test1): Activation: starting connection 'test1-ttls' (83fa4d0b-a806-4e94-8d47-fdf84a80c184)

The "carrier: link connected" indicates that there is carrier on the interface, and that can only happen when the link is set as "up".

I think the link status is not the problem here. If you can provide a NM log with trace level, that probably would be helpful to understand the actual problem.