Hide Forgot
Created attachment 1945170 [details] syslog Description of problem: As shown in the attached screencast,anaconda failed to show the fcoe target after I select the CNA and click "Add fcoe disk". Version-Release number of selected component (if applicable): fcoe-utils-1.0.34-3.gitb233050.fc37.x86_64 anaconda-38.21-1.fc38.x86_64.rpm How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 1945171 [details] screencast
I see the following in the log,and I try to add inst.selinux=0 ,doesn't work 07:10:27,798 NOTICE audit:AVC avc: denied { create } for pid=2670 comm="fcoemon" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=netlink_scsitransport_socket permissive=1 07:10:27,798 NOTICE kernel:audit: type=1400 audit(1676877027.796:401): avc: denied { create } for pid=2670 comm="fcoemon" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:system_r:kernel_t:s0 tclass=netlink_scsitransport_socket permissive=1
Created attachment 1945178 [details] syslog with inst.selinux=0
Proposed as a Blocker for 38-final by Fedora user lnie using the blocker tracking app because: This affects: The installer must be able to detect (if possible) and install to supported network-attached storage devices.
The installer runs in a permissive mode, so the SELinux warnings are not relevant. From syslog: 03:56:17,250 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:anaconda.threading:Running Thread: AnaTaskThread-FCOEDiscoverTask-2 (139754307974848) 03:56:17,250 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:anaconda.modules.common.task.task:Discover a FCoE 03:56:17,250 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:blivet:Activating FCoE SAN attached to ens2f1, dcb: True autovlan: True 03:56:17,251 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... systemctl start lldpad.service 03:56:17,270 INFO systemd:Listening on lldpad.socket - Link Layer Discovery Protocol Agent Socket.. 03:56:17,278 INFO systemd:Started lldpad.service - Link Layer Discovery Protocol Agent Daemon.. 03:56:17,280 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 03:56:17,280 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... lldptool -p 03:56:17,340 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:stdout: 03:56:17,340 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:2841 03:56:17,340 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 03:56:17,340 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... dcbtool sc ens2f1 dcb on 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:stdout: 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Command: #011Set Config 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Feature: #011DCB State 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Port: #011ens2f1 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Status: #011Successful 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 03:56:17,352 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... dcbtool sc ens2f1 pfc e:1 a:1 w:1 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:stdout: 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Command: #011Set Config 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Feature: #011Priority Flow Control 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Port: #011ens2f1 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Status: #011Successful 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 03:56:17,358 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... dcbtool sc ens2f1 app:fcoe e:1 a:1 w:1 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:stdout: 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Command: #011Set Config 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Feature: #011Application FCoE 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Port: #011ens2f1 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Status: #011Successful 03:56:17,364 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 03:56:17,380 DEBUG NetworkManager:<debug> [1676865377.3807] ndisc-lndp[0x55f1e987cce0,"eno0"]: processing libndp events 03:56:18,366 WARNING org.fedoraproject.Anaconda.Modules.Storage:INFO:program:Running... systemctl restart fcoe.service 03:56:18,381 INFO fcoemon:fcoemon: error 9 Bad file descriptor 03:56:18,381 INFO fcoemon:fcoemon: Failed write req D len 1 03:56:18,381 INFO systemd:Stopping fcoe.service - Open-FCoE initiator daemon... 03:56:18,382 INFO systemd:fcoe.service: Deactivated successfully. 03:56:18,393 INFO systemd:Stopped fcoe.service - Open-FCoE initiator daemon. 03:56:18,405 INFO systemd:Starting fcoe.service - Open-FCoE initiator daemon... 03:56:18,410 INFO systemd:Started fcoe.service - Open-FCoE initiator daemon. 03:56:18,412 WARNING org.fedoraproject.Anaconda.Modules.Storage:DEBUG:program:Return code: 0 The fcoemon tool seems to fail. Reassigning.
+4 in https://pagure.io/fedora-qa/blocker-review/issue/1041 , marking accepted.
Chris, can you please take a look at this? It has been sitting here a long time. It is a Fedora 38 final release blocker, which means we need it fixed in the next month or so.
I'm pretty sure this is a network problem on these interfaces and not specific to fcoe. F37 for comparison, start anaconda with inst.sshd and ssh in without interacting with anaconda at all. ens2f0/1 are both connected # nmcli dev DEVICE TYPE STATE CONNECTION eno0 ethernet connected Wired Connection ens2f0 ethernet connected ens2f0 ens2f1 ethernet connected ens2f1 eno1 ethernet unavailable -- lo loopback unmanaged -- fipvlan diagnostic command find the fabric gateways # fipvlan ens2f0 ens2f1 Fibre Channel Forwarders Discovered interface | VLAN | FCF MAC ------------------------------------------ ens2f0 | 802 | 00:05:73:b2:7f:00 ens2f1 | 802 | 00:05:73:b2:7f:00 now let's try that with F38-20230317.n.0 NetworkManager seems to have activated the connections # nmcli dev DEVICE TYPE STATE CONNECTION eno0 ethernet connected Wired Connection ens2f0 ethernet connected ens2f0 ens2f1 ethernet connected ens2f1 lo loopback connected (externally) lo eno1 ethernet unavailable -- but now the link state shows NO-CARRIER and DORMANT (we haven't done anything except query network state at this point, on F37 the network links were "state UP mode DEFAULT") # ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eno0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether e8:39:35:2d:e0:b8 brd ff:ff:ff:ff:ff:ff altname enp2s0 3: ens2f0: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state DORMANT mode DORMANT group default qlen 1000 link/ether 00:1b:21:59:12:34 brd ff:ff:ff:ff:ff:ff altname enp7s0f0 4: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether e8:39:35:2d:e0:b9 brd ff:ff:ff:ff:ff:ff altname enp3s0 5: ens2f1: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state DORMANT mode DORMANT group default qlen 1000 link/ether 00:1b:21:59:12:35 brd ff:ff:ff:ff:ff:ff altname enp7s0f1 fipvlan fails, because the interface isn't IFF_RUNNING # fipvlan -d ens2f0 fipvlan: creating netlink socket fipvlan: Using libfcoe module parameter interfaces fipvlan: sending RTM_GETLINK dump request fipvlan: RTM_NEWLINK: ifindex 1, type 772, flags 10049 fipvlan: RTM_NEWLINK: ifindex 2, type 1, flags 11043 fipvlan: RTM_NEWLINK: ifindex 3, type 1, flags 11003 fipvlan: RTM_NEWLINK: ifindex 4, type 1, flags 1003 fipvlan: RTM_NEWLINK: ifindex 5, type 1, flags 11003 fipvlan: NLMSG_DONE fipvlan: if 3 not running, starting fipvlan: sending RTM_SETLINK request fipvlan: NLMSG_ERROR (0) Success fipvlan: waiting for IFF_RUNNING [1/20] fipvlan: return from poll 0 fipvlan: if 3 not running, waiting for link up ... fipvlan: waiting for IFF_RUNNING [20/20] fipvlan: return from poll 0 fipvlan: if 3 not running, waiting for link up fipvlan: return from poll 0 fipvlan: if 2: skipping, FIP not ready fipvlan: if 3: skipping, FIP not ready fipvlan: if 4: skipping, FIP not ready fipvlan: if 5: skipping, FIP not ready No Fibre Channel Forwarders or VN2VN Responders Found fipvlan: shutdown if 3 fipvlan: sending RTM_SETLINK request fipvlan: NLMSG_ERROR (0) Success and now lets check NetworkManager again # nmcli dev DEVICE TYPE STATE CONNECTION eno0 ethernet connected Wired Connection ens2f1 ethernet connected ens2f1 lo loopback connected (externally) lo eno1 ethernet unavailable -- ens2f0 ethernet unavailable --
If I stop NetworkManager from managing these interfaces, and reload the driver, things seem better. # nmcli dev set ens1f0 managed no # nmcli dev set ens1f1 managed no # rmmod ixgbe # modprobe ixgbe # fipvlan ens2f0 ensf10 Fibre Channel Forwarders Discovered interface | VLAN | FCF MAC ------------------------------------------ ens2f0 | 802 | 00:05:73:b2:7f:00 ens2f1 | 802 | 00:05:73:b2:7f:00 But, returning to Anaconda and attempting to add an FCoE SAN, and it fails again and returns to the DORMANT state?
Thanks for looking into it. Could it be a kernel issue?
(In reply to Adam Williamson from comment #10) > Thanks for looking into it. Could it be a kernel issue? Could be, I'm not familiar enough the the DORMANT state here. But I could only manage to get the link working by telling NM to stop managing it, and I'm guessing that the Anaconda FCoE connection code might have gone back to requesting NM to active the connection?
I don't know why the device ends up in "NO-CARRIER state DORMANT", and from my understanding that only depends on the NIC and the kernel driver, not on NetworkManager. I'm reassiging this bz to kernel.