Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1378910 - HPESD RHEL7.3-SN4- FCoE multipath BFS fails to boot after installation
HPESD RHEL7.3-SN4- FCoE multipath BFS fails to boot after installation
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut (Show other bugs)
7.3
x86_64 Linux
urgent Severity urgent
: alpha
: 7.5
Assigned To: Lukáš Nykrýn
Release Test Team
:
Depends On:
Blocks: 1438583 1445812 1465137 1522983 1482185
  Show dependency treegraph
 
Reported: 2016-09-23 09:41 EDT by RAVI
Modified: 2018-04-10 14:10 EDT (History)
24 users (show)

See Also:
Fixed In Version: dracut-033-520.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1482185 (view as bug list)
Environment:
Last Closed: 2018-04-10 14:07:53 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Boot logs (140.70 KB, text/plain)
2016-09-23 09:41 EDT, RAVI
no flags Details
Installation screen (311.94 KB, image/png)
2018-02-06 10:39 EST, Nagaraj
no flags Details
Python errors on console (612.24 KB, image/png)
2018-02-06 10:40 EST, Nagaraj
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0964 None None None 2018-04-10 14:10 EDT

  None (edit)
Description RAVI 2016-09-23 09:41:49 EDT
Created attachment 1204160 [details]
Boot logs

Steps to Reproduce:
-------------------------
1. Install and configure both the ports of Banjo (HP) adapter for FCoE BFS on RHEL7.3SN4.
2. Installed RHEL 7.3 OS on mapped LUN . (Multipath installation)
3. Once the installation is done, reboot the server.
4. Now the server fails to boot into OS.

Snippet of log:
----------------

dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.137266] 
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.143146] 
dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.143294] 
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.149145] 
dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.149293]
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.155171] 
dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.155479] 
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.161106] 
dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.161294] 
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.167173] 
dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[   33.167401] 
dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[   33.173109] 

Additional Information:
-----------------------
NA

Frequency: Always
----------

Expected Results:
-----------------
After installation, should boot into OS

Setup Details:
-----------------
OS                 : RHEL 7.3 SN4
FCoE Driver Version: bnx2fc – 2.10.3
Adapter       : Banjo - HP (57980)
MFW: 7.13.75

Attachments:
-------------
Boot logs
Comment 1 Chad Dupuis (Cavium) 2016-09-23 09:45:47 EDT
This looks similar to another unresolved BZ for 7.1: https://bugzilla.redhat.com/show_bug.cgi?id=1129574.
Comment 3 Chad Dupuis (Cavium) 2016-09-30 11:33:13 EDT
Chris, any idea why fipvlan would be throwing this error?
Comment 4 Chad Dupuis (Cavium) 2017-06-13 09:02:40 EDT
We observed this again on RHEL 7.4 snap 2.  It's possible that this is caused by a small timing window where fipvlan is tried just before the link fully up which could cause the fip socket to not open thus the error spew.
Comment 5 RAVI 2017-07-03 05:30:08 EDT
This is still observed with RHEL 7.4 Snap 5.
Comment 6 Chad Dupuis (Cavium) 2017-07-20 13:45:01 EDT
Looking into this more, this occurs because we don't wait enough time in /usr/lib/dracut/modules.d/95fcoe/fcoe-up.sh.  Specifically this line:

elif [ "$netdriver" = "bnx2x" ]; then
    # If driver is bnx2x, do not use /sys/module/fcoe/parameters/create but fipvlan
    modprobe 8021q
    udevadm settle --timeout=30
    # Sleep for 3 s to allow dcb negotiation
    sleep 3 <-- *** This line ***
    fipvlan "$netif" -c -s
else

we need to increase this to 13 seconds as was done upstream: https://git.kernel.org/pub/scm/boot/dracut/dracut.git/commit/?id=3966a1e1ee0e3d27197258f446f54b683c415208
Comment 7 RAVI 2017-07-21 01:48:13 EDT
Additional information:
We are also observing this in the kdump kernel as well.
Comment 9 Joseph Kachuck 2017-09-12 13:50:53 EDT
Hello Chad,
Is this BZ able to be moved to POSTed state, or is this waiting on up stream?

Thank You
Joe Kachuck
Comment 10 Chad Dupuis (Cavium) 2017-09-12 15:27:01 EDT
(In reply to Joseph Kachuck from comment #9)
> Hello Chad,
> Is this BZ able to be moved to POSTed state, or is this waiting on up stream?
> 
> Thank You
> Joe Kachuck

Change was already upstreamed.
Comment 11 Trinh Dao 2017-10-24 13:11:56 EDT
any new update on this bug?
Comment 16 Trinh Dao 2018-01-09 12:03:28 EST
JoeK, since bug is ON_QA, is the fix in RHEL7.5 alpha?
Comment 17 Jan Stodola 2018-01-09 12:07:48 EST
Trinh, this should be fixed in dracut-033-520.el7, which is present in RHEL-7.5 Alpha.
Give it a try to confirm it's fixed for you, please.
Comment 18 Trinh Dao 2018-01-09 12:16:26 EST
Thank you!
trinh
Comment 19 Trinh Dao 2018-01-19 10:09:01 EST
Nagaraj D is verified with RHEL7.5 alpha and I will update the bug once I have the test result.
Comment 20 Nagaraj 2018-02-06 10:38:05 EST
I tried to install RHEL 7.5 with FCOE NX2 cards. The installation gets hung at the package installation stage. Attached the screen shots of installation hung (FCOE-Installation1.PNG) and python errors in the console (FCOE-Installation2.PNG).
Comment 21 Nagaraj 2018-02-06 10:39 EST
Created attachment 1392180 [details]
Installation screen
Comment 22 Nagaraj 2018-02-06 10:40 EST
Created attachment 1392181 [details]
Python errors on console
Comment 23 Jan Stodola 2018-02-06 10:50:06 EST
Nagaraj,
does it happen every time? Could you please attach logs from the installation? They are stored in /tmp during the installation.
Thank you.
Comment 24 Nagaraj 2018-02-12 04:27:47 EST
I reinstalled RHEL 7.5 (Snapshot 1) twice again. I didn't see the issue.
Comment 25 Trinh Dao 2018-02-13 15:29:15 EST
mark HPE verified, bug is closed on HPE side.
Comment 26 Jan Stodola 2018-02-14 04:48:54 EST
Thanks for verifying the issue is fixed.

Moving to VERIFIED based on previous comments.
Comment 29 errata-xmlrpc 2018-04-10 14:07:53 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0964

Note You need to log in before you can comment on or make changes to this bug.