Bug 1378910
| Summary: | HPESD RHEL7.3-SN4- FCoE multipath BFS fails to boot after installation | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | RAVI <ravi.adabala> | ||||||||
| Component: | dracut | Assignee: | Lukáš Nykrýn <lnykryn> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> | ||||||||
| Severity: | urgent | Docs Contact: | |||||||||
| Priority: | urgent | ||||||||||
| Version: | 7.3 | CC: | abeausol, andrew.vasquez, arun.patil, cdupuis, cleech, dinesh.surpur, dracut-maint-list, emilne, jkachuck, jstodola, karen.skweres, lnykryn, mknutson, nagaraj-sangappa.davanakatti, nilesh.bhoi, phinchman, ravi.adabala, revers, shyam.sundar, trinh.dao, vishnu.kumar, vivek.kumar, william.gens, xhe | ||||||||
| Target Milestone: | alpha | ||||||||||
| Target Release: | 7.5 | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | dracut-033-520.el7 | Doc Type: | If docs needed, set a value | ||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | |||||||||||
| : | 1482185 (view as bug list) | Environment: | |||||||||
| Last Closed: | 2018-04-10 18:07:53 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1438583, 1445812, 1465137, 1482185, 1522983 | ||||||||||
| Attachments: |
|
||||||||||
This looks similar to another unresolved BZ for 7.1: https://bugzilla.redhat.com/show_bug.cgi?id=1129574. Chris, any idea why fipvlan would be throwing this error? We observed this again on RHEL 7.4 snap 2. It's possible that this is caused by a small timing window where fipvlan is tried just before the link fully up which could cause the fip socket to not open thus the error spew. This is still observed with RHEL 7.4 Snap 5. Looking into this more, this occurs because we don't wait enough time in /usr/lib/dracut/modules.d/95fcoe/fcoe-up.sh. Specifically this line:
elif [ "$netdriver" = "bnx2x" ]; then
# If driver is bnx2x, do not use /sys/module/fcoe/parameters/create but fipvlan
modprobe 8021q
udevadm settle --timeout=30
# Sleep for 3 s to allow dcb negotiation
sleep 3 <-- *** This line ***
fipvlan "$netif" -c -s
else
we need to increase this to 13 seconds as was done upstream: https://git.kernel.org/pub/scm/boot/dracut/dracut.git/commit/?id=3966a1e1ee0e3d27197258f446f54b683c415208
Additional information: We are also observing this in the kdump kernel as well. Hello Chad, Is this BZ able to be moved to POSTed state, or is this waiting on up stream? Thank You Joe Kachuck (In reply to Joseph Kachuck from comment #9) > Hello Chad, > Is this BZ able to be moved to POSTed state, or is this waiting on up stream? > > Thank You > Joe Kachuck Change was already upstreamed. any new update on this bug? JoeK, since bug is ON_QA, is the fix in RHEL7.5 alpha? Trinh, this should be fixed in dracut-033-520.el7, which is present in RHEL-7.5 Alpha. Give it a try to confirm it's fixed for you, please. Thank you! trinh Nagaraj D is verified with RHEL7.5 alpha and I will update the bug once I have the test result. I tried to install RHEL 7.5 with FCOE NX2 cards. The installation gets hung at the package installation stage. Attached the screen shots of installation hung (FCOE-Installation1.PNG) and python errors in the console (FCOE-Installation2.PNG). Created attachment 1392180 [details]
Installation screen
Created attachment 1392181 [details]
Python errors on console
Nagaraj, does it happen every time? Could you please attach logs from the installation? They are stored in /tmp during the installation. Thank you. I reinstalled RHEL 7.5 (Snapshot 1) twice again. I didn't see the issue. mark HPE verified, bug is closed on HPE side. Thanks for verifying the issue is fixed. Moving to VERIFIED based on previous comments. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0964 |
Created attachment 1204160 [details] Boot logs Steps to Reproduce: ------------------------- 1. Install and configure both the ports of Banjo (HP) adapter for FCoE BFS on RHEL7.3SN4. 2. Installed RHEL 7.3 OS on mapped LUN . (Multipath installation) 3. Once the installation is done, reboot the server. 4. Now the server fails to boot into OS. Snippet of log: ---------------- dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.137266] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.143146] dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.143294] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.149145] dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.149293] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.155171] dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.155479] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.161106] dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.161294] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.167173] dracut-initqueue[837]: fipvlan: fip_recv: error 88 Socket operation on non-socket[ 33.167401] dracut-initqueue[837]: fipvlan: fip_recv: packet socket recv error[ 33.173109] Additional Information: ----------------------- NA Frequency: Always ---------- Expected Results: ----------------- After installation, should boot into OS Setup Details: ----------------- OS : RHEL 7.3 SN4 FCoE Driver Version: bnx2fc – 2.10.3 Adapter : Banjo - HP (57980) MFW: 7.13.75 Attachments: ------------- Boot logs