Bug 844304

Summary: [bfa boot] bfa kernel modules not loaded during boot in initramfs.
Product: Red Hat Enterprise Linux 7 Reporter: Gris Ge <fge>
Component: dracutAssignee: dracut-maint
Status: CLOSED DUPLICATE QA Contact: Xiaowei Li <xiaoli>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.0CC: harald, qcai, revers, xiaoli
Target Milestone: betaKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-06 07:33:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gris Ge 2012-07-30 09:17:31 UTC
Description of problem:

Cannot boot OS from bfa FCoE.
The installation goes well without problem.
But the first boot up failed with not exists disk and fail back to dracut shell.

After investigation, bfa module is not loaded and found the .ko file is NULL:
=====
dracut:/lib/modules/3.3.0-0.20.el7.x86_64/kernel/drivers/scsi# ls -lh
total 1.3M
-rwxr--r-- 1 root root  66K Jul  9 20:46 3w-9xxx.ko
-rwxr--r-- 1 root root  49K Jul  9 20:46 3w-sas.ko
-rwxr--r-- 1 root root  49K Jul  9 20:46 3w-xxxx.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 aacraid
drwxr-xr-x 2 root root    0 Jul 30 04:57 aic7xxx
drwxr-xr-x 2 root root    0 Jul 30 04:57 aic94xx
drwxr-xr-x 2 root root    0 Jul 30 04:57 arcmsr
drwxr-xr-x 2 root root    0 Jul 30 04:57 bfa
drwxr-xr-x 2 root root    0 Jul 30 04:57 bnx2fc
drwxr-xr-x 2 root root    0 Jul 30 04:57 fcoe
drwxr-xr-x 2 root root    0 Jul 30 04:57 fnic
-rwxr--r-- 1 root root  92K Jul  9 20:46 hpsa.ko
-rwxr--r-- 1 root root  32K Jul  9 20:46 hptiop.ko
-rwxr--r-- 1 root root  35K Jul  9 20:46 initio.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 isci
-rwxr--r-- 1 root root  19K Jul  9 20:46 iscsi_boot_sysfs.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 libfc
-rwxr--r-- 1 root root  88K Jul  9 20:46 libiscsi.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 libsas
drwxr-xr-x 2 root root    0 Jul 30 04:57 lpfc
drwxr-xr-x 2 root root    0 Jul 30 04:57 megaraid
drwxr-xr-x 2 root root    0 Jul 30 04:57 mpt2sas
drwxr-xr-x 2 root root    0 Jul 30 04:57 mvsas
-rwxr--r-- 1 root root  47K Jul  9 20:46 mvumi.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 osd
drwxr-xr-x 2 root root    0 Jul 30 04:57 pm8001
-rwxr--r-- 1 root root 104K Jul  9 20:46 pmcraid.ko
drwxr-xr-x 2 root root    0 Jul 30 04:57 qla2xxx
drwxr-xr-x 2 root root    0 Jul 30 04:57 qla4xxx
-rwxr--r-- 1 root root  14K Jul  9 20:46 raid_class.ko
-rwxr--r-- 1 root root 103K Jul  9 20:46 scsi_debug.ko
-rwxr--r-- 1 root root  28K Jul  9 20:46 scsi_tgt.ko
-rwxr--r-- 1 root root 101K Jul  9 20:46 scsi_transport_fc.ko
-rwxr--r-- 1 root root  97K Jul  9 20:46 scsi_transport_iscsi.ko
-rwxr--r-- 1 root root  68K Jul  9 20:46 scsi_transport_sas.ko
-rwxr--r-- 1 root root  50K Jul  9 20:46 scsi_transport_spi.ko
-rwxr--r-- 1 root root  16K Jul  9 20:46 scsi_transport_srp.ko
-rwxr--r-- 1 root root 3.3K Jul  9 20:46 scsi_wait_scan.ko
-rwxr--r-- 1 root root  75K Jul  9 20:46 sd_mod.ko
-rwxr--r-- 1 root root  35K Jul  9 20:46 sr_mod.ko
-rwxr--r-- 1 root root  36K Jul  9 20:46 stex.ko
-rwxr--r-- 1 root root  40K Jul  9 20:46 vmw_pvscsi.k
=====

So there are many kernel modules is zero sized. 

Version-Release number of selected component (if applicable):
RHEL-7.0-20120711.2

How reproducible:
100%

Steps to Reproduce:
1. Install OS on server which boot from bfa FCoE.
2.
3.
  
Actual results:
cannot boot up.

Expected results:
installation pass and boot up without problem

Additional info:
RHEL 6.3 works well for bfa boot. Hence request blocker and set regression flag.

Comment 2 Harald Hoyer 2012-07-30 11:57:03 UTC
> drwxr-xr-x 2 root root    0 Jul 30 04:57 aacraid
> drwxr-xr-x 2 root root    0 Jul 30 04:57 aic7xxx
> drwxr-xr-x 2 root root    0 Jul 30 04:57 aic94xx
> drwxr-xr-x 2 root root    0 Jul 30 04:57 arcmsr
> drwxr-xr-x 2 root root    0 Jul 30 04:57 bfa
> drwxr-xr-x 2 root root    0 Jul 30 04:57 bnx2fc
> drwxr-xr-x 2 root root    0 Jul 30 04:57 fcoe
> drwxr-xr-x 2 root root    0 Jul 30 04:57 fnic
> drwxr-xr-x 2 root root    0 Jul 30 04:57 isci
> drwxr-xr-x 2 root root    0 Jul 30 04:57 libfc
> drwxr-xr-x 2 root root    0 Jul 30 04:57 libsas
> drwxr-xr-x 2 root root    0 Jul 30 04:57 lpfc
> drwxr-xr-x 2 root root    0 Jul 30 04:57 megaraid
> drwxr-xr-x 2 root root    0 Jul 30 04:57 mpt2sas
> drwxr-xr-x 2 root root    0 Jul 30 04:57 mvsas
> drwxr-xr-x 2 root root    0 Jul 30 04:57 osd
> drwxr-xr-x 2 root root    0 Jul 30 04:57 pm8001
> drwxr-xr-x 2 root root    0 Jul 30 04:57 qla2xxx
> drwxr-xr-x 2 root root    0 Jul 30 04:57 qla4xxx
> =====
> 
> So there are many kernel modules is zero sized. 

All of these are directories!!

Comment 3 Gris Ge 2012-07-31 02:49:15 UTC
Oh. Sorry.
I will investigate why bfa module not loaded.

Update later.

Comment 4 Gris Ge 2012-07-31 02:50:10 UTC
Downgrade priority to medium/medium, as it turn out be only bfa cannot boot up.

Comment 5 Gris Ge 2012-07-31 03:53:02 UTC
Manaully loaded bfa kernel module by command:
===
modprobe -r bfa
modprobe -a bfa
===

the FCoE disk and multipath setup correctly.

When mounting sysroot, got these:
===
(Repair filesystem):/# mount -t ext4 -o ro /dev/mapper/mpathbp2 /sysroot
[  522.256805] EXT4-fs (dm-3): bad geometry: block count 10499840 exceeds size of device (2048000 blocks)
(Repair filesystem):/# multipath -l
mpathb (360060e801047103004f2c4b300000021) dm-0 HITACHI,DF600F
size=50G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| `- 2:0:0:0 sda 8:0  active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
  `- 2:0:1:0 sdc 8:32 active undef running
(Repair filesystem):/# cat /sys/block/dm-3/queue/hw_sector_size
512
(Repair filesystem):/# cat /sys/block/dm-3/queue/logical_block_size
512
(Repair filesystem):/# cat /sys/block/dm-3/size
16384000
===

ext4 report the dm-3 (2048000 blocks) is 1 GiB
sysfs report the dm-3 size is 8GiB.


This installation is not using LVM by using ondisk=/dev/disk/by-id/scsi-<WWID> in beaker kickstar metadata.

Please investigate.

Comment 9 Harald Hoyer 2012-07-31 09:29:18 UTC
I doubt it's a dracut problem. It might be a kernel/driver/multipath problem.

But to have more information:

1. boot with "rd.debug"
2. attach the "dmesg" output (just mount something to save the file to)
3. is the bfa module loaded?
4. which devices are recognized?
5. is multipath setup correctly?
6. what is "ls -al /dev/disk/*"

Comment 10 Xiaowei Li 2012-12-06 07:33:48 UTC
It's the bfa issues. I will track this issue in another BZ 831492.
So close this BZ.

*** This bug has been marked as a duplicate of bug 831492 ***