709266 – Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : format fail

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 709266 - Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : format fail

Summary: Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : format fail

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	dosfstools
Sub Component:
Version:	6.0
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	rc
Target Release:	6.2
Assignee:	Jaroslav Škarvada
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Duplicates (2):	723841 738094 (view as bug list)
Depends On:
Blocks:	670196 670203
TreeView+	depends on / blocked

Reported:	2011-05-31 08:48 UTC by Lin Avator
Modified:	2013-03-19 17:15 UTC (History)
CC List:	17 users (show)
Fixed In Version:	dosfstools-3.0.9-4.el6
Doc Type:	Bug Fix
Doc Text:	The mkfs.vfat utility did not correctly detect device partitions on RAID devices. As a consequence, formatting failed with an error message. This was caused by an invalid mask for the statbuf.st_rdev variable. The mask has been fixed to be at least four bytes long and the problem no longer occurs.
Clone Of:
Clones:	710480 714891 (view as bug list)
Environment:
Last Closed:	2011-12-06 09:56:37 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
BIOS: IRST RAID (79.52 KB, image/jpeg) 2011-05-31 08:52 UTC, Lin Avator	no flags	Details
HDD partition table of the RHEL 6.0 x86_64 UEFI installation (57.58 KB, image/jpeg) 2011-05-31 08:56 UTC, Lin Avator	no flags	Details
snapshot of format failure :1 (52.11 KB, image/jpeg) 2011-05-31 08:59 UTC, Lin Avator	no flags	Details
the log file (70.94 KB, application/x-gzip) 2011-05-31 09:05 UTC, Lin Avator	no flags	Details
trouble shooting to format EFI partition (77.54 KB, image/jpeg) 2011-05-31 09:11 UTC, Lin Avator	no flags	Details
Test code (287 bytes, text/plain) 2011-05-31 20:05 UTC, Jaroslav Škarvada	no flags	Details
Proposed fix (944 bytes, patch) 2011-06-02 10:30 UTC, Jaroslav Škarvada	no flags	Details \| Diff
test results of new dosfstools-3.0.9-4.el6.x86_64.rpm (103.06 KB, image/jpeg) 2011-06-03 07:29 UTC, Lin Avator	no flags	Details
Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : error on installing bootloader (75.33 KB, image/jpeg) 2011-06-14 12:04 UTC, Lin Avator	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2011:1552	0	normal	SHIPPED_LIVE	dosfstools bug fix update	2011-12-06 00:39:24 UTC

Description Lin Avator 2011-05-31 08:48:23 UTC

Description of problem:

   We install RHEL 6.0 x86_64 in UEFI mode on our Sandy Bridge platform with Cougar Point chipset. And we configure the HDDs as IRST (Intel Rapid Storage Technology) RAID0 . However it will encounter "format failed: 1" while formatting device /dev/md127p1 .

   We have tried with IRST (Intel Rapid Storage Technology) RAID0, RAID1, and RAID10 configurations. All of them encounter the same failure.

   BTW, it is OK if we install RHEL 6.0 in Legacy BIOS mode on the same IRST RAID .

Version-Release number of selected component (if applicable):

   RHEL 6.0 x86_64 .

How reproducible:


Steps to Reproduce:
1. Configure BIOS to UEFI mode on the Sandy Bridge platform
2. Configure BIOS to use onboard RAID function.
3. Press Ctrl+I during BIOS POST to configure IRST (Intel Rapid Storage Technology) RAID .
4. Install RHEL 6.0 x86_64 in UEFI mode.
  
Actual results:


Expected results:


Additional info:

Comment 1 Lin Avator 2011-05-31 08:52:03 UTC

Created attachment 501941 [details]
BIOS: IRST RAID

As shown in "BIOS_IRST_RAID.jpg" , our Sandy Bridge platform is configure as IRST  ( Intel Rapid Storage Technology) RAID0 in this testing case.

Comment 3 Lin Avator 2011-05-31 08:56:22 UTC

Created attachment 501945 [details]
HDD partition table of the RHEL 6.0 x86_64 UEFI installation

  The HDD partitions of the RHEL 6.0 x86_64 UEFI installation are shown in "partition.jpg" . The EFI partition is located in /dev/md127p1 .

Comment 4 Lin Avator 2011-05-31 08:59:20 UTC

Created attachment 501946 [details]
snapshot of format failure :1

 As shown in "failure.jpg", an error was encountered while formatting device /dev/md127p1 .

Comment 5 Lin Avator 2011-05-31 09:05:03 UTC

Created attachment 501948 [details]
the log file

Please check "raid.tgz" for the log files during installation. In particular, the "anaconda-tb-UQD_I0LjBU2H.xml" inside is generated by pressing the button "File Bug" .

Comment 6 Lin Avator 2011-05-31 09:11:28 UTC

Created attachment 501950 [details]
trouble shooting to format EFI partition

  As shown in "trouble_shooting.jpg" , we try to manually format the EFI partition /dev/md127p1. mkfs.vfat cannot recognize the device node /dev/md127p1 , and mkfs.ext4 ( and also mkfs.ext3, mkfs.ext2 ) can format /dev/md127p1 correctly. It seems to be the possible root cause.

   Please kindly help to clarify this issue.

Comment 7 Jaroslav Škarvada 2011-05-31 20:05:28 UTC

Created attachment 502092 [details]
Test code

This is such a "security feature". But it is not perfect in detection. Does it work with the -I switch? I think anaconda should be changed to use this switch by default.

Please provide the major and minor numbers for the failing device, or compile and run the attached source code and provide its output (preffered, it uses the same syscalls as dosfstools detection):
$ gcc -o test test.c
# ./test /dev/md127p1

Comment 8 Lin Avator 2011-06-01 05:32:23 UTC

Hi Jaroslav ,

   "mkfs.vfat -I" works with IRST RAID0 volume /dev/md127p1 . Please check the response below.

# ./test /dev/md127p1
rdev: 10300

# mkfs.vfat -I /dev/md127p1
# mkdir /tmp/x
# mount /dev/md127p1 /tmp/x
# touch /tmp/x/file
# mkdir /tmp/x/dir

Comment 9 Jaroslav Škarvada 2011-06-02 10:30:16 UTC

Created attachment 502487 [details]
Proposed fix

Thanks for info. The attached patch should fix it. But I am still suggesting the anaconda team to use the '-I' switch.

There is also available scratch build for testing:
http://jskarvad.fedorapeople.org/dosfstools/dosfstools-3.0.9-4.el6.x86_64.rpm

Please test it and report results.

Comment 10 Lin Avator 2011-06-03 07:29:52 UTC

Created attachment 502742 [details]
test results of new dosfstools-3.0.9-4.el6.x86_64.rpm

Hi,
  As shown in "mkdosfs.jpg", dosfstools-3.0.9-4.el6.x86_64.rpm works well.
  ( where /tmp/mkdosfs comes from dosfstools-3.0.9-4.el6.x86_64.rpm , and /usr/sbin/mkdosfs is the original one ).

Comment 11 Jaroslav Škarvada 2011-06-03 07:35:34 UTC

Thanks for info. I will forward the patch upstream.

Comment 12 Jaroslav Škarvada 2011-06-03 14:37:14 UTC

Reported upstream and cloned as fedora bug 710480.

Comment 14 Lin Avator 2011-06-14 12:04:38 UTC

Created attachment 504663 [details]
Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : error on installing bootloader

Hi,

   We reverify to install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID. But it got error on the last stage of bootloader installation. Please check "i001.jpg" .

Comment 15 Jaroslav Škarvada 2011-06-20 07:28:36 UTC

Lin, could you provide logs? If this another error (comment 14) is not related to dosfstools I suggest to create new bug on it.

Comment 16 Stuart Hayes 2011-06-20 20:49:43 UTC

I'm seeing the same thing.  The system I'm working with has a Intel RAID1 across /dev/sda and /dev/sdb, which shows up as /dev/md127.  (/proc/partitions does not show any partitions on /dev/sda or /dev/sdb, though parted & fdisk see the partition tables on those devices.)

After I boot to an installed system, when I run "grub-install --grub-shell=/sbin/grub --no-floppy /dev/md127p1", I get this sort of thing:

Probing devices to guess BIOS drives. This may take a long time.
The file /boot/grub/stage1 not read correctly.

/boot/grub/device.map looks like this:

(hd0)   /dev/sda
(hd1)   /dev/sdb

And mdadm --query --detail /dev/md127p1 returns this:

/dev/md127p1:
      Container : /dev/md0, member 0
     Raid Level : raid1
...
    Number   Major   Minor   RaidDevice State
       1       8        0        0      active sync   /dev/sda
       0       8       16        1      active sync   /dev/sdb

It appears that the grub-install script thinks this is a software RAID, and is using an mdadm query to convert /dev/md127p1 into /dev/sda, /boot/grub/device.map to convert that into (hd0).  I'm not sure how this would ever work with Intel firmware RAID...?



If I change /boot/grub/device.map to:

(hd0)   /dev/md127

And I change /sbin/grub-install so that the function is_raid1_device always returns 0, it will install grub successfully to /dev/md127p1.

Comment 17 Lin Avator 2011-06-21 01:45:11 UTC

Hi Jaroslav ,

   I try to capture logs on bootloader installation error, but in vain. Since this is the follow-up problem after patching dosfstools, i am afraid that it will be hard to understand if i create new bug on this.

Hi Stuart,

  Will the modification of is_raid1_device in /sbin/grub-install work for RAID0 and RAID10 too ?

Comment 18 Jaroslav Škarvada 2011-06-21 08:30:31 UTC

Lin, Stuart,

thanks for info. I will use this BZ entry (bug 709266) to fix the dosfstools. I am cloning this bug to grub package for further investigation of this follow-up (but independent) problem.

Comment 19 Jaroslav Škarvada 2011-06-21 08:36:06 UTC

Cloned as bug 714891, please continue to comment the bootloader issue there.

Comment 21 Eliska Slobodova 2011-07-18 14:50:39 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
The mkfs.vfat utility did not correctly detect device partitions on RAID devices. As a consequence, formatting failed with an error message. This was caused by an invalid mask for the statbuf.st_rdev variable. The mask has been fixed to be at least four bytes long and the problem no longer occurs.

Comment 22 Jaroslav Škarvada 2011-07-21 14:34:29 UTC

*** Bug 723841 has been marked as a duplicate of this bug. ***

Comment 23 Matthew Garrett 2011-09-14 15:35:05 UTC

*** Bug 738094 has been marked as a duplicate of this bug. ***

Comment 24 Dan Williams 2011-09-30 18:06:00 UTC

(In reply to comment #21)
> The mkfs.vfat utility did not correctly detect device partitions on RAID
> devices. As a consequence, formatting failed with an error message. This was
> caused by an invalid mask for the statbuf.st_rdev variable. The mask has been
> fixed to be at least four bytes long and the problem no longer occurs.

Any partition detection scheme that relies on major/minor number is fragile.  I.e. I think this fix would eventually fail testing with CONFIG_DEBUG_BLOCK_EXT_DEVT.  Here is the partition detection routine from mdadm which has no dependency on major/minor numbers.

int test_partition(int fd)
{
        /* Check if fd is a whole-disk or a partition.
         * BLKPG will return EINVAL on a partition, and BLKPG_DEL_PARTITION
         * will return ENXIO on an invalid partition number.
         */
        struct blkpg_ioctl_arg a;
        struct blkpg_partition p;
        a.op = BLKPG_DEL_PARTITION;
        a.data = (void*)&p;
        a.datalen = sizeof(p);
        a.flags = 0;
        memset(a.data, 0, a.datalen);
        p.pno = 1<<30;
        if (ioctl(fd, BLKPG, &a) == 0)
                /* Very unlikely, but not a partition */
                return 0;
        if (errno == ENXIO)
                /* not a partition */
                return 0;
 
        return 1;
}

Do you want a replacement patch?  Maybe I missed how the proposed patch is robust?

Comment 25 Jaroslav Škarvada 2011-10-02 19:42:51 UTC

(In reply to comment #24)
Dan, thanks, I will discuss this with upstream. I don't like that this detection can remove (very unlikely) one partition from the kernel.

Comment 26 Dan Williams 2011-10-03 18:20:15 UTC

(In reply to comment #25)
> (In reply to comment #24)
> Dan, thanks, I will discuss this with upstream. I don't like that this
> detection can remove (very unlikely) one partition from the kernel.

Um, you do realize that successfully deleting partition number 1,073,741,824 would mean that the kernel is holding 1TB of hd_struct data :-)?

I'm not sure why Neil did not just use 1 << 31 for the partition number so we would skip out on a negative number, but it's for all intents and purposes identical.

Comment 27 errata-xmlrpc 2011-12-06 09:56:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1552.html

Note You need to log in before you can comment on or make changes to this bug.