Bug 709266
Summary: | Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : format fail | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Lin Avator <lavator> | |
Component: | dosfstools | Assignee: | Jaroslav Škarvada <jskarvad> | |
Status: | CLOSED ERRATA | QA Contact: | BaseOS QE - Apps <qe-baseos-apps> | |
Severity: | high | Docs Contact: | ||
Priority: | medium | |||
Version: | 6.0 | CC: | bnater, dmilburn, ed.ciechanowski, hui.xiao, ignacy.kasperowicz, jane.lv, jose_de_la_rosa, jskarvad, jvillalo, jwilleford, luyu, maciej.patelczyk, michael.j.degon, michalx.sorn, przemyslaw.hawrylewicz.czarnowski, rvokal, stuart_hayes | |
Target Milestone: | rc | |||
Target Release: | 6.2 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | dosfstools-3.0.9-4.el6 | Doc Type: | Bug Fix | |
Doc Text: |
The mkfs.vfat utility did not correctly detect device partitions on RAID devices. As a consequence, formatting failed with an error message. This was caused by an invalid mask for the statbuf.st_rdev variable. The mask has been fixed to be at least four bytes long and the problem no longer occurs.
|
Story Points: | --- | |
Clone Of: | ||||
: | 710480 714891 (view as bug list) | Environment: | ||
Last Closed: | 2011-12-06 09:56:37 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 670196, 670203 | |||
Attachments: |
Description
Lin Avator
2011-05-31 08:48:23 UTC
Created attachment 501941 [details]
BIOS: IRST RAID
As shown in "BIOS_IRST_RAID.jpg" , our Sandy Bridge platform is configure as IRST ( Intel Rapid Storage Technology) RAID0 in this testing case.
Created attachment 501945 [details]
HDD partition table of the RHEL 6.0 x86_64 UEFI installation
The HDD partitions of the RHEL 6.0 x86_64 UEFI installation are shown in "partition.jpg" . The EFI partition is located in /dev/md127p1 .
Created attachment 501946 [details]
snapshot of format failure :1
As shown in "failure.jpg", an error was encountered while formatting device /dev/md127p1 .
Created attachment 501948 [details]
the log file
Please check "raid.tgz" for the log files during installation. In particular, the "anaconda-tb-UQD_I0LjBU2H.xml" inside is generated by pressing the button "File Bug" .
Created attachment 501950 [details]
trouble shooting to format EFI partition
As shown in "trouble_shooting.jpg" , we try to manually format the EFI partition /dev/md127p1. mkfs.vfat cannot recognize the device node /dev/md127p1 , and mkfs.ext4 ( and also mkfs.ext3, mkfs.ext2 ) can format /dev/md127p1 correctly. It seems to be the possible root cause.
Please kindly help to clarify this issue.
Created attachment 502092 [details]
Test code
This is such a "security feature". But it is not perfect in detection. Does it work with the -I switch? I think anaconda should be changed to use this switch by default.
Please provide the major and minor numbers for the failing device, or compile and run the attached source code and provide its output (preffered, it uses the same syscalls as dosfstools detection):
$ gcc -o test test.c
# ./test /dev/md127p1
Hi Jaroslav , "mkfs.vfat -I" works with IRST RAID0 volume /dev/md127p1 . Please check the response below. # ./test /dev/md127p1 rdev: 10300 # mkfs.vfat -I /dev/md127p1 # mkdir /tmp/x # mount /dev/md127p1 /tmp/x # touch /tmp/x/file # mkdir /tmp/x/dir Created attachment 502487 [details] Proposed fix Thanks for info. The attached patch should fix it. But I am still suggesting the anaconda team to use the '-I' switch. There is also available scratch build for testing: http://jskarvad.fedorapeople.org/dosfstools/dosfstools-3.0.9-4.el6.x86_64.rpm Please test it and report results. Created attachment 502742 [details]
test results of new dosfstools-3.0.9-4.el6.x86_64.rpm
Hi,
As shown in "mkdosfs.jpg", dosfstools-3.0.9-4.el6.x86_64.rpm works well.
( where /tmp/mkdosfs comes from dosfstools-3.0.9-4.el6.x86_64.rpm , and /usr/sbin/mkdosfs is the original one ).
Thanks for info. I will forward the patch upstream. Reported upstream and cloned as fedora bug 710480. Created attachment 504663 [details]
Install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID : error on installing bootloader
Hi,
We reverify to install UEFI RHEL 6.0 x86_64 on Sandy Bridge IRST RAID. But it got error on the last stage of bootloader installation. Please check "i001.jpg" .
Lin, could you provide logs? If this another error (comment 14) is not related to dosfstools I suggest to create new bug on it. I'm seeing the same thing. The system I'm working with has a Intel RAID1 across /dev/sda and /dev/sdb, which shows up as /dev/md127. (/proc/partitions does not show any partitions on /dev/sda or /dev/sdb, though parted & fdisk see the partition tables on those devices.) After I boot to an installed system, when I run "grub-install --grub-shell=/sbin/grub --no-floppy /dev/md127p1", I get this sort of thing: Probing devices to guess BIOS drives. This may take a long time. The file /boot/grub/stage1 not read correctly. /boot/grub/device.map looks like this: (hd0) /dev/sda (hd1) /dev/sdb And mdadm --query --detail /dev/md127p1 returns this: /dev/md127p1: Container : /dev/md0, member 0 Raid Level : raid1 ... Number Major Minor RaidDevice State 1 8 0 0 active sync /dev/sda 0 8 16 1 active sync /dev/sdb It appears that the grub-install script thinks this is a software RAID, and is using an mdadm query to convert /dev/md127p1 into /dev/sda, /boot/grub/device.map to convert that into (hd0). I'm not sure how this would ever work with Intel firmware RAID...? If I change /boot/grub/device.map to: (hd0) /dev/md127 And I change /sbin/grub-install so that the function is_raid1_device always returns 0, it will install grub successfully to /dev/md127p1. Hi Jaroslav , I try to capture logs on bootloader installation error, but in vain. Since this is the follow-up problem after patching dosfstools, i am afraid that it will be hard to understand if i create new bug on this. Hi Stuart, Will the modification of is_raid1_device in /sbin/grub-install work for RAID0 and RAID10 too ? Lin, Stuart, thanks for info. I will use this BZ entry (bug 709266) to fix the dosfstools. I am cloning this bug to grub package for further investigation of this follow-up (but independent) problem. Cloned as bug 714891, please continue to comment the bootloader issue there. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The mkfs.vfat utility did not correctly detect device partitions on RAID devices. As a consequence, formatting failed with an error message. This was caused by an invalid mask for the statbuf.st_rdev variable. The mask has been fixed to be at least four bytes long and the problem no longer occurs. *** Bug 723841 has been marked as a duplicate of this bug. *** *** Bug 738094 has been marked as a duplicate of this bug. *** (In reply to comment #21) > The mkfs.vfat utility did not correctly detect device partitions on RAID > devices. As a consequence, formatting failed with an error message. This was > caused by an invalid mask for the statbuf.st_rdev variable. The mask has been > fixed to be at least four bytes long and the problem no longer occurs. Any partition detection scheme that relies on major/minor number is fragile. I.e. I think this fix would eventually fail testing with CONFIG_DEBUG_BLOCK_EXT_DEVT. Here is the partition detection routine from mdadm which has no dependency on major/minor numbers. int test_partition(int fd) { /* Check if fd is a whole-disk or a partition. * BLKPG will return EINVAL on a partition, and BLKPG_DEL_PARTITION * will return ENXIO on an invalid partition number. */ struct blkpg_ioctl_arg a; struct blkpg_partition p; a.op = BLKPG_DEL_PARTITION; a.data = (void*)&p; a.datalen = sizeof(p); a.flags = 0; memset(a.data, 0, a.datalen); p.pno = 1<<30; if (ioctl(fd, BLKPG, &a) == 0) /* Very unlikely, but not a partition */ return 0; if (errno == ENXIO) /* not a partition */ return 0; return 1; } Do you want a replacement patch? Maybe I missed how the proposed patch is robust? (In reply to comment #24) Dan, thanks, I will discuss this with upstream. I don't like that this detection can remove (very unlikely) one partition from the kernel. (In reply to comment #25) > (In reply to comment #24) > Dan, thanks, I will discuss this with upstream. I don't like that this > detection can remove (very unlikely) one partition from the kernel. Um, you do realize that successfully deleting partition number 1,073,741,824 would mean that the kernel is holding 1TB of hd_struct data :-)? I'm not sure why Neil did not just use 1 << 31 for the partition number so we would skip out on a negative number, but it's for all intents and purposes identical. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1552.html |