Description of problem: Anaconda crashed right after start. This is probably coused by disks set into RAID1 (firmware raid) Version-Release number of selected component: anaconda-25.20.4-1 The following was filed automatically by anaconda: anaconda 25.20.4-1 exception report Traceback (most recent call first): File "/usr/lib64/python3.5/site-packages/gi/overrides/BlockDev.py", line 441, in wrapped raise transform[1](msg) File "/usr/lib/python3.5/site-packages/blivet/populator/helpers/dmraid.py", line 56, in run rs_names = blockdev.dm.get_member_raid_sets(name, uuid, major, minor) File "/usr/lib/python3.5/site-packages/blivet/populator/populator.py", line 345, in handle_format helper_class(self, info, device).run() File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.5/site-packages/blivet/populator/populator.py", line 318, in handle_device self.handle_format(info, device) File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.5/site-packages/blivet/populator/populator.py", line 518, in _populate self.handle_device(dev) File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.5/site-packages/blivet/populator/populator.py", line 451, in populate self._populate() File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.5/site-packages/blivet/blivet.py", line 271, in reset self.devicetree.populate(cleanup_only=cleanup_only) File "/usr/lib/python3.5/site-packages/blivet/threads.py", line 45, in run_with_lock return m(*args, **kwargs) File "/usr/lib/python3.5/site-packages/blivet/osinstall.py", line 1175, in storage_initialize storage.reset() File "/usr/lib64/python3.5/threading.py", line 862, in run self._target(*self._args, **self._kwargs) File "/usr/lib64/python3.5/site-packages/pyanaconda/threads.py", line 251, in run threading.Thread.run(self, *args, **kwargs) gi.overrides.BlockDev.DMError: Failed to group_set Additional info: addons: com_redhat_kdump, com_redhat_docker cmdline: /usr/bin/python3 /sbin/anaconda cmdline_file: BOOT_IMAGE=/images/pxeboot/vmlinuz inst.stage2=hd:LABEL=Fedora-S-dvd-x86_64-25 rd.live.check quiet executable: /sbin/anaconda hashmarkername: anaconda kernel: 4.8.0-0.rc7.git0.1.fc25.x86_64 product: Fedora release: Cannot get release name. type: anaconda version: 25
Created attachment 1207841 [details] File: anaconda-tb
Created attachment 1207842 [details] File: anaconda.log
Created attachment 1207843 [details] File: dnf.librepo.log
Created attachment 1207844 [details] File: environ
Created attachment 1207845 [details] File: hawkey.log
Created attachment 1207846 [details] File: lsblk_output
Created attachment 1207847 [details] File: nmcli_dev_list
Created attachment 1207848 [details] File: os_info
Created attachment 1207849 [details] File: program.log
Created attachment 1207850 [details] File: storage.log
Created attachment 1207851 [details] File: syslog
Created attachment 1207852 [details] File: ifcfg.log
Created attachment 1207853 [details] File: packaging.log
This bug appears whenever I want to use firmware RAID on this computer. I tried to erase both disks (there are just two) and it didn't help. Strange thing is that when I run lsblk, those two disks aren't listed as part of md* device (there is no such device at all).
I tested this with Fedora 24 and this bug appears there too. So it's not new. I propose this as a blocker even though that I think it could be some problem with hardware (as I think that we tested F24 on this computer too, but I'm not sure). Non-functional RAID is violation the Beta criterion: "The installer must be able to detect and install to hardware or firmware RAID storage devices."
This is NVIDIA firmware RAID?
"Strange thing is that when I run lsblk, those two disks aren't listed as part of md* device (there is no such device at all)." This looks like a firmware RAID case where dmraid rather than mdraid is used, so yes, you won't get any md devices :)
Looks to me like dmraid doesn't like your array. Here's the list of block devices reported by udev when anaconda probes storage: 08:26:10,995 INFO blivet: devices to scan: ['sdc', 'sdc1', 'sdc2', 'sda', 'sdb', 'sr0', 'loop0', 'loop1', 'loop2', 'live-rw', 'live-base'] And here's the lsblk output: 08:26:32,553 INFO program: Running... lsblk --perms --fs --bytes 08:26:32,577 INFO program: NAME SIZE OWNER GROUP MODE NAME FSTYPE LABEL UUID MOUNTPOINT 08:26:32,578 INFO program: loop1 2147483648 root disk brw-rw---- loop1 ext4 Anaconda 9674df00-c062-44b6-b861-21eab39708ab 08:26:32,578 INFO program: |-live-base 2147483648 root disk brw-rw---- |-live-base ext4 Anaconda 9674df00-c062-44b6-b861-21eab39708ab 08:26:32,578 INFO program: `-live-rw 2147483648 root disk brw-rw---- `-live-rw ext4 Anaconda 9674df00-c062-44b6-b861-21eab39708ab / 08:26:32,579 INFO program: sdb 500107862016 root disk brw-rw---- sdb promise_fasttrack_raid_member 08:26:32,579 INFO program: sr0 1073741312 root cdrom brw-rw---- sr0 08:26:32,579 INFO program: loop2 536870912 root disk brw-rw---- loop2 DM_snapshot_cow 08:26:32,579 INFO program: `-live-rw 2147483648 root disk brw-rw---- `-live-rw ext4 Anaconda 9674df00-c062-44b6-b861-21eab39708ab / 08:26:32,579 INFO program: loop0 409735168 root disk brw-rw---- loop0 squashfs 08:26:32,579 INFO program: sdc 15552479232 root disk brw-rw---- sdc iso9660 Fedora-S-dvd-x86_64-25 2016-10-05-05-31-59-00 /run/install/repo 08:26:32,580 INFO program: |-sdc2 5447680 root disk brw-rw---- |-sdc2 vfat ANACONDA 61C8-BEAF 08:26:32,580 INFO program: `-sdc1 2042626048 root disk brw-rw---- `-sdc1 iso9660 Fedora-S-dvd-x86_64-25 2016-10-05-05-31-59-00 08:26:32,580 INFO program: sda 80026361856 root disk brw-rw---- sda promise_fasttrack_raid_member 08:26:32,580 DEBUG program: Return code: 0 So it looks like sda and sdb are members of a promise fasttrack array that dmraid for some reason did not choose to activate during system startup. Reassigning to dmraid for further investigation...
Discussed at 2016-10-06 Fedora 25 Beta Go/No-Go meeting, acting as a blocker review meeting: https://meetbot-raw.fedoraproject.org/teams/f25-beta-go_no_go-meeting/f25-beta-go_no_go-meeting.2016-10-06-17.00.html . Rejected as a blocker: this was reported very late so it's hard to make a definitive call, but given the lateness of the report and the feeling we have that this may be an issue with the RAID set rather than a genuine bug, we decided to reject it as a Beta blocker. If we investigate further and determine that it's a genuine bug that may affect many dmraid cases, it may become a Final blocker. pschindl, dlehman said there's a few things you can get that might help us to figure out what's going on: <dlehman> heinz knows better than I do, but he might start w/ 'dmraid -rv ; dmraid -sv' <dlehman> or try to run whatever part of systemd should have activated the array during bootup <dlehman> and see what happens <dlehman> dmraid -l might be of interest (to see if 'pdc' is in the output)
Yes, output of "dmraid -s"/"dmraid -b" would be useful to get. If this is pdc, you may want to retrieve the first MiB of both disks for further analysis and attach them. If you just want to get rid of any RAID metadata, wipe the first MiB of each component disk unless "wipefs -all /dev/sd[ab]" does it for you.
Created attachment 1211750 [details] output of dmraid -b
Created attachment 1211751 [details] output of dmraid -l
Created attachment 1211752 [details] output of dmraid -s
Created attachment 1211753 [details] output of lsblk --fs
Created attachment 1211754 [details] first MiB of /dev/sda
Created attachment 1211755 [details] first MiB of /dev/sdb
(In reply to Petr Schindler from comment #26) > Created attachment 1211755 [details] > first MiB of /dev/sdb Petr, both dumps of sda and sdb contain zeroes? Please check/recreate. Just to confirm: do you have a Promise FakeRAID BIOS on the machine which discovers the RAID set ok?
(In reply to Heinz Mauelshagen from comment #27) > Petr, both dumps of sda and sdb contain zeroes? > Please check/recreate. I checked that, that's really the contents of sda and sdb, all zeroes. I don't know how firmware raid works, but shouldn't there be some kind of raid signature? How does lsblk know the disk is "promise_fasttrack_raid_member"? > > Just to confirm: > do you have a Promise FakeRAID BIOS on the machine which discovers the RAID > set ok? Sorry, I don't understand. The board is Asus M5A97 PRO, and when I set RAID as disk controller and set up RAID 1 in the integrated tool (I tried both fast and full initialization), everything looks in order in that integrated tool. I should also mention this worked for us in the past, but when we tried it now with older release (F24), it didn't work either. It is possible this might a hardware failure (raid controller fried or something), but I don't know how to distinguish that.
(In reply to Kamil Páral from comment #28) > (In reply to Heinz Mauelshagen from comment #27) > > Petr, both dumps of sda and sdb contain zeroes? > > Please check/recreate. > > I checked that, that's really the contents of sda and sdb, all zeroes. I > don't know how firmware raid works, but shouldn't there be some kind of raid > signature? How does lsblk know the disk is "promise_fasttrack_raid_member"? > I was mistaken about the metadata location, sorry. We need the last MiB of each component device attached. That's where lsblk found the identifier "Promise Technology, Inc." causing "promise_fasttrack_raid_member" to be displayed (see libblkid/src/superblocks/promise_raid.c in the util-linux package). > > > > Just to confirm: > > do you have a Promise FakeRAID BIOS on the machine which discovers the RAID > > set ok? > > Sorry, I don't understand. The board is Asus M5A97 PRO, and when I set RAID > as disk controller and set up RAID 1 in the integrated tool (I tried both > fast and full initialization), everything looks in order in that integrated > tool. With "integrated tool", you're likely referring to the BIOS RAID support/utility on the Motherboard (or a Promise FakeRAID controller plugged in) selected by some hot key (combination) during POST, which allows to boot off such software RAID and to manage it (i.e. setting it up/displaying information on it). > I should also mention this worked for us in the past, but when we > tried it now with older release (F24), it didn't work either. It is possible > this might a hardware failure (raid controller fried or something), but I > don't know how to distinguish that. The controller doesn't seem to be reasoning it, presumably you got reliable access to the 2 disks from the BIOS RAID utility and from Linux. You aren't noticing any disk SMART errors, are you? Use "for d in /dev/sd[ab];do smartctl -l error $d;done" to see their error logs. Once I have the metadata, I can analyse if this is a bug in dmraid and tell more. BTW: are you able to cause this failure on a different system or is it singular?
Created attachment 1212170 [details] last MiB of sda and sdb
(In reply to Heinz Mauelshagen from comment #29) > We need the last MiB of each component device attached. Attached. > With "integrated tool", you're likely referring to the BIOS RAID > support/utility on the Motherboard Yes. > (or a Promise FakeRAID controller plugged in) Nope, no external controller. > selected by some hot key (combination) during POST, which allows to boot > off such software RAID and to manage it (i.e. setting it up/displaying > information on it). Exactly. > You aren't noticing any disk SMART errors, are you? > Use "for d in /dev/sd[ab];do smartctl -l error $d;done" to see their error > logs. Smart was disabled, I had to enable it with "-s on". This is the output: $ smartctl -s on -l error /dev/sda smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.8.1-1.fc25.x86_64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Enabled. === START OF READ SMART DATA SECTION === SMART Error Log not supported $ smartctl -s on -l error /dev/sdb smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.8.1-1.fc25.x86_64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF ENABLE/DISABLE COMMANDS SECTION === SMART Enabled. === START OF READ SMART DATA SECTION === SMART Error Log Version: 1 No Errors Logged > BTW: are you able to cause this failure on a different system or is it > singular? We don't have any other non-intel firmware raid available in any other system, so I can't really tell. We haven't seen this error with intel firmware raid.
Kamil, I am able to analyse further with your metadata as of comment #30. Your sda doesn't support error logging, whereas sdb does so we can't be sure about sda's sanity.
sda is Intel SSD SC2CT08. I'm surprised it doesn't support SMART. But I've done numerous installations to it (in non-RAID mode) recently and had no issues with it. However, if you can't find any other error or have a strong suspicion of a disk failure, I'll replace it with a different drive and try again.
Interesting is that I can show the SMART values for sda in gnome-disks, it shows them without problems, and all values are marked as OK. Even short self-test passed. So the drive is probably OK and it's just some smartctl issue.
This message is a reminder that Fedora 25 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 25. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '25'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 25 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 25 changed to end-of-life (EOL) status on 2017-12-12. Fedora 25 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.