1160424 – MDRaidError: name_from_md_node(md126p1) failed

Bug 1160424 - MDRaidError: name_from_md_node(md126p1) failed

Summary: MDRaidError: name_from_md_node(md126p1) failed

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	python-blivet
Sub Component:
Version:	22
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Blivet Maintenance Team
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	RejectedBlocker abrt_hash:ec7bc45e7de...
Duplicates (5):	1146620 1170755 1209635 1210057 1219430 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-11-04 18:48 UTC by Jeremy Rimpo
Modified:	2016-07-19 12:21 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-07-19 12:21:53 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
File: anaconda-tb (1.09 MB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: anaconda.log (18.36 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: environ (615 bytes, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: journalctl (568.88 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: lsblk_output (4.48 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: nmcli_dev_list (2.07 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: os_info (377 bytes, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: program.log (81.85 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: storage.log (444.97 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: ifcfg.log (10.53 KB, text/plain) 2014-11-04 18:48 UTC, Jeremy Rimpo	no flags	Details
File: anaconda-tb (398.07 KB, text/plain) 2015-04-17 23:01 UTC, Ian Pilcher	no flags	Details
File: anaconda.log (4.80 KB, text/plain) 2015-04-17 23:01 UTC, Ian Pilcher	no flags	Details
File: journalctl (208.48 KB, text/plain) 2015-04-17 23:02 UTC, Ian Pilcher	no flags	Details
File: lsblk_output (6.34 KB, text/plain) 2015-04-17 23:03 UTC, Ian Pilcher	no flags	Details
File: nmcli_dev_list (1.83 KB, text/plain) 2015-04-17 23:03 UTC, Ian Pilcher	no flags	Details
File: program.log (33.56 KB, text/plain) 2015-04-17 23:04 UTC, Ian Pilcher	no flags	Details
File: storage.log (207.43 KB, text/plain) 2015-04-17 23:05 UTC, Ian Pilcher	no flags	Details
File: ifcfg.log (7.08 KB, text/plain) 2015-04-17 23:06 UTC, Ian Pilcher	no flags	Details
File: anaconda-tb-4jCXBs (417.07 KB, text/plain) 2015-04-21 15:30 UTC, Ian Pilcher	no flags	Details
File: anaconda-tb-TMjejy (420.04 KB, text/plain) 2015-04-21 15:30 UTC, Ian Pilcher	no flags	Details
File: anaconda.log (6.60 KB, text/plain) 2015-04-21 15:31 UTC, Ian Pilcher	no flags	Details
File: ifcfg.log (7.23 KB, text/plain) 2015-04-21 15:32 UTC, Ian Pilcher	no flags	Details
File: journalctl (206.54 KB, text/plain) 2015-04-21 15:33 UTC, Ian Pilcher	no flags	Details
File: lsblk_output (6.79 KB, text/plain) 2015-04-21 15:33 UTC, Ian Pilcher	no flags	Details
File: nmcli_dev_list (2.23 KB, text/plain) 2015-04-21 15:34 UTC, Ian Pilcher	no flags	Details
File: program.log (33.03 KB, text/plain) 2015-04-21 15:34 UTC, Ian Pilcher	no flags	Details
File: storage.log (209.02 KB, text/plain) 2015-04-21 15:35 UTC, Ian Pilcher	no flags	Details
File: anaconda.log (6.60 KB, text/plain) 2015-04-21 23:08 UTC, Ian Pilcher	no flags	Details
File: anaconda-tb-HrlsJE (426.29 KB, text/plain) 2015-04-21 23:09 UTC, Ian Pilcher	no flags	Details
File: anaconda-tb-sM7Zw6 (429.26 KB, text/plain) 2015-04-21 23:10 UTC, Ian Pilcher	no flags	Details
File: ifcfg.log (7.23 KB, text/plain) 2015-04-21 23:10 UTC, Ian Pilcher	no flags	Details
File: journalctl (206.34 KB, text/plain) 2015-04-21 23:11 UTC, Ian Pilcher	no flags	Details
File: lsblk_output (6.34 KB, text/plain) 2015-04-21 23:11 UTC, Ian Pilcher	no flags	Details
File: nmcli_dev_list (2.23 KB, text/plain) 2015-04-21 23:12 UTC, Ian Pilcher	no flags	Details
File: program.log (33.03 KB, text/plain) 2015-04-21 23:13 UTC, Ian Pilcher	no flags	Details
File: storage.log (217.09 KB, text/plain) 2015-04-21 23:13 UTC, Ian Pilcher	no flags	Details
Show Obsolete (25) View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1157657	unspecified	CLOSED	DeviceTreeError: failed to scan disk sdb	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1157770	unspecified	CLOSED	SELinux is preventing cat from 'getattr' accesses on the file /proc/mdstat.	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1157774	unspecified	CLOSED	[abrt] mdadm: strcpy(): mdadm killed by SIGSEGV	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1210057	unspecified	CLOSED	Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1226305	unspecified	CLOSED	Can not install "Fedora 22" into existing LVM-on-RAID partitions	2021-02-22 00:41:40 UTC

Internal Links: 1210057 1226305

Description Jeremy Rimpo 2014-11-04 18:48:18 UTC

Description of problem:
Have an Intel Firmware RAID setup, run anaconda, wait a few seconds.

Version-Release number of selected component:
anaconda-core-21.48.13-1.fc21.x86_64

The following was filed automatically by anaconda:
anaconda 21.48.13-1 exception report
Traceback (most recent call first):
  File "/usr/lib/python2.7/site-packages/blivet/devicelibs/mdraid.py", line 359, in name_from_md_node
    raise MDRaidError("name_from_md_node(%s) failed" % node)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 909, in addUdevPartitionDevice
    name = mdraid.name_from_md_node(name)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1201, in addUdevDevice
    device = self.addUdevPartitionDevice(info)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2170, in _populate
    self.addUdevDevice(dev)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2105, in populate
    self._populate()
  File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 479, in reset
    self.devicetree.populate(cleanupOnly=cleanupOnly)
  File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 183, in storageInitialize
    storage.reset()
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 227, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 112, in wait
    self.raise_if_error(name)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/timezone.py", line 75, in time_initialize
    threadMgr.wait(THREAD_STORAGE)
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 227, in run
    threading.Thread.run(self, *args, **kwargs)
MDRaidError: name_from_md_node(md126p1) failed

Additional info:
cmdline:        /usr/bin/python  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Fedora-Live-Workstation-x86_64-2 rootfstype=auto ro rd.live.image quiet  rhgb rd.luks=0 rd.md=0 rd.dm=0 
executable:     /sbin/anaconda
hashmarkername: anaconda
kernel:         3.17.1-302.fc21.x86_64
other involved packages: python-blivet-0.61.8-1.fc21.noarch, python-libs-2.7.8-4.1.fc21.x86_64
product:        Fedora"
release:        Fedora release 21 (Twenty One)
type:           anaconda
version:        Fedora

Comment 1 Jeremy Rimpo 2014-11-04 18:48:20 UTC

Created attachment 953767 [details]
File: anaconda-tb

Comment 2 Jeremy Rimpo 2014-11-04 18:48:21 UTC

Created attachment 953768 [details]
File: anaconda.log

Comment 3 Jeremy Rimpo 2014-11-04 18:48:22 UTC

Created attachment 953769 [details]
File: environ

Comment 4 Jeremy Rimpo 2014-11-04 18:48:23 UTC

Created attachment 953770 [details]
File: journalctl

Comment 5 Jeremy Rimpo 2014-11-04 18:48:24 UTC

Created attachment 953771 [details]
File: lsblk_output

Comment 6 Jeremy Rimpo 2014-11-04 18:48:24 UTC

Created attachment 953772 [details]
File: nmcli_dev_list

Comment 7 Jeremy Rimpo 2014-11-04 18:48:25 UTC

Created attachment 953773 [details]
File: os_info

Comment 8 Jeremy Rimpo 2014-11-04 18:48:26 UTC

Created attachment 953774 [details]
File: program.log

Comment 9 Jeremy Rimpo 2014-11-04 18:48:27 UTC

Created attachment 953775 [details]
File: storage.log

Comment 10 Jeremy Rimpo 2014-11-04 18:48:28 UTC

Created attachment 953776 [details]
File: ifcfg.log

Comment 11 Jeremy Rimpo 2014-11-04 19:06:12 UTC

I also experienced the mdadm crash and selinux alerts associated with these old bugs. So I'm assuming these are related issues. However, in my case the RAID setup should still be intact and functional. So I'm guessing the root cause may be a selinux issue.

Comment 12 David Lehman 2014-11-05 17:52:39 UTC

If possible, reproduce the failure and then provide the output of the following command:

 ls -l /dev/md/

Thanks.

Comment 13 Jeremy Rimpo 2014-11-05 19:26:02 UTC

total 0
lrwxrwxrwx. 1 root root  8 Nov  5 19:17 imsm0 -> ../md127
lrwxrwxrwx. 1 root root  8 Nov  5 19:17 Volume1_0 -> ../md126
lrwxrwxrwx. 1 root root 10 Nov  5 19:17 Volume1_0p1 -> ../md126p1
lrwxrwxrwx. 1 root root 10 Nov  5 19:17 Volume1_0p2 -> ../md126p2
lrwxrwxrwx. 1 root root 10 Nov  5 19:17 Volume1_0p3 -> ../md126p3

Comment 14 Jeremy Rimpo 2014-11-05 19:52:41 UTC

I'll note again, these partitions should be
1: an NTFS partition
2: an EXT boot partition for Fedora 20
3: an LVM partition for Fedora 20 root, home, and swap

I do recall seeing some issues concerning NTFS partitions on the same drive that were or are being addressed - not sure if that's related or not.

I'm not currently installing to this drive, but it is part of the system.

Comment 15 Brian Lane 2014-11-05 22:01:49 UTC

This looks like it may be related to bug 1156614

Comment 16 David Lehman 2014-11-07 00:09:00 UTC

Here's where we try to resolve md126p1 to a name from /dev/md/:

13:45:28,796 INFO blivet: md126p1 is a partition
13:45:28,797 DEBUG blivet:           DeviceTree.addUdevPartitionDevice: name: md126p1 ;
13:45:28,797 DEBUG blivet: link: Volume1_0 -> ../md126
13:45:28,797 DEBUG blivet: link: imsm0 -> ../md127

This is followed by the exception MDRaidError("name_from_md_node(md126p1) failed"), which indicates that the only two symlinks in /dev/md at that time were those mentioned (Volume1_0, imsm0). However, we see from your output that there is indeed a Volume1_0p1 symlink that points to md126p1. I don't see how this is possible unless selinux is preventing a full directory listing (never heard of this) or some kind of timing issue.

Comment 17 Jeremy Rimpo 2014-11-07 19:36:14 UTC

This wasn't happening a few composes back - I had other issues related to firmware drives, but this is new. It's not just a one off though; I've tried three boots now and I get the same thing each time.

Comment 18 David Shea 2014-12-08 22:28:58 UTC

*** Bug 1146620 has been marked as a duplicate of this bug. ***

Comment 19 David Shea 2014-12-08 22:29:01 UTC

*** Bug 1170755 has been marked as a duplicate of this bug. ***

Comment 20 Ian Pilcher 2014-12-09 16:16:07 UTC

(In reply to David Lehman from comment #16)
> Here's where we try to resolve md126p1 to a name from /dev/md/:
> 
> 13:45:28,796 INFO blivet: md126p1 is a partition
> 13:45:28,797 DEBUG blivet:           DeviceTree.addUdevPartitionDevice:
> name: md126p1 ;
> 13:45:28,797 DEBUG blivet: link: Volume1_0 -> ../md126
> 13:45:28,797 DEBUG blivet: link: imsm0 -> ../md127
> 
> This is followed by the exception MDRaidError("name_from_md_node(md126p1)
> failed"), which indicates that the only two symlinks in /dev/md at that time
> were those mentioned (Volume1_0, imsm0). However, we see from your output
> that there is indeed a Volume1_0p1 symlink that points to md126p1. I don't
> see how this is possible unless selinux is preventing a full directory
> listing (never heard of this) or some kind of timing issue.

As I mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1170755#c13, my guess is that anaconda is racing with udev.  AFAIK, udev doesn't create all of the nodes and symlinks for a device atomically, and partitions on top of MD RAID devices seem to take a particularly long time to process.

Comment 21 James D. Howard 2014-12-31 02:24:56 UTC

Another user experienced a similar problem:

Booted x64 PC from Fedora 21 Xfce Live media, and attempted to use [Install to Hard Disk]. Pre-existing environment had a pair of 64 GBy SSDs established (already) as a full-disk RAID-1 - /dev/sda and /dev/sdb together == /dev/md127.  There was also a DOS disk label on md127 showing 1 DOS partition from sector 2048 to end-of-disk.

The (apparant) Python trace shown indicated some problem in deciphering the RAID configuration.

Because nothing on these 2 disks is expected to be saved, I'll try a work-around of zeroing the first part of the disks, and re-booting to re-try the install.  Maybe anaconda (et al) will be happier if it sees an "empty" pair of disks and does the RAID-1 creation under its own control?

cmdline:        /usr/bin/python  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Fedora-Live-Xfce-x86_64-21-5 rootfstype=auto ro rd.live.image quiet  rhgb rd.luks=0 rd.md=0 rd.dm=0 
hashmarkername: anaconda
kernel:         3.17.4-301.fc21.x86_64
other involved packages: python-blivet-0.61.13-1.fc21.noarch, python-libs-2.7.8-7.fc21.x86_64
package:        anaconda-core-21.48.21-1.fc21.x86_64
packaging.log:  
product:        Fedora"
reason:         MDRaidError: name_from_md_node(md127p1) failed
release:        Fedora release 21 (Twenty One)
version:        Fedora

Comment 22 James D. Howard 2014-12-31 04:14:08 UTC

(In reply to James D. Howard from comment #21)
> Another user experienced a similar problem:
> 
> Booted x64 PC from Fedora 21 Xfce Live media, and attempted to use [Install
> to Hard Disk]. Pre-existing environment had a pair of 64 GBy SSDs
> established (already) as a full-disk RAID-1 - /dev/sda and /dev/sdb together
> == /dev/md127.  There was also a DOS disk label on md127 showing 1 DOS
> partition from sector 2048 to end-of-disk.
> 
> The (apparant) Python trace shown indicated some problem in deciphering the
> RAID configuration.
> 
> Because nothing on these 2 disks is expected to be saved, I'll try a
> work-around of zeroing the first part of the disks, and re-booting to re-try
> the install.  Maybe anaconda (et al) will be happier if it sees an "empty"
> pair of disks and does the RAID-1 creation under its own control?
> 
> cmdline:        /usr/bin/python  /sbin/anaconda --liveinst
> --method=livecd:///dev/mapper/live-base
> cmdline_file:   BOOT_IMAGE=vmlinuz0 initrd=initrd0.img
> root=live:CDLABEL=Fedora-Live-Xfce-x86_64-21-5 rootfstype=auto ro
> rd.live.image quiet  rhgb rd.luks=0 rd.md=0 rd.dm=0 
> hashmarkername: anaconda
> kernel:         3.17.4-301.fc21.x86_64
> other involved packages: python-blivet-0.61.13-1.fc21.noarch,
> python-libs-2.7.8-7.fc21.x86_64
> package:        anaconda-core-21.48.21-1.fc21.x86_64
> packaging.log:  
> product:        Fedora"
> reason:         MDRaidError: name_from_md_node(md127p1) failed
> release:        Fedora release 21 (Twenty One)
> version:        Fedora

Please note: the work-around mentioned above worked fine: anaconda (et al) saw blank/empty sda and adb, was able to use them as "Standard Partitioning" and RAID-1 with manual configuration, and complete the install.

Comment 23 James D. Howard 2014-12-31 04:23:05 UTC

Please note: Similar to other earlier comments, my system ALSO had an Intel hardware RAID card RS2BL080 (actually mfg. by LSI Logic, their part # MegaRAID SAS 2108) managing 18 TBy of disk.  During BOTH the failed and successful install, the RAID array was attached and recognized by the booted LIVE-CD, and was listed as being handled by the "megaraid_sas" driver without problem.  The device showed up in the booted LIVE OS as /dev/sdd... in either case.

Comment 24 Noel Duffy 2015-01-23 10:05:54 UTC

Another user experienced a similar problem:

Boot Fedora 21 from USB Flash drive. Select Install. 

cmdline:        /usr/bin/python  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Fedora-Live-WS-x86_64-21-5 rootfstype=auto ro rd.live.image quiet  rhgb rd.luks=0 rd.md=0 rd.dm=0 
hashmarkername: anaconda
kernel:         3.17.4-301.fc21.x86_64
other involved packages: python-blivet-0.61.13-1.fc21.noarch, python-libs-2.7.8-7.fc21.x86_64
package:        anaconda-core-21.48.21-1.fc21.x86_64
packaging.log:  
product:        Fedora"
reason:         MDRaidError: name_from_md_node(md126p1) failed
release:        Fedora release 21 (Twenty One)
version:        Fedora

Comment 25 Ian Pilcher 2015-02-27 23:17:19 UTC

I just tried Fedora 22 Alpha TC5 on my system with Intel BIOS RAID, and I got a very similar crash:

  https://bugzilla.redhat.com/show_bug.cgi?id=1197257

Comment 26 poul7777777 2015-03-03 16:29:09 UTC

Another user experienced a similar problem:

Intel SRT
or may be KMS Radeon R9 270x
UEFI Dualboot

addons:         com_redhat_kdump
cmdline:        /usr/bin/python  /sbin/anaconda
cmdline_file:   BOOT_IMAGE=/images/pxeboot/vmlinuz inst.stage2=hd:LABEL=LIVE quiet
hashmarkername: anaconda
kernel:         3.17.4-301.fc21.x86_64
package:        anaconda-21.48.21-1
packaging.log:  
product:        Fedora"
reason:         MDRaidError: name_from_md_node(md125p1) failed
release:        Cannot get release name.
version:        Fedora

Comment 27 mulhern 2015-04-08 14:12:16 UTC

*** Bug 1209635 has been marked as a duplicate of this bug. ***

Comment 28 Fedora Blocker Bugs Application 2015-04-08 20:31:31 UTC

Proposed as a Blocker for 22-final by Fedora user ipilcher using the blocker tracking app because:

 As far as I can tell, it is impossible to install Fedora 22 (via Anaconda) on a system with Intel software (IMSM) RAID.

Comment 29 mulhern 2015-04-09 12:51:22 UTC

*** Bug 1210057 has been marked as a duplicate of this bug. ***

Comment 30 mulhern 2015-04-13 16:47:53 UTC

Suggest that the bug should not be considered a blocker unless it can be consistently reproduced.

Comment 31 James D. Howard 2015-04-13 18:42:29 UTC

Prior to applying the workaround quoted above (zeroing the start of another RAID pair to be used in installation), the installation problem occurred 3 times for me.  That's not "general" reproducibility - but at least suggestive :-)

Comment 32 Adam Williamson 2015-04-13 18:44:18 UTC

It's not a case of whether it can be reproduced in a configuration that's known to hit the bug, but whether there's a clear way to reproduce an affected configuration. See my dupe - I got my test system into a similar state *once*, but could not reproduce it after that.

Comment 33 David Lehman 2015-04-13 19:10:25 UTC

See https://bugzilla.redhat.com/show_bug.cgi?id=1210057#c19

Comment 34 David Lehman 2015-04-13 19:12:51 UTC

Someone needs to upload a recent set of logs from this so we can see what it looks like with the current codebase. If you hit it, don't just let it add the useless "Another user experienced this" message. Save the logs and attach them. Thanks in advance.

Comment 35 Ian Pilcher 2015-04-16 22:48:55 UTC

(In reply to David Lehman from comment #34)
> Someone needs to upload a recent set of logs from this so we can see what it
> looks like with the current codebase. If you hit it, don't just let it add
> the useless "Another user experienced this" message. Save the logs and
> attach them. Thanks in advance.

Can you provide a link to an ISO with a codebase that you consider to be current?  (I can't find anything on dl.fedoraproject.org.)  Also, what logs do you need, beyond what the automatic reporting tool uploads (and why doesn't it upload those)?

Comment 36 Adam Williamson 2015-04-16 22:57:22 UTC

It doesn't upload a new set of logs when it considers a new report to be a dupe of an existing bug (otherwise BZ would be spammed with dozens or hundreds of attachments for commonly-encountered bugs).

The current F22 build is Beta RC3: https://dl.fedoraproject.org/pub/alt/stage/22_Beta_RC3/

Comment 37 Ian Pilcher 2015-04-17 18:58:52 UTC

(In reply to awilliam from comment #36)
> It doesn't upload a new set of logs when it considers a new report to be a
> dupe of an existing bug (otherwise BZ would be spammed with dozens or
> hundreds of attachments for commonly-encountered bugs).

Seems like there should be some soft of flag/keyword/etc. that can be set in a bug to tell the tool that new logs are needed.

> The current F22 build is Beta RC3:
> https://dl.fedoraproject.org/pub/alt/stage/22_Beta_RC3/

Downloading now, but I'm not sure how I'm supposed to get some of the stuff (the anaconda traceback, for example) into a file to upload.

Comment 38 Adam Williamson 2015-04-17 19:15:31 UTC

The anaconda env has 'fpaste', so you can just fpaste the file out (look for /tmp/anaconda-tb-randomstring ). If it's too large, you can cut it up a bit, or you can mount a USB stick and copy it onto that.

Comment 39 David Lehman 2015-04-17 19:50:42 UTC

(In reply to Ian Pilcher from comment #37)
> Downloading now, but I'm not sure how I'm supposed to get some of the stuff
> (the anaconda traceback, for example) into a file to upload.

The easiest way is to scp the files onto your workstation, provided the system you're installing isn't your workstation.

Comment 40 Ian Pilcher 2015-04-17 23:01:16 UTC

Created attachment 1015781 [details]
File: anaconda-tb

Comment 41 Ian Pilcher 2015-04-17 23:01:54 UTC

Created attachment 1015782 [details]
File: anaconda.log

Comment 42 Ian Pilcher 2015-04-17 23:02:36 UTC

Created attachment 1015783 [details]
File: journalctl

Comment 43 Ian Pilcher 2015-04-17 23:03:16 UTC

Created attachment 1015784 [details]
File: lsblk_output

Comment 44 Ian Pilcher 2015-04-17 23:03:58 UTC

Created attachment 1015785 [details]
File: nmcli_dev_list

Comment 45 Ian Pilcher 2015-04-17 23:04:34 UTC

Created attachment 1015786 [details]
File: program.log

Comment 46 Ian Pilcher 2015-04-17 23:05:13 UTC

Created attachment 1015787 [details]
File: storage.log

Comment 47 Ian Pilcher 2015-04-17 23:06:05 UTC

Created attachment 1015788 [details]
File: ifcfg.log

Comment 48 Ian Pilcher 2015-04-19 00:26:15 UTC

(In reply to David Lehman from comment #33)
> See https://bugzilla.redhat.com/show_bug.cgi?id=1210057#c19

I can duplicate that message (kernel:  sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 >) by running partprobe, so I'd guess that Anaconda is doing that (or something equivalent).

Comment 49 David Lehman 2015-04-20 17:00:17 UTC

My guess is that when we use pyparted to tell us about the block device it eventually closes an r/w fd, which triggers udev to call partprobe. It seems fairly likely that same "change" uevent is what triggers the stopping of the array.

Comment 50 David Lehman 2015-04-20 17:20:53 UTC

Try adding this to the boot command line to grab a bit of debug logging:

inst.updates=https://dlehman.fedorapeople.org/updates/updates-1160424.0.img

Comment 51 Petr Schindler 2015-04-20 18:36:53 UTC

Discussed at today's blocker review meeting [1].

It was decided to delay the decision by one week - this is at least potentially a blocker, but it depends how commonly it occurs (it does not affect *all* firmware RAID installs). New information seems to be arriving quite often, so we will check in next week and see if this is clearer then.

[1] http://meetbot.fedoraproject.org/fedora-blocker-review/2015-04-20/

Comment 52 Ian Pilcher 2015-04-21 15:30:07 UTC

Created attachment 1016954 [details]
File: anaconda-tb-4jCXBs

Comment 53 Ian Pilcher 2015-04-21 15:30:46 UTC

Created attachment 1016955 [details]
File: anaconda-tb-TMjejy

Comment 54 Ian Pilcher 2015-04-21 15:31:14 UTC

Created attachment 1016956 [details]
File: anaconda.log

Comment 55 Ian Pilcher 2015-04-21 15:32:30 UTC

Created attachment 1016957 [details]
File: ifcfg.log

Comment 56 Ian Pilcher 2015-04-21 15:33:03 UTC

Created attachment 1016958 [details]
File: journalctl

Comment 57 Ian Pilcher 2015-04-21 15:33:37 UTC

Created attachment 1016959 [details]
File: lsblk_output

Comment 58 Ian Pilcher 2015-04-21 15:34:22 UTC

Created attachment 1016960 [details]
File: nmcli_dev_list

Comment 59 Ian Pilcher 2015-04-21 15:34:54 UTC

Created attachment 1016961 [details]
File: program.log

Comment 60 Ian Pilcher 2015-04-21 15:35:34 UTC

Created attachment 1016962 [details]
File: storage.log

Comment 61 Ian Pilcher 2015-04-21 15:37:35 UTC

(In reply to David Lehman from comment #50)
> Try adding this to the boot command line to grab a bit of debug logging:
> 
> inst.updates=https://dlehman.fedorapeople.org/updates/updates-1160424.0.img

New logs uploaded.  (Wow, does Bugzilla need a batch upload function!)  Note that I got two anaconda-tb files.

Comment 62 David Lehman 2015-04-21 20:27:42 UTC

This is a bit of a fishing expedition. I didn't get what I hoped for last time, so I have another updates image:

inst.updates=https://dlehman.fedorapeople.org/updates/updates-1160424.1.img

(note incremented digit just before the ".img" extension)

Comment 63 Ian Pilcher 2015-04-21 20:43:25 UTC

(In reply to David Lehman from comment #62)
> This is a bit of a fishing expedition. I didn't get what I hoped for last
> time, so I have another updates image:
> 
> inst.updates=https://dlehman.fedorapeople.org/updates/updates-1160424.1.img
> 
> (note incremented digit just before the ".img" extension)

I'll try it when I get home.  Do you need all of the logs, or only particular ones?

Comment 64 Ian Pilcher 2015-04-21 23:08:33 UTC

Created attachment 1017166 [details]
File: anaconda.log

Comment 65 Ian Pilcher 2015-04-21 23:09:35 UTC

Created attachment 1017167 [details]
File: anaconda-tb-HrlsJE

Comment 66 Ian Pilcher 2015-04-21 23:10:12 UTC

Created attachment 1017168 [details]
File: anaconda-tb-sM7Zw6

Comment 67 Ian Pilcher 2015-04-21 23:10:47 UTC

Created attachment 1017169 [details]
File: ifcfg.log

Comment 68 Ian Pilcher 2015-04-21 23:11:16 UTC

Created attachment 1017170 [details]
File: journalctl

Comment 69 Ian Pilcher 2015-04-21 23:11:56 UTC

Created attachment 1017171 [details]
File: lsblk_output

Comment 70 Ian Pilcher 2015-04-21 23:12:26 UTC

Created attachment 1017172 [details]
File: nmcli_dev_list

Comment 71 Ian Pilcher 2015-04-21 23:13:02 UTC

Created attachment 1017173 [details]
File: program.log

Comment 72 Ian Pilcher 2015-04-21 23:13:44 UTC

Created attachment 1017174 [details]
File: storage.log

Comment 73 David Lehman 2015-04-22 14:46:57 UTC

Here is what I see happening:

mpatha has 4096 byte sectors and a total size of 264.3GiB. We create partition mpatha3 with length 69156608 sectors, which should be a size of 263.8GiB. For some reason, the system interprets the size of this partition as though the sector size was 512 bytes, so both lsblk and lvm see mpatha as having size 32.97GiB.

In fact, you'll notice that all of the partitions on mpatha show up with oddly small sizes in the lsblk output. This indicates that parted sees the mpatha as having a 4KiB sector size while the rest of the system sees it as having a 512B sector size.

sdc and sdk, which comprise mpatha, both show in syslog kernel messages as having 4096-byte sectors.

Maybe you can check /proc/partitions to see if the kernel is confused as well.

Comment 74 David Lehman 2015-04-22 14:48:14 UTC

Please ignore comment 73 -- it was intended for a different bug report.

Comment 75 David Gay 2015-04-28 21:51:07 UTC

Discussed at the 2015-04-28 blocker review meeting.[1] Voted as RejectedBlocker.

This is a worrying bug, but so far it has been difficult to pin down. It appears to only affect RAID sets in specific states. If more details emerge that indicate it is very commonly encountered by those attempting to install on firmware RAID, we will reconsider it.

[1]: http://meetbot.fedoraproject.org/fedora-blocker-review/2015-04-28/

Comment 76 Ian Pilcher 2015-04-30 14:34:15 UTC

(In reply to David Gay from comment #75)
> This is a worrying bug, but so far it has been difficult to pin down. It
> appears to only affect RAID sets in specific states. If more details emerge
> that indicate it is very commonly encountered by those attempting to install
> on firmware RAID, we will reconsider it.

Would it be too much to ask that someone actually look at the logs that I provided back on April 21?

> [1]: http://meetbot.fedoraproject.org/fedora-blocker-review/2015-04-28/

So reading these (and previous) minutes, it seems that the main reasons that this is not considered a blocker are:

1.  It's hard

2.  No progress is being made (see above)

3.  Adam was able to workaround the issue by deleting all of the partitions on
    his array.

Did I miss something?

Comment 77 DO NOT USE account not monitored (old adamwill) 2015-04-30 21:07:42 UTC

"Would it be too much to ask that someone actually look at the logs that I provided back on April 21?"

dlehman has, he just didn't update the bug - he says they still don't provide enough data, and it's not easy to get enough data. He is still working on this.

On the reasons: you missed probably the most important one, 'this doesn't always happen'. It's not like *all* existing firmware RAID sets run into this bug, they don't. I've done multiple installs over existing sets and it's worked fine - I only hit this bug *one* time.

The other point is that 2) is pretty significant. Fedora has a hybrid release process, not *completely* time-based, feature-based or quality-based. That means we do sometimes have to reject bugs as blockers if they're sufficiently difficult to address that we can't commit to fixing them in a reasonable time span; our policy isn't as strict as 'if it takes a year to fix this blocker, we'll just have no releases for a year!'

Comment 78 David Lehman 2015-05-01 18:23:13 UTC

(In reply to Ian Pilcher from comment #76)
> Did I miss something?

You may have missed the fact that I have a great many things to do other than work on non-blocker Fedora bugs.


If you want to find me in IRC we can probably cover ground a bit faster. I'm generally in #anaconda on freenode during US/Central business hours.

Comment 79 Joseph Bowman 2015-05-10 14:45:24 UTC

Another user experienced a similar problem:

This occured by opening "install to hard drive", happens about 60% of the time in my experience.

cmdline:        /usr/bin/python2  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=/syslinux/vmlinuz0 root=live:LABEL=LIVE ro rd.live.image quiet rhgb rd.live.check
hashmarkername: anaconda
kernel:         4.0.0-0.rc5.git4.1.fc22.x86_64
other involved packages: python-blivet-1.0.7-1.fc22.noarch, python-libs-2.7.9-5.fc22.x86_64
package:        anaconda-core-22.20.9-1.fc22.x86_64
packaging.log:  
product:        Fedora
reason:         Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)
release:        Fedora release 22 (Twenty Two)
version:        22

Comment 80 Adam Williamson 2015-05-11 16:31:53 UTC

*** Bug 1219430 has been marked as a duplicate of this bug. ***

Comment 81 Adam Williamson 2015-05-11 18:36:05 UTC

Discussed again (via newly-filed dupe 1219430) at 2015-05-11 blocker review meeting: https://meetbot.fedoraproject.org/fedora-blocker-review/2015-05-11/f22-blocker-review.2015-05-11-16.01.log.txt . Remains RejectedBlocker for now, but we're concerned about how many reports this is getting and its severity. Somehow we didn't accept it as a freeze exception before, it now is accepted - obviously if we can fix this we should.

Comment 82 Bart Ratgers 2015-05-28 18:56:39 UTC

Another user experienced a similar problem:

Start Fedora 22 installer

cmdline:        /usr/bin/python2  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Fedora-Live-WS-x86_64-22-3 rootfstype=auto ro rd.live.image quiet  rhgb rd.luks=0 rd.md=0 rd.dm=0 
hashmarkername: anaconda
kernel:         4.0.4-301.fc22.x86_64
other involved packages: libblockdev-0.13-2.fc22.x86_64, python-libs-2.7.9-6.fc22.x86_64, python-blivet-1.0.9-1.fc22.noarch
package:        anaconda-core-22.20.13-1.fc22.x86_64
packaging.log:  
product:        Fedora
reason:         MDRaidError: No name found for the node 'md126p1'
release:        Fedora release 22 (Twenty Two)
version:        22

Comment 83 David Lehman 2015-05-29 20:39:36 UTC

If anyone is willing to run some tests I will prepare another debug updates image to narrow the problem down a bit.

Comment 84 David Lehman 2015-05-29 21:08:14 UTC

I see a sequence like this repeating in the syslog over and over:

Apr 21 17:57:17 localhost systemd[1]: Starting Software RAID monitoring and management...
<snip>
Apr 21 17:57:17 localhost mdadm[2025]: mdadm: No mail address or alert command - not monitoring.
<snip>
Apr 21 17:57:17 localhost systemd[1]: mdmonitor.service: control process exited, code=exited status=1
Apr 21 17:57:17 localhost systemd[1]: Failed to start Software RAID monitoring and management.
Apr 21 17:57:17 localhost systemd[1]: Unit mdmonitor.service entered failed state.
Apr 21 17:57:17 localhost unknown[1]: <audit-1130> pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=mdmonitor comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Apr 21 17:57:17 localhost systemd[1]: mdmonitor.service failed.
Apr 21 17:57:17 localhost kernel: audit: type=1130 audit(1429639037.278:91): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=mdmonitor comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'

That may not matter since I also see a successful start of mdmon@md127 or something along those lines.


Then I see this:

Apr 21 22:57:36 localhost kernel:  sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 >
Apr 21 22:57:36 localhost kernel:  sdc: sdc1 sdc2 sdc3 sdc4 < sdc5 >

More importantly, I see this:

Apr 21 22:57:39 localhost systemd-udevd[1731]: error: /dev/md126p1: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[2386]: inotify_add_watch(7, /dev/md126p2, 10) failed: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[2346]: inotify_add_watch(7, /dev/md126p1, 10) failed: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[2409]: inotify_add_watch(7, /dev/md126p3, 10) failed: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[2410]: inotify_add_watch(7, /dev/md126p4, 10) failed: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[1731]: error: /dev/md126p3: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[1731]: error: /dev/md126p2: No such file or directory
Apr 21 22:57:39 localhost systemd-udevd[1731]: error: /dev/md126p4: No such file or directory
Apr 21 22:57:39 localhost kernel:  md126: p1 p2 p3 p4 < p5 >

This last bit is at right around the same time (within the same second) as the transition from active to inactive array(s). 

I'm starting to think that us instantiating a parted.Device for sdb and sdc, which triggers a "change" uevent for each, is triggering udev rules that briefly deactivate the isw/imsm arrays. I say briefly because the last syslog snippet suggests they get reactivated. lsblk does not show them, though, so it may be that they get deactivated and stay down.

Comment 85 Ian Pilcher 2015-06-01 19:59:42 UTC

(In reply to David Lehman from comment #78)
> You may have missed the fact that I have a great many things to do other
> than work on non-blocker Fedora bugs.

Actually that's one thing that I definitely didn't miss -- thus my attempt to get this bug accepted as a blocker and my frustration when it 

> If you want to find me in IRC we can probably cover ground a bit faster. I'm
> generally in #anaconda on freenode during US/Central business hours.

Unfortunately, I am generally only able to do Fedora-related stuff on nights & weekends, plus I often travel during the week.  That said, if you ever decide that you really, really want some "live time" with my home system we could schedule something.

(In reply to David Lehman from comment #83)
> If anyone is willing to run some tests I will prepare another debug updates
> image to narrow the problem down a bit.

Absolutely.

(In reply to David Lehman from comment #84)
> I'm starting to think that us instantiating a parted.Device for sdb and sdc,
> which triggers a "change" uevent for each, is triggering udev rules that
> briefly deactivate the isw/imsm arrays. I say briefly because the last
> syslog snippet suggests they get reactivated. lsblk does not show them,
> though, so it may be that they get deactivated and stay down.

It just occurred to me that the best way to handle this may be to add a call to 
udevadm settle (or it's library equivalent) followed by a second attempt to the relevant error path(s).  My reasoning is that trying to prevent/control uevents is probably a fool's errand (and unlikely to garner much sympathy from the udev developers).

Comment 86 Adam Williamson 2015-06-10 22:55:21 UTC

Clearing F22 accepted / nominated freeze exception status as F22 has shipped, per https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Trackers . You may nominate as an Alpha, Beta or Final freeze exception for F23 if desired using the web application - https://qa.fedoraproject.org/blockerbugs/propose_bug (though it is not currently set up for F23) - or by marking the bug as blocking AlphaFreezeException, BetaFreezeException, or FinalFreezeException.

Comment 87 mulhern 2015-06-19 16:43:51 UTC

Reassigning...

Comment 88 Fedora Admin XMLRPC Client 2015-09-28 20:26:40 UTC

This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 89 julbrehin 2015-10-06 21:11:16 UTC

Another user experienced a similar problem:

Booted from a Fedora 22 live usb on an Asus Zenbook UX301LA with a Intel firmware RAID0 (2 x 128GB SSDs).
Clicked on Install to Hard Drive and got this error.

Seems like it was a freeze exception before 22 release. It looks similar to #1160424 but this one was reassigned (to what/where ?) when it came to 23.

Has been reproduced 3 (out of 3) times even after a fsck on the 3 partitions.

Result of the command : ls -l /dev/md
total 0
lrwxrwxrwx. 1 root root  8 Oct  6 15:31 ASUS_OS_0 -> ../md126
lrwxrwxrwx. 1 root root 10 Oct  6 15:31 ASUS_OS_0p1 -> ../md126p1
lrwxrwxrwx. 1 root root 10 Oct  6 15:31 ASUS_OS_0p2 -> ../md126p2
lrwxrwxrwx. 1 root root 10 Oct  6 15:31 ASUS_OS_0p3 -> ../md126p3
lrwxrwxrwx. 1 root root  8 Oct  6 15:31 imsm0 -> ../md127

cmdline:        /usr/bin/python2  /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base
cmdline_file:   BOOT_IMAGE=/isolinux/vmlinuz0 root=live:LABEL=Fedora ro rd.live.image quiet rhgb rd.live.check
hashmarkername: anaconda
kernel:         4.0.4-301.fc22.x86_64
other involved packages: libblockdev-0.13-2.fc22.x86_64, python-libs-2.7.9-6.fc22.x86_64, python-blivet-1.0.9-1.fc22.noarch
package:        anaconda-core-22.20.13-1.fc22.x86_64
packaging.log:  
product:        Fedora
reason:         MDRaidError: No name found for the node 'md126p1'
release:        Fedora release 22 (Twenty Two)
version:        22

Comment 90 Fedora End Of Life 2016-07-19 12:21:53 UTC

Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.

anaconda-maint-list
awilliam
bart
bcl
dlehman
fred.moses
g.kaviyarasu
hamzy
ipilcher
jbowm16
jhoward
jonathan
jrimpo
julbrehin
pfrields
poul.gagarin
pschindl
robatino
vanmeeuwen+fedora
vpodzime