Bug 1210057

Summary: Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: anacondaAssignee: Anaconda Maintenance Team <anaconda-maint-list>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: amulhern, anaconda-maint-list, awilliam, g.kaviyarasu, ipilcher, jonathan, kubek-93, vanmeeuwen+fedora
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:f18e12cafc68353c8822501744b6b98043ee7ba8500decbf81fc1ebd505d2c3f
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-04-09 12:51:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: anaconda-tb
none
File: anaconda.log
none
File: dnf.log
none
File: environ
none
File: lsblk_output
none
File: nmcli_dev_list
none
File: os_info
none
File: program.log
none
File: storage.log
none
File: syslog
none
File: ifcfg.log
none
File: packaging.log none

Description Adam Williamson 2015-04-08 19:32:29 UTC
Description of problem:
Booted 22 Beta RC1 Server DVD (dd'ed to USB) on a system with an existing Intel firmware RAID-0 set with a Fedora install on it.

Version-Release number of selected component:
anaconda-22.20.9-1

The following was filed automatically by anaconda:
anaconda 22.20.9-1 exception report
Traceback (most recent call first):
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 909, in addUdevPartitionDevice
    name = blockdev.md_name_from_node(name)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1225, in addUdevDevice
    device = self.addUdevPartitionDevice(info)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2194, in _populate
    self.addUdevDevice(dev)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2128, in populate
    self._populate()
  File "/usr/lib/python2.7/site-packages/blivet/blivet.py", line 277, in reset
    self.devicetree.populate(cleanupOnly=cleanupOnly)
  File "/usr/lib/python2.7/site-packages/blivet/osinstall.py", line 1117, in storageInitialize
    storage.reset()
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 244, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 115, in wait
    self.raise_if_error(name)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/timezone.py", line 75, in time_initialize
    threadMgr.wait(THREAD_STORAGE)
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 244, in run
    threading.Thread.run(self, *args, **kwargs)
Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)

Additional info:
addons:         com_redhat_kdump
cmdline:        /usr/bin/python2  /sbin/anaconda
cmdline_file:   BOOT_IMAGE=vmlinuz initrd=initrd.img inst.stage2=hd:LABEL=Fedora-22_B-x86_64 quiet
dnf.rpm.log:    Apr 08 20:41:04 INFO --- logging initialized ---
executable:     /sbin/anaconda
hashmarkername: anaconda
kernel:         4.0.0-0.rc5.git4.1.fc22.x86_64
product:        Fedora
release:        Cannot get release name.
type:           anaconda
version:        22

Comment 1 Adam Williamson 2015-04-08 19:32:31 UTC
Created attachment 1012387 [details]
File: anaconda-tb

Comment 2 Adam Williamson 2015-04-08 19:32:32 UTC
Created attachment 1012388 [details]
File: anaconda.log

Comment 3 Adam Williamson 2015-04-08 19:32:33 UTC
Created attachment 1012389 [details]
File: dnf.log

Comment 4 Adam Williamson 2015-04-08 19:32:34 UTC
Created attachment 1012390 [details]
File: environ

Comment 5 Adam Williamson 2015-04-08 19:32:35 UTC
Created attachment 1012391 [details]
File: lsblk_output

Comment 6 Adam Williamson 2015-04-08 19:32:36 UTC
Created attachment 1012392 [details]
File: nmcli_dev_list

Comment 7 Adam Williamson 2015-04-08 19:32:37 UTC
Created attachment 1012393 [details]
File: os_info

Comment 8 Adam Williamson 2015-04-08 19:32:38 UTC
Created attachment 1012394 [details]
File: program.log

Comment 9 Adam Williamson 2015-04-08 19:32:39 UTC
Created attachment 1012395 [details]
File: storage.log

Comment 10 Adam Williamson 2015-04-08 19:32:40 UTC
Created attachment 1012396 [details]
File: syslog

Comment 11 Adam Williamson 2015-04-08 19:32:41 UTC
Created attachment 1012397 [details]
File: ifcfg.log

Comment 12 Adam Williamson 2015-04-08 19:32:42 UTC
Created attachment 1012398 [details]
File: packaging.log

Comment 13 Adam Williamson 2015-04-08 20:19:06 UTC
This seemed to be reproducible with the set in the exact state I hit it with (at least, I booted twice and hit it both times), but after I manually wiped the set I can't reproduce it any more - I ran three installs in succession and didn't hit it on any of them, and after the third I can still boot the installer without hitting the crash.

So I'm not quite sure what special state I got the set into, but this at least doesn't seem to be a clear showstopper.

Comment 14 Ian Pilcher 2015-04-08 22:40:53 UTC
(In reply to awilliam from comment #13)
> This seemed to be reproducible with the set in the exact state I hit it with
> (at least, I booted twice and hit it both times), but after I manually wiped
> the set I can't reproduce it any more - I ran three installs in succession
> and didn't hit it on any of them, and after the third I can still boot the
> installer without hitting the crash.

When you say "I manually wiped the set", what exactly did you do -- wipe filesystems, LVs, etc.; remove partitions; delete RAID volumes/sets; or something else?

Comment 15 Adam Williamson 2015-04-08 23:20:51 UTC
Wiped the partitions with fdisk and created a new disk label.

Comment 16 Ian Pilcher 2015-04-08 23:31:54 UTC
(In reply to awilliam from comment #15)
> Wiped the partitions with fdisk and created a new disk label.

What do you want to bet that recreating the partitions brings back the crash?

(I still believe that the root cause is that Anaconda is looking for symlinks in /dev/md/ before udev is done creating them.  Partitions on top of MD RAID devices seem to take a particularly long time for udev to process for some reason.)

Comment 17 Adam Williamson 2015-04-09 00:21:48 UTC
Well, that's what I was doing in #c13. So far as I can recall, all that was on the disk was a previous F22 install. So that's why I was installing over and over, but it doesn't seem to reproduce the bug again.

Comment 18 mulhern 2015-04-09 12:51:22 UTC

*** This bug has been marked as a duplicate of bug 1160424 ***

Comment 19 David Lehman 2015-04-13 19:07:58 UTC
Looking at adam's logs here, what I see is something is stopping the fwraid from outside of anaconda/blivet.

Here you can see (from syslog) where the system activates the fwraid:

20:40:52,900 INFO kernel:[   18.720320] md: bind<sda>
20:40:52,900 INFO kernel:[   18.729999] md: bind<sdb>
20:40:52,900 INFO kernel:[   18.732877] md: bind<sdb>
20:40:52,900 INFO kernel:[   18.733078] md: bind<sda>
20:40:52,900 INFO kernel:[   18.736946] md/raid0:md126: md_size is 1953536000 sectors.
20:40:52,900 INFO kernel:[   18.736950] md: RAID0 configuration for md126 - 1 zone
20:40:52,900 INFO kernel:[   18.736951] md: zone0=[sda/sdb]
20:40:52,900 INFO kernel:[   18.736955]       zone-offset=         0KB, device-offset=         0KB, size= 976768256KB
20:40:52,900 INFO kernel:[   18.736956] 
20:40:52,900 INFO kernel:[   18.736974] md126: detected capacity change from 0 to 1000210432000
20:40:52,900 INFO kernel:[   18.762846]  md126: p1 p2 p3
<snip>
20:40:52,900 INFO kernel:[   18.878739] md: export_rdev(sdb)
20:40:52,900 INFO kernel:[   18.878781] md: export_rdev(sda)
20:40:52,900 INFO kernel:[   18.973909] md: export_rdev(sdb)
20:40:52,900 INFO kernel:[   18.973942] md: export_rdev(sda)


And then, inexplicably, there's this:

20:41:05,023 INFO kernel:[   41.519132]  sda: sda1 sda2 sda3
20:41:05,023 WARNING kernel:[   41.519135] sda: partition table partially beyond EOD, truncated
20:41:05,023 WARNING kernel:[   41.519426] sda: p2 size 1948653568 extends beyond EOD, truncated
20:41:05,023 WARNING kernel:[   41.519495] sda: p3 start 1949630464 is beyond EOD, truncated
20:41:05,096 WARNING kernel:[   41.592055] Alternate GPT is invalid, using primary GPT.
20:41:05,096 INFO kernel:[   41.592069]  sdb: sdb1 sdb2 sdb3
<snip>
20:41:06,611 INFO kernel:[   43.109073]  md126: p1 p2 p3


To me, this looks like sda is disappearing and then reappearing. I can't comment on whether this should be handled transparently by the fwraid, but I can say that a disappearing array is going to be difficult to install to. Notice that there are no active md devices in the lsblk output at the bottom of program.log. I see nothing in blivet's logs to indicate that blivet deactivated the array.

Comment 20 David Shea 2015-05-05 15:39:27 UTC
*** Bug 1217666 has been marked as a duplicate of this bug. ***