Bug 1210057

Summary:

Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)

Product:

[Fedora] Fedora

Reporter:

Adam Williamson <awilliam>

Component:

anaconda

Assignee:

Anaconda Maintenance Team <anaconda-maint-list>

Status:

CLOSED DUPLICATE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

CC:

amulhern, anaconda-maint-list, awilliam, g.kaviyarasu, ipilcher, jonathan, kubek-93, vanmeeuwen+fedora

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Unspecified

Whiteboard:

abrt_hash:f18e12cafc68353c8822501744b6b98043ee7ba8500decbf81fc1ebd505d2c3f

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-04-09 12:51:22 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
File: anaconda-tb	none
File: anaconda.log	none
File: dnf.log	none
File: environ	none
File: lsblk_output	none
File: nmcli_dev_list	none
File: os_info	none
File: program.log	none
File: storage.log	none
File: syslog	none
File: ifcfg.log	none
File: packaging.log	none

Description Adam Williamson 2015-04-08 19:32:29 UTC

Description of problem:
Booted 22 Beta RC1 Server DVD (dd'ed to USB) on a system with an existing Intel firmware RAID-0 set with a Fedora install on it.

Version-Release number of selected component:
anaconda-22.20.9-1

The following was filed automatically by anaconda:
anaconda 22.20.9-1 exception report
Traceback (most recent call first):
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 909, in addUdevPartitionDevice
    name = blockdev.md_name_from_node(name)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1225, in addUdevDevice
    device = self.addUdevPartitionDevice(info)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2194, in _populate
    self.addUdevDevice(dev)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2128, in populate
    self._populate()
  File "/usr/lib/python2.7/site-packages/blivet/blivet.py", line 277, in reset
    self.devicetree.populate(cleanupOnly=cleanupOnly)
  File "/usr/lib/python2.7/site-packages/blivet/osinstall.py", line 1117, in storageInitialize
    storage.reset()
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 244, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 115, in wait
    self.raise_if_error(name)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/timezone.py", line 75, in time_initialize
    threadMgr.wait(THREAD_STORAGE)
  File "/usr/lib64/python2.7/threading.py", line 766, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 244, in run
    threading.Thread.run(self, *args, **kwargs)
Error: g-bd-md-error-quark: No name found for the node 'md126p1' (2)

Additional info:
addons:         com_redhat_kdump
cmdline:        /usr/bin/python2  /sbin/anaconda
cmdline_file:   BOOT_IMAGE=vmlinuz initrd=initrd.img inst.stage2=hd:LABEL=Fedora-22_B-x86_64 quiet
dnf.rpm.log:    Apr 08 20:41:04 INFO --- logging initialized ---
executable:     /sbin/anaconda
hashmarkername: anaconda
kernel:         4.0.0-0.rc5.git4.1.fc22.x86_64
product:        Fedora
release:        Cannot get release name.
type:           anaconda
version:        22

Comment 1 Adam Williamson 2015-04-08 19:32:31 UTC

Created attachment 1012387 [details]
File: anaconda-tb

Comment 2 Adam Williamson 2015-04-08 19:32:32 UTC

Created attachment 1012388 [details]
File: anaconda.log

Comment 3 Adam Williamson 2015-04-08 19:32:33 UTC

Created attachment 1012389 [details]
File: dnf.log

Comment 4 Adam Williamson 2015-04-08 19:32:34 UTC

Created attachment 1012390 [details]
File: environ

Comment 5 Adam Williamson 2015-04-08 19:32:35 UTC

Created attachment 1012391 [details]
File: lsblk_output

Comment 6 Adam Williamson 2015-04-08 19:32:36 UTC

Created attachment 1012392 [details]
File: nmcli_dev_list

Comment 7 Adam Williamson 2015-04-08 19:32:37 UTC

Created attachment 1012393 [details]
File: os_info

Comment 8 Adam Williamson 2015-04-08 19:32:38 UTC

Created attachment 1012394 [details]
File: program.log

Comment 9 Adam Williamson 2015-04-08 19:32:39 UTC

Created attachment 1012395 [details]
File: storage.log

Comment 10 Adam Williamson 2015-04-08 19:32:40 UTC

Created attachment 1012396 [details]
File: syslog

Comment 11 Adam Williamson 2015-04-08 19:32:41 UTC

Created attachment 1012397 [details]
File: ifcfg.log

Comment 12 Adam Williamson 2015-04-08 19:32:42 UTC

Created attachment 1012398 [details]
File: packaging.log

Comment 13 Adam Williamson 2015-04-08 20:19:06 UTC

This seemed to be reproducible with the set in the exact state I hit it with (at least, I booted twice and hit it both times), but after I manually wiped the set I can't reproduce it any more - I ran three installs in succession and didn't hit it on any of them, and after the third I can still boot the installer without hitting the crash.

So I'm not quite sure what special state I got the set into, but this at least doesn't seem to be a clear showstopper.

Comment 14 Ian Pilcher 2015-04-08 22:40:53 UTC

(In reply to awilliam from comment #13)
> This seemed to be reproducible with the set in the exact state I hit it with
> (at least, I booted twice and hit it both times), but after I manually wiped
> the set I can't reproduce it any more - I ran three installs in succession
> and didn't hit it on any of them, and after the third I can still boot the
> installer without hitting the crash.

When you say "I manually wiped the set", what exactly did you do -- wipe filesystems, LVs, etc.; remove partitions; delete RAID volumes/sets; or something else?

Comment 15 Adam Williamson 2015-04-08 23:20:51 UTC

Wiped the partitions with fdisk and created a new disk label.

Comment 16 Ian Pilcher 2015-04-08 23:31:54 UTC

(In reply to awilliam from comment #15)
> Wiped the partitions with fdisk and created a new disk label.

What do you want to bet that recreating the partitions brings back the crash?

(I still believe that the root cause is that Anaconda is looking for symlinks in /dev/md/ before udev is done creating them.  Partitions on top of MD RAID devices seem to take a particularly long time for udev to process for some reason.)

Comment 17 Adam Williamson 2015-04-09 00:21:48 UTC

Well, that's what I was doing in #c13. So far as I can recall, all that was on the disk was a previous F22 install. So that's why I was installing over and over, but it doesn't seem to reproduce the bug again.

Comment 18 mulhern 2015-04-09 12:51:22 UTC


*** This bug has been marked as a duplicate of bug 1160424 ***

Comment 19 David Lehman 2015-04-13 19:07:58 UTC

Looking at adam's logs here, what I see is something is stopping the fwraid from outside of anaconda/blivet.

Here you can see (from syslog) where the system activates the fwraid:

20:40:52,900 INFO kernel:[   18.720320] md: bind<sda>
20:40:52,900 INFO kernel:[   18.729999] md: bind<sdb>
20:40:52,900 INFO kernel:[   18.732877] md: bind<sdb>
20:40:52,900 INFO kernel:[   18.733078] md: bind<sda>
20:40:52,900 INFO kernel:[   18.736946] md/raid0:md126: md_size is 1953536000 sectors.
20:40:52,900 INFO kernel:[   18.736950] md: RAID0 configuration for md126 - 1 zone
20:40:52,900 INFO kernel:[   18.736951] md: zone0=[sda/sdb]
20:40:52,900 INFO kernel:[   18.736955]       zone-offset=         0KB, device-offset=         0KB, size= 976768256KB
20:40:52,900 INFO kernel:[   18.736956] 
20:40:52,900 INFO kernel:[   18.736974] md126: detected capacity change from 0 to 1000210432000
20:40:52,900 INFO kernel:[   18.762846]  md126: p1 p2 p3
<snip>
20:40:52,900 INFO kernel:[   18.878739] md: export_rdev(sdb)
20:40:52,900 INFO kernel:[   18.878781] md: export_rdev(sda)
20:40:52,900 INFO kernel:[   18.973909] md: export_rdev(sdb)
20:40:52,900 INFO kernel:[   18.973942] md: export_rdev(sda)


And then, inexplicably, there's this:

20:41:05,023 INFO kernel:[   41.519132]  sda: sda1 sda2 sda3
20:41:05,023 WARNING kernel:[   41.519135] sda: partition table partially beyond EOD, truncated
20:41:05,023 WARNING kernel:[   41.519426] sda: p2 size 1948653568 extends beyond EOD, truncated
20:41:05,023 WARNING kernel:[   41.519495] sda: p3 start 1949630464 is beyond EOD, truncated
20:41:05,096 WARNING kernel:[   41.592055] Alternate GPT is invalid, using primary GPT.
20:41:05,096 INFO kernel:[   41.592069]  sdb: sdb1 sdb2 sdb3
<snip>
20:41:06,611 INFO kernel:[   43.109073]  md126: p1 p2 p3


To me, this looks like sda is disappearing and then reappearing. I can't comment on whether this should be handled transparently by the fwraid, but I can say that a disappearing array is going to be difficult to install to. Notice that there are no active md devices in the lsblk output at the bottom of program.log. I see nothing in blivet's logs to indicate that blivet deactivated the array.

Comment 20 David Shea 2015-05-05 15:39:27 UTC

*** Bug 1217666 has been marked as a duplicate of this bug. ***