Bug 723144 - lvcreate not creating device nodes as needed
Summary: lvcreate not creating device nodes as needed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: rawhide
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
Assignee: Peter Rajnoha
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 723314 723886 727488 728420 728886 (view as bug list)
Depends On:
Blocks: F16Alpha, F16AlphaBlocker
TreeView+ depends on / blocked
 
Reported: 2011-07-19 08:20 UTC by Tao Wu
Modified: 2014-10-28 23:45 UTC (History)
26 users (show)

Fixed In Version: lvm2-2.02.86-5.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-06 04:24:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
error log of anaconda 16.12 (320.57 KB, text/plain)
2011-07-19 08:20 UTC, Tao Wu
no flags Details
program log with -vv added to lvm commands (165.38 KB, text/plain)
2011-07-29 15:34 UTC, David Lehman
no flags Details
program log with -vvvv and activation config to verify udev (194.96 KB, text/plain)
2011-08-01 15:00 UTC, David Lehman
no flags Details

Description Tao Wu 2011-07-19 08:20:53 UTC
Created attachment 513733 [details]
error log of anaconda 16.12

Description of problem:
The anaconda 16.12 crashed after the partition process, both i386 and x86_64 version have this problem.

Version-Release number of selected component (if applicable):
anaconda 16.12

How reproducible:
always

Steps to Reproduce:
1. Install the Fedora-16-test-i386.iso or Fedora-16-test-x86_64.iso in KVM
2. Select the default options and follow the installation process
3. Anaconda run correctly until the step of partition, then it crashed before next step.
  
Actual results:
anaconda crashed.

Expected results:
anaconda install F16 successfully.

Additional info:

Comment 1 Tao Wu 2011-07-19 09:20:57 UTC
There are 3 questions I want to got attentions:

1. It is strange that the ISO files' size seems smaller than other versions (F14, F15 ... ), for example:
      Fedora-16-test-i386-netinst.iso 15-Jul-2011 02:35  145M  
      Fedora-16-test-i386.iso         15-Jul-2011 02:35  310M  
all the above data come from  
   http://serverbeach1.fedoraproject.org/pub/alt/stage/20110714/i386/iso/
and I am not sure what is Fedora-16-test-i386.iso like, which is much smaller than the DVD image.

2. Anaconda couldn't start up correctly unless erase the kernel argument "quite" before it starts, which means when the first blue screen appears, we need to press the Tab key, then delete the "quite" string.

3. After the anaconda crashed, it seems can not configure network automatically, so that we can not save the bug to bugzilla directly.

Comment 2 Martin Gracik 2011-07-19 10:24:59 UTC
Didn't the crash mention ntfsresize?

Comment 3 James Laska 2011-07-19 13:04:20 UTC
Ugh ... XML attachments from report :(   To aid bugzilla searching, I'm including the expanded traceback below ...

anaconda 16.12 exception report
Traceback (most recent call first):
  File "/usr/lib/python2.7/site-packages/pyanaconda/storage/devices.py", line 790, in create
    raise DeviceCreateError(str(e), self.name)
  File "/usr/lib/python2.7/site-packages/pyanaconda/storage/deviceaction.py", line 241, in execute
    self.device.create(intf=intf)
  File "/usr/lib/python2.7/site-packages/pyanaconda/storage/devicetree.py", line 316, in processActions
    action.execute(intf=self.intf)
  File "/usr/lib/python2.7/site-packages/pyanaconda/storage/__init__.py", line 383, in doIt
    self.devicetree.processActions()
  File "/usr/lib/python2.7/site-packages/pyanaconda/packages.py", line 122, in turnOnFilesystems
    anaconda.storage.doIt()
  File "/usr/lib/python2.7/site-packages/pyanaconda/dispatch.py", line 338, in dispatch
    self.dir = self.steps[self.step].target(self.anaconda)
  File "/usr/lib/python2.7/site-packages/pyanaconda/dispatch.py", line 235, in go_forward
    self.dispatch()
  File "/usr/lib/python2.7/site-packages/pyanaconda/gui.py", line 1200, in nextClicked
    self.anaconda.dispatch.go_forward()
DeviceCreateError: ('lvcreate failed for VolGroup/lv_swap:   /dev/VolGroup/lv_swap: not found: device not cleared\n  Aborting. Failed to wipe start of new LV.\n', 'VolGroup-lv_swap')

I'm also seeing this problem using a default partition install ([X] Use LVM) on a baremetal system.

Comment 4 James Laska 2011-07-19 13:07:03 UTC
(In reply to comment #1)
> 3. After the anaconda crashed, it seems can not configure network
> automatically, so that we can not save the bug to bugzilla directly.

Can you file a separate bug for the networking problem?  A good test reproducer for this will be the test case https://fedoraproject.org/wiki/QA:Testcase_Anaconda_save_traceback_to_bugzilla

Comment 5 Radek Vykydal 2011-07-19 14:44:51 UTC
(In reply to comment #4)
> (In reply to comment #1)
> > 3. After the anaconda crashed, it seems can not configure network
> > automatically, so that we can not save the bug to bugzilla directly.
> 
> Can you file a separate bug for the networking problem?  A good test reproducer
> for this will be the test case
> https://fedoraproject.org/wiki/QA:Testcase_Anaconda_save_traceback_to_bugzilla

I am seeing it in any network enablement in stage 2 (GUI). Workarounds are enabling network in loader (e.g. using asknetwork) or checking [x] Connect Automatically when enabling network in GUI.

Comment 6 James Laska 2011-07-19 15:33:35 UTC
(In reply to comment #5)
> I am seeing it in any network enablement in stage 2 (GUI). Workarounds are
> enabling network in loader (e.g. using asknetwork) or checking [x] Connect
> Automatically when enabling network in GUI.

Correct.  Sorry I wasn't clear.  I wasn't offering a workaround, only a reproducer to help file a new bug to track this problem.  

The workaround of activating networking during loader would work wonderfully to avoid anaconda (old stage2) network activation.  Thanks!

Comment 7 Adam Williamson 2011-07-19 16:41:44 UTC
Proposing as Alpha blocker.

Comment 8 James Laska 2011-07-19 18:22:22 UTC
*** Bug 723314 has been marked as a duplicate of this bug. ***

Comment 9 Tao Wu 2011-07-20 10:20:43 UTC
(In reply to comment #4)
> (In reply to comment #1)
> > 3. After the anaconda crashed, it seems can not configure network
> > automatically, so that we can not save the bug to bugzilla directly.
> 
> Can you file a separate bug for the networking problem?  A good test reproducer
> for this will be the test case
> https://fedoraproject.org/wiki/QA:Testcase_Anaconda_save_traceback_to_bugzilla

Thanks to the suggestion, I have submit a new bug for this issue, bug 723475.

Comment 10 David Lehman 2011-07-20 18:50:26 UTC
This is an lvm issue and is likely related to something about the installer environment -- specifically udev.

Comment 11 Tim Flink 2011-07-21 16:09:52 UTC
The closest alpha release criterion that I'm seeing is:

The installer must be able to complete an installation using the entire disk, existing free space, or existing Linux partitions methods, with or without encryption enabled.

If this is happening every time, I'm +1 on alpha blocker.

Does this also affect non-KVM installs?

Comment 12 James Laska 2011-07-21 18:32:36 UTC
(In reply to comment #11)
> If this is happening every time, I'm +1 on alpha blocker.

Indeed it is.  

> Does this also affect non-KVM installs?

Yes, I confirmed bare-metal installs are also affected.  The trigger is whether or not LVM is used during installation.  

The partitioning Alpha release criteria were originally intended to capture partitioning scenarios that you could *only* hit at the first partitioning screen (excluding customize storage).  Prior to F16, all partitioning scenarios included LVM.  As of F16, there is now a toggle to enable/disable the use of LVM in any of the partitioning scenarios.  I believe enabling/disabling the use of LVM in F16 still meets the spirit of the criteria you noted.  I would recommend we adjust the criteria to include whether LVM is enabled/disabled.

</long_story> So I agree, I'm +1 Alpha blocker :)

Comment 13 Chris Lumens 2011-07-21 18:49:54 UTC
*** Bug 723886 has been marked as a duplicate of this bug. ***

Comment 14 Adam Williamson 2011-07-21 23:13:36 UTC
what's the default state of the toggle?

Comment 15 James Laska 2011-07-22 12:13:42 UTC
(In reply to comment #14)
> what's the default state of the toggle?

It defaults to enabled.  

[X] Use LVM?

Comment 16 Tim Flink 2011-07-22 20:50:05 UTC
Discussed at the 2011-07-22 Blocker Bug Review Meeting. We agreed that it hits the following alpha release criterion and accepted it as a Fedora 16 alpha blocker.

The installer must be able to complete an installation using the entire disk,
existing free space, or existing Linux partitions methods, with or without
encryption enabled.

Comment 17 Tao Wu 2011-07-25 09:48:00 UTC
According to the testing results of the latest version (Fedora-16-test-20110721-3-i386.iso), anaconda still crashed if enable the " Use LVM ", until now.

Comment 18 David Lehman 2011-07-29 14:54:01 UTC
04:16:58,448 INFO program: Running... lvm lvcreate -L 2976m -n lv_swap --config  devices { filter=["r|/loop3$|","r|/loop4$|","r|/loop5$|","r|/loop6$|","r|/loop7$|"] }  VolGroup
04:16:59,017 ERR program:   /dev/VolGroup/lv_swap: not found: device not cleared
04:16:59,018 ERR program:   Aborting. Failed to wipe start of new LV.

After I hit this in kvm I switched to a shell and found VolGroup-lv_swap active and usable.

Peter, we have started using systemd in the installer environment. We include most, if not all, of the stock udev rules. We do not use any lvm.conf file, nor do we run any lvm-specific daemons that may exist. Can you offer any insight into what is happening here?

Comment 19 David Lehman 2011-07-29 15:34:04 UTC
Created attachment 515902 [details]
program log with -vv added to lvm commands

The failing command is the last one in the log.

Comment 20 Peter Rajnoha 2011-08-01 11:21:01 UTC
(In reply to comment #18)
> 04:16:59,017 ERR program:   /dev/VolGroup/lv_swap: not found: device not
> cleared
...
> 
> Peter, we have started using systemd in the installer environment. We include
> most, if not all, of the stock udev rules. We do not use any lvm.conf file, nor
> do we run any lvm-specific daemons that may exist. Can you offer any insight
> into what is happening here?

So are all udev rules in place? (10-dm.rules, 11-dm-lvm.rules, 13-dm-disk.rules, 95-dm-notify.rules)

The recent version of LVM2 (v 2.02.86) relies on udev completely without doing any additional checks. There's still an option to bring back the checks with the activation/verify_udev_operations=1 option. Could you please try setting this one in the config line you're using and see the output for any warnings then? (like "Udev should have... but it was not found. Falling back to direct link creation")

Also, run the command with even more debugging with -vvvv (so we can see how udev_sync works in lvm).

So that would be:

  lvm lvcreate -vvvv ... --config "activation { verify_udev_operations=1 } ..."

Then we should see more... Thanks.

Comment 21 David Lehman 2011-08-01 15:00:49 UTC
Created attachment 516164 [details]
program log with -vvvv and activation config to verify udev

I added -vvvv and the suggested activation config and now the lvcreate command succeeds. Is this because the activation config will fix things or because the additional logging affected the timing?

Comment 22 David Lehman 2011-08-01 15:51:07 UTC
Adding -vvvv but not adding the activation config causes the command to hang completely.

Comment 23 Peter Rajnoha 2011-08-02 10:35:59 UTC
(In reply to comment #21)
> Created attachment 516164 [details]
> program log with -vvvv and activation config to verify udev

Well, looking at the log, I can't see any udev synchronization, just a plain old static node creation with mknod. That happens only if udev is not running OR you have udev sync disabled (so we use original old code to create the nodes).

I think the lvcreate will complete sucessfully if you use:

  lvm lvcreate --config "activation { udev_sync=1 } ..."

The problem seems to be that internally there's still DEFAULT_UDEV_SYNC=0. When using *only* the '--config' on the command line wihtout any 'udev_sync=1' setting in it that would override the default one (which happens when you use default lvm.conf file provided in the lvm2 package), we'll end up without any udev synchronization. BUT still fully relying on udev and so we end up with a race condition.

Can you try adding the udev_sync=1 activation option to your config line (don't set the veriry_udev_operations option now) and it should be running fine, I hope.

(I'll fix the DEFAULT_UDEV_SYNC to be 1 as well in the upstream immediately.)

Comment 24 Chris Lumens 2011-08-02 12:52:47 UTC
*** Bug 727488 has been marked as a duplicate of this bug. ***

Comment 25 David Lehman 2011-08-02 23:05:54 UTC
Adding udev_sync fixes the problem for me. Peter, it would be great if you could do a build for F16-Alpha. It makes little sense to add this code to anaconda if you plan to upstream it in lvm2.

Comment 26 Peter Rajnoha 2011-08-03 09:20:01 UTC
(In reply to comment #25)
> Adding udev_sync fixes the problem for me. Peter, it would be great if you
> could do a build for F16-Alpha. It makes little sense to add this code to
> anaconda if you plan to upstream it in lvm2.

Sure, build done (lvm2 v2.02.86-5, http://koji.fedoraproject.org/koji/buildinfo?buildID=256942). I've sent an email to releng and asked them to include this in the alpha.

Comment 28 Adam Williamson 2011-08-04 00:47:15 UTC
Peter, can you please submit the build as an update via http://bodhi.fedoraproject.org/ ? This is required now that F16 is branched. Thanks!

Comment 29 Fedora Update System 2011-08-04 07:41:04 UTC
lvm2-2.02.86-5.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/lvm2-2.02.86-5.fc16

Comment 30 Fedora Update System 2011-08-04 21:13:09 UTC
Package lvm2-2.02.86-5.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing lvm2-2.02.86-5.fc16'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/lvm2-2.02.86-5.fc16
then log in and leave karma (feedback).

Comment 31 Hongqing Yang 2011-08-05 05:10:54 UTC
*** Bug 728420 has been marked as a duplicate of this bug. ***

Comment 32 Fedora Update System 2011-08-06 04:24:36 UTC
lvm2-2.02.86-5.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 33 Chris Lumens 2011-08-08 14:02:46 UTC
*** Bug 728886 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.