Bug 1015234 - F20 Beta TC1 ARM disk images unable to find root filesystem
F20 Beta TC1 ARM disk images unable to find root filesystem
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: dracut (Show other bugs)
20
arm Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: dracut-maint
Fedora Extras Quality Assurance
AcceptedBlocker
: Reopened
Depends On:
Blocks: ARMTracker F20BetaBlocker
  Show dependency treegraph
 
Reported: 2013-10-03 13:19 EDT by Paul Whalen
Modified: 2013-11-04 21:15 EST (History)
11 users (show)

See Also:
Fixed In Version: dracut-034-8.git20131008.fc20
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-04 21:15:10 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Log of regenerating initramfs with '--debug' (5.70 MB, text/x-log)
2013-10-03 13:27 EDT, Paul Whalen
no flags Details
Fedora 20 Beta TC4 Boot Log - resize success (51.42 KB, text/plain)
2013-10-17 14:46 EDT, Paul Whalen
no flags Details
Fedora 20 Beta TC5 Boot Log - resize failure (29.37 KB, text/plain)
2013-10-17 14:47 EDT, Paul Whalen
no flags Details

  None (edit)
Description Paul Whalen 2013-10-03 13:19:32 EDT
Description of problem:
When booting F20 Beta TC1 ARM disk images, system is unable to find the root filesystem.


Version-Release number of selected component (if applicable):

dracut-033-3.git20130913.fc20

How reproducible:
everytime. 

Steps to Reproduce:
1. Boot a F20 Beta-TC1 disk image on any device, or using QEMU



Actual results:
system will drop to dracut

Expected results:
Booted system

Additional info:
When regenerating the initramfs using 'dracut -f --add-drivers mmc_block' the system is able to find the root filesystem.
Comment 1 Paul Whalen 2013-10-03 13:27:16 EDT
Created attachment 807191 [details]
Log of regenerating initramfs with '--debug'
Comment 2 Paul Whalen 2013-10-03 13:35:10 EDT
Proposed Blocker:
"Release-blocking images must boot: All release-blocking images must boot in their supported configurations."
Comment 3 Kyle McMartin 2013-10-03 15:53:35 EDT
I suspect it's a combination of:
commit 36b2e5e2c27f9e72e9aa40e580b6d9e60799ceae
Author: Colin Walters <walters@verbum.org>
Date:   Wed Sep 11 09:04:45 2013 -0400

    dracut.sh: Fixup previous commit to only read /sys and /proc in hostonly mode
    
    The gnome-ostree build system generates dracut initramfs images on the
    build server, therefore not in hostonly mode.  The build system at the
    moment doesn't mount /sys, and the previous commit caused a hard
    failure due to lack of /sys/devices.
    
    Because we only want /sys/devices in hostonly mode, just move those
    bits inside the hostonly conditional above.

commit 3c4315fa1368f1ee12dfa8abb5dd7c3556da79f8
Author: Harald Hoyer <harald@redhat.com>
Date:   Wed Sep 11 09:57:52 2013 +0200

    dracut-functions.sh: extend module_is_host_only()
    
    If the currently running kernel is not present in the installer root,
    fall back to modalias checking.

since we build the initramfs for the ARM images on one host, in hostonly mode, since each image is board specific (more or less.)

I think I'll hardcode the modules we need for ARM boards entirely in 90kernel-modules.
Comment 4 Harald Hoyer 2013-10-07 05:11:20 EDT
(In reply to Paul Whalen from comment #1)
> Created attachment 807191 [details]
> Log of regenerating initramfs with '--debug'

Well according to the debug log, "mmc_block" is actually installed.

dracut-install: Handle '/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko'
dracut-install: dracut_install('/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko', '/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko')
dracut-install: dest dir '/var/tmp/initramfs.MXy0ux/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card' does not exist
dracut-install: dracut_install('/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card', '/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card')
dracut-install: dest dir '/var/tmp/initramfs.MXy0ux/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc' does not exist
dracut-install: dracut_install('/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc', '/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc')
dracut-install: mkdir '/var/tmp/initramfs.MXy0ux/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc'
dracut-install: mkdir '/var/tmp/initramfs.MXy0ux/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card'
dracut-install: dracut_install ret = 0
dracut-install: cp '/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko' '/var/tmp/initramfs.MXy0ux/lib/modules/3.11.2-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko'
Comment 5 Mike Ruckman 2013-10-09 17:12:24 EDT
Discussed this in 2013-10-09 Blocker Review Meeting [1]. Voted an AcceptedBlocker as it violates the following F20 alpha release criterion for ARM images in virt and on bare-metal: "A system installed with a release-blocking desktop must boot to a log in screen where it is possible to log in to a working desktop using a user account created during installation or a 'first boot' utility." [2]

[1] http://meetbot.fedoraproject.org/fedora-blocker-review/2013-10-09/ 
[2] https://fedoraproject.org/wiki/Fedora_20_Alpha_Release_Criteria#Expected_installed_system_boot_behavior
Comment 6 Harald Hoyer 2013-10-10 06:27:50 EDT
Downloaded:

http://dl.fedoraproject.org/pub/alt/stage/20-Beta-TC2/Images/armhfp/Fedora-Minimal-armhfp-20-Beta-TC2-sda.raw.xz

# unxz ...
# losetup -P -f Fedora-Minimal-armhfp-20-Beta-TC2-sda.raw 
# mount /dev/loop0p1 /mnt/tt
# for i in /mnt/tt/initramfs-3.11.3-301.fc20.armv7hl*; do \
  lsinitrd $i | fgrep mmc_block;done
-rwxr--r--   1 root     root        36164 Oct  3 06:02 usr/lib/modules/3.11.3-301.fc20.armv7hl/kernel/drivers/mmc/card/mmc_block.ko
-rwxr--r--   1 root     root        36172 Oct  3 05:46 usr/lib/modules/3.11.3-301.fc20.armv7hl+lpae/kernel/drivers/mmc/card/mmc_block.ko


Seems to be there..

# for i in /mnt/tt/initramfs-3.11.3-301.fc20.armv7hl*; do lsinitrd $i | fgrep 'with dracut';done
dracut-033-3.git20130913.fc20 with dracut modules:
dracut-033-3.git20130913.fc20 with dracut modules:
Comment 7 Harald Hoyer 2013-10-10 06:36:02 EDT
Btw, how do you boot this raw image in qemu?
Comment 8 Kyle McMartin 2013-10-10 07:55:54 EDT
The image creation was missing dracut-config-generic, so we were getting hostonly dracut images in TC1... that should be fixed for TC2. However, we're now also seeing issues resizing the root partition that we weren't before that we need to debug.

regards, Kyle
Comment 9 Adam Williamson 2013-10-11 05:27:43 EDT
Hmm, this seems a bit messy now.

The change in TC2 to use dracut-config-generic: is this a fix, or a workaround? Is further work required on that specific issue, or not?

The 'issues resizing the root partition' that Kyle talks about: are those at all related to the initial problem here? If not, they should be filed and tracked separately. If they may block the Beta release, the newly-filed bug should be proposed as a Beta blocker.
Comment 10 Fedora Update System 2013-10-12 12:34:33 EDT
dracut-034-8.git20131008.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/dracut-034-8.git20131008.fc20
Comment 11 Fedora Update System 2013-10-13 15:52:30 EDT
dracut-034-8.git20131008.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 12 Dennis Gilmore 2013-10-14 00:01:38 EDT
pulling dracut-config-generic into TC2 was a workaround, though maybe it should be the fix, and arm images just use generic initrds. As we do not know the target device at creation time we need generic, but it would be nice for kernel updates to switch to the host mode and have a smaller initramfs, especially considering that most arm devices are using sdcards for their root storage. the dracut change in dracut-034-8.git20131008.fc20 is not right or sufficient however.
Comment 13 Harald Hoyer 2013-10-14 02:19:27 EDT
(In reply to Dennis Gilmore from comment #12)
> As we do not know the target device at creation time we need generic…

So, why don't you call dracut with "-N" at generation time and you will have a generic image?
Comment 14 Harald Hoyer 2013-10-14 02:29:40 EDT
So, now we have:

instmods sdhci_esdhc_imx mmci sdhci_tegra mvsdio omap omapdrm \
         omap_hsmmc panel-tfp410 sdhci_dove ahci_platform pata_imx sata_mv \
         ehci-tegra mmc_block usb_storage


If you need those modules in _every_ arm initramfs, why don't you compile them in the kernel?

If you only need those modules in the shipped install image initramfs, why don't you
a) use the "--no-hostonly" option to build it
or
b) use the --add-drivers "sdhci_esdhc_imx mmci sdhci_tegra mvsdio omap omapdrm \
         omap_hsmmc panel-tfp410 sdhci_dove ahci_platform pata_imx sata_mv \
         ehci-tegra mmc_block usb_storage" to build it?
Comment 15 Harald Hoyer 2013-10-14 02:32:35 EDT
In any case, why don't you talk to me first before modifying my package?
Comment 16 Dennis Gilmore 2013-10-14 23:51:17 EDT
we do not actually call dracut directly, its called via the kernel install process in appliance-creator. I guess i could patch appliance-creator to regenerate the initramfs. We dont need all of those modules in every arm initramfs but we do need to make sure they are in the ones that dracut makes. I also guess another option is to remove dracut-config-generic in %post in the kickstarts. up until recently dracut had checks that were run, which resulted in drcut determining that it needed to make a generic image. those checks no longer seem to be run. as i said we do not directly run dracut, and id really rather not re run it manually, as then i would need to also re create the u-boot wrapped versions. there is no good way to pass any flags to dracut.


https://fedoraproject.org/wiki/Changes/Virt_ARM_on_x86 gives you an overview on how to run the image on a x86 box.
Comment 17 Harald Hoyer 2013-10-15 04:46:14 EDT
(In reply to Dennis Gilmore from comment #16)
> up until recently dracut had checks that were run, which resulted in drcut 
> determining that it needed to make a generic image. those checks no longer
> seem to be run.

Well, these checks only check, if /sys /proc /run /dev are mounted properly and are still present.

Anyway, it's not wise to generate a host-only image in the appliance-creator, if these images are used on different hardware.

Removing dracut-config-generic in %post in the kickstarts would probably be the best solution here.
Comment 18 Adam Williamson 2013-10-16 14:41:07 EDT
Discussed at 2013-10-16 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-10-16/f20beta-blocker-review-4.2013-10-16-16.02.log.txt . This appears to be working, one way or another, in TC4, but we are not sure if the current state is the intended long-term fix, or still a workaround.

Is there a definite plan for how this should be handled in the long term? If so, is it already implemented? Generally, where do we stand on this?
Comment 19 Paul Whalen 2013-10-17 14:46:20 EDT
Created attachment 813502 [details]
Fedora 20 Beta TC4 Boot Log - resize success
Comment 20 Paul Whalen 2013-10-17 14:47:57 EDT
Created attachment 813506 [details]
Fedora 20 Beta TC5 Boot Log - resize failure
Comment 21 Paul Whalen 2013-10-17 14:49:25 EDT
The TC4 used dracut-034-8.git20131008.fc20 and did not include dracut-config-generic. When including the same version of dracut *and* dracut-config-generic in TC5, when root is on sda, the resize of the root filesystem fails and destroys the root partition. Resizing mmc does not work (BZ#1009172), when it does work I would expect the same behaviour. 

I have attached boot logs for TC4 and TC5 with debug, and am unsure which package I should file the bug on. Is this a problem in dracut-modules-growroot - which gives an error on success and failure(in logs), cloud-utils-growpart as the entire partition is destroyed, or a by-product of using a generic config?
Comment 22 Peter Robinson 2013-10-19 07:42:42 EDT
(In reply to Paul Whalen from comment #21)
> The TC4 used dracut-034-8.git20131008.fc20 and did not include
> dracut-config-generic. When including the same version of dracut *and*
> dracut-config-generic in TC5, when root is on sda, the resize of the root
> filesystem fails and destroys the root partition. Resizing mmc does not work
> (BZ#1009172), when it does work I would expect the same behaviour. 

So just to confirm here the corruption seen when resizing the sda has been confirmed due to the HW buginess on the Trimslice usb attached SSD. This has been a problem with the device for ever. When tested on highbank with a standard HDD it works fine. I believe kylem has a fix for the resize bug on MMC.
Comment 23 Paul Whalen 2013-10-22 10:06:18 EDT
Resize is also causing issues in qemu(BZ#1016648).
Comment 24 Adam Williamson 2013-10-24 14:40:48 EDT
If the actual showstopper bug here is fixed by something that's in stable, then we can close it. Shouldn't the resize issue be filed separately?
Comment 25 Adam Williamson 2013-11-04 21:15:10 EST
I'm just going to go ahead and close this now.

Note You need to log in before you can comment on or make changes to this bug.