Bug 1396217 - ppc64/ppc64le kickstart installation fails with dracut-initqueue[241]: Warning: dracut-initqueue timeout - starting timeout scripts
Summary: ppc64/ppc64le kickstart installation fails with dracut-initqueue[241]: Warnin...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut
Version: 7.4
Hardware: ppc64
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Lukáš Nykrýn
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1380361
TreeView+ depends on / blocked
 
Reported: 2016-11-17 18:04 UTC by Richard W.M. Jones
Modified: 2021-01-15 07:28 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-15 07:28:40 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Richard W.M. Jones 2016-11-17 18:04:11 UTC
Description of problem:

A kickstart installation inside an emulated POWER/POWERLE virtual
machine (running on x86-64 host, so this is emulated using TCG and
also quite slow) fails on the first boot with:

         Mounting Configuration File System...
[  OK  ] Mounted Configuration File System.
[   16.006006] 8139cp: 8139cp: 10/100 PCI Ethernet driver v1.3 (Mar 22, 2004)
[   16.007020] 8139cp 0000:00:01.0: enabling device (0100 -> 0103)
[   16.014546] 8139cp 0000:00:01.0 eth0: RTL-8139C+ at 0xd000080080040100, 52:54:00:87:03:bf, IRQ 19
[   16.086101] virtio-pci 0000:00:03.0: enabling device (0100 -> 0101)
[   16.086924] virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver
[   16.227799] 8139too: 8139too Fast Ethernet driver 0.9.28
[  OK  ] Started udev Coldplug all Devices.
[  OK  ] Reached target System Initialization.
         Starting dracut initqueue hook...
         Starting Show Plymouth Boot Screen...
[  OK  ] Started Show Plymouth Boot Screen.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
[  390.791362] dracut-initqueue[241]: Warning: dracut-initqueue timeout - starting timeout scripts
[  393.732354] dracut-initqueue[241]: Warning: dracut-initqueue timeout - starting timeout scripts

  [ this message is repeated many times, then ... ]

[  OK  ] Started dracut initqueue hook.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.
[  *** ] A start job is running for dev-mapp...ot.device (54min 37s / no limit)

At this point is just stays on the "A start job ..." message forever.
I'm guessing whatever timeout is failing needs to be much longer.  However
this did not fail on RHEL 7.2, so it seems to be a regression.

Version-Release number of selected component (if applicable):

RHEL 7.3 ppc64 and ppc64le

This is in an emulated (TCG) virtual machine on Fedora 24 host.

How reproducible:

100%

Steps to Reproduce:
1. Install RHEL 7.3 ppc64 or ppc64le using kickstart into a virtual machine.
2. Boot it.

Filesystems inside the guest:

Name            Type        VFS      Label  MBR  Size  Parent
/dev/sda1       filesystem  unknown  -      -    4.0M  -
/dev/sda2       filesystem  xfs      -      -    1.0G  -
/dev/rhel/root  filesystem  xfs      -      -    28G   -
/dev/rhel/swap  filesystem  swap     -      -    616M  -
/dev/rhel/root  lv          -        -      -    28G   /dev/rhel
/dev/rhel/swap  lv          -        -      -    616M  /dev/rhel
/dev/rhel       vg          -        -      -    29G   /dev/sda3
/dev/sda3       pv          -        -      -    29G   -
/dev/sda1       partition   -        -      41   4.0M  /dev/sda
/dev/sda2       partition   -        -      83   1.0G  /dev/sda
/dev/sda3       partition   -        -      8e   29G   /dev/sda
/dev/sda        device      -        -      -    30G   -

Comment 2 Harald Hoyer 2016-11-22 13:36:16 UTC
seems like a kernel module is missing.. just don't know which

Comment 3 Richard W.M. Jones 2016-11-22 13:51:46 UTC
Anyway way to find out?  Any way to get better debugging?

Comment 4 Harald Hoyer 2016-11-22 16:25:34 UTC
boot from installation medium:

# lsmod

boot from not working installation with "rd.break=initqueue"

# lsmod

Comment 5 Richard W.M. Jones 2016-11-22 16:32:13 UTC
There's no installation medium AFAIK, it's using kickstart, and the error
occurs after installation during the first boot.

I will try adding the second suggestion later.

Comment 6 Harald Hoyer 2016-11-22 17:03:35 UTC
(In reply to Richard W.M. Jones from comment #5)
> There's no installation medium AFAIK, it's using kickstart, and the error
> occurs after installation during the first boot.
> 
> I will try adding the second suggestion later.

then do lsmod in %post and parse the logs

Comment 7 Richard W.M. Jones 2016-11-24 15:13:30 UTC
Happens on Fedora 25 (ppc64 & ppc64le) too.  I'll see if I can collect
the lsmod data ...

Comment 8 Richard W.M. Jones 2016-11-24 15:21:44 UTC
This is for Fedora 25 ppc64le:

It's somewhat painful to get lsmod output from %post.

lsmod in the initramfs:

Module                  Size  Used by
virtio_pci             23047  0
virtio_ring            20275  1 virtio_pci
crc32c_vpmsum          10227  0
virtio                 11218  1 virtio_pci

virtio-scsi is required to mount the root filesystem.

Interestingly the kickstart was done using a virtio-blk device
so I suppose it's including that driver instead of virtio-scsi
in the initramfs.

I suppose we can rerun dracut in %post and add virtio-scsi drivers,
since ideally we'd want to make a disk image which could be booted
with either virtio-blk or virtio-scsi.

This works fine on non-POWER architectures.

Comment 9 Richard W.M. Jones 2016-11-24 18:43:21 UTC
Annoyingly, using dracut --add-drivers virtio-scsi in the %post script
did not help, so I guess it's some other thing.

Comment 10 Richard W.M. Jones 2016-11-28 16:46:08 UTC
FINALLY I managed to work around this by doing:

%post

pushd /etc/dracut.conf.d
echo 'add_drivers+="virtio-blk virtio-scsi"' > virt-builder-virtio-scsi.conf
popd

KERNEL_VERSION=$(rpm -q kernel --qf '%{version}-%{release}.%{arch}\n' | sort -V | tail -1)
dracut -f /boot/initramfs-$KERNEL_VERSION.img $KERNEL_VERSION

%end

However this behaviour still differs from both RHEL 7.2 on ppc64
and RHEL 7 on all other architectures.

Comment 11 Richard W.M. Jones 2016-11-29 20:02:02 UTC
device-mapper.ko is also missing.  It should be included because
the guest was built with LVM support using `autopart --type=lvm'.

Something is very broken with dracut, RHEL 7.3 & POWER.

The kickstart I'm using now is:

install
text
reboot
lang en_US.UTF-8
keyboard us
network --bootproto dhcp
rootpw builder
firewall --enabled --ssh
timezone --utc America/New_York
selinux --enforcing

bootloader --location=mbr --append="console=tty0 console=ttyS0,115200 rd_NO_PLYMOUTH"

zerombr
clearpart --all --initlabel
autopart --type=lvm

# Halt the system once configuration has finished.
poweroff

%packages
@core
%end

%post
# Enable virtio-scsi support.
pushd /etc/dracut.conf.d
echo 'add_drivers+="virtio-blk virtio-scsi"' > virt-builder-virtio-scsi.conf
popd
# To make dracut config changes permanent, we need to rerun dracut.
# Rerun dracut for the installed kernel (not the running kernel).
KERNEL_VERSION="$(rpm -q kernel --qf '%{version}-%{release}.%{arch}\n' | sort -V | tail -1)"
dracut -f /boot/initramfs-$KERNEL_VERSION.img $KERNEL_VERSION
%end

Comment 12 Richard W.M. Jones 2016-11-30 10:55:56 UTC
Sorry, ignore the previous comment.  dm_*.ko modules are present.
The boot problem is something else.

Comment 13 Harald Hoyer 2016-11-30 13:53:57 UTC
please add spaces:

echo 'add_drivers+=" virtio-blk virtio-scsi "' > virt-builder-virtio-scsi.conf

Comment 14 Richard W.M. Jones 2016-11-30 13:54:10 UTC
OK I finally got this to boot.  I had to add virtio-scsi modules
to dracut (on ppc64/ppc64le only), so the situation is as in comment 10.
Ignore comment 11.

Comment 15 Richard W.M. Jones 2016-11-30 13:55:05 UTC
(In reply to Harald Hoyer from comment #13)
> please add spaces:
> 
> echo 'add_drivers+=" virtio-blk virtio-scsi "' >
> virt-builder-virtio-scsi.conf

I've updated the KS.  However it worked even without the spaces.

Comment 16 Harald Hoyer 2016-11-30 13:57:20 UTC
hmm, /usr/lib/modules.d/90kernel-modules/module-setup.sh has

        # install virtual machine support
        instmods virtio virtio_blk virtio_ring virtio_pci virtio_scsi \
            "=drivers/pcmcia" =ide "=drivers/usb/storage"


So, if the drivers are loaded on kickstart/installation time, they should be included in the final initramfs.

Comment 17 Harald Hoyer 2016-11-30 14:00:01 UTC
If you don't want the "hostonly" feature:


%packages
@core
dracut-config-generic
%end

which then should generate an initramfs with _all_ the kernel modules (and not just those, which were loaded on installation time)

Comment 18 Richard W.M. Jones 2016-11-30 15:20:43 UTC
I think what's needed first is a very clear and simple reproducer of
the original bug.  This is the most minimal reproducer I could create:

(1) Create a file 'rhel.ks' containing:

install
text
reboot
lang en_US.UTF-8
keyboard us
network --bootproto dhcp
rootpw builder
firewall --enabled --ssh
timezone --utc America/New_York
selinux --enforcing

bootloader --location=mbr --append="console=tty0 console=ttyS0,115200 rd_NO_PLYMOUTH"

zerombr
clearpart --all --initlabel
autopart --type=lvm

# Halt the system once configuration has finished.
poweroff

%packages
@core
%end


(2) Run the following virt-install command (as non-root if you wish):

virt-install --name=tmp \
    --ram=4096 \
    --arch=ppc64le \
    --machine=pseries \
    --cpu=POWER8 \
    --vcpus=1 \
    --os-type=linux \
    --os-variant=rhel7 \
    --initrd-inject=rhel.ks \
    --extra-args="ks=file:/rhel.ks console=tty0 console=ttyS0,115200 rd_NO_PLYMOUTH" \
    --disk=rhel.img,size=6,format=raw \
    --serial=pty \
    --nographics \
    --noreboot \
    --location=http://download.devel.redhat.com/released/RHEL-7/7.3/Server/ppc64le/os


(3) Boot the disk image using virtio-scsi:

qemu-system-ppc64 \
    -machine pseries \
    -cpu POWER8 -m 2048 \
    -device virtio-scsi-pci,id=scsi \
    -drive file=rhel.img,snapshot=on,format=raw,if=none,id=hd0 \
    -device scsi-hd,drive=hd0 -serial stdio


(4) You should now observe the error during the first boot when the guest tries to
find its root disks.


To clean up after the above steps, you will need to do:

virsh destroy tmp
virsh undefine tmp
rm rhel.img

----

(In reply to Harald Hoyer from comment #16)
> hmm, /usr/lib/modules.d/90kernel-modules/module-setup.sh has
> 
>         # install virtual machine support
>         instmods virtio virtio_blk virtio_ring virtio_pci virtio_scsi \
>             "=drivers/pcmcia" =ide "=drivers/usb/storage"
> 
> 
> So, if the drivers are loaded on kickstart/installation time, they should be
> included in the final initramfs.

(File name is in fact /usr/lib/dracut/modules.d/90kernel-modules/module-setup.sh)

Yes, we have that file in the guest I built above, and it contains that
fragment too.  It seems to be conditional on `[[ -z $drivers ]]' and
nothing else.

It can't be running correctly because I pulled the initramfs from the guest above
and it does not contain virtio_scsi.ko

However the initramfs does contain virtio_blk.ko, virtio.ko, virtio_pci.ko and virtio_ring.ko.

Comment 19 Richard W.M. Jones 2016-11-30 15:22:58 UTC
(In reply to Harald Hoyer from comment #17)
> If you don't want the "hostonly" feature:
> 
> 
> %packages
> @core
> dracut-config-generic
> %end
> 
> which then should generate an initramfs with _all_ the kernel modules (and
> not just those, which were loaded on installation time)

Good tip, virt-builder should switch to doing this.

Comment 20 Andrea Bolognani 2016-12-06 15:26:15 UTC
(In reply to Richard W.M. Jones from comment #18)
> I think what's needed first is a very clear and simple reproducer of
> the original bug.  This is the most minimal reproducer I could create:
> 
> (1) Create a file 'rhel.ks' containing:
> 
> install
> text
> reboot
> lang en_US.UTF-8
> keyboard us
> network --bootproto dhcp
> rootpw builder
> firewall --enabled --ssh
> timezone --utc America/New_York
> selinux --enforcing
> 
> bootloader --location=mbr --append="console=tty0 console=ttyS0,115200
> rd_NO_PLYMOUTH"
> 
> zerombr
> clearpart --all --initlabel
> autopart --type=lvm
> 
> # Halt the system once configuration has finished.
> poweroff
> 
> %packages
> @core
> %end
> 
> 
> (2) Run the following virt-install command (as non-root if you wish):
> 
> virt-install --name=tmp \
>     --ram=4096 \
>     --arch=ppc64le \
>     --machine=pseries \
>     --cpu=POWER8 \
>     --vcpus=1 \
>     --os-type=linux \
>     --os-variant=rhel7 \
>     --initrd-inject=rhel.ks \
>     --extra-args="ks=file:/rhel.ks console=tty0 console=ttyS0,115200
> rd_NO_PLYMOUTH" \
>     --disk=rhel.img,size=6,format=raw \
>     --serial=pty \
>     --nographics \
>     --noreboot \
>    
> --location=http://download.devel.redhat.com/released/RHEL-7/7.3/Server/
> ppc64le/os
> 
> 
> (3) Boot the disk image using virtio-scsi:
> 
> qemu-system-ppc64 \
>     -machine pseries \
>     -cpu POWER8 -m 2048 \
>     -device virtio-scsi-pci,id=scsi \
>     -drive file=rhel.img,snapshot=on,format=raw,if=none,id=hd0 \
>     -device scsi-hd,drive=hd0 -serial stdio
> 
> 
> (4) You should now observe the error during the first boot when the guest
> tries to
> find its root disks.

It works if you replace

  --disk=rhel.img,size=6,format=raw \

with

  --disk=rhel.img,size=6,format=raw,bus=scsi \
  --controller scsi,model=virtio-scsi \

in your virt-install command line: by default, the guest
will have a virtio-blk disk and dracut will only pick up
that one module, skipping virtio-scsi.

Comment 22 Lin Li 2018-08-15 03:50:23 UTC
I also hit the issue on x86_64 server.
Here is beaker job: https://beaker.engineering.redhat.com/jobs/2690432
Here is console log: http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2018/08/26904/2690432/5519832/console.log

Comment 25 RHEL Program Management 2021-01-15 07:28:40 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.