Bug 1714828 - F31 rawhide net install image fails using mbr on gpt disk with pre-existing ext4 partitions
Summary: F31 rawhide net install image fails using mbr on gpt disk with pre-existing e...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Anaconda Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-29 01:53 UTC by stan
Modified: 2020-11-24 15:19 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-24 15:19:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
storage.log (454.36 KB, text/plain)
2019-05-30 20:15 UTC, stan
no flags Details
sensitive-info.log (107 bytes, text/plain)
2019-05-30 20:16 UTC, stan
no flags Details
program.log (164.84 KB, text/plain)
2019-05-30 20:16 UTC, stan
no flags Details
packaging.log (166.28 KB, text/plain)
2019-05-30 20:17 UTC, stan
no flags Details
hawkey.log (24.05 KB, text/plain)
2019-05-30 20:17 UTC, stan
no flags Details
dnf.librepo.log (585.15 KB, text/plain)
2019-05-30 20:18 UTC, stan
no flags Details
dbus.log (2.96 KB, text/plain)
2019-05-30 20:18 UTC, stan
no flags Details
anaconda.log (64.78 KB, text/plain)
2019-05-30 20:19 UTC, stan
no flags Details
X.log (35.37 KB, text/plain)
2019-05-30 20:20 UTC, stan
no flags Details
grub2-mount_strace.txt (17.50 KB, text/plain)
2019-05-30 23:17 UTC, stan
no flags Details
anaconda.log (25.64 KB, text/plain)
2019-06-16 13:50 UTC, stan
no flags Details
dbus.log (2.92 KB, text/plain)
2019-06-16 13:51 UTC, stan
no flags Details
dnf.librepo.log (183.27 KB, text/plain)
2019-06-16 13:52 UTC, stan
no flags Details
hawkey.log (11.36 KB, text/plain)
2019-06-16 13:52 UTC, stan
no flags Details
packaging.log (9.33 KB, text/plain)
2019-06-16 13:53 UTC, stan
no flags Details
program.log (118.05 KB, text/plain)
2019-06-16 13:53 UTC, stan
no flags Details
storage.log (434.66 KB, text/plain)
2019-06-16 13:54 UTC, stan
no flags Details
X.log (35.37 KB, text/plain)
2019-06-16 13:56 UTC, stan
no flags Details

Description stan 2019-05-29 01:53:04 UTC
Description of problem:
Using a burned image of the F31 netinstall iso, the install fails when trying to create an mbr record.


Version-Release number of selected component (if applicable):
Fedora-Server-netinst-x86_64-Rawhide-20190524.n.1.iso


How reproducible:
every time


Steps to Reproduce:
1.  gpt formatted hard drive with ext4 1 Gib /boot partition, 250 Gib / partition
2.  dist contains an existing mbr booted Fedora28 on 2 identical partitions
3.  boot CD and install.

Actual results:
Everything works fine until it tries to create the mbr record.  Then it fails when running os-prober on the existing Fedora28 / partition.  It ran for 20 mins at ~100% CPU before I killed it.


Expected results:
Install successfully moves to completion.


Additional info:
After the failed install attempt, the existing Fedora28 no longer booted but dropped directly to a grub prompt.

# gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.4

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 5860533168 sectors, 2.7 TiB
Model: TOSHIBA DT01ACA3
Sector size (logical/physical): 512/4096 bytes
Disk identifier (GUID): 363C9B1D-DBA5-41AE-A74B-6DD5F974ED6D
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 5860533134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2093933 sectors (1022.4 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1         2097152         4194303   1024.0 MiB  8300  
   2         4194304         6291455   1024.0 MiB  0700  
   3         6291456        48234495   20.0 GiB    8200  
   4        48234496       572522495   250.0 GiB   8300  
   5       572522496      1096810495   250.0 GiB   0700  
   6      1096810496      5860532223   2.2 TiB     0700  
   7            2048            6143   2.0 MiB     EF02

os-prober as run from grub2-mkconfig from an old F25 install was able to successfully probe and create boot stanzas for the existing Fedora28.  I was then able to boot it successfully from that grub.cfg.

Comment 1 stan 2019-05-29 01:54:46 UTC
Forgot to mention that I used custom configuration for the hard drive configuration, and told it to reformat the partitions.

Comment 2 stan 2019-05-29 02:14:51 UTC
I just looked on the root partition to see if there were any logs from the install, but they aren't there.  So, unless I run this again, I only have the memory of what I saw on the screen before I bailed out.

Comment 3 stan 2019-05-29 18:32:14 UTC
After another failed try, I wrote down some of the errors.

On F1,
"""
Exception ignored when trying to write to signal wakeup fd
Blocking IOError: [Errono 11] Resource temporarily unavailable
"""
fill the page,
except the last 5 entries are
"""
[rsvg_internals/src/svg.rs:84] &self.loadoptions.base_vol = None
"""

On F4,
"""
Notice root: 50mounted-tests: debug.runing subtest /usr/bilexec/os-probes/mounted/90linux-distro
"""
This is after it has been probing /dev/sda5, the existing Fedora installation on the drive.  I notice that os-prober discovers early on that the partition is ext2 (actually ext4), but continues probing it even after discovering that.  Not very efficient.

Comment 4 Vendula Poncova 2019-05-30 09:40:24 UTC
 Please, attach logs from the installation. You can find them during the installation in /tmp/*log.

Comment 5 stan 2019-05-30 15:27:08 UTC
Developments:

I deleted the mbr from the disk using gdisk, and reset all the types to 8300 (linux filesystem), and the install succeeded (a minimal install).  It appears to have installed as UEFI using BLS since there are entries in /boot/loader/entries, but the reboot failed.  I'll continue looking into that until I can boot successfully.

# fdisk -l /dev/sda
Disk /dev/sda: 2.7 TiB, 3000592982016 bytes, 5860533168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 363C9B1D-DBA5-41AE-A74B-6DD5F974ED6D

Device          Start        End    Sectors  Size Type
/dev/sda1     2097152    4194303    2097152    1G Linux filesystem
/dev/sda2     4194304    6291455    2097152    1G Linux filesystem
/dev/sda3     6291456   48234495   41943040   20G Linux swap
/dev/sda4    48234496  572522495  524288000  250G Linux filesystem
/dev/sda5   572522496 1096810495  524288000  250G Linux filesystem
/dev/sda6  1096810496 5860532223 4763721728  2.2T Linux filesystem
/dev/sda8        6144    2097151    2091008 1021M EFI System

Partition table entries are not in disk order.

So, I have no way of getting logs at this point.  For future reference, how would I save the logs?  I have no system at that point, except that provided by the install environment.  Is it possible to access hard drives from there?  Would that be mounting them to /mnt/sysimage as in a rescue disk?  

Anyway, you might as well close this as notabug.  It seems the installer became confused by the non-standard type on the existing Fedora partitions, likely to be a rare situation.

Comment 6 stan 2019-05-30 20:15:57 UTC
Created attachment 1575380 [details]
storage.log

Comment 7 stan 2019-05-30 20:16:22 UTC
Created attachment 1575381 [details]
sensitive-info.log

Comment 8 stan 2019-05-30 20:16:49 UTC
Created attachment 1575382 [details]
program.log

Comment 9 stan 2019-05-30 20:17:27 UTC
Created attachment 1575383 [details]
packaging.log

Comment 10 stan 2019-05-30 20:17:51 UTC
Created attachment 1575384 [details]
hawkey.log

Comment 11 stan 2019-05-30 20:18:22 UTC
Created attachment 1575385 [details]
dnf.librepo.log

Comment 12 stan 2019-05-30 20:18:48 UTC
Created attachment 1575386 [details]
dbus.log

Comment 13 stan 2019-05-30 20:19:45 UTC
Created attachment 1575387 [details]
anaconda.log

Comment 14 stan 2019-05-30 20:20:08 UTC
Created attachment 1575388 [details]
X.log

Comment 15 stan 2019-05-30 20:21:12 UTC
Did another install, and this one failed.  So I am providing logs from the failure.

Comment 16 stan 2019-05-30 23:17:13 UTC
Created attachment 1575440 [details]
grub2-mount_strace.txt

The install is hanging in grub2-mount.  This is an strace of the program when it is stuck in a loop and consuming ~100% of cpu.

Comment 17 stan 2019-06-07 15:53:55 UTC
An update on this.  Neither UEFI or MBR install with the Server netinstall (base) image will install successfully.  They both hang at the "installing boot record" stage.  I tried the latest available, 20190604, and it also has this problem.  I then downloaded the Fedora30 server netinstall base version, and it successfully installed the minimal version as UEFI.

That means that I could just flip the repositories to rawhide and no longer need to use the rawhide version, but I think I will continue testing candidates from rawhide to see if they install successfully.

Comment 18 Chris Murphy 2019-06-08 03:34:30 UTC
1. The program.log ends abruptly at
12:59:07,844 INF program: Running in chroot '/mnt/sysimage'... grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

2. efibootmgr succeeds so this is definitely UEFI boot

3. storage.log says /dev/sda has GPT; /dev/sdb has MBR; /dev/sdc has GPT.

4. anaconda.log says

12:59:07,138 INF bootloader.installation: boot loader stage1 target device is sda7
12:59:07,138 INF bootloader.installation: boot loader stage2 target device is sda1

I don't understand that, on UEFI there is no distinction between stage1 and stage2, shouldn't they be the same?

sda7 is EFI System partition
sda1 is ex4 boot mounted at /boot

Comment 19 stan 2019-06-16 13:49:19 UTC
Made another attempt with the 20190609 image.  Fails with modular repo errors about tokei and libgit2.  I'll attach the logs.  I'm using a different disk, since I bumped an F30 netinstall to F31 on the other disk, and it is up and functioning well, except that pulseaudio doesn't work, or build, there.

Comment 20 stan 2019-06-16 13:50:47 UTC
Created attachment 1581125 [details]
anaconda.log

Comment 21 stan 2019-06-16 13:51:40 UTC
Created attachment 1581126 [details]
dbus.log

Comment 22 stan 2019-06-16 13:52:11 UTC
Created attachment 1581127 [details]
dnf.librepo.log

Comment 23 stan 2019-06-16 13:52:45 UTC
Created attachment 1581128 [details]
hawkey.log

Comment 24 stan 2019-06-16 13:53:16 UTC
Created attachment 1581129 [details]
packaging.log

Comment 25 stan 2019-06-16 13:53:42 UTC
Created attachment 1581130 [details]
program.log

Comment 26 stan 2019-06-16 13:54:40 UTC
Created attachment 1581131 [details]
storage.log

Comment 27 stan 2019-06-16 13:56:43 UTC
Created attachment 1581132 [details]
X.log

Comment 28 stan 2019-06-16 13:59:09 UTC
A request; would it be possible to have gpm installed on the netinstall image so the mouse could be used to cut and paste things in the terminals during install?

Comment 29 Samuel Sieb 2019-06-17 05:38:46 UTC
If you add "inst.sshd" to the boot command line, you will be able to ssh to the system.  The default is root with no password, but there is an option to set a password as well if you want.

Comment 30 stan 2019-06-17 15:55:38 UTC
I think that requires another system to ssh from, and I don't have that.  I was able to use the existing terminals to extract the information in the logs from /mnt/sysimage, and put it somewhere safe by mounting an existing filesystem on /mnt.  gpm would just make that easier, since double click, middle click pastes at the cursor.  Dealing with qwerty is annoying, since it is hunt and peck for me at this point, so I want to avoid as much typing as possible.

Comment 31 stan 2019-06-27 19:51:35 UTC
I just tried Fedora-Server-netinst-x86_64-Rawhide-20190625.n.0.iso.  Still failing at installing boot loader step.  I'm going to try altering my procedure to see if I can get it to install, so I'm not attaching logs yet.

Comment 32 stan 2019-06-28 23:03:16 UTC
I just did a comparison of installation.py between F30 that works, and F31 that fails.

$ diff anaconda-30.25.6/pyanaconda/bootloader/installation.py anaconda-31.18/pyanaconda/bootloader/installation.py
24a25
> from pyanaconda.core.util import decode_bytes
155c156
<     kernel = h.name.decode()
---
>     kernel = decode_bytes(h.name)

The difference seems to be the use of decode_bytes in F31.

When I look at that function in anaconda-31.18/pyanaconda/core/util.py,
I find that it is interpreting strings as utf-8.  But, EFI uses iso-8859, and when I tried to use utf-8, it failed with the error.  Could that be the reason it is failing to install the bootloader?  I'm trying to understand how F30 could function flawlessly, and F31 fails / hangs when trying to install the bootloader.

def decode_bytes(data):
    """Decode the given bytes.

    Return the given string or a string decoded from the given bytes.

    :param data: bytes or a string
    :return: a string
    """
    if isinstance(data, str):
        return data

    if isinstance(data, bytes):
        return data.decode('utf-8')

    raise ValueError("Unsupported type '{}'.".format(type(data).__name__))

There are quite a lot of other differences in pyanaconda/core/util.py between F30 and F31, but this seems to be the only difference between F30 and F31 that matters to installation.

Comment 33 Chris Murphy 2019-06-29 00:01:50 UTC
I think it very well could explain the failure, but it's still speculation as we don't see an actual error in any of the logs. Make sure to provide either full dmesg or possibly better, as root, 'journalctl -b -o short-monotonic > journalfull.log' which will interleave anaconda messages along with kernel message. I'm not certain it'll contain anything new, but I'd like to think any firmware problem will cause a kernel message.

Comment 34 Ben Cotton 2019-08-13 16:47:15 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 31 development cycle.
Changing version to '31'.

Comment 35 Ben Cotton 2020-11-03 15:14:46 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 36 Ben Cotton 2020-11-24 15:19:59 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.