Bug 2210300 - [leapp] rhel-upgrade plugin returned a non-zero exit code during the real transaction
Summary: [leapp] rhel-upgrade plugin returned a non-zero exit code during the real tra...
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: leapp-repository
Version: 7.9
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Petr Stodulka
QA Contact: upgrades-and-conversions
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-26 13:56 UTC by Christophe Besson
Modified: 2023-08-16 11:34 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OAMG-9169 0 None None None 2023-05-26 13:59:19 UTC
Red Hat Issue Tracker RHELPLAN-158308 0 None None None 2023-05-26 13:59:06 UTC
Red Hat Knowledge Base (Solution) 5057391 0 None None None 2023-05-26 17:06:09 UTC
Red Hat Knowledge Base (Solution) 7016238 0 None None None 2023-05-31 13:57:40 UTC

Description Christophe Besson 2023-05-26 13:56:22 UTC
Description of problem:
Customer passed the upgrade step successfully, but after the reboot on the RHEL-UpgradeInitramfs to proceed to the real upgrade, the DNF transaction failed and it ends in the emergency shell.

The system has not been altered.

Version-Release number of selected component (if applicable):
leapp-upgrade-el7toel8-0.18.0-1.el7_9.noarch

How reproducible:
Always

Steps to Reproduce:
1. Use a similar layout than the customer, enough space in / /var /var/lib/leapp, but a small /usr.
2. leapp upgrade
3. reboot

Actual results:

[    3.066611] localhost kernel: EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
 :
[    3.285717] localhost kernel: EXT4-fs (dm-17): mounted filesystem with ordered data mode. Opts: (null)
 :
[  143.424238] localhost upgrade[2074]: Total size: 1.5 G
 :
[  173.061055] localhost upgrade[1032]: 2023-05-25 12:11:25.369592 [ERROR] Actor: dnf_upgrade_transaction
[  173.061055] localhost upgrade[1032]: Message: There is not enough space on the file system hosting /var/lib/leapp directory to extract the packages.

Additional info:
* dm-4 is /var and dm-17 is /var/lib/leapp.

* df excerpt
Filesystem                               1K-blocks       Used  Available Use% Mounted on
/dev/mapper/vg00-usr                       7093752    3921376    2820460  59% /usr
/dev/mapper/vg_satellite-lv_leapp         51474912    3880820   44956268   8% /var/lib/leapp

* the issue has not been detected during the pre/upgrade steps.

* Leapp incorrectly reports not enough space in that folder, whereas the FS has 50G of remaining space. It's the same error msg than in userspacegen.py but the context is different, here we are in system_upgrade/common/libraries/dnfplugin.py executed from the initramfs context.

* From my understanding, the DNF transaction pre-checks does not estimate the RPM installed size (after the extraction), or fails to do so correctly.

* maybe an actor to check the installed size (rpm -qi) of all downloaded rpms from /var/lib/el?userspace

* attaching the leapp.db from the sosreport (taken before the reboot) and the rdsosreport

Comment 5 Petr Stodulka 2023-05-26 14:39:33 UTC
Hi Chris, thank you for the report. I see the issue. It's caused by the current solution we use to "emulate" some changes on existing partitions which includes use of overlayfs. As the overlayfs upper & workdirs are located in /var/lib/leapp, the calculation of the available space on all partitions is based on the size of the /var/lib/leapp partition. The problem is tricky, because RPM does not provide information about the calculated required space unless it discovers that more space is actuall needed.

It is actually related to https://bugzilla.redhat.com/show_bug.cgi?id=2134213 - even when it's different, it is affected by this problem also in some ways. It seems that the new possible solution we are investigating in these days could cover this problem also. Adding this bug to the current plans. Also we will add new KI into the upgrade documentation.


Note You need to log in before you can comment on or make changes to this bug.