In Rawhide, toolbox is now complaining: liveuser@localhost-live:~$ toolbox enter Error: failed to get the Podman version liveuser@localhost-live:~$ rpm -q podman podman-5.0.3-1.fc41.x86_64 liveuser@localhost-live:~$ podman --version podman version 5.0.3 liveuser@localhost-live:~$ toolbox create Error: failed to get the Podman version liveuser@localhost-live:~$ rpm -q toolbox conmon toolbox-0.0.99.5-11.fc41.x86_64 conmon-2.1.10-1.fc41.x86_64 liveuser@localhost-live:~$ toolbox Error: failed to get the Podman version I think it worked for me last month or so. Reproducible: Always Steps to Reproduce: 1. install rawhide (or boot Live) 2. start terminal 3. run toolbox Actual Results: Error: failed to get the Podman version Expected Results: No error Was working until recently
rishi is away so let's reassign this to podman first. Any idea would could have broken toolbox?
I don't see this on the podman-next copr which uses podman built from main. Could you retry this with podman 5.2.0-rc2 which should soon enter rawhide https://bodhi.fedoraproject.org/updates/FEDORA-2024-7eb94815d9
Hmm, I don't see it with podman 5.0.3 on my local rawhide either. I'll try on a vm.
Don't see it on a fresh rawhide vm either: [lsm5@fedora ~]$ toolbox enter ⬢[lsm5@toolbox ~]$ exit logout [lsm5@fedora ~]$ rpm -q podman toolbox containers-common conmon podman-5.0.3-1.fc41.x86_64 toolbox-0.0.99.5-14.fc41.x86_64 containers-common-0.59.2-2.fc41.noarch conmon-2.1.12-2.fc41.x86_64 Wonder if your conmon being slightly older has anything to do with it.
Thank you for the quick responses, Lokesh. Okay, it is certainly reproducible in a Live instance, though that might not qualify as a Blocker perhaps, not sure. It used to work fine in Live, so something has definitely changed. Same versions as you quoted: liveuser@localhost-live:~$ rpm -q podman toolbox containers-common conmon podman-5.0.3-1.fc41.x86_64 toolbox-0.0.99.5-14.fc41.x86_64 containers-common-0.59.2-2.fc41.noarch conmon-2.1.12-2.fc41.x86_64 You are right though the issue does not happen in Rawhide Workstation installation. Moving to Freeze Exception then.
(In reply to Jens Petersen from comment #5) > Okay, it is certainly reproducible in a Live instance, though that might not > qualify as a Blocker perhaps, not sure. > It used to work fine in Live, so something has definitely changed. > Same versions as you quoted: > Isn't this usually caught in the openqa tests on bodhi?
It usually does and it did ... it skipped due to deps in the rawhide for 22nd only https://openqa.fedoraproject.org/tests/2740533#dependencies 23rd open qa looks fine https://openqa.fedoraproject.org/tests/2743101 25th is good too https://openqa.fedoraproject.org/tests/2747595 I will be testing this on today's rawhide and post my findings here
We don't test toolbox in a live session in openQA. We test it after install from live images.
Discussed during the 2024-08-19 blocker review meeting: [1] The decision to classify this bug as a AcceptedFreezeException (Beta) was made: "This is an annoying problem that cannot be fixed with an update, as it affects live sessions." [1] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-08-19/f41-blocker-review.2024-08-19-15.59.log.html
Now getting this error too in my F40 host via toolbox ;-( $ rpm -q podman conmon containers-common toolbox podman-5.2.2-1.fc40.x86_64 conmon-2.1.12-2.fc40.x86_64 containers-common-0.60.1-1.fc40.noarch toolbox-0.0.99.5-11.fc40.x86_64 $ toolbox list Error: failed to get the Podman version
(In reply to Jens Petersen from comment #10) > Now getting this error too in my F40 host via toolbox ;-( Sorry this is incorrect or rather was due to user error on my part (I had a faulty podman wrapper script). Nevertheless this error seems quite vague and confusing: having a more specific error message would great.
(In reply to Jens Petersen from comment #0) > In Rawhide, toolbox is now complaining: > > liveuser@localhost-live:~$ toolbox enter > Error: failed to get the Podman version Are you still seeing this error? It's coming from running 'podman version --format json' and trying to parse the Podman version out of that JSON. I can't immediately imagine a reason for this to happen, unless the structure of the JSON has changed in a way that it breaks the parsing. You can try to use --verbose with the toolbox(1) invocations to see more. The Podman version is read before most Toolbx commands to decide where it's necessary to invoke 'podman system migrate'. So, it's quite fundamental and should affect any Toolbx testing. However, the Toolbx test suite seems to be humming along without any hiccups. Here are the periodic runs of the test suite on Fedora nodes: https://softwarefactory-project.io/zuul/t/local/builds?project=containers%2Ftoolbox&pipeline=periodic Here is a recent pull request: https://github.com/containers/toolbox/pull/1560 Here is the Bodhi update for the latest Rawhide build that ran the same test suite: https://bodhi.fedoraproject.org/updates/FEDORA-2024-0dbfa3767f
(In reply to Debarshi Ray from comment #12) > (In reply to Jens Petersen from comment #0) > > In Rawhide, toolbox is now complaining: > > > > liveuser@localhost-live:~$ toolbox enter > > Error: failed to get the Podman version > > Are you still seeing this error? Yes, I do in Live It is very easy to reproduce: boot live and run any toolbox command. > It's coming from running 'podman version --format json' and trying to parse > the Podman version out of that JSON. I can't immediately imagine a reason > for this to happen, unless the structure of the JSON has changed in a way > that it breaks the parsing. You can try to use --verbose with the > toolbox(1) invocations to see more. liveuser@localhost-live:~$ toolbox --verbose DEBU Running as real user ID 1000 DEBU Resolved absolute path to the executable as /usr/bin/toolbox DEBU Running on a cgroups v2 host DEBU Looking up sub-GID and sub-UID ranges for user liveuser DEBU TOOLBX_DELAY_ENTRY_POINT is DEBU TOOLBX_FAIL_ENTRY_POINT is DEBU TOOLBOX_PATH is /usr/bin/toolbox DEBU Migrating to newer Podman DEBU Toolbx config directory is /home/liveuser/.config/toolbox Error: configure storage: 'overlay' is not supported over overlayfs, a mount_program is required: backing file system is unsupported for this graph driver DEBU Migrating to newer Podman: failed to get the Podman version: failed to invoke podman(1) Error: failed to get the Podman version Anyway thanks, I see now this appears to be a podman issue: liveuser@localhost-live:~$ podman version Error: configure storage: 'overlay' is not supported over overlayfs, a mount_program is required: backing file system is unsupported for this graph driver (Could toolbox print a better error message with the unparsed version string to help diagnose this kind of issue? I could open a RFE issue for that gladly.) > However, the Toolbx test suite seems to be humming along without any hiccups. I think one of the blindspots/weaknesses of the CI is that it only tests newly created toolboxes? Though that is not the original problem here.
(In reply to Jens Petersen from comment #13) > (In reply to Debarshi Ray from comment #12) > > (In reply to Jens Petersen from comment #0) > > > In Rawhide, toolbox is now complaining: > > > > > > liveuser@localhost-live:~$ toolbox enter > > > Error: failed to get the Podman version > > > > Are you still seeing this error? > > Yes, I do in Live > > It is very easy to reproduce: boot live and run any toolbox command. > > > It's coming from running 'podman version --format json' and trying to parse > > the Podman version out of that JSON. I can't immediately imagine a reason > > for this to happen, unless the structure of the JSON has changed in a way > > that it breaks the parsing. You can try to use --verbose with the > > toolbox(1) invocations to see more. > > liveuser@localhost-live:~$ toolbox --verbose > DEBU Running as real user ID 1000 > DEBU Resolved absolute path to the executable as /usr/bin/toolbox > DEBU Running on a cgroups v2 host > DEBU Looking up sub-GID and sub-UID ranges for user liveuser > DEBU TOOLBX_DELAY_ENTRY_POINT is > DEBU TOOLBX_FAIL_ENTRY_POINT is > DEBU TOOLBOX_PATH is /usr/bin/toolbox > DEBU Migrating to newer Podman > DEBU Toolbx config directory is /home/liveuser/.config/toolbox > Error: configure storage: 'overlay' is not supported over overlayfs, a > mount_program is required: backing file system is unsupported for this graph > driver > > [...] > > liveuser@localhost-live:~$ podman version > Error: configure storage: 'overlay' is not supported over overlayfs, a > mount_program is required: backing file system is unsupported for this graph > driver The cause for this is the same as the ones in: https://discussion.fedoraproject.org/t/rpm-ostree-update-breaks-toolbox-fedora-40 https://github.com/containers/common/pull/2203 https://github.com/containers/toolbox/issues/1512 ... but the solution might have to be different. In those other bugs, old Toolbx containers that were created with fuse-overlayfs(1) don't start when it's no longer available. In this case with the Fedora live media, fuse-overlayfs(1) is not there, and the Linux kernel's overlay file system can't use another overlay file system to back it up. I don't know if we can pull in fuse-overlayfs(1) into the live media but not install it. The other option would be reinstate the fuse-overlayfs dependency in containers-common. Related: https://src.fedoraproject.org/rpms/containers-common/pull-request/38 https://src.fedoraproject.org/rpms/containers-common/pull-request/39 https://src.fedoraproject.org/rpms/containers-common/pull-request/40 > (Could toolbox print a better error message with the unparsed version string > to help diagnose this kind of issue? > I could open a RFE issue for that gladly.) If we are to deprecate fuse-overlayfs(1), then we need to actually detect the situation in Toolbx and offer users a migration path: https://github.com/containers/toolbox/issues/1512 > > However, the Toolbx test suite seems to be humming along without any hiccups. > > I think one of the blindspots/weaknesses of the CI is that it only tests > newly created toolboxes? Yes, exactly.
Proposed as a Freeze Exception for 41-final by Fedora user rishi using the blocker tracking app because: Both `podman(1)` and `toolbox(1)` are broken on the Fedora 41 live media.
(In reply to Debarshi Ray from comment #14) > The cause for this is the same as the ones in: > https://discussion.fedoraproject.org/t/rpm-ostree-update-breaks-toolbox- > fedora-40 > https://github.com/containers/common/pull/2203 > https://github.com/containers/toolbox/issues/1512 > > ... but the solution might have to be different. > > In those other bugs, old Toolbx containers that were created with > fuse-overlayfs(1) don't start when it's no longer available. I filed bug 2319121 to not confuse the discussion this late in the Fedora 41 cycle. I am working on doing builds that add 'Recommends: fuse-overlayfs' to address it in the short-term. That might also address this bug because toolbox is installed by default on Fedora Workstation and is present on the live media, and Fedora Silverblue doesn't have a live media. I don't know about the other Fedora variants.
Wouldn't it be good to use this Exception here, to make sure the Recommends ends up in the GA release?
(In reply to Debarshi Ray from comment #14) > The other option would be reinstate the fuse-overlayfs dependency in > containers-common. Related: > https://src.fedoraproject.org/rpms/containers-common/pull-request/38 > https://src.fedoraproject.org/rpms/containers-common/pull-request/39 > https://src.fedoraproject.org/rpms/containers-common/pull-request/40 Okay I see you PR's were already merged, but not built yet.
(In reply to Jens Petersen from comment #18) > (In reply to Debarshi Ray from comment #14) > > The other option would be reinstate the fuse-overlayfs dependency in > > containers-common. Related: > > https://src.fedoraproject.org/rpms/containers-common/pull-request/38 > > https://src.fedoraproject.org/rpms/containers-common/pull-request/39 > > https://src.fedoraproject.org/rpms/containers-common/pull-request/40 > > Okay I see you PR's were already merged, but not built yet. Note that these pull requests are for Fedora 40 and older. The RPM spec file is maintained upstream and synchronized downstream. That's why there are pull requests for all supported Fedora branches. The change itself is conditional on Fedora 40 and older. Here is the upstream pull request: https://github.com/containers/common/pull/2203 This is the Fedora 40 update - karma welcome: https://bodhi.fedoraproject.org/updates/FEDORA-2024-d45f40439b
(In reply to Jens Petersen from comment #17) > Wouldn't it be good to use this Exception here, to make sure the Recommends > ends up in the GA release? There are multiple overlapping and related issues arising from dropping fuse-overlayfs: (a) podman(1) and toolbox(1) not working on the Fedora 41 live media (b) backwards compatibility with old containers on Fedora 41 onwards (c) it's even been pointed out in the context of (b) that fuse-overlayfs can be better for rootless containers under some conditions: https://github.com/coreos/fedora-coreos-tracker/issues/1749#issuecomment-2417831141 (d) backwards compatibility with old containers on Fedora Silverblue 40, which Timothée Ravier worked around in https://pagure.io/workstation-ostree-config/pull-request/526 and is the topic of the pull requests in comment 18 and comment 19 So, I didn't want to have one big messy issue with three different problems this late in the Fedora 41 cycle, when we are already in Final Freeze. It seems a lot easier to vote on the severity of the problem if it's narrowly defined. (d) is solved now. (b) will be solved for the short-term by bug 2319121 and the zero-day updates. I don't know how many such old containers are out there. There are some, because we did get a few reports, but that's probably not the vast majority of users. That's why zero-day updates seemed good enough and I didn't ask for an exception. (a) isn't solved and it will need the live media to be re-spun. I don't know how seriously our release criteria views broken podman(1) on the live media, when bug will be fixed on the installed system by zero-day updates. (a) can be solved either by restoring fuse-overlayfs or maybe Podman can be taught to use the vfs storage driver in the overlayfs on overlayfs case?
Debarshi: the fact that this is AcceptedFreezeException means we can fix (a) on the F41 live media. What exactly is the necessary change to fix it?
(In reply to Adam Williamson from comment #21) > Debarshi: the fact that this is AcceptedFreezeException means we can fix (a) > on the F41 live media. What exactly is the necessary change to fix it? I think the easiest option is to pull in the fix for bug 2319121: https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1 ... because it adds 'Recommends: fuse-overlayfs' to a package in the default set for Fedora Workstation. Meanwhile, the discussion in https://github.com/coreos/fedora-coreos-tracker/issues/1749 seems to have concluded that fuse-overlayfs really is needed in enough edge cases that it can't be deprecated so easily.
The pedantic way to fix this would be to restore the fuse-overlayfs dependency as it used to be in containers-common. I filed some pull requests based on the discussion in https://github.com/coreos/fedora-coreos-tracker/issues/1749: https://github.com/containers/common/pull/2206 https://src.fedoraproject.org/rpms/containers-common/pull-request/41 https://src.fedoraproject.org/rpms/containers-common/pull-request/42
FEDORA-2024-b7ceba50a1 (toolbox-0.0.99.6-6.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1
OK, I have marked https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1 as fixing this bug. That means we'll pull it in for composes, and push it stable if it gets queued for stable, as part of the blocker/FE process.
(In reply to Adam Williamson from comment #25) > OK, I have marked > https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1 as fixing > this bug. That means we'll pull it in for composes, and push it stable if it > gets queued for stable, as part of the blocker/FE process. Okay! Thanks, Adam.
Thanks that's good. Though I still think it would be better to get the fix also into F41+ podman (I mean containers-common). Well at least if toolbox fixes Workstation (Live) that is an improvement. I didn't have time to dig into all the linked tickets and discussion... iiuc container-common.spec had previously dropped the dependency on fuse overlay and then it was incorrectly reverted by changing it to a Suggests rather than Recommends? Then seems it would make sense to make the change also effective for F41+. Moving this bug to containers-common.
This bug, AIUI, covers the problem "running toolbox in a live environment does not work". The toolbox update https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1 , again AIUI, will fix that problem. At that point, IMO, this bug should be closed, because the problem reported in this bug will be fixed. Other considerations should be tracked in other bugs. If there is another change that you think it is important to get into F41 *before release*, please file a separate bug and propose it as a blocker or FE.
Fix LGTM. `toolbox enter` on a 20241019.n.0 nightly fails as described, on the Final RC-1.2 ISO (the compose failed so it wasn't announced, but WS live ISO built fine) it does not. It *does* ultimately fail, but just because the VM I tested in ran out of "disk space" (i.e. RAM, as that's what backs 'disk' writes in the live env). With enough RAM it'd probably work.
huh, actually, one odd quirk: fairly reproducibly, it failed to pull the container image if I ran `toolbox enter`, but worked if I ran `toolbox --verbose enter`.
(In reply to Jens Petersen from comment #27) > Though I still think it would be better to get the fix also into F41+ podman > (I mean containers-common). > > [...] > > Moving this bug to containers-common. This is also fixed in containers-common because the pull requests in comment 23 made it to this Fedora 41 update that's already been pulled into stable: https://bodhi.fedoraproject.org/updates/FEDORA-2024-5a61a2fa45 (In reply to Adam Williamson from comment #28) > This bug, AIUI, covers the problem "running toolbox in a live environment > does not work". The toolbox update > https://bodhi.fedoraproject.org/updates/FEDORA-2024-b7ceba50a1 , again AIUI, > will fix that problem. At that point, IMO, this bug should be closed, > because the problem reported in this bug will be fixed. Yes, agreed. This bug should be fixed through both containers-common and toolbox. > Other considerations should be tracked in other bugs. If there is another > change that you think it is important to get into F41 *before release*, > please file a separate bug and propose it as a blocker or FE. I think we are all set here.
(In reply to Adam Williamson from comment #30) > huh, actually, one odd quirk: fairly reproducibly, it failed to pull the > container image if I ran `toolbox enter`, but worked if I ran `toolbox > --verbose enter`. That's weird. In both cases it's a `podman pull`. We hide the spew without --verbose, and expose it when the flag is used. I just tried on my Fedora 40 Workstation and it worked: rishi@topinka:~$ /usr/bin/toolbox enter No Toolbx containers found. Create now? [y/N] y Image required to create Toolbx container. Download registry.fedoraproject.org/fedora-toolbox:40 (374.8MB)? [y/N]: y ⬢ [rishi@toolbx ~]$ logout rishi@topinka:~ Please do file a bug with the command line output if you manage to keep reproducing it.
note that was only testing *in a live image environment*. I'll fiddle around with it a bit more later.
Ahh, I think I know what I was seeing now: it was actually behaving the same in both cases, it was just more obvious what was happening in the verbose case. The verbose case made it clear the image pull had worked but we'd run out of 'disk space' trying to actually run the retrieved image; the non-verbose case didn't. I just tested on bare metal with more RAM available, and it worked fine without --verbose.
FEDORA-2024-b7ceba50a1 (toolbox-0.0.99.6-6.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report.
(In reply to Adam Williamson from comment #34) > Ahh, I think I know what I was seeing now: it was actually behaving the same > in both cases, it was just more obvious what was happening in the verbose > case. The verbose case made it clear the image pull had worked but we'd run > out of 'disk space' trying to actually run the retrieved image; the > non-verbose case didn't. > > I just tested on bare metal with more RAM available, and it worked fine > without --verbose. Oops, I see. Yes, we need to handle the "no disk space" scenarios separately, but it's a bit tricky because we are wrapping Podman commands. Thanks for confirming.