Red Hat Bugzilla – Bug 1382224
RFE: dnf transactions should run in a transient systemd service
Last modified: 2017-12-07 19:08:00 EST
When possible (which should be pretty much always), I think that dnf should run its RPM transactions in a transient systemd service a la systemd-run.
- The caller's environment won't matter any more. (Minor.)
- Loss of the terminal in which dnf is running will no longer hose your system.
- Outright nuking of the dnf process by systemd if KillUserProcesses=yes will no longer happen.
Cons: No significant downsides I can think of.
Rather than mandating a specific solution, perhaps we should just ask that dnf should be designed - somehow - to avoid catastrophic failure in the case of the controlling terminal going away? That's the real request here, right?
FWIW, I've been told today (by someone who really likes the F-word) that this is how apt-get works.
(In reply to Adam Williamson from comment #1)
> Rather than mandating a specific solution, perhaps we should just ask that
> dnf should be designed - somehow - to avoid catastrophic failure in the case
> of the controlling terminal going away? That's the real request here, right?
> FWIW, I've been told today (by someone who really likes the F-word) that
> this is how apt-get works.
This == systemd service, as proposed here? Or just "not effing crash when the effing terminal effing crashes"? :)
How did we get to a place where systemd is required in order to update RPMs reliably? That's not a healthy solution.
For the cases that currently matter, surviving the loss of the controlling terminal should be sufficient. For future cases (KillUserProcesses), I think that systemd may SIGKILL the dnf process itself, in which case dnf should IMO be able to survive being SIGKILLed without causing problems. The only decent solution I can think of would be to ask systemd to run the meat outside as something like a service so that it's outside the scope/session/whatever that is at risk of being SIGKILLed.
"systemd-run -t dnf ..." apparently is not enough, dnf gets killed as soon as the terminal it is running from is closed. But it certainly could be made to work.
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> "systemd-run -t dnf ..." apparently is not enough, dnf gets killed as soon
> as the terminal it is running from is closed. But it certainly could be made
> to work.
I assume that's because dnf can't survive the loss of its controlling terminal. IMO it should die when the ctty dies *except when doing critical things*.
Here's my attempt at figuring out what's going on.
TransactionDisdplay.callback calls CliTransactionDisplay.progress, which calls CliTransactionDisplay._out_progress, which calls sys.stdout.write(), which can throw IOError.
TransactionDisplay.callback seems to be passed directly to rpm.TransactionSet.run.
If I were the one fixing this, I would seriously consider invoking rpm.TransactionSet.run() from an entirely separate process and piping the callbacks back to the main cli process so that even if the cli process crashed outright we'd be okay. (And I'd then add the ability for that subprocess to actually be a transient systemd service if systemd were running.)
An alternative would be to modify Base._run_transaction to swallow exceptions from the callback and spit them out after the transaction is done.
Been following this entire discussion on the ML for a few days now and came by this bugreport due to some comment made on the ML, therefore allow me to comment here.
About two days ago I installed Fedora 25 Beta on one of my Notebooks (from Netinstall: Xfce). The installation went fine and the system works appropriate.
The installation was tar.bz2 as a backup and usually used within a chroot environment for further dnf updates, deletes, installations and *normal* system administration.
The same as I always do, since using Fedora.
The Backup is usually untared in a directory that is called /.cdrom (historical chosen directory).
I then chrooted into this directory from my running desktop and wanted to remove some *rudimentary* (non critical) packages from this chroot backup.
During the dnf delete process (inside the chroot) the entire chroot *AND* the running Xfce Desktop (host) got shot down and took me back into the Linux console.
This never happened with any Fedora version before.
One of my main tasks is dealing with Fedora installations and usually I process changes in a chroot environment, so the *new* backup don't affect the running system.
It would be nice to have this issue sorted out before Fedora 25 Final hits the roads.
we will propably implement a better exception handling in Base class as a temporary solution and will focus on systemd in long term.
*** Bug 1383490 has been marked as a duplicate of this bug. ***
*** Bug 1389867 has been marked as a duplicate of this bug. ***
There is a pull-request that should improve behavior: https://github.com/rpm-software-management/dnf/pull/638
*** Bug 1256943 has been marked as a duplicate of this bug. ***
This bug is planned to be fixed in next 1-2 months.
Is something like https://github.com/timlau/dnf-daemon/ considered?
That might be also an option.
we have reconsidered priority of this bug as this would require bigger DNF design changes for little benefit.