Bug 1627694 - dnf crashes when two concurrent dnf commands are invoked
Summary: dnf crashes when two concurrent dnf commands are invoked
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-09-11 09:27 UTC by Zbigniew Jędrzejewski-Szmek
Modified: 2019-07-03 17:33 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-07-03 17:33:51 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Zbigniew Jędrzejewski-Szmek 2018-09-11 09:27:44 UTC
Description of problem:
Before, when 'dnf install' was started twice from different terminal windows, the second one would just say "... locked by PID NNN. Waiting." (approximately) and then resume one the first one was done.

This had two great advantages:
- rpms could be downloaded in parallel while the other installation was running
- it was safe to do 'dnf install' from scripts, without worrying about somebody else running a transaction at the same time.

Now I get a "crash":
Traceback (most recent call last):
  File "/usr/bin/dnf", line 58, in <module>
    main.user_main(sys.argv[1:], exit_code=True)
  File "/usr/lib/python3.7/site-packages/dnf/cli/main.py", line 179, in user_main
    errcode = main(args)
  File "/usr/lib/python3.7/site-packages/dnf/cli/main.py", line 64, in main
    return _main(base, args, cli_class, option_parser_class)
  File "/usr/lib/python3.7/site-packages/dnf/cli/main.py", line 99, in _main
    return cli_run(cli, base)
  File "/usr/lib/python3.7/site-packages/dnf/cli/main.py", line 123, in cli_run
    ret = resolving(cli, base)
  File "/usr/lib/python3.7/site-packages/dnf/cli/main.py", line 146, in resolving
    base.resolve(cli.demands.allow_erasing)
  File "/usr/lib/python3.7/site-packages/dnf/base.py", line 810, in resolve
    self._transaction = self._goal2transaction(goal)
  File "/usr/lib/python3.7/site-packages/dnf/base.py", line 707, in _goal2transaction
    ts.add_install(pkg, obs, reason)
  File "/usr/lib/python3.7/site-packages/dnf/db/group.py", line 256, in add_install
    ti_new = self.new(new, libdnf.transaction.TransactionItemAction_INSTALL, reason)
  File "/usr/lib/python3.7/site-packages/dnf/db/group.py", line 219, in new
    rpm_item = self._pkg_to_swdb_rpm_item(pkg)
  File "/usr/lib/python3.7/site-packages/dnf/db/group.py", line 210, in _pkg_to_swdb_rpm_item
    rpm_item = self.history.swdb.createRPMItem()
  File "/usr/lib/python3.7/site-packages/dnf/db/history.py", line 290, in swdb
    self._swdb = libdnf.transaction.Swdb(self.dbpath)
  File "/usr/lib64/python3.7/site-packages/libdnf/transaction.py", line 713, in __init__
    this = _transaction.new_Swdb(*args)
RuntimeError: C++ std::exception: Exec failed: database is locked

Version-Release number of selected component (if applicable):
$ rpm -q dnf
dnf-3.2.0-2.fc29.noarch
dnf-3.3.0-2.fc29.noarch

How reproducible:
seems repeatable


Steps to Reproduce:
1. start 'dnf upgrade' in one window
2. start 'dnf install foo' in another window

Actual results:
See above.

Expected results:
...
Is this ok [y/N]: y
Downloading Packages:
Waiting for process with pid 4938 to finish.
(1/5): ...

Comment 1 Matthew Miller 2018-09-12 20:20:28 UTC
This is a fairly common use case. Also, consider the situation where dnf-automatic is running in the background.

Comment 2 Chris Murphy 2018-09-12 20:27:55 UTC
Is this reproducible? I just tried to and can't reproduce it.

Comment 3 Zbigniew Jędrzejewski-Szmek 2018-09-12 20:49:44 UTC
I can't reproduce it now. It was pretty easy yesterday (I did it a few times to produce the backtrace), but today seems to never happen.

Comment 4 Jan Pokorný [poki] 2018-09-14 10:24:50 UTC
I have reproduced this (with 90% confidence, see below) on Rawhide with

  for i in {1..2}; do dnf update -y & done

that I run merely out of curiosity since I've seen issues like this
announced at fedora-devel ML (and that's how I reached this bug after
the fact).

It was with dnf-3.2.0-2.fc29.noarch and for completeness, the winner
of this race was updating it to dnf-3.5.1-1.fc30.noarch
(similarly, python3-hawkey-0.17.0-2.fc29.x86_64 going to
python3-hawkey-0.19.1-1.fc30.x86_64 in that transaction).

Now the problems.  I am usually updating in plain virtual terminal,
meaning that I have only very limited options to take a note about
the exceptions and other anomalies.  Ok, I am using gpm so its limited
copy-paste is sufficient for the task.  But... I have the trackpoint
on this laptop allegedly broken on HW level, meaning I need to employ
a kernel-side workaround so that I can still use the trackpad at all.
So I switched to the other VT, applied said workaround, returned to
original VT, only to discover I no longer can scroll back to the
exception in question :-(

... long story short, I find it totally unacceptable that tracebacks
from such a crucial process such as an update of the packages slips
any sort of logging - neither /var/log/dnf*.log nor journal nor any
other file un /var/log contains a mention about this issue.
I guess I am filing a new bug for that.

But that's an explanation why I cannot be 100% sure the exception
I glimpsed (without possiblity to recheck) was the same,
just guessing it was.

All I know for sure is that likely the loser in said race returned
with exit status of 1 (i.e., failure) as reported by shell.

Comment 5 Jan Pokorný [poki] 2018-09-14 10:26:46 UTC
s/slips any sort of logging/slip any sort of persistent logging/

Comment 6 Zbigniew Jędrzejewski-Szmek 2018-09-14 10:32:14 UTC
There's abrt, which captures at least some python exceptions. I'm not sure if it can/should work in this case. There's also https://src.fedoraproject.org/rpms/systemd-coredump-python/ which is rather simpler and I'm pretty sure would work if enabled.

Comment 7 Jan Pokorný [poki] 2018-09-14 11:52:51 UTC
re [comment 6]:

Since I want a full control over any sort of reports and coredumpctl
+ respective core files fetching works seemlessly, I am actively
avoiding abrt.  Haven't known about the latter package, though,
thanks for the tip.

Anyway, point is, if dnf apparently logs other sorts of tracebacks
and if it's meant to be sort of self-contained (which I believe is
the case, e.g. not relying on any of those packages) and given
it's a distro-mission-critical component, it is really questionable
such a failure hadn't been tracked anywhere for later review
and troubleshooting.

Comment 8 Jaroslav Mracek 2019-07-03 17:33:51 UTC
According to my investigation the issue is difficult to reproduce with dnf-4.2.7, therefore I believe that it is fixed.


Note You need to log in before you can comment on or make changes to this bug.