Hi, I've tried running dnf and it said: metadata already locked by 6259 The application with PID 6259 is: dnf Memory : 68 M RSS (528 MB VSZ) Started: Tue Apr 14 22:48:43 2015 - 1 day(s) 10:15:16 ago State : Sleeping The process is: root 6259 1 0 kwi14 ? 00:00:03 /usr/bin/python -OO /usr/bin/dnf -v makecache timer Let's see what it does: # strace -f -p 6259 Process 6259 attached select(27, [26], [], [], {0, 435164}) = 0 (Timeout) poll([{fd=26, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout) select(27, [26], [], [], {1, 0}) = 0 (Timeout) poll([{fd=26, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout) select(27, [26], [], [], {1, 0}) = 0 (Timeout) poll([{fd=26, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 1, 0) = 0 (Timeout) So what's fd 26? # lsof -n -p 6259 (...) dnf 6259 root 26u IPv4 8448464 0t0 TCP 83.9.186.222:44556->209.132.181.16:https (ESTABLISHED) (...) 209.132.181.16 is wildcard.fedoraproject.org. 83.9.186.222 is an IP which my home computer had until 6:00 the next day, changed twice by now. I think it's worth noting that dnf not only failed to notice that connection doesn't work, it even failed to notice the local network interface (ppp1) went down few hours later (actually I believe it gets removed and recreated every pppd reconnect). This is dnf-0.6.4-1.fc21.noarch I'm letting this process live in case you want me to get some more info from it.
It's still around. metadata already locked by 6259 The application with PID 6259 is: dnf Memory : 62 M RSS (528 MB VSZ) Started: Tue Apr 14 22:48:43 2015 - 4 day(s) 15:24:31 ago State : Sleeping
Thanks for the report. Did you configured `timeout` option in /etc/dnf/dnf.conf? If you kill the process does it happen again?
Hi, are you able to attach gdb to running process and check where dnf got stuck?
Created attachment 1016465 [details] gdb's bt Michal, I've attached output of bt, please let me know if you need me to dive deeper, I'll keep the process around :) I've only installed python-debuginfo and imagine it's enough - we can see .py files and lines and even the repo which it was trying to get metadata for. Jan, No, "timeout" doesn't appear anywhere in /etc/dnf. I don't want to kill it yet (in case Michal needs more info), but it probably wouldn't be trivial to reproduce losing connection to Fedora servers again. I'd prefer not to build reproducible cases with iptables for something this simple :)
Would it be possible to install librepo debug package and upload the bt again?
Created attachment 1016854 [details] better bt In the meantime I have foolishly upgraded python with yum, so I had to downgrade that and accompanying debuginfo - hope this didn't break anything! (Hopefully) better backtrace attached.
It looks like librepo does not set a proper timeout value for select() operation => reassigning to librepo.
Librepo always uses 1 sec at most for the select(). The problem is not inside the select, rather this is a problem with timeout configuration. By default, librepo uses: LRO_LOWSPEEDTIME = 120 second LRO_LOWSPEEDLIMIT = 1000 bytes per second These configuration options ensure that if a transfer speed is below 1000 B/s for 120sec then librepo consider this transfer too slow and abort. I've tried to reproduce the reported issue with librepo: I've opened a connection and started a download of an ISO, then I disabled my network adapter and waited. After 120sec I got this (expected) error: Timeout was reached for http://mirror.karneval.cz/pub/linux/fedora/linux/releases/21/Workstation/i386/iso/Fedora-Live-Workstation-i686-21-5.iso [Operation too slow. Less than 1000 bytes/sec transferred the last 120 seconds] Michal, are you sure that DNF doesn't set these values to different values which could led to the reported issue?
Thank you Tomas. It looks like dnf sets LRO_LOWSPEEDLIMIT to zero. Is there any reasonable default value for LRO_LOWSPEEDLIMIT?
The default values could work but feel free tweak them to better fit your requirements - maybe you could find useful to use a shorter time period (e.g. 60sec).
Related pull request (thanks to paragan on #yum, for pointing me to this) and the associated patch list: https://github.com/rpm-software-management/dnf/pull/291 https://github.com/mluscon/dnf/commit/1b17488b2b31b5116ccff18051aa71b9c23b65e8
dnf-1.0.1-2.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/dnf-1.0.1-2.fc22
*** Bug 1206878 has been marked as a duplicate of this bug. ***
*** Bug 1214538 has been marked as a duplicate of this bug. ***
Package dnf-1.0.1-2.fc22: * should fix your issue, * was pushed to the Fedora 22 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing dnf-1.0.1-2.fc22' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-10004/dnf-1.0.1-2.fc22 then log in and leave karma (feedback).
dnf-1.0.1-2.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
dnf-1.0.2-2.fc22,hawkey-0.5.9-2.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/dnf-1.0.2-2.fc22,hawkey-0.5.9-2.fc22
dnf-1.0.2-3.fc22, hawkey-0.5.9-3.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
I seem to be experiencing this bug too. I was doing a "dnf update" today and the process hung after the downloads were complete, about three packages into the "[DRPM]... done" sequence. After about 10 minutes I gave up on and killed the process with a Ctrl-C. Then tried to do a "dnf clean all" but that hung as well and cannot be killed by Ctrl-C. Had to open a second terminal, su to root and do a kill -9 on it. Subsequent attempts to do a clean also hang. How does one recover from this?
Update: after an hour of playing with dnf, it was still hanging immediately after every attempt to execute it. I finally gave up, shutdown the system, power cycled it, and restarted. After it came back up I was able to execute "dnf -v clean all" successfully on the first try. After that I was also successful at doing the update with "dnf -v update". Towards the end of the update it looked like dnf had hung but running top in another terminal revealed it had just stopped producing output while a lot of "dracut" processes were running in the background (probably needs some user feedback to say "not dead, please wait while x happens" or "x % done with y" so users won't wonder what's up - this was a *long* delay, like 4 or 5 minutes with no output from dnf. Seems like Yum always let us know what it was doing in these sorts of cases. It was also a lot faster than dnf on large updates like this (or maybe it just seemed that way because I wasn't worried that it had crashed because of the lack of user feedback?) It's kind of scary that dnf got itself so messed up that the entire system has be restarted before it would run again! Anyway, for the record, here's the version involved: dnf-1.1.1-2.fc2
dnf-0.6.4-7.fc21 has been submitted as an update to Fedora 21. https://bodhi.fedoraproject.org/updates/FEDORA-2015-9bcdcdc46b
dnf-0.6.4-7.fc21 has been pushed to the Fedora 21 testing repository. If problems still persist, please make note of it in this bug report. If you want to test the update, you can install it with $ su -c 'dnf --enablerepo=updates-testing update dnf' You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2015-9bcdcdc46b
I am encountering the same "freeze" with dnf on my F22 laptop and F23 desktop. When I added timeout=1 to /etc/dnf/dnf.conf the freeze immediately went away. When I removed that line, the "freeze" started happening again; however, it was timing out at 30 seconds rather than being indefinite. As a side note, ftp://ftp.uci.edu/mirrors/fedora seems to be having an issue and not working when my dnf attempts to download from them.
dnf-0.6.4-7.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report.
*** Bug 1268364 has been marked as a duplicate of this bug. ***