I don't know how to describe it. We've been running a script that basicly does this dnf install <list of packages> --nogpgcheck --installroot=<somedir> The output is piped into a file. Now what happens ... and this is happening quite often ... is this: quite often we see that dnf updates (downloads metadata) into an empty <somedir>/var/lib/dnf/22 directory. But dnf doesn't output anything into xterm, console or whatever. Only by watching the said directory we see that dnf indeed does something. But then from one second to another it simply halts and doesn't do anything. Not all repositories data have been downloaded. Right now - as I type this - we are waiting over 30 minutes for the process to go on. Nothing. Even tap doesn't show any process eating... ps -aux doesn't show a defunct process. Two things we don't understand: a) Why does dnf often do not output anything - while it should at least show that it's downloading metadata from some location. This is even happening if we start from a clean directory (metadata directory). b) Why does dnf halt often in the middle of the process stopping the entire workflow from continuing. It looks like the process died but it hasn't. We've been having this quite often lately and don't know the cause.
The correct path is: <sonedir/var/cache/dnf/i386/22/<the files there> We use &> to pipe the output to a file and tail -f log it. Sadly the file remains 0 bytes.
We just aborted a couple of times and restarted the script. Now it looks like it has processed all metadata and continues. Output is showing now. Still, there is no reason for dnf to go idle in the middle of a process for no reason. There is no background dnf running, nothing that would in any ways hinder dnf to continue working. Nothing. This is only happening with *get metadata from server*. There is also no reason for a 100mbit line to not get the data. Stalled servers I don't think so. At least in all the years we never had such issues with yum.
I wonder whether this can be related to bug 1212320. And I agree that we should log something.
Well I only quickly peeked over that bug... But let's make things a bit more clear... I believe the problem that I describe is related to something completely else. Background: During the conversion of our infrastructure from yum to dnf, we detected a massive slowdown of operations. During the time when we've been using Fedora 20, we had a feeling how long certain processes takes. For example compiling the entire XFCE4.12 including yum builddep, yum installation etc. requires just 50 minutes. In worst case if builddeps are required it goes up to 80-90 minutes. With dnf we ended up in 2.5-3.0 hours what used to be 50 minutes and 4.0-5.0 hours what used to be 80-90 minutes. The test machine isn't the fastest, we know. We started performance tests with the filesystem underneath (xfs) but the performance for that system is ok. So nothing unusual compared to the performance of Fedora 20. Then we've saw that also other parts like installing and de-installing uses much longer using dnf than yum (at least we compare dnf -> Fedora 22 with yum -> Fedora 20). Last night I started to write some small script to measure the time that is needed for dnf to install/remove/reinstall/update and yum to install/remove/reinstall/update. To make it simple, it's nothing special and measures only the relative execution time for our test system. time dnf reinstall somepackage.rpm sync sleep 5 sync time yum reinstall somepackage.rpm The test script is not finished yet. But during the execution I figured out that output from DNF as well as YUM on Fedora 22 is being eaten. That is: output is entirely missing. We also detected that xfce4-terminal (which uses vte) is eating characters and output from other tools as well. Specially if it's piped or used together like the "time dnf" construction. I filled a report about this to: https://bugzilla.redhat.com/show_bug.cgi?id=1230021 You can see from the output above, how linefeeds are being eaten. We started the same stuff using xterm, same issue. So the problem might not be vte, xfce4-terminal, xterm... It might be a lower component of X or even the Kernel itself. Maybe some python module preventing output ? Who knows ? I hacked up a small test script (see attachment) and let it ran by issuing the following command: sudo sh test.sh &> log.txt And the ouput is missing half of the expected information (see attachment). I guarantee, that the file (log.txt) has not been touched, modified or altered in any ways. We are talking about exactly this issue in this bug.
Created attachment 1037155 [details] test.sh script for output testing.
Created attachment 1037156 [details] log.txt output missing half of the stuff we'd expected to see
Something has changed if I look at the log above again, because we were pulling in some updates the moment before. Nonetheless. The above log should at least contain output that it's processing the repositories. From what I remember, yum is doing this on Fedora 20. But then... just a side note... Look at the real time (this is not yet a valid bug report, we need to do more speed tests here). Do you see how dnf uses 3 minutes to pull in all repo data while yum only needs 1 minutes ? Re-running the script: dnf real 2m42.438s user 1m44.702s sys 0m8.960s yum real 1m33.239s user 0m18.528s sys 0m2.813s
The changes pulled in were related to mesa, perl and some other parts. Logged out and logged into XFCE 4.12 again and re-run the above script. The thing is hard to describe. It might be related to logging out and in again. When we ran the script a couple of times, we didn't even get output.. Say we saw an output similar to this when using tail -f log.txt ------------------------------------------ dnf clean ... the cleaning message ... <nothing> ------------------------------------------ end. All echo's, further times etc where missing. Loading that file in VIM showed the same as tail -f did. Either the mesa stuff fixed something (which I don't expect because xfce4 barely uses it) or the logging in and out has reset an uncondition state that shows up during long term running of the XOrg session. Once this happens again I will report back.
Oh I nearly forgot to explain what we mean with "does nothing in the middle of a process" by explaining the backgrouds why we came to this at all... After we had the issue as explained above (output being eaten and no further response), we dived into the /var/lib/dnf/i386/22 directory and did a frequent ls -l inside that (every couple of seconds). We saw that there was no progress in pulling down the repository metadata. This has nothing to do with two dnf processes interferring each others as your bug is showing above. It has simply to do that nothing happens anymore in the dnf metadata directory. We would expected something like this: -bash-4.3$ ls -l total 71040 drwxr-xr-x 3 root root 21 Jun 6 21:54 adobe-linux-i386 -rw-r--r-- 1 root root 607 Jun 6 21:54 adobe-linux-i386-filenames.solvx -rw-r--r-- 1 root root 6105 Jun 6 21:55 adobe-linux-i386.solv -rw-r--r-- 1 root root 2 Jun 10 10:53 expired_repos.json drwxr-xr-x 4 root root 55 Jun 10 10:50 fedora -rw-r--r-- 1 root root 30445939 Jun 10 10:51 fedora-filenames.solvx -rw-r--r-- 1 root root 13632525 Jun 10 10:53 fedora.solv drwxr-xr-x 3 root root 21 Jun 10 10:49 google drwxr-xr-x 3 root root 21 Jun 10 10:52 google-chrome -rw-r--r-- 1 root root 1764 Jun 10 10:52 google-chrome-filenames.solvx -rw-r--r-- 1 root root 7366 Jun 10 10:53 google-chrome.solv drwxr-xr-x 3 root root 21 Jun 10 10:50 google-earth -rw-r--r-- 1 root root 7553 Jun 10 10:50 google-earth-filenames.solvx -rw-r--r-- 1 root root 5904 Jun 10 10:53 google-earth.solv -rw-r--r-- 1 root root 1764 Jun 10 10:49 google-filenames.solvx -rw-r--r-- 1 root root 7366 Jun 10 10:53 google.solv drwxr-xr-x 3 root root 38 Jun 10 10:49 kde -rw-r--r-- 1 root root 82 Jun 10 10:49 kde-filenames.solvx -rw-r--r-- 1 root root 5509 Jun 10 10:53 kde.solv drwxr-xr-x 3 root root 38 Jun 10 10:49 kde-testing -rw-r--r-- 1 root root 82 Jun 10 10:49 kde-testing-filenames.solvx -rw-r--r-- 1 root root 5509 Jun 10 10:53 kde-testing.solv drwxr-xr-x 3 root root 38 Jun 10 10:50 kde-unstable -rw-r--r-- 1 root root 82 Jun 10 10:50 kde-unstable-filenames.solvx -rw-r--r-- 1 root root 5509 Jun 10 10:53 kde-unstable.solv -rw-r--r-- 1 root root 0 Jun 10 09:39 last_makecache drwxr-xr-x 4 root root 53 Jun 10 10:52 rpmfusion-free -rw-r--r-- 1 root root 289347 Jun 10 10:52 rpmfusion-free-filenames.solvx -rw-r--r-- 1 root root 198183 Jun 10 10:53 rpmfusion-free.solv drwxr-xr-x 3 root root 38 Jun 10 10:52 rpmfusion-free-updates -rw-r--r-- 1 root root 82 Jun 10 10:52 rpmfusion-free-updates-filenames.solvx -rw-r--r-- 1 root root 4782 Jun 10 10:53 rpmfusion-free-updates.solv drwxr-xr-x 3 root root 38 Jun 10 10:52 rpmfusion-free-updates-testing -rw-r--r-- 1 root root 82 Jun 10 10:52 rpmfusion-free-updates-testing-filenames.solvx -rw-r--r-- 1 root root 4782 Jun 10 10:53 rpmfusion-free-updates-testing.solv drwxr-xr-x 3 root root 38 Jun 10 10:52 rpmfusion-nonfree -rw-r--r-- 1 root root 86576 Jun 10 10:52 rpmfusion-nonfree-filenames.solvx -rw-r--r-- 1 root root 71325 Jun 10 10:53 rpmfusion-nonfree.solv drwxr-xr-x 3 root root 38 Jun 10 10:52 rpmfusion-nonfree-updates -rw-r--r-- 1 root root 82 Jun 10 10:52 rpmfusion-nonfree-updates-filenames.solvx -rw-r--r-- 1 root root 4782 Jun 10 10:53 rpmfusion-nonfree-updates.solv drwxr-xr-x 3 root root 38 Jun 10 10:49 rpmfusion-nonfree-updates-testing -rw-r--r-- 1 root root 82 Jun 10 10:49 rpmfusion-nonfree-updates-testing-filenames.solvx -rw-r--r-- 1 root root 4782 Jun 10 10:53 rpmfusion-nonfree-updates-testing.solv -rw-r--r-- 1 root root 1039500 Jun 10 10:53 @System.solv drwxr-xr-x 4 root root 55 Jun 10 10:53 updates -rw-r--r-- 1 root root 4870515 Jun 10 10:53 updates-filenames.solvx -rw-r--r-- 1 root root 1106996 Jun 10 10:53 updates.solv drwxr-xr-x 4 root root 55 Jun 10 10:50 updates-testing -rw-r--r-- 1 root root 11916983 Jun 10 10:50 updates-testing-filenames.solvx -rw-r--r-- 1 root root 5348036 Jun 10 10:53 updates-testing.solv -rw-r--r-- 1 root root 2882834 Jun 10 10:50 updates-testing-updateinfo.solvx -rw-r--r-- 1 root root 680182 Jun 10 10:53 updates-updateinfo.solvx -bash-4.3$ But received something like that: -bash-4.3$ ls -l total 71040 drwxr-xr-x 3 root root 21 Jun 6 21:54 adobe-linux-i386 -rw-r--r-- 1 root root 607 Jun 6 21:54 adobe-linux-i386-filenames.solvx -rw-r--r-- 1 root root 6105 Jun 6 21:55 adobe-linux-i386.solv -rw-r--r-- 1 root root 2 Jun 10 10:53 expired_repos.json drwxr-xr-x 4 root root 55 Jun 10 10:50 fedora -rw-r--r-- 1 root root 30445939 Jun 10 10:51 fedora-filenames.solvx -rw-r--r-- 1 root root 13632525 Jun 10 10:53 fedora.solv -rw-r--r-- 1 root root 1039500 Jun 10 10:53 @System.solv drwxr-xr-x 4 root root 55 Jun 10 10:53 updates -rw-r--r-- 1 root root 4870515 Jun 10 10:53 updates-filenames.solvx -rw-r--r-- 1 root root 1106996 Jun 10 10:53 updates.solv drwxr-xr-x 4 root root 55 Jun 10 10:50 updates-testing -rw-r--r-- 1 root root 11916983 Jun 10 10:50 updates-testing-filenames.solvx -rw-r--r-- 1 root root 5348036 Jun 10 10:53 updates-testing.solv -rw-r--r-- 1 root root 2882834 Jun 10 10:50 updates-testing-updateinfo.solvx -rw-r--r-- 1 root root 680182 Jun 10 10:53 updates-updateinfo.solvx -bash-4.3$ This is now something that I just created out of my mind. By looking inside that directory with ls -l we saw that it was doing nothing anymore and idles there for over 30 minutes (not one byte has changed since then). We have been seeing this for various times after we switched to Fedora 22 (27.05.2015). This also got us to the point "no textual output, we can't see what's happening". It's like the entire dnf process came to a dead end. Neither top, ps, or other things have shown activity. Only aborting dnf and having it start over again (done a couple of times) made it continue working. We wonder that the character eating (inside xterm, vte, xfce4-terminal) might be related to dnf becoming idle because most of the things happening within linux is reading and writing of block and character devices. Some *condition* that causes this (... which shows up during a workday ...) but disappears during reboot or login/logout (from X in my case above)... I can not describe it more precisely because Fedora 22 is quite new and there might be a ton of *missbehaviours* that's being tracked by various people. Packages going hand in hand with other packages etc.
Ali do you need to wait even if you don't set `--installroot` switch? How long was the installed package list?
The entire installed package list..... hmmmmm? Actually this was during downloading og the meta-data so I am not sure whether DNF was processing any package list at that time. But the package list was a normal Fedora-22 Workstation install. The package list was about 1800 packages which should be stripped down to about 700 packages. But yet I am still having "stalls" in the middle of either downloading meta-data or package updates with current DNF. Last time this morning on a 100mbit line. I am not sure but I belive it has something to do with bandwidth throtteling because it downloaded 10 updates or so and stalled at one. While all packages had around 1-2mb/s, this particular one had just around 100kb/s or even less and then hung. Had to ctrl+c and restart DNF a couple of times until it russhed through. No installroot was used this morning. -bash-4.3$ dnf --version 1.1.0 Installed: dnf-0:1.1.0-2.fc22.noarch at 2015-08-15 08:01 Built : Fedora Project at 2015-08-12 16:58 Installed: rpm-0:4.12.0.1-12.fc22.i686 at 2015-06-30 15:25 Built : Fedora Project at 2015-06-29 10:34 -bash-4.3$
Reporting back. This issue is still happening quite often. -bash-4.3$ rpm -qa | grep -i "dnf" | sort dnf-1.1.7-2.fc22.noarch dnf-conf-1.1.7-2.fc22.noarch dnf-plugins-core-0.1.17-1.fc22.noarch dnf-yum-1.1.7-2.fc22.noarch python-dnf-1.1.7-2.fc22.noarch python-dnf-plugins-core-0.1.17-1.fc22.noarch -bash-4.3$
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.