Bug 1471466

Summary: dnf-makecache.service hangs and blocks dnf/yumex-dnf
Product: [Fedora] Fedora Reporter: Lars S. Jensen <lars.s.jensen>
Component: dnfAssignee: rpm-software-management
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 26CC: dmach, mhatina, packaging-team-maint, pascal.ernster+bugzilla.redhat.com, rpm-software-management, vmukhame
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-19 11:17:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lars S. Jensen 2017-07-16 06:22:47 UTC
Description of problem:
dnf/yumex-dnf hands because the programs wait for dnf-makecache.service to finsh and release dnf lock

Version-Release number of selected component (if applicable):
dnf-2.5.1-1.fc26.noarch

How reproducible:

Steps to Reproduce:
1.
running 
dnf update
Hangs with:
Waiting for process with pid XXXXX to finish. 
2.systemctl status  dnf-makecache
  Loaded: loaded (/usr/lib/systemd/system/dnf-makecache.service; static; vendor
  Active: activating (start) since Sun 2017-07-16 06:49:08 CEST; 56min ago
Main PID: 17850 (dnf)
   Tasks: 1 (limit: 4915)
   CGroup: /system.slice/dnf-makecache.service
           └─17850 /usr/libexec/system-python /usr/bin/dnf makecache timer
3. restart 
systemctl restart  dnf-makecache 
and now systemctl restart hangs:-(

Additional info:
       dnf [options] makecache --timer

Change line 14 in  /usr/lib/systemd/system/dnf-makecache.service
-ExecStart=/usr/bin/dnf makecache timer
+ExecStart=/usr/bin/dnf makecache --timer

Then 
systemctl restart  dnf-makecache 
works and:
systemctl status  dnf-makecache
   Active: inactive (dead) since Sun 2017-07-16 08:01:10 CEST; 9min ago
  Process: 18950 ExecStart=/usr/bin/dnf makecache --timer (code=exited, status=0
 Main PID: 18950 (code=exited, status=0/SUCCESS)

Comment 1 Lars S. Jensen 2017-07-16 14:48:36 UTC
The old syntax was restored in dnf-2.5.1-1.fc26 or before, so it was not the solution. 

It is still hanging and seen on 2 machines and it is not reproducible every time it runs:

systemctl status  dnf-makecache
● dnf-makecache.service - dnf makecache
   Loaded: loaded (/usr/lib/systemd/system/dnf-makecache.service; static; vendor
   Active: activating (start) since Sun 2017-07-16 15:01:13 CEST; 1h 4min ago
 Main PID: 27522 (dnf)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/dnf-makecache.service
           └─27522 /usr/libexec/system-python /usr/bin/dnf makecache --timer

Is should have a timeout so it don't block yumex-dnf in case of an issue.

Comment 2 Pascal Ernster 2017-07-18 15:57:29 UTC
I see this as well on several Fedora 25 machines.

According to "strace -f -p $PID" (with $PID being the pid that "systemctl status dnf-makecache" gives me), it hangs with FUTEX_WAIT_PRIVATE.

A first "systemctl restart dnf-makecache", cancelled by Ctrl+C after a few seconds, and then followed by a second "systemctl restart dnf-makecache", temporarily fixes the problem until the corresponding systemd timer triggers the time.

Comment 3 Pascal Ernster 2017-07-18 16:05:06 UTC
Just to be complete: the affected oackage version on my Fedora 25 machines is dnf 0:1.1.10-6.fc25. Though I have installed these machines quite recently so I don't know if previous versions of the dnf package are affected as well.

Comment 4 Igor Gnatenko 2017-07-19 11:17:34 UTC

*** This bug has been marked as a duplicate of bug 1470352 ***