Bug 1581824 - Malformed lock file found: /var/cache/dnf/metadata_lock.pid.
Summary: Malformed lock file found: /var/cache/dnf/metadata_lock.pid.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: rawhide
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Jaroslav Rohel
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-23 17:11 UTC by Brian J. Murrell
Modified: 2019-12-02 10:05 UTC (History)
11 users (show)

Fixed In Version: dnf-4.2.7-2.fc30
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-30 01:14:46 UTC
Type: Bug


Attachments (Terms of Use)

Description Brian J. Murrell 2018-05-23 17:11:15 UTC
Description of problem:
We seem to be hitting occurrences of:

# dnf install --allowerasing -y <package list>

Malformed lock file found: /var/cache/dnf/metadata_lock.pid.
Ensure no other dnf process is running and remove the lock file manually or run systemd-tmpfiles --remove dnf.conf.

This seems to happen most often when a machine loses power unexpectedly.  This might seem like it should be a rare event but it happens more frequently in HA configurations where nodes need to be STONITHed for the integrity of the cluster.

I suspect that what happens is that the lock is taken by a running dnf process but before the data (i.e. the pid) can be flushed to the file, the machine is powered off and on again.  Once booted up, the filesystem journal is replayed creating the file but without it's contents.

What results is an empty lock file which DNF simply refuses to handle (https://github.com/rpm-software-management/dnf/blob/e4f3e91480647146b3bce542cccd2f675a5bc8e2/dnf/lock.py#L86-L104):

    def _try_read_lock(self):
        try:
            with open(self.target, 'r') as f:
                return int(f.readline())
        except IOError:
            return -1
        except ValueError:
            time.sleep(2)
            try:
                with open(self.target, 'r') as f:
                    return int(f.readline())
            except IOError:
                return -1
            except ValueError:
                msg = _('Malformed lock file found: %s.\n'
                        'Ensure no other dnf process is running and '
                        'remove the lock file manually or run '
                        'systemd-tmpfiles --remove dnf.conf.' % (self.target))
                raise LockError(msg)

I wonder if this is the best we can do.  Perhaps a system boot job to remove any stale locks is needed?

Any other ideas?

Comment 1 Jan Beran 2019-05-07 11:09:59 UTC
PR: https://github.com/rpm-software-management/dnf/pull/1393

The patch ensures that the empty lock file is not considered as the malformed lock file, and handles the situation as the process is not locked, i.e. writes the actual pid in the lock file.

Comment 2 Jan Beran 2019-05-16 09:28:36 UTC
Jaroslav continues to make a locking improvement, assigned to him.

Comment 3 Jaroslav Rohel 2019-05-29 11:51:36 UTC
Fixed in upstream. PR: https://github.com/rpm-software-management/dnf/pull/1398

Comment 4 Fedora Update System 2019-07-04 13:50:24 UTC
FEDORA-2019-58c2d3f1aa has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-58c2d3f1aa

Comment 5 Fedora Update System 2019-07-05 00:45:57 UTC
dnf-4.2.7-1.fc30, libdnf-0.35.1-1.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-58c2d3f1aa

Comment 6 Fedora Update System 2019-07-23 07:21:20 UTC
FEDORA-2019-672a74d688 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-672a74d688

Comment 7 Fedora Update System 2019-07-24 01:44:15 UTC
dnf-4.2.7-2.fc30, libdnf-0.35.1-2.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-672a74d688

Comment 8 Fedora Update System 2019-07-30 01:14:46 UTC
dnf-4.2.7-2.fc30, libdnf-0.35.1-2.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.