RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1337977 - When /var is a separate filesystem, File-based locking initialization fails due to inability to create /var/lock/lvm
Summary: When /var is a separate filesystem, File-based locking initialization fails d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.8
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Peter Rajnoha
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1269194
TreeView+ depends on / blocked
 
Reported: 2016-05-20 15:10 UTC by John Pittman
Modified: 2020-03-11 15:07 UTC (History)
18 users (show)

Fixed In Version: lvm2-2.02.143-9.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-21 12:02:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vgchange -vvvv command showing failure (84.00 KB, text/plain)
2016-06-20 20:05 UTC, John Pittman
no flags Details
'lvmdump -a' from all systems. (125.92 KB, application/x-gzip)
2016-06-21 14:55 UTC, John Pittman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2333821 0 None None None 2016-05-23 13:55:05 UTC
Red Hat Product Errata RHBA-2017:0798 0 normal SHIPPED_LIVE lvm2 bug fix update 2017-03-21 12:51:51 UTC

Description John Pittman 2016-05-20 15:10:07 UTC
Description of problem:

When /var is a separate filesystem, File-based locking initialization fails due to inability to create /var/lock/lvm.  Issue can be worked around by keeping /var in the same filesystem as /, or downgrading lvm and dependents from levels mentioned below.

Version-Release number of selected component (if applicable):

lvm2-2.02.143-7.el6.x86_64
lvm2-libs-2.02.143-7.el6.x86_64
#Additional packages provided just in case
device-mapper-multipath-libs-0.4.9-93.el6.x86_64
device-mapper-libs-1.02.117-7.el6.x86_64
device-mapper-persistent-data-0.6.2-0.1.rc7.el6.x86_64
device-mapper-1.02.117-7.el6.x86_64
device-mapper-event-1.02.117-7.el6.x86_64
device-mapper-event-libs-1.02.117-7.el6.x86_64
kernel-2.6.32-642.el6.x86_64

How reproducible:

1. Create RHEL 6.7 system with /var as separate filesystem
2. Upgrade to RHEL 6.8
3. Reboot
4. Issue will be shown in /var/log/boot.log

Actual results:

From /var/log/boot.log with lvm verbosity at 1:

Setting up Logical Volume Management:     Logging initialised at Fri May 20 10:54:58 2016
    Set umask from 0022 to 0077
    Creating directory "/var/lock/lvm"
  Failed to create directory /var/lock/lvm.
    File-based locking initialisation failed.
    Locking disabled - only read operations permitted.

Expected results:

Initialization should succeed

Additional info:

I have a test system recreation and will be glad to provide any information needed.  Please let me know.

[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-LogVol00
                      3.8G  2.0G  1.7G  55% /
tmpfs                 495M     0  495M   0% /dev/shm
/dev/sda1             477M   67M  385M  15% /boot
/dev/mapper/VolGroup-LogVol01
                      969M   93M  826M  11% /home
/dev/mapper/VolGroup-LogVol02
                      673M  716K  638M   1% /tmp
/dev/mapper/VolGroup-LogVol03
                      969M  586M  333M  64% /var
/dev/mapper/VolGroup-LogVol04
                      283M  2.2M  266M   1% /var/log/audit

Comment 1 Zdenek Kabelac 2016-05-20 15:25:27 UTC
You can configure locking dir anywhere you want in  /etc/lvm/lvm.conf

global { locking_dir = "/run/lock" }


It's worth to note - recent releases of lvm2 are already using this location and if you install new 'lvm.conf' dir - such location should be there 'by-default'.

Comment 2 Alasdair Kergon 2016-05-20 21:19:33 UTC
What command failed and in what context and what is the locking protecting?

/var/lock/lvm is installed by rpm so how is it missing?  (/var/lock is installed by 'filesystem' rpm.)

Is the error from running lvm before /var is mounted, or was the rpm installed without /var mounted?

Comment 3 John Pittman 2016-05-25 19:22:30 UTC
Thanks for the workaround Zdenek; One of the guys made a public article detailing the steps to switch the locking_dir in case customers see the message.

To add to my original 'How reproducible', the issue can be reproduced on a fresh install of 6.8, the only criteria is that /var is a separate filesystem.  Upgrading from 6.7 to 6.8 is not a requirement.

Searching showed that the 'Failed to create directory' comes from the function dm_create_dir in libdm/libdm-file.c.  It calls _create_dir_recursive which runs 3 mkdir commands in this case.  Showing the return and errno of each mkdir below with log_verbose messages marked as RH_DEBUG.

    Creating directory "/var/lock/lvm"
    RH_DEBUG: First rc is -1 and errno is 17
    RH_DEBUG: First rc is -1 and errno is 30
    RH_DEBUG: Second rc is -1 and errno is 2

All return codes from mkdir indicate failure.  For /var we get 17 which is EEXIST, for /var/lock we get 30 which is EROFS, and for /var/lock/lvm we get 2 which is ENOENT.

Also, for completeness, the return code of _create_dir_recursive was indeed 0 which satisfied the condition to give us the failure message.

    RH_DEBUG: Return of _create_dir_recursive is 0.

So as far as I can tell, it does look as if the trouble is due to /var not being mounted rw at the time we try to create the directories.

That's as far as I've gotten for now, will post more if I'm able.

John

Comment 4 Zdenek Kabelac 2016-06-03 10:20:47 UTC
(In reply to John Pittman from comment #3)
> Thanks for the workaround Zdenek; One of the guys made a public article
> detailing the steps to switch the locking_dir in case customers see the
> message.
> 

Note - it's not a 'workaround' - locking dir is configurable setting - so when  user does something 'unusual'  like using mounted /var  he likely needs to further 'reconfigure' other parts of his system.

/var/lock dir used to regular content of /root filesystem - the only 'mounteable' part used to be /usr  dir.

It further changes with RHEL7 and usrmove.

Recent version of lvm2 should automatically pick /run/lock.

Closing as not a bug - it's just configuration issue.

Comment 7 Alasdair Kergon 2016-06-03 15:02:41 UTC
(In reply to Alasdair Kergon from comment #2)
> Is the error from running lvm before /var is mounted

So the answer was 'yes'.

But this key question remains unanswered:

> What command failed and in what context and what is the locking protecting?

In other words, is the lock *necessary* at that exact point during booting or is it harmless if it gets skipped silently?

Also, exactly which change between 6.7 and 6.8 caused this?

Comment 8 John Pittman 2016-06-15 16:25:59 UTC
Hi Alasdair, sorry for the wait.  It looks as if the addition of logging an error based on the return of _create_dir_recursive is the only reason we're seeing this now.  

A snip from the relevant diff between 143 and 118:

diff LVM2.2.02.143/libdm/libdm-file.c ../temp4/LVM2.2.02.118/libdm/libdm-file.c
---
> 	if (stat(dir, &info) < 0)
> 		return _create_dir_recursive(dir);
100,103c72,73
< 	if (!_create_dir_recursive(dir)) {
< 		log_error("Failed to create directory %s.", dir);
< 		return 0;
< 	}
---

From LVM2.2.02.143/WHATS_NEW_DM

Version 1.02.109 - 22nd September 2016
======================================
 .... snip
  Check dir path components are valid if using dm_create_dir, error out if not.
 .... snip

https://www.redhat.com/archives/lvm-devel/2015-September/msg00120.html

The locking protection selected is read-only locking, so we do seem to have protection at the time.

dracut: Setting global/locking_type to 4
........
dracut: Logging initialised at Wed Jun 15 14:00:21 2016
dracut: Set umask from 0022 th 0077
dracut: Read-only locking selected. Only read operations permitted.

Stated in lvm.conf:

#   4
#     LVM uses read-only locking which forbids any operations that 
#     might change metadata

If I understand you correctly, the actual command that is failing is the mkdir within _create_dir_recursive.  The final mkdir fails due to the second to final failing because of the /var filesystem not being read-write.

Code path for reference:

init_locking --> init_file_locking --> dm_create_dir --> _create_dir_recursive --> mkdir(dir, 0777)

So it doesn't seem that the file based lock is necessary at that point, would think it's ok to skip silently.

There were points in my research where I was unsure, like trying to find when we actually do end up setting locking type to 1 for the first time and creating the lock dir.

I hope that helps.

John

Comment 9 Zdenek Kabelac 2016-06-16 09:33:07 UTC
locking_type=4 should not be creating any locking dir.

Could you please attach  full -vvvv trace of the problematic failing command ?

Comment 10 Zdenek Kabelac 2016-06-16 09:49:24 UTC
Note:  when locking_type is  1  and creation of locking dir fails - it should 'fallback' read-only  locking type mode when  --ignorelockingfailure  is specified.

Moreover to fail on lock-dir creation means /var  would have to be sitting on 'read-only' filesystems??

Comment 11 John Pittman 2016-06-20 20:05:42 UTC
Created attachment 1169982 [details]
vgchange -vvvv command showing failure

Comment 12 John Pittman 2016-06-20 20:06:29 UTC
Command showing the failure was '/sbin/lvm vgchange -a ay --sysinit --ignoreskippedcluster' from /etc/rc.d/rc.sysinit.  File vgchange.out attached with -vvvv output.

Comment 13 Zdenek Kabelac 2016-06-21 08:50:59 UTC
(In reply to John Pittman from comment #12)
> Command showing the failure was '/sbin/lvm vgchange -a ay --sysinit
> --ignoreskippedcluster' from /etc/rc.d/rc.sysinit.  File vgchange.out
> attached with -vvvv output.

Yep it's being used with 'locking_type=1'
So if the users wants to activate devices on not yet fully initialized system he may user locking_type=4 for this initialization - or  --ignorelockingfailure - whatever fits better.

RHEL7.X is using /run  path.

RHEL6.X is reconfigurable when user needs via lvm.conf

It's unclear how that could have ever worked in RHEL6.7 - this behavior is consistent and without any change AFAIK.


So is user suggesting the system worked without any changes in 6.7?

We would need to see 'lvmdump -a' from both systems.

Also note  'vgchange -ay'  !=  'vgchange -aay'.

Comment 14 John Pittman 2016-06-21 14:54:48 UTC
Thanks Zdenek; In 6.7, from what I understand of the issue, it worked the same, we just didn't know about the lock dir creation failure it because we didn't log it.

Attached:  lvmdump.tgz

Comment 15 John Pittman 2016-06-21 14:55:35 UTC
Created attachment 1170314 [details]
'lvmdump -a' from all systems.

Comment 16 John Pittman 2016-06-29 13:05:14 UTC
Just checking in here.  Any change after I provided the latest info?  Are we going to supress the message?

Comment 17 Alasdair Kergon 2016-06-29 13:17:04 UTC
I'm too busy to look at this at the moment.  I'm not sure my questions have been answered yet.

It sounds like you identified this commit?

> commit 6c0b4a2769067048fa144814e298a3272564c475
> Author: Peter Rajnoha <prajnoha>
> Date:   Thu Sep 17 14:29:51 2015 +0200

If the filesystem is mounted readonly and the code has fallen back to a readonly locking mode there should certainly be no error messages appearing as this is a fully-supported configuration.

Comment 18 Peter Rajnoha 2016-06-29 14:25:05 UTC
I've removed the "Failed to create directory" message from dm_create_dir function - there are detailed messages printed inside _create_dir_recursive which dm_create_dir calls (and which handles EROFS case like anywhere else in the code):

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=eac0706761e628532ffcd27f3e4d7fa559a5f818

Comment 24 Roman Bednář 2016-11-24 14:39:25 UTC
Marking verified. The error is no longer shown during boot while having /var as a separate file system and locking dir set to "/var/lock/lvm". 

Changing locking_dir to "/run/lock" also removes the error message on affected systems
as described in this artice: https://access.redhat.com/solutions/2333821 

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_virt137-lv_root
                      7.0G  2.1G  4.6G  32% /
tmpfs                 499M     0  499M   0% /dev/shm
/dev/vda1             477M   37M  416M   8% /boot
/dev/mapper/vg_virt137-lv_var
                      847M  461M  343M  58% /var

# egrep "locking_type|locking_dir" /etc/lvm/lvm.conf | egrep -v "^\s*#"
    locking_type = 3
    locking_dir = "/var/lock/lvm"
------------------------------------------------------------------------------------------------

RHEL6.7, no errors in boot log while having /var as separate file system:

lvm2-2.02.118-3.el6_7.4   

# grep -i "logical volume management" /var/log/boot.log
Setting up Logical Volume Management:   2 logical volume(s) in volume group "vg_virt137" now active
------------------------------------------------------------------------------------------------

After update to RHEL6.8 or equivalent lvm2 package 
the error message is present in boot.log:

lvm2-2.02.143-7.el6    

# grep -i "logical volume management" /var/log/boot.log
Setting up Logical Volume Management:   Failed to create directory /var/lock/lvm.
------------------------------------------------------------------------------------------------

After fix:

# grep -i "logical volume management" /var/log/boot.log
Setting up Logical Volume Management:   2 logical volume(s) in volume group "vg_virt137" now active



Tested with:
2.6.32-573.35.2.el6.x86_64

lvm2-2.02.143-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
lvm2-libs-2.02.143-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
lvm2-cluster-2.02.143-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
udev-147-2.63.el6_7.1    BUILT: Thu Nov 12 17:11:28 CET 2015
device-mapper-1.02.117-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
device-mapper-libs-1.02.117-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
device-mapper-event-1.02.117-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
device-mapper-event-libs-1.02.117-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016
device-mapper-persistent-data-0.6.2-0.1.rc7.el6    BUILT: Tue Mar 22 14:58:09 CET 2016
cmirror-2.02.143-9.el6    BUILT: Thu Nov 10 10:21:10 CET 2016

Comment 26 errata-xmlrpc 2017-03-21 12:02:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0798.html


Note You need to log in before you can comment on or make changes to this bug.