1369741 – pvs crashed on rhel 6.8

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1369741 - pvs crashed on rhel 6.8

Summary: pvs crashed on rhel 6.8

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	6.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	LVM and device-mapper development team
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-08-24 09:45 UTC by nikhil kshirsagar
Modified:	2019-12-16 06:28 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-06-07 22:10:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
core (15.62 MB, application/x-core) 2016-08-24 09:51 UTC, nikhil kshirsagar	no flags	Details
View All

Description nikhil kshirsagar 2016-08-24 09:45:12 UTC

Description of problem:
When the user executed the "pvs" command, it failed and created a core dump. User says

We are not getting core dump every time "pvs" command was executed. It just occured once. I can execute "pvs" command now on the server and it works fine. But we are concerned about this as to why it failed as our automation scripts heavily depended on this command output. 


Version-Release number of selected component (if applicable):

lvm2-2.02.143-7.el6.x86_64                                  
                             
How reproducible:
Does not reproduce


Additional info:

Valgrind shows:

[root@GBLDNSRV9TL4002 ~]# valgrind pvs >> /tmp/pvs_valgrind.out
==11132== Memcheck, a memory error detector
==11132== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==11132== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==11132== Command: pvs
==11132==
==11132== Warning: noted but unhandled ioctl 0x127b with no size/direction hints
==11132==    This could cause spurious value errors to appear.
==11132==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==11132== Warning: noted but unhandled ioctl 0x127b with no size/direction hints
==11132==    This could cause spurious value errors to appear.
==11132==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==11132== Warning: noted but unhandled ioctl 0x127b with no size/direction hints
==11132==    This could cause spurious value errors to appear.
==11132==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==11132==
==11132== HEAP SUMMARY:
==11132==     in use at exit: 2,410 bytes in 18 blocks
==11132==   total heap usage: 188,909 allocs, 188,891 frees, 56,102,138 bytes allocated
==11132==
==11132== LEAK SUMMARY:
==11132==    definitely lost: 1,146 bytes in 12 blocks
==11132==    indirectly lost: 0 bytes in 0 blocks
==11132==      possibly lost: 0 bytes in 0 blocks
==11132==    still reachable: 1,264 bytes in 6 blocks
==11132==         suppressed: 0 bytes in 0 blocks
==11132== Rerun with --leak-check=full to see details of leaked memory
==11132==
==11132== For counts of detected and suppressed errors, rerun with: -v
==11132== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 10 from 6)
[root@GBLDNSRV9TL4002 ~]#
[root@GBLDNSRV9TL4002 ~]# valgrind
valgrind           valgrind-listener
[root@GBLDNSRV9TL4002 ~]# less /tmp/pvs_valgrind.out
[root@GBLDNSRV9TL4002 ~]#
[root@GBLDNSRV9TL4002 ~]# pvs -v >> /tmp/pvs_vv.out
    Using physical volume(s) on command line.
    Wiping cache of LVM-capable devices
    Wiping internal VG cache
[root@GBLDNSRV9TL4002 ~]#

Comment 3 nikhil kshirsagar 2016-08-24 09:51:48 UTC

Created attachment 1193560 [details]
core

Comment 7 David Teigland 2016-08-24 16:22:55 UTC

Thank you nikil, the dev_name issue is because the entire info struct is garbage.

In _vg_read_orphans(), here is the vginfo:

(gdb) p *vginfo
$17 = {list = {n = 0x7ffff8207970, p = 0x7ffff82da8e0}, infos = {
    n = 0x7ffff82da9e0, p = 0x7ffff82da9e0}, fmt = 0x7ffff8253a80, 
  vgname = 0x7ffff82daaa0 "#orphans_lvm2", status = 0, 
  vgid = "#orphans_lvm2", '\000' <repeats 19 times>, 
  _padding = "\000\000\000\000\000\000", next = 0x0, creation_host = 0x0, 
  system_id = 0x0, lock_type = 0x0, mda_checksum = 0, mda_size = 0, 
  vgmetadata_size = 0, vgmetadata = 0x0, cft = 0x0, cached_vg = 0x0, 
  holders = 0, vg_use_count = 0, precommitted = 0, cached_vg_invalidated = 0, 
  preferred_duplicates = 0}

It has an empty vginfo->infos list.

However, lvmcache_foreach_pv() is in the midst of working through the vginfo->infos list (inconsistent with an empty list).  At the time of the core dump, it has called _vg_read_orphan_pv(), passing it an info struct containing garbage.  (The dev_name segfault just happens to be the first place that uses the garbage info.)

Perhaps the prior call to _vg_read_orphan_pv() has freed and reinitialized vginfo->infos.  This would explain the empty vginfo->infos list and the garbage info struct.  It's not clear why this would happen.

Had they just upgraded lvm on this system?  Since this is not very repeatable, one theory is that it's caused by _check_or_repair_orphan_pv_ext(), which is new to the code, and would run some repair code the first time it saw a PV.  I wonder if they happened to notice the warning from that code (lvm really needs to log these unusual events in syslog):

WARNING: Repairing Physical Volume %s that is in Volume Group %s but not marked as used.

Comment 8 nikhil kshirsagar 2016-08-24 16:34:35 UTC

I can't see the repairing physical volume message in the sosreport, let me check with them if they saw anything like that. Are the ioctl related errors in valgrind report fine then ?

I will also check if they had just upgraded lvm.

Comment 9 David Teigland 2016-08-24 16:36:08 UTC

Sorry, that theory above (upgraded lvm fixing PVs) doesn't work, I was mixing up 
_check_or_repair_orphan_pv_ext() and _check_or_repair_pv_ext().

Comment 10 nikhil kshirsagar 2016-08-24 16:38:04 UTC

Allright, well let me know if you need anything else to understand the reason this has happened, and the correct fix for it. I mean we could technically check if the p and n pointers are the same in the structures like the info, or the dm_list , if I may make a naive suggestion.

Comment 11 David Teigland 2016-08-24 18:51:33 UTC

I'm not very good at this, and maybe I'm reading the gdb info wrong, but I'm looking at this vginfo->infos value, and thinking that the list is empty?

infos = {
    n = 0x7ffff82da9e0, p = 0x7ffff82da9e0}

I'm trying to figure out if there are supposed to be any orphan PVs on this system.  If things haven't changed, collecting an ordinary 'pvs -vvvv' from this system might be a good reference.

I'm not sure what vg->pvs is supposed to hold in the case of the orphans VG, i.e. is it supposed to match vginfo->infos?  Looking at *vg, we see pv_count is 1, which seems to mean there's one orphan PV, but then the pvs list looks empty, if I'm reading this right?
pvs = { n = 0x7ffff8261260, p = 0x7ffff8261260}

Comment 15 David Teigland 2016-08-26 16:10:40 UTC

The code path in question is:

_vg_read_orphans()

lvmcache_foreach_pv(vginfo, _vg_read_orphan_pv, &baton)

dm_list_iterate_items(info, &vginfo->infos)
  _vg_read_orphan_pv(info)

pv = _pv_read(b->vg->cmd, b->vg->vgmem, dev_name(lvmcache_device(info)), b->vg->fid, b->warn_flags, 0);

And the segfault is in that dev_name().

lvmcache_device(info) is just info->dev.

info seems to be a valid pointer, but it contains garbage, including the dev value.

> So, is (dev && dev->aliases.n)  the crashing statement, or is it
> dm_list_item(dev->aliases.n, struct dm_str_list)->str

I'm not sure it really matters, since the dev struct is junk, but I would think it's segfaulting on dev->aliases.

Comment 16 David Teigland 2016-08-26 16:30:48 UTC

For reference I should have asked for the output of a normal 'pvs -vvvv'.  I don't have an immediate use for this, but it could be useful later.  Also, if there was anything in /var/log/messages around the time of the core dump that would be useful to have for future reference.

Comment 17 Alasdair Kergon 2016-08-26 20:43:11 UTC

Without the ability to reproduce this, there's probably little we can do.
The exact setup prior to running the command that fails would need to be replicated based on all the available information - configuration, logs, metadata change history etc. - and after all that it might turn out to be a problem we already fixed upstream (and therefore in 6.9).  (I can see some candidates.)

Comment 23 nikhil kshirsagar 2016-08-31 08:15:17 UTC

Hello,

in the vmcore, we see a reference to vg_ase_tempest_prod. This vg can be seen in /etc/lvm/archive but not in backup. It seems a reference of that VG still exists in the metadata seen through the vmcore.

I can see you had renamed this vg in fact,

etc/lvm/archive/vg_ase_tempest_prod_00082-1449691450.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00082-1449691450.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00154-198026714.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00154-198026714.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00164-473146371.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00114-661737966.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00114-661737966.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00042-1142268183.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00042-1142268183.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00117-1850396622.vg:description = "Created *before* executing 'pvchange --uuid --config global{activation=0} --select vg_name=vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00117-1850396622.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00128-1875962741.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00190-600814812.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00190-600814812.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00118-1862483559.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00118-1862483559.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00097-1242122254.vg:description = "Created *before* executing 'pvchange --uuid --config global{activation=0} --select vg_name=vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00097-1242122254.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00014-817231081.vg:description = "Created *before* executing 'vgchange --uuid --config global{activation=0} vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00014-817231081.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00147-306145930.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG20082016130112'"
etc/lvm/archive/vg_ase_tempest_prod_00147-306145930.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00067-657324530.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG19082016130252'"
etc/lvm/archive/vg_ase_tempest_prod_00067-657324530.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00141-1388365472.vg:description = "Created *before* executing 'pvchange --uuid --config global{activation=0} --select vg_name=vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00141-1388365472.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00048-690172417.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00099-399885773.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG19082016162701'"
etc/lvm/archive/vg_ase_tempest_prod_00099-399885773.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00059-41882538.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG19082016124207'"
etc/lvm/archive/vg_ase_tempest_prod_00059-41882538.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00107-217238218.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG19082016163308'"
etc/lvm/archive/vg_ase_tempest_prod_00107-217238218.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00069-502888113.vg:description = "Created *before* executing 'pvchange --uuid --config global{activation=0} --select vg_name=vg_ase_tempest_prod'"
etc/lvm/archive/vg_ase_tempest_prod_00069-502888113.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00012-1241477481.vg:vg_ase_tempest_prod {
etc/lvm/archive/vg_ase_tempest_prod_00039-429080104.vg:description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG19082016093346'"
etc/lvm/archive/vg_ase_tempest_prod_00039-429080104.vg:vg_ase_tempest_prod {

By any chance could the pvs command have been run while you were in the middle of the operation of renaming this vg? The system time is (unfortunately) not stored in a core file by default.  We can check the timestamp through the messages file though, and it shows


Aug 23 10:25:14 GBLDNSRV9TL4002 rsyslogd-2177: imuxsock begins to drop messages from pid 6733 due to rate-limiting
Aug 23 10:29:06 GBLDNSRV9TL4002 kernel: pvs[7421] general protection ip:7ffff7f0f98d sp:7fffffffd978 error:0 in lvm[7ffff7e9c000+163000]
Aug 23 10:29:07 GBLDNSRV9TL4002 abrtd: Directory 'ccpp-2016-08-23-10:29:06-7421' creation detected
Aug 23 10:29:07 GBLDNSRV9TL4002 abrt[7469]: Saved core dump of pid 7421 (/sbin/lvm) to /var/spool/abrt/ccpp-2016-08-23-10:29:06-7421 (16379904 bytes)

So the crash happened at Aug 23 10:29:07 

Now see this , from file etc/lvm/archive/vg_ase_tempest_prod_00215-1181003826.vg

# Generated by LVM2 version 2.02.143(2)-RHEL6 (2016-04-01): Tue Aug 23 10:29:08 2016

contents = "Text Format Volume Group"
version = 1

description = "Created *before* executing 'vgrename vg_ase_tempest_prod NEW_VG23082016102906'"

creation_host = "GBLDNSRV9TL4002.anyaccess.net" # Linux GBLDNSRV9TL4002.anyaccess.net 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64
creation_time = 1471944548      # Tue Aug 23 10:29:08 2016 <-------------

vg_ase_tempest_prod {
        id = "Tnv0nH-Uu3P-cj1l-zOwT-Dqpd-pd1r-NdsYjp"
        seqno = 26
        format = "lvm2"                 # informational
        status = ["RESIZEABLE", "READ", "WRITE"]
        flags = []
        extent_size = 8192              # 4 Megabytes
        max_lv = 0
        max_pv = 0
        metadata_copies = 0

        physical_volumes {

It looks like there was a rename operation of the volume group at the same time as pvs was run.

-Nikhil.

Comment 24 nikhil kshirsagar 2016-08-31 08:17:35 UTC

Just before that, at the same time of the crash,

# Generated by LVM2 version 2.02.143(2)-RHEL6 (2016-04-01): Tue Aug 23 10:29:07 2016

contents = "Text Format Volume Group"
version = 1

description = "Created *before* executing 'pvchange --uuid --config global{activation=0} --select vg_name=vg_ase_tempest_prod'"

creation_host = "GBLDNSRV9TL4002.anyaccess.net" # Linux GBLDNSRV9TL4002.anyaccess.net 2.6.32-642.el6.x86_64 #1 SMP Wed Apr 13 00:51:26 EDT 2016 x86_64
creation_time = 1471944547      # Tue Aug 23 10:29:07 2016 <-------

vg_ase_tempest_prod {
        id = "BGR0a2-43vp-LBMr-I6oc-C2a9-jpbo-PcuBqK"
        seqno = 24
        format = "lvm2"                 # informational
        status = ["RESIZEABLE", "READ", "WRITE"]
        flags = []
        extent_size = 8192              # 4 Megabytes
        max_lv = 0
        max_pv = 0
        metadata_copies = 0

        physical_volumes {


-Nikhil.

Comment 25 David Teigland 2016-08-31 19:30:38 UTC

Thanks for figuring that out, I think the concurrent commands very likely caused the segfault.  We'll look for missing synchronization between these commands.

Comment 27 David Teigland 2016-09-06 15:41:49 UTC

On my test machine I'm running a loop of these two commands concurrently:
pvs
pvchange --uuid --config global{activation=0} --select vg_name=NAME

I've started by setting locking_type=0 in lvm.conf to make the issue simpler to hit, and eventually I reproduced the same segfault from pvs:

gdb) bt
#0  0x00007f4ef8989462 in dev_name (dev=0x7f4ef8f00a70) at device/dev-cache.c:1539
#1  0x00007f4ef89dfdba in _vg_read_orphan_pv (info=0x7f4ef8f00980, baton=0x7ffc4557e090) at metadata/metadata.c:3771
#2  0x00007f4ef8976560 in lvmcache_foreach_pv (vginfo=0x7f4ef8ec91e0, fun=0x7f4ef89dfd6f <_vg_read_orphan_pv>, 
    baton=0x7ffc4557e090) at cache/lvmcache.c:2556
#3  0x00007f4ef89e0194 in _vg_read_orphans (cmd=0x7f4ef8e8f020, warn_flags=1, orphan_vgname=0x7f4ef8edcb90 "#orphans_lvm2", 
    consistent=0x7ffc4557e288) at metadata/metadata.c:3842
#4  0x00007f4ef89e0f48 in _vg_read (cmd=0x7f4ef8e8f020, vgname=0x7f4ef8edcb90 "#orphans_lvm2", 
    vgid=0x7f4ef8edcb80 "#orphans_lvm2", warn_flags=1, consistent=0x7ffc4557e288, precommitted=0) at metadata/metadata.c:4163
#5  0x00007f4ef89e2d22 in vg_read_internal (cmd=0x7f4ef8e8f020, vgname=0x7f4ef8edcb90 "#orphans_lvm2", 
    vgid=0x7f4ef8edcb80 "#orphans_lvm2", warn_flags=1, consistent=0x7ffc4557e288) at metadata/metadata.c:4792
#6  0x00007f4ef89e5275 in _vg_lock_and_read (cmd=0x7f4ef8e8f020, vg_name=0x7f4ef8edcb90 "#orphans_lvm2", 
    vgid=0x7f4ef8edcb80 "#orphans_lvm2", lock_flags=33, status_flags=0, read_flags=262144, lockd_state=0)
    at metadata/metadata.c:5815
#7  0x00007f4ef89e568a in vg_read (cmd=0x7f4ef8e8f020, vg_name=0x7f4ef8edcb90 "#orphans_lvm2", 
    vgid=0x7f4ef8edcb80 "#orphans_lvm2", read_flags=262144, lockd_state=0) at metadata/metadata.c:5918
#8  0x00007f4ef8956095 in _process_pvs_in_vgs (cmd=0x7f4ef8e8f020, read_flags=262144, all_vgnameids=0x7ffc4557e4b0, 
    all_devices=0x7ffc4557e4a0, arg_devices=0x7ffc4557e4d0, arg_tags=0x7ffc4557e4f0, process_all_pvs=0, process_all_devices=0, 
    handle=0x7f4ef8edb1d8, process_single_pv=0x7f4ef89489dc <_pvs_single>) at toollib.c:3487
#9  0x00007f4ef895685f in process_each_pv (cmd=0x7f4ef8e8f020, argc=1, argv=0x7ffc4557eb68, only_this_vgname=0x0, all_is_set=0, 
    read_flags=262144, handle=0x7f4ef8edb1d8, process_single_pv=0x7f4ef89489dc <_pvs_single>) at toollib.c:3644
#10 0x00007f4ef894ac16 in _do_report (cmd=0x7f4ef8e8f020, handle=0x7f4ef8edb1d8, args=0x7ffc4557e650, single_args=0x7ffc4557e698)
    at reporter.c:1112
#11 0x00007f4ef894c027 in _report (cmd=0x7f4ef8e8f020, argc=1, argv=0x7ffc4557eb68, report_type=LABEL) at reporter.c:1388
#12 0x00007f4ef894c162 in pvs (cmd=0x7f4ef8e8f020, argc=1, argv=0x7ffc4557eb68) at reporter.c:1425
#13 0x00007f4ef893a6e6 in lvm_run_command (cmd=0x7f4ef8e8f020, argc=1, argv=0x7ffc4557eb68) at lvmcmdline.c:1723
#14 0x00007f4ef893bfe9 in lvm2_main (argc=3, argv=0x7ffc4557eb58) at lvmcmdline.c:2249
#15 0x00007f4ef8968a34 in main (argc=3, argv=0x7ffc4557eb58) at lvm.c:22


Next I'm setting locking_type=1 (the default) and attempting to reproduce this again.

Comment 28 David Teigland 2016-09-08 20:21:51 UTC

I've been running the same commands in a loop with locking_type 1 for a couple days and not had the seg fault.

I wonder if this during boot could cause future commands to skip locking?
> Setting up Logical Volume Management:   Failed to create directory /var/lock/lvm.

Comment 31 David Teigland 2016-10-18 19:40:50 UTC

No, we've not been looking at this any further.  The best theory is that a missing locking dir prevented locks from being used, which caused commands to run concurrently, which allowed data to be changed from one command while it was being used by another.

Comment 32 Sridhar 2016-10-18 20:43:32 UTC

Hello David,

Is there any specific reason why this is not looked upon? Because the customer is looking for an update regarding this bug in the case which he has opened so if there are any challenges or if this issue is not reproducible in test environment then I could ask whether the customer is able to reproduce the issue and tell us the steps to do so.

I currently cannot tell the customer that this is not being looked into it just like that. We need some justification as well to tell why this can't be looked further? For eg like this issue is not reproducible in our environment etc.

Kindly let me know if there is anything required from my end.


Regards,
Sridhar S

Comment 33 David Teigland 2016-10-18 20:54:43 UTC

We cannot reproduce it.  If they can reproduce the problem, have them run the command with -vvvv and send the core file and -vvvv output together.

Comment 35 Chris Williams 2017-06-07 22:10:11 UTC

Red Hat Enterprise Linux 6 transitioned to the Production 3 Phase on May 10, 2017.  During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not appear to meet the inclusion criteria for the Production Phase 3 and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification.  Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com

Note You need to log in before you can comment on or make changes to this bug.