RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 889465 - lvcreate core dumps if attempted on a VG with missing PV while lvmetad is running
Summary: lvcreate core dumps if attempted on a VG with missing PV while lvmetad is run...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.4
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: 6.4
Assignee: Petr Rockai
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 892991
TreeView+ depends on / blocked
 
Reported: 2012-12-21 10:09 UTC by Nenad Peric
Modified: 2014-07-22 12:02 UTC (History)
10 users (show)

Fixed In Version: lvm2-2.02.100-1.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-22 12:02:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Nenad Peric 2012-12-21 10:09:26 UTC
Description of problem:

lvcreate core dumps if an attempt at creating LV inside a VG with a missing PV. 


Version-Release number of selected component (if applicable):

lvm2-2.02.98-6.el6.x86_64
lvm2-libs-2.02.98-6.el6.x86_64
device-mapper-1.02.77-6.el6.x86_64
device-mapper-libs-1.02.77-6.el6.x86_64


How reproducible:
Every time

Steps to Reproduce:

Make sure lvmetad is enabled in lvm.conf and lvm2-lvmetad is running. 

 use_lvmetad = 1

service lvm2-lvmetad status

1. vgcreate vg /dev/sda1 /dev/sdb1
2. echo "offline" >/sys/block/sdb/device/state 
3. lvcreate vg -n test_lv -L1G

  
Actual results:

(04:02:04) [root@r6-node01:~]$ /etc/init.d/lvm2-lvmetad status
lvmetad (pid  6001) is running...
(04:02:14) [root@r6-node01:~]$ vgcreate vg /dev/sda1 /dev/sdb1
  No physical volume found in lvmetad cache for /dev/sdb1
  Physical volume "/dev/sdb1" successfully created
  Volume group "vg" successfully created
(04:02:22) [root@r6-node01:~]$ vgs
  VG       #PV #LV #SN Attr   VSize  VFree 
  VolGroup   1   2   0 wz--n-  9.51g     0 
  vg         2   0   0 wz--n- 19.98g 19.98g
(04:02:24) [root@r6-node01:~]$ echo "offline" >/sys/block/sdb/device/state 
(04:02:28) [root@r6-node01:~]$ vgs
  No device found for PV FFcDQr-vaB3-WNau-382i-ivo3-kKvI-dv2XGj.
  No device found for PV FFcDQr-vaB3-WNau-382i-ivo3-kKvI-dv2XGj.
  VG       #PV #LV #SN Attr   VSize  VFree 
  VolGroup   1   2   0 wz--n-  9.51g     0 
  vg         2   0   0 wz--n- 19.98g 19.98g
(04:02:30) [root@r6-node01:~]$ lvcreate vg -n test_lv -L1G
  No device found for PV FFcDQr-vaB3-WNau-382i-ivo3-kKvI-dv2XGj.
lvcreate: metadata/pv_map.c:140: _create_maps: Assertion `pvl->pv->dev' failed.
Aborted (core dumped)


Expected results:

The command should either:
- succeed and create LV on existing physical volume if it fits, 
or 
- refuse to get executed completely saying that VG is missing a PV, and that it needs to be repaired first.

Additional info:

Comment 2 RHEL Program Management 2012-12-25 06:48:07 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Peter Rajnoha 2013-01-08 12:35:53 UTC
Well, this bug is caused by using the "echo offline > /sys...". Thing is that with this, there's no uevent generated that would update lvmetad via udev rules for this change (here meaning going running/offline).

It's questionable what is proper behaviour here as with using "echo offline > /sys", the device is still accessible via /sys/<device> - this item is not removed from kernel! Also, the kernel does not generate any event when changing states from running to offline and vice versa (which imho it should!). This should not happen in normal hotplug environment as events are always generated. The running/offline device state is a special case.

To fix this for now, you can use:

echo offline > /sys/<dev>/device/state
echo remove > /sys/<dev>/uevent

...

echo running > /sys/<dev>/device/state
echo add > /sys/<dev>/uevent

Comment 4 Peter Rajnoha 2013-01-08 12:36:50 UTC
(In reply to comment #0)
>   No device found for PV FFcDQr-vaB3-WNau-382i-ivo3-kKvI-dv2XGj.
> lvcreate: metadata/pv_map.c:140: _create_maps: Assertion `pvl->pv->dev'
> failed.
> Aborted (core dumped)

...anyway, we should fix this as well - it should not fail this way!

Comment 5 Peter Rajnoha 2013-01-08 12:54:14 UTC
There's also another echo that removes the dev altogether (including proper uevents):

  echo 1 > /sys/block/<dev>/device/delete

...and then this call to bring it back:

  echo "- - -" > /sys/class/scsi_host/hostN/scan

However, this way, you're not sure that the dev is assigned the same name again (but that is OK and expected in hotplug system).

Comment 6 Marian Csontos 2013-01-08 14:29:25 UTC
(In reply to comment #5)
> There's also another echo that removes the dev altogether (including proper
> uevents):
> 
>   echo 1 > /sys/block/<dev>/device/delete

Verified using this method lvmetad content is refreshed and the issue is likely to go away.

Comment 7 Peter Rajnoha 2013-01-10 14:04:16 UTC
(In reply to comment #0)
> (04:02:30) [root@r6-node01:~]$ lvcreate vg -n test_lv -L1G
>   No device found for PV FFcDQr-vaB3-WNau-382i-ivo3-kKvI-dv2XGj.
> lvcreate: metadata/pv_map.c:140: _create_maps: Assertion `pvl->pv->dev'
> failed.

The problem here lies in using 2 sources of information while working with the VG.


*Without lvmetad*, the vg_read call returns with error state ("missing PVs") which is properly recognized via vg_read_error_call:

int lvcreate(struct cmd_context *cmd, int argc, char **argv)
...
  vg = vg_read_for_update(cmd, lp.vg_name, NULL, 0);
...
  if (vg_read_error(vg)) {
    release_vg(vg);
    stack;
    return ECMD_FAILED;
  }

And this causes the lvcreate operation to properly error out.


However, *with lvmetad* in play, we first get the VG metadata from lvmetad which is stale as lvmetad was not updated properly because of the missing event as mentioned in comment #3. We have this sequence in the code:

...
_vg_read
lvmcache_get_vg
lvmetad_vg_lookup

The lvmetad_vg_lookup then calls:

if ((pvcn = dm_config_find_node(top, "metadata/physical_volumes")))
                        for (pvcn = pvcn->child; pvcn; pvcn = pvcn->sib)
                                _pv_populate_lvmcache(cmd, pvcn, 0); 

So it takes the metadata from lvmetad and tries to update the lvmcache with PV info. However, one of the PVs is missing!

The problem is that we don't check the return value of _pv_populate_lvmcache and we silently continue and so we end up with lvmetad info and lvmcache info going apart.

We should probably error out here directly and mark the VG as "unreliable" and exit. OR, alternatively, we should fix the information given from lvmetad in-situ and additionally mark the VG with "this VG has missing PVs" flag.

Mornfall, any opinion?

Comment 8 Peter Rajnoha 2013-01-10 14:20:40 UTC
(In reply to comment #7)
>   vg = vg_read_for_update(cmd, lp.vg_name, NULL, 0);
> ...
>   if (vg_read_error(vg)) {
>     release_vg(vg);
>     stack;
>     return ECMD_FAILED;
>   }

(...with lvmetad, this passes as VG is considered to be OK and complete...)

Comment 9 Petr Rockai 2013-03-18 14:18:52 UTC
Well, I suggest just checking validity of dev pointers and complaining earlier. A write would fail anyway, so we basically check if we have all dev pointers, and if no, set the PV_MISSING flag or such, ensuring that usual policies apply (no lvcreate, among others).

Comment 10 Petr Rockai 2013-04-03 10:59:23 UTC
This should now work correctly upstream (devices not visible to the lvm command but visible to lvmetad are now treated as MISSING by the command).

Comment 11 Nenad Peric 2013-07-12 07:02:54 UTC
Tested with lvm2-2.02.99-0.13.el6 and marking verified. 
The command fails to execute while PV is missing from the VG, rather than just core dumping. 


[root@tardis-01 ~]# echo offline >/sys/block/sdc/device/state 
[root@tardis-01 ~]# lvcreate vg -n lv_missing_pv -L 5G 
  No device found for PV aAyY75-bLyv-gFgJ-D8Ri-8Cuc-xgdH-eqrZyC.
  Cannot change VG vg while PVs are missing.
  Consider vgreduce --removemissing.


Note You need to log in before you can comment on or make changes to this bug.