RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1320836 - when vol-create-as a volume with invalid name in a disk pool, libvirt produced error, but parted still created a partition and multipathd didn't generate symbolic link in /dev/mapper
Summary: when vol-create-as a volume with invalid name in a disk pool, libvirt produce...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.3
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: John Ferlan
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-24 06:25 UTC by yisun
Modified: 2016-11-03 18:40 UTC (History)
6 users (show)

Fixed In Version: libvirt-1.3.5-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-03 18:40:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2016:2577 0 normal SHIPPED_LIVE Moderate: libvirt security, bug fix, and enhancement update 2016-11-03 12:07:06 UTC

Description yisun 2016-03-24 06:25:29 UTC
description:
when vol-create-as a volume with invalid name in a disk pool, libvirt produced error, but parted still created a partition and multipathd didn't generate symbolic link in /dev/mapper

versions:
libvirt-1.3.2-1.el7.x86_64
kernel-3.10.0-363.el7.x86_64
parted-3.1-24.el7.x86_64
device-mapper-multipath-0.4.9-87.el7.x86_64

Report to libvirt for now, because in step 1 ~ 5, pure parted didn't reproduce the issue, use step 1~5 as a comparison here.  
step 6 ~ 13 shows the issue. and during the test, step 12 shows another vol-list issue, it's such a minor issue so I didn't split it into a new bug for now.  

0. create and build a disk pool as follow, using multipath device as source
# cat pool.xml
<pool type='disk'>
<name>bugtest</name>
<source>
<device path='/dev/mapper/mpathc' part_separator='no'/>
<format type='gpt'/>
</source>
<target>
<path>/dev/disk/by-path/</path>
</target>
</pool>


# virsh pool-define pool.xml
Pool bugtest defined from pool.xml

# virsh pool-build bugtest --overwrite
Pool bugtest built

# virsh pool-start bugtest
Pool bugtest started



1. check the mpathc disk
# parted /dev/mapper/mpathc print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/mpathc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags
<===== no partition now as expected

2. check multipath device mapper dir
# ll /dev/mapper/ | grep mpathc
lrwxrwxrwx. 1 root root       7 Mar 16 22:54 mpathc -> ../dm-1
<==== only slink generated for mpathc as expected.

3. create a partition by parted as follow (the command is exactly used by libvirt found in libvirtd.log)
# /usr/sbin/parted /dev/mapper/mpathc mkpart --script primary  30988927B 40988927B
Warning: The resulting partition is not properly aligned for best performance.

4. now, in mapper dir, we can see the new partition mpathc1
# ll /dev/mapper/ | grep mpathc
lrwxrwxrwx. 1 root root       7 Mar 16 22:57 mpathc -> ../dm-1
lrwxrwxrwx. 1 root root       7 Mar 16 22:57 mpathc1 -> ../dm-2

5. and parted print show the newly created partition as follow.
# parted /dev/mapper/mpathc print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/mpathc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name     Flags
 1      31.0MB  41.0MB  10.0MB               primary





6. using virsh cmd to create a new partition, with invalid name.
# virsh vol-create-as --pool bugtest --name badName --capacity 50M
error: Failed to create vol badName
error: invalid argument: invalid partition name 'badName', expected 'mpathc2'
<=== here an error produced, but libvirt proceed to parted, and the behavior is different from step 1 ~ 5, as follow

7. with parted print, we can see the partition created as number 2, even if virsh failed.
# parted /dev/mapper/mpathc print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/mpathc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name     Flags
 1      31.0MB  41.0MB  10.0MB               primary
 2      41.0MB  93.4MB  52.4MB               primary  <==== here the new partition exists


8. but in /dev/mapper, we can see the new partition is not listed yet.
# ll /dev/mapper/ | grep mpathc
lrwxrwxrwx. 1 root root       7 Mar 16 23:03 mpathc -> ../dm-1
lrwxrwxrwx. 1 root root       7 Mar 16 23:03 mpathc1 -> ../dm-2
<===== but here, we do not have a mpathc2, it's missing, and /dev/dm-3 is not generated either.

9. and this make libvirt cannot refresh or start the disk pool, due to vol '/dev/mapper/mpathc2' cannot be found
# virsh pool-refresh bugtest
error: Failed to refresh pool bugtest
error: Storage volume not found: no storage vol with matching path '/dev/mapper/mpathc2'


# virsh pool-start bugtest
error: Failed to start pool bugtest
error: Storage volume not found: no storage vol with matching path '/dev/mapper/mpathc2'


10. after restart multipatd, the '/dev/mapper/mpathc2' genereated
# service multipathd restart
Restarting multipathd (via systemctl):                     [  OK  ]

# ll /dev/mapper/ | grep mpathc
lrwxrwxrwx. 1 root root       7 Mar 16 23:04 mpathc -> ../dm-1
lrwxrwxrwx. 1 root root       7 Mar 16 23:04 mpathc1 -> ../dm-2
lrwxrwxrwx. 1 root root       7 Mar 16 23:04 mpathc2 -> ../dm-3

11. now libvirt can deal with this pool again.
# virsh pool-start bugtest
virsh vol-list bugtestPool bugtest started

12. well there is another tiny problem here, the vol-list show duplicated vols as follow
# virsh vol-list bugtest
 Name                 Path                                    
------------------------------------------------------------------------------
 mpathc1              /dev/mapper/mpathc1                    
 mpathc1              /dev/mapper/mpathc1                    
 mpathc1              /dev/mapper/mpathc1                    
 mpathc2              /dev/mapper/mpathc2                    
 mpathc2              /dev/mapper/mpathc2                    
 mpathc2              /dev/mapper/mpathc2  
<== here is a tiny problem about vol-list here.



13. after refresh the pool, everthing's fine.
# virsh pool-refresh bugtest; virsh vol-list bugtest
Pool bugtest refreshed

 Name                 Path                                    
------------------------------------------------------------------------------
 mpathc1              /dev/mapper/mpathc1                    
 mpathc2              /dev/mapper/mpathc2        


actual result:
as above

expected result:
in step 6, if virsh cmd failed, we should not proceed calling parted to make a new partition
but there still something messed up as around step 10, after we run a "failed" vol-create-as, we have to restart multipathd to generate the symbolic link in /dev/mapper, this behavior is strange and not reproducible by pure parted cmd.

Comment 1 John Ferlan 2016-04-26 13:00:42 UTC
The failure path for this code is a "best case" type scenario...  As part of the fix for commit id '290ffcf' (bug 1138516), I did add the call to virStorageBackendDiskDeleteVol when the virStorageBackendDiskCreateVol fails. That DeleteVol code has separate paths for devmapper device and "regular" disk backend. That logic was introduced by commit id 'df1011ca8'.  I'd have to do some "on system debugging" in order to dig any deeper as I don't have a multipath device mapper setup on my laptop/system.  So if you can provide me access to one, I can continue to look deeper.  My "assumption" is that the dmsetup command being called is failing, but we're not checking that in the caller - the caller assumes that the command succeeds if it's run. Since 'badName' is probably not expected by dmsetup, I'm assuming it fails as well, but would have to "see it" in action and then generate a fix. I have some ideas already coded up so it shouldn't take too long.

As for the duplicate entries in the pool, I see what the issue is... There are a few missed clear all vols in pool prior to calling refreshPool. Those I can send a separate patch for.

Comment 3 John Ferlan 2016-05-08 12:51:30 UTC
The new test disk couldn't handle 50M, but 10M worked...

I also found that a "parted /dev/mapper/mpathc rm --script 2" after step 6 would allow for the vol-list, pool-refresh, etc. to work.

So, my initial assumption wasn't true... the "dmsetup remove --force" is working, but it seems it's (still) the wrong solution especially since the partition isn't deleted by dmsetup remove and that's what in the long run is causing the issue. It's too bad that kind of detail wasn't found in the patches that led to commit id 'df1011ca8':

http://www.redhat.com/archives/libvir-list/2011-February/msg00386.html (v3)
http://www.redhat.com/archives/libvir-list/2011-February/msg00184.html (v2)
http://www.redhat.com/archives/libvir-list/2011-January/msg01270.html (v1)

What you'll note in the examples provided of those solutions though is that the "p#" (partition #) that's appended onto the 'base' name is what really is causing the issue. So rather than "shed" the "p" from the "p1", the solution was to use dmsetup remove, which while working in one sense, I'm still not convinced it ever worked "properly". It certainly would remove the volume from the libvirt pool, but the dm device would still have that partition as far as I understand the code at least.

Of course, I think this only comes to light now since the "part_separator='no'" support was added (eg. commit id '020135dc' for bz 1265694).  Might be interesting to alter the setup to remove the part_separator from the pool def and to not have user_friendly_names set. Of interest would be - using the same process what happens after the failure to create the volume. I suspect the mpathcp2 volume would successfully have the 'dmsetup remove' run; however, running a parted afterwards would so partition 2 still exists.  

In any case, that's history and now it's time to find a solution for "both" instances. I now understand what needs to be done and will work on some patches. I do have a "fix" for the current situation (part_separator='no'), but would like to "really" fix the other situation where the "p#" is used.  So if you could configure perhaps another volume that doesn't use user_friendly_names, I can make sure I have a more functionally complete solution.

Comment 4 yisun 2016-05-09 12:26:37 UTC
Hi John, 
Your assumption is correct. If a volume's multipath name ended with "#" which required a 'p' appended when created new partition. The dm-setup remove can work well if vol-create-as failed. 

Now I have a new vhba.

# virsh nodedev-dumpxml scsi_host6
<device>
  <name>scsi_host6</name>
  <path>/sys/devices/pci0000:00/0000:00:09.0/0000:18:00.1/host4/vport-4:0-1/host6</path>
  <parent>scsi_host4</parent>
  <capability type='scsi_host'>
    <host>6</host>
    <unique_id>6</unique_id>
    <capability type='fc_host'>
      <wwnn>21000024ff370145</wwnn>
      <wwpn>2101001b32a90001</wwpn>
      <fabric_wwn>2001000dec9877c1</fabric_wwn>
    </capability>
  </capability>
</device>

# lsscsi | grep 6:
[6:0:0:0]    disk    IBM      1726-4xx  FAStT  0617  /dev/sdj 
[6:0:0:1]    disk    IBM      1726-4xx  FAStT  0617  /dev/sdk 
[6:0:1:0]    disk    IBM      1726-4xx  FAStT  0617  /dev/sdl 
[6:0:1:1]    disk    IBM      1726-4xx  FAStT  0617  /dev/sdm 

# cat /etc/multipath/bindings 
...
mpathd 3600a0b80005b10ca00005e115729093f  
<== since the wwid is ended with 'f' which is not a number, so a 'p' will not be required by multipathd. so I didn't set up "user_friendly_name  no" in /etc/multipath.conf, but use an alias ended in a number. as follow:
...

# vim /etc/multipath.conf
multipaths {
        multipath {
                wwid                    3600a0b80005b10ca00005e115729093f
                alias                   5554344432
                }




# service multipathd restart
Redirecting to /bin/systemctl restart  multipathd.service

# cat pool.xml
<pool type='disk'>
<name>bugtest</name>
<source>
<device path='/dev/mapper/5554344432'/>
<format type='gpt'/>
</source>
<target>
<path>/dev/disk/by-path/</path>
</target>
</pool>

# virsh pool-define pool.xml; virsh pool-build bugtest --overwrite; virsh pool-start bugtest
Pool bugtest defined from pool.xml
Pool bugtest built
Pool bugtest started

# virsh vol-create-as --pool bugtest --name badName --capacity 50M
error: Failed to create vol badName
error: invalid argument: invalid partition name 'badName', expected '5554344432p1'

# parted /dev/mapper/5554344432 print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/5554344432: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 
Number  Start  End  Size  File system  Name  Flags
<=== no new partition here

# service multipathd restart; ll /dev/mapper/ | grep 555
Redirecting to /bin/systemctl restart  multipathd.service
lrwxrwxrwx. 1 root root       7 May  9 20:23 5554344432 -> ../dm-6
<=== no new partition here

# virsh pool-refresh bugtest
Pool bugtest refreshed
<=== no error here


I left the machine turned on, if the vhba deleted by someone else, you can use the nodedev xmls in this comment and comment 2 to create them by yourself. thx.

Comment 5 John Ferlan 2016-05-09 19:18:19 UTC
I think we were working at about the same time on the same machine on two different things. I believe you were running my code when you tried this since "the right thing" happened.  If you were running the 1.3.2 (or upstream) version, then the deletion would have failed. Additionally, if you had a successful creation using the right name, the "virsh vol-delete $volname $poolname" would have failed, e.g.:

# virsh vol-list bugtest
 Name                 Path                                    
------------------------------------------------------------------------------
 5554344432p1         /dev/mapper/5554344432p1                

# virsh vol-delete 5554344432p1 bugtest
error: Failed to delete vol 5554344432p1
error: Storage volume not found: no storage vol with matching path '/dev/mapper/5554344432p1'

# ll /dev/mapper/ | grep 555
lrwxrwxrwx. 1 root root       7 May  9 22:57 5554344432 -> ../dm-6
# parted /dev/mapper/5554344432 print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/5554344432: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name     Flags
 1      31.0MB  41.0MB  10.0MB               primary

# 

The reason is the deletion code shouldn't be doing the 'dmsetup remove' at all. Rather the fix should have handled the partition separator "p" properly.

In any case, the good news is my changes fix both issues ;-).  Thanks for setting up the second device so I can test with both configs at the same time (I used pool name bug1test for the "original" issue).  I'm going to add some more context to bz 1265694 as well as it seems after actually using things for a while that the code there isn't quite right

Comment 6 John Ferlan 2016-05-11 13:51:54 UTC
Patch is pushed to resolve this:

commit 8cdff0b93f732256df8d4a5c4eb418f37c707291
Author: John Ferlan <jferlan>
Date:   Tue Apr 26 08:53:57 2016 -0400

    storage: Fix virStorageBackendDiskDeleteVol for device mapper
    
    Commit id 'df1011ca8' modified virStorageBackendDiskDeleteVol to use
    "dmsetup remove --force" to remove the volume, but left things in an
    inconsistent state since the partition still existed on the disk and
    only the device mapper device (/dev/dm-#) was removed.
    
    Prior to commit '1895b421' (or '1ffd82bb' and '471e1c4e'), this could
    go unnoticed since virStorageBackendDiskRefreshPool wasn't called.
    However, the pool would be unusable since the /dev/dm-# device would
    be removed even though the partition was not removed unless a multipathd
    restart reset the link. That would of course make the volume appear again
    in the pool after a refresh or pool start after libvirt reload.
    
    This patch removes the 'dmsetup' logic and re-implements the partition
    deletion logic for device mapper devices. The removal of the partition
    via 'parted rm --script #' will cause udev device change logic to allow
    multipathd to handle removing the dm-* device associated with the partition.

$ git describe 8cdff0b93f732256df8d4a5c4eb418f37c707291
v1.3.4-211-g8cdff0b
$

Comment 8 Pei Zhang 2016-09-06 05:53:38 UTC
Verified versions :
libvirt-2.0.0-6.el7.x86_64
qemu-kvm-rhev-2.6.0-22.el7.x86_64
device-mapper-multipath-0.4.9-96.el7.x86_64

Steps : 

Scenario 1 : source device ends with letter 
1. Create and build a multipath device pool as following :

# virsh pool-list 
 Name                 State      Autostart 
-------------------------------------------
 test1                active     no        

# virsh pool-dumpxml test1
<pool type='disk'>
  <name>test1</name>
  <uuid>a508da43-5fa2-46f3-b9b9-0c2193ff26b4</uuid>
  <capacity unit='bytes'>10737401344</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>10737383936</available>
  <source>
    <device path='/dev/mapper/mpathc' part_separator='no'>
      <freeExtent start='17408' end='10737401344'/>
    </device>
    <format type='gpt'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>

2. create volume with invalid name, it should report error 

# virsh vol-create-as --pool test1 --name vol1 --capacity 50M 
error: Failed to create vol vol1
error: invalid argument: invalid partition name 'vol1', expected 'mpathc1'

3. check in /dev/mapper, there is no extra partition 

# ll /dev/mapper/ | grep mpathc
lrwxrwxrwx. 1 root root       7 Sep  6 13:25 mpathc -> ../dm-6

# parted /dev/mapper/mpathc print 
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/mpathc: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End  Size  File system  Name  Flags

4. create volume with valid name, and check it can be deleted 

# virsh vol-create-as --pool test1 --name mpathc1 --capacity 50M 
Vol mpathc1 created

# virsh vol-list --pool test1
 Name                 Path                                    
------------------------------------------------------------------------------
 mpathc1              /dev/mapper/mpathc1                     

# virsh vol-delete mpathc1 --pool test1
Vol mpathc1 deleted


Scenario 2 : source device end with number
1. check current multipath devices and change alias 
 
# cat /etc/multipath/bindings 
......
mpathc 3600a0b80005b10ca00005e115729093f
mpathd 3600a0b80005b0acc00004f875728fe8e

# vim /etc/multipath.conf
multipaths {
        multipath {
                wwid                    3600a0b80005b10ca00005e115729093f
                alias                   5554344432
                }

# systemctl restart multipathd

2. define and build a multipath pool like following, use the new alias

# virsh pool-dumpxml test
<pool type='disk'>
  <name>test</name>
  <uuid>8511548b-3959-4ebd-b293-5eb8975169d5</uuid>
  <capacity unit='bytes'>10737401344</capacity>
  <allocation unit='bytes'>0</allocation>
  <available unit='bytes'>10737383936</available>
  <source>
    <device path='/dev/mapper/5554344432'>
      <freeExtent start='17408' end='10737401344'/>
    </device>
    <format type='gpt'/>
  </source>
  <target>
    <path>/dev/disk/by-path</path>
  </target>
</pool>

3. create a volume with invalid name, it should report error

# virsh vol-create-as --pool test --name vol1 --capacity 50M
error: Failed to create vol vol1
error: invalid argument: invalid partition name 'vol1', expected '5554344432p1'

# virsh vol-list test
 Name                 Path                                    
------------------------------------------------------------------------------

4. check in /dev/mapper, no extra partition 

# ls /dev/mapper/ -al | grep 555
lrwxrwxrwx.  1 root root       7 Sep  6 12:49 5554344432 -> ../dm-6

# parted /dev/mapper/5554344432 print
Model: Linux device-mapper (multipath) (dm)
Disk /dev/mapper/5554344432: 10.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start  End  Size  File system  Name  Flags

5. create a volume use valid name and delete it 

# virsh vol-create-as --pool test --name 5554344432p1 --capacity 50M
Vol 5554344432p1 created

# virsh vol-list test
 Name                 Path                                    
------------------------------------------------------------------------------
 5554344432p1         /dev/mapper/5554344432p1                

# virsh vol-delete 5554344432p1 test
Vol 5554344432p1 deleted

# virsh vol-list test
 Name                 Path                                    
------------------------------------------------------------------------------

As above, move this bug to verified.

Comment 10 errata-xmlrpc 2016-11-03 18:40:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html


Note You need to log in before you can comment on or make changes to this bug.