Bugzilla (bugzilla.redhat.com) will be under maintenance for infrastructure upgrades and will not be available on July 31st between 12:30 AM - 05:30 AM UTC. We appreciate your understanding and patience. You can follow status.redhat.com for details.
Bug 1413867 - Error creating a storage pool when PV contains ':'
Summary: Error creating a storage pool when PV contains ':'
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.18.15.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ovirt-4.1.1
: ---
Assignee: Fred Rolland
QA Contact: Avihai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-17 08:03 UTC by Sergei
Modified: 2017-02-15 09:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-15 09:47:13 UTC
oVirt Team: Storage
rule-engine: ovirt-4.1+


Attachments (Terms of Use)
Logs from engine and hypervisor node (72.00 KB, application/x-tar)
2017-01-17 08:03 UTC, Sergei
no flags Details
lvm pvs output (5.20 KB, application/zip)
2017-01-17 09:59 UTC, Fred Rolland
no flags Details
engine & vdsm logs (57.22 KB, application/x-gzip)
2017-02-12 12:51 UTC, Avihai
no flags Details
lvs pvs -vvvv --config output (81.31 KB, text/plain)
2017-02-14 12:35 UTC, Avihai
no flags Details
Also adding print screen of engine GUI to make it more clear . (202.77 KB, image/png)
2017-02-15 07:20 UTC, Avihai
no flags Details
Another print screen with the GUI error (238.84 KB, image/png)
2017-02-15 07:23 UTC, Avihai
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 70636 0 master MERGED storage: Support colon in PV name 2017-01-18 21:31:45 UTC
oVirt gerrit 70664 0 master MERGED storage: add tests for decodePVInfo 2017-01-18 21:21:12 UTC
oVirt gerrit 70741 0 ovirt-4.1 MERGED testValidation: Add @xfail decorator 2017-01-23 13:04:06 UTC
oVirt gerrit 70742 0 ovirt-4.1 MERGED storage: add tests for decodePVInfo 2017-01-23 13:49:25 UTC
oVirt gerrit 70743 0 ovirt-4.1 MERGED storage: Support colon in PV name 2017-01-23 13:49:21 UTC

Description Sergei 2017-01-17 08:03:51 UTC
Created attachment 1241568 [details]
Logs from engine and hypervisor node

Description of problem:

When trying to attach iscsi device from my storage system, ovirt complains,
that it cannot create a storage pool.

In logs I can see only this:
----------------------------------------------------------
RuntimeError(u'Error creating a storage pool:
(u"spUUID=3fcace71-cd2b-4ab4-9d64-8ef926ead05a,
poolName=hosted_datacenter, masterDom=71b0d510-c24a-49fa-86ec-6b5ebf530140,
domList=[u\'71b0d510-c24a-49fa-86ec-6b5ebf530140\',
u\'793985b7-9d11-44a4-bf68-4b509a967a52\'], masterVersion=1, clusterlock
params: ({\'LEASETIMESEC\': 60, \'LOCKPOLICY\': \'ON\', \'IOOPTIMEOUTSEC\':
10, \'LEASERETRIES\': 3, \'LOCKRENEWALINTERVALSEC\': 5})",)',), <traceback
object at 0x35935a8>)
-----------------------------------------------------------

I have verified, that this block device can be successfully connected and
partitioned on the same server using standard tools (iscsiadm, parted,
mount), so, storage system on its own is working fine.

Version-Release number of selected component (if applicable):

ovirt 4.0.5, vdsm 4.18.15.3
Name        : vdsm
Arch        : x86_64
Version     : 4.18.15.3
Release     : 1.el7.centos


How reproducible:

Steps to Reproduce:

1. Deploy fresh ovirt (manager + register at least one host)
2. Add storage through web gui
3. System reports error, and log contains info "Error creating a storage pool"

Actual results:

Storage domain is not activated

Expected results:

Storage domain becomes active

Additional info:

I have got a response in ovirt mailing list, that there's a bug in vdsm while parsing PV. Link to the discussion: http://lists.ovirt.org/pipermail/users/2017-January/078891.html

I'm also attaching logs from kvm node and manager node.

Comment 1 Fred Rolland 2017-01-17 09:59:44 UTC
Created attachment 1241658 [details]
lvm pvs output

Comment 2 Nir Soffer 2017-01-18 09:38:21 UTC
Sergei, we like to understand why the multipath device name is not using the
standard format (which does not include ":").

Can you share the output of these commands:

lsblk
multipath -ll
cat /etc/multipath.conf

Comment 3 Sergei 2017-01-18 09:51:56 UTC
 Hi, Nil.
Here are requested outputs:

[root@kvm1 ~]# lsblk
NAME                                                         MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
fd0                                                            2:0    1    4K  0 disk  
sda                                                            8:0    0   80G  0 disk  
├─sda1                                                         8:1    0  500M  0 part  /boot
└─sda2                                                         8:2    0 79.5G  0 part  
  ├─centos-root                                              253:0    0   50G  0 lvm   /
  ├─centos-swap                                              253:1    0  3.9G  0 lvm   [SWAP]
  └─centos-home                                              253:2    0 25.6G  0 lvm   /home
sdb                                                            8:16   0   10G  0 disk  
└─SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-00 253:3    0   10G  0 mpath 
  ├─70e64136--e537--4d27--ac3b--e067a127a1d7-metadata        253:4    0  512M  0 lvm   
  ├─70e64136--e537--4d27--ac3b--e067a127a1d7-outbox          253:5    0  128M  0 lvm   
  ├─70e64136--e537--4d27--ac3b--e067a127a1d7-leases          253:6    0    2G  0 lvm   
  ├─70e64136--e537--4d27--ac3b--e067a127a1d7-ids             253:7    0  128M  0 lvm   
  ├─70e64136--e537--4d27--ac3b--e067a127a1d7-inbox           253:8    0  128M  0 lvm   
  └─70e64136--e537--4d27--ac3b--e067a127a1d7-master          253:9    0    1G  0 lvm   /rhev/data-center/mnt/blockSD/70e64136-e537-4d27-ac3b-e067a127a1d7/master
sr0                                                           11:0    1  603M  0 rom   
14f504e46494c4552686e4962646a2d5951514e2d3747716b            253:11   0 11.7G  0 mpath 
├─a41bca58--e9ad--4c35--bc46--95dcd160415c-metadata          253:12   0  512M  0 lvm   
├─a41bca58--e9ad--4c35--bc46--95dcd160415c-outbox            253:13   0  128M  0 lvm   
├─a41bca58--e9ad--4c35--bc46--95dcd160415c-leases            253:14   0    2G  0 lvm   
├─a41bca58--e9ad--4c35--bc46--95dcd160415c-ids               253:15   0  128M  0 lvm   
├─a41bca58--e9ad--4c35--bc46--95dcd160415c-inbox             253:16   0  128M  0 lvm   
└─a41bca58--e9ad--4c35--bc46--95dcd160415c-master            253:17   0    1G  0 lvm   
[root@kvm1 ~]# 


[root@kvm1 ~]# multipath -ll
SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-00 dm-3 ioFABRIC,Vicinity        
size=10G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 9:0:0:0 sdb 8:16 active ready running
14f504e46494c4552686e4962646a2d5951514e2d3747716b dm-11 
size=12G features='0' hwhandler='0' wp=rw


[root@kvm1 ~]# cat /etc/multipath.conf
# VDSM REVISION 1.3

defaults {
    polling_interval            5
    no_path_retry               fail
    user_friendly_names         no
    flush_on_last_del           yes
    fast_io_fail_tmo            5
    dev_loss_tmo                30
    max_fds                     4096
}

# Remove devices entries when overrides section is available.
devices {
    device {
        # These settings overrides built-in devices settings. It does not apply
        # to devices without built-in settings (these use the settings in the
        # "defaults" section), or to devices defined in the "devices" section.
        # Note: This is not available yet on Fedora 21. For more info see
        # https://bugzilla.redhat.com/1253799
        all_devs                yes
        no_path_retry           fail
    }
}

# Enable when this section is available on all supported platforms.
# Options defined here override device specific options embedded into
# multipathd.
#
# overrides {
#      no_path_retry           fail
# }

[root@kvm1 ~]#

Comment 4 Nir Soffer 2017-01-18 21:38:39 UTC
(In reply to Sergei from comment #3)
Thanks Sergei, everything seems normal except the strange lun name. I guess this is
related to the storage server configuration.

Comment 5 Sergei 2017-01-19 07:22:29 UTC
I guess so. I will try to contact them directly. 
Thanks for help anyway.

 Sergei.

Comment 6 Sandro Bonazzola 2017-01-23 14:18:32 UTC
Moving to 4.1.1 since 4.1 RC is out and this bug is not marked as a blocker

Comment 7 Avihai 2017-02-12 08:39:38 UTC
Hi Nir ,

I see the customer had the following target that included ":" , so to verify this I tried to do the same .

Customer Target from bug :
SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-00

I created a volume with this exact name (iofabric:ovirt-master-00) & mapped it to the VDSM host .

But the host still sees it via multipath -ll  as :
"3514f0c5a51600543 dm-40 XtremIO ,XtremApp"

And not as a its volume name -> iofabric:ovirt-master-00 .

Is there any way we can change the lun-id on the server ( we're using extreamIO) or in the host itself ?

Or can we change VDSM code to trick VDSM to see a target name with ":" in it ?

Comment 8 Avihai 2017-02-12 12:46:34 UTC
I tried also with targetcli server we have 10.35.88.157  and it does not look good.

I created a target name "target iqn.2015-03.com.iofabric:ovirt-master-00" and added a lun to it with 11G in size .

when I tried to add a new iscsi storage domain it failed .

event:
Feb 12, 2017 1:17:43 PM  Failed to add Storage Domain test22. (User: admin@internal-authz)

Many errors seen in vdsm log:
2017-02-12 13:17:03,208 INFO  (jsonrpc/5) [storage.TaskManager.Task] (Task='baf14138-3e79-421e-90e0-e27920017dca') aborting: Task is aborted: u'Failed reload: /dev/mapper/360014055ce39b844c384d6dbe13f9544' - code 100 (task:1175)
2017-02-12 13:17:03,209 ERROR (jsonrpc/5) [storage.Dispatcher] Failed reload: /dev/mapper/360014055ce39b844c384d6dbe13f9544 (dispatcher:80)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 72, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 105, in wrapper
    return m(self, *a, **kw)
  File "/usr/share/vdsm/storage/task.py", line 1183, in prepare
    raise self.error

Engine log:
2017-02-12 13:17:04,262+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] (default task-25) [4f3a17ac-5746-41da-a2b4-206d17be8fb5] Failed in 'GetDeviceListVDS' method

See logs attached .

More info :
lsblk output:
sdad                                                                                   65:208  0   11G  0 disk  
└─360014055ce39b844c384d6dbe13f9544 

multipath output:
360014055ce39b844c384d6dbe13f9544 dm-42 LIO-ORG ,FILEIO          
size=11G features='0' hwhandler='0' wp=rw
`-+- policy='service-time 0' prio=1 status=active
  `- 81:0:0:0 sdad 65:208 active ready running

Comment 9 Avihai 2017-02-12 12:50:11 UTC
Engine : ovirt-engine-4.1.0.4-0.1.el7.noarch
VDSM :4.19.2-2

Comment 10 Avihai 2017-02-12 12:51:27 UTC
Created attachment 1249529 [details]
engine & vdsm logs

Comment 11 Fred Rolland 2017-02-14 12:27:42 UTC
Hi,

Can you provide the pvs output ?

Thanks

Comment 12 Avihai 2017-02-14 12:33:59 UTC
(In reply to Fred Rolland from comment #11)
> Hi,
> 
> Can you provide the pvs output ?
> 
> Thanks

Lucky I saved it last time around .

I used :
lvm pvs -vvvv --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-00|'\'', '\''r|.*|'\'' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } '

I saved the output in a file "lvs_pvs_output.txt" which I attached .

Comment 13 Avihai 2017-02-14 12:35:54 UTC
Created attachment 1250215 [details]
lvs pvs -vvvv --config output

Comment 14 Fred Rolland 2017-02-14 14:40:59 UTC
(In reply to Avihai from comment #12)
> (In reply to Fred Rolland from comment #11)
> > Hi,
> > 
> > Can you provide the pvs output ?
> > 
> > Thanks
> 
> Lucky I saved it last time around .
> 
> I used :
> lvm pvs -vvvv --config ' devices { preferred_names = ["^/dev/mapper/"]
> ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3
> filter = [
> '\''a|/dev/mapper/SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-
> 00|'\'', '\''r|.*|'\'' ] }  global {  locking_type=1 
> prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  backup { 
> retain_min = 50  retain_days = 0 } '
> 
> I saved the output in a file "lvs_pvs_output.txt" which I attached .

Why did you use this command ?
Do you have a device named : /dev/mapper/SioFABRICVicinity_iqn.2015-03.com.iofabric:ovirt-master-00 ?

How did you reproduce ?
How did you get a PV with a name with colon?

I could no see anything related in the log.

Comment 15 Avihai 2017-02-15 07:16:10 UTC
I ran this command as it was used in the original bug to provide as much info as possible .( at least that was the intent )

My device/tartget name is "iqn.2015-03.com.iofabric:ovirt-master-00"

In the lvs_pvs_output.txt see a line with my current target name , 
is this what we expect? 

#device/dev-cache.c:352         /dev/disk/by-path/ip-10.35.88.157:3260-iscsi-iqn.2015-03.com.iofabric:ovirt-master-00-lun-0: Aliased to /dev/sdad in device cache (65:208)

Just to clarify , these are the steps of what I did :

1) I used a targetcli server that we have (ip=10.35.88.157 )  

2) On this server I created a target name with colon ":" similar to the bug :
   "iqn.2015-03.com.iofabric:ovirt-master-00" and added a 11G lun to this target.

Is this not what ensures that PVS is with ":" ? 
If not please suggest a way to do it .

3) From engine GUI I logged in to this server sucessfully & saw the expected lun.

4) I choose the lun tried to created a new storage domain with this lun & failed

Comment 16 Avihai 2017-02-15 07:20:16 UTC
Created attachment 1250511 [details]
Also adding print screen of engine GUI to make it more clear .

Comment 17 Avihai 2017-02-15 07:23:00 UTC
Created attachment 1250512 [details]
Another print screen with the GUI error

Comment 18 Nir Soffer 2017-02-15 07:34:59 UTC
(In reply to Avihai from comment #15)
> 2) On this server I created a target name with colon ":" similar to the bug :
>    "iqn.2015-03.com.iofabric:ovirt-master-00" and added a 11G lun to this
> target.
> 
> Is this not what ensures that PVS is with ":" ? 

No

> If not please suggest a way to do it .

We don't know how to reproduce a device name including a ":".

The fix was verified by the reporter. I don't think there is anything to verify
here except checking for regressions.

If you have trouble with target name including ":" it may be another bug. Please
check that you login to this target from the shell using iscsiadm. If it works in
iscsiadm but not in vdsm, please open another bug for it.

Comment 19 Avihai 2017-02-15 08:56:54 UTC
@Nir , is there a way to simulate this special lun-id by hard coding it in the VDSM code somehow ? 

@Raz , Currently there is no way to verify this bug , 
how do you want to proceed here ?

Comment 20 Raz Tamir 2017-02-15 09:47:13 UTC
According to Nir's comment - comment #18 closing this bug


Note You need to log in before you can comment on or make changes to this bug.