Bug 1884223 - [GSS][RHHI 1.7][Error message '<device-path> is not a valid name for this device' showing up every two hours]
Summary: [GSS][RHHI 1.7][Error message '<device-path> is not a valid name for this de...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: cockpit-ovirt
Classification: oVirt
Component: gluster-ansible
Version: ---
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ovirt-4.4.5
: 0.14.18
Assignee: Parth Dhanjal
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks: 1851114
TreeView+ depends on / blocked
 
Reported: 2020-10-01 11:54 UTC by Gobinda Das
Modified: 2021-03-22 12:57 UTC (History)
11 users (show)

Fixed In Version: cockpit-ovirt-0.14.18
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1851114
Environment:
Last Closed: 2021-03-22 10:22:25 UTC
oVirt Team: Gluster
Embargoed:
pm-rhel: ovirt-4.4+
godas: devel_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 111614 0 master MERGED VGName and Thinpool name should be less than 55 characters 2021-01-19 08:29:51 UTC

Description Gobinda Das 2020-10-01 11:54:20 UTC
Description of problem:

Issue description:

* RHHI 1.7 installed. The customer is receiving the following error message every two hours:

Jun 17 20:33:53 ddin-kar-rhhi-2 vdsm[36674]: ERROR Internal server error#012Traceback (most recent call last):#012  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request#012    
res = method(**params)#012  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194, in _dynamicMethod#012    result = fn(*methodArgs)#012  File "/usr/lib/python2.7/site-packages/vdsm/gluster/apiwrapper.py", 
line 84, in storageDevicesList#012    return self._gluster.storageDevicesList()#012  File "/usr/lib/python2.7/site-packages/vdsm/gluster/api.py", line 93, in wrapper#012    rv = func(*args, **kwargs)#012  
File "/usr/lib/python2.7/site-packages/vdsm/gluster/api.py", line 519, in storageDevicesList#012    status = self.svdsmProxy.glusterStorageDevicesList()#012  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", 
line 56, in __call__#012    return callMethod()#012  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 54, in <lambda>#012    **kwargs)#012  File "<string>", line 2, in glusterStorageDevicesList#012  
File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod#012    raise convert_to_error(kind, result)#012ValueError: gluster_thinpool_gluster_vg_3600508b1001c84fdb22ae627d7a840e2p3 is not a valid name for this device

* This is the device reported by the storageDevicesList command:

3600508b1001c0ccc07ddd1090e3147f4p3                                                                                      253:3    0 832.2G  0 part
    ├─gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3-gluster_thinpool_gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3_tmeta   253:9    0     3G  0 lvm
    │ └─gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3-gluster_thinpool_gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3-tpool 253:11   0 826.2G  0 lvm
    │   ├─gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3-gluster_thinpool_gluster_vg_3600508b1001c0ccc07ddd1090e3147f4p3


* This is the error observed in the supervdsm logs:

MainProcess|jsonrpc/7::ERROR::2020-06-24 16:47:25,490::supervdsm_server::103::SuperVdsm.ServerCallback::(wrapper) Error in storageDevicesList
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 101, in wrapper
    res = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/storagedev.py", line 152, in storageDevicesList
    _reset_blivet(blivetEnv)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/storagedev.py", line 143, in _reset_blivet
    blivetEnv.reset()
  File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 511, in reset
    self.devicetree.populate(cleanupOnly=cleanupOnly)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2256, in populate
    self._populate()
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2323, in _populate
    self.addUdevDevice(dev)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1235, in addUdevDevice
    device = self.addUdevLVDevice(info)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 713, in addUdevLVDevice
    self.addUdevDevice(pv_info)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1293, in addUdevDevice
    self.handleUdevDeviceFormat(info, device)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 2009, in handleUdevDeviceFormat
    self.handleUdevLVMPVFormat(info, device)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1651, in handleUdevLVMPVFormat
    self.handleVgLvs(vg_device)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1588, in handleVgLvs
    addLV(lv)
  File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 1558, in addLV
    exists=True, **lv_kwargs)
  File "/usr/lib/python2.7/site-packages/blivet/devices/lvm.py", line 1190, in __init__
    segType=segType)
  File "/usr/lib/python2.7/site-packages/blivet/devices/lvm.py", line 554, in __init__
    exists=exists)
  File "/usr/lib/python2.7/site-packages/blivet/devices/dm.py", line 73, in __init__
    parents=parents, sysfsPath=sysfsPath)
  File "/usr/lib/python2.7/site-packages/blivet/devices/storage.py", line 131, in __init__
    Device.__init__(self, name, parents=parents)
  File "/usr/lib/python2.7/site-packages/blivet/devices/device.py", line 84, in __init__
    raise ValueError("%s is not a valid name for this device" % name)
ValueError: gluster_thinpool_gluster_vg_3600508b1001c84fdb22ae627d7a840e2p3 is not a valid name for this device

* This looks to be a blivet issue, although the python-blivet package seems updated:

python-blivet-0.61.15.75-1.el7.noarch                       Wed May 13 15:59:27 2020

Could you please assist in resolving these errors?
Is this case by the partition naming of the device?

Thank you,

Natalia

--- Additional comment from RHEL Program Management on 2020-06-25 15:55:44 UTC ---

This bug is automatically being proposed for RHHI-V 1.8 release at Red Hat Hyperconverged Infrastructure for Virtualization product, by setting the release flag 'rhiv‑1.8' to '?'.

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from  on 2020-06-25 15:57:06 UTC ---

The sosreport of the affected node is available at /cases/02684773/0010-sosreport-ddin-kar-rhhi-2-02684773-2020-06-24-opuiuqy.tar.xz/sosreport-ddin-kar-rhhi-2-02684773-2020-06-24-opuiuqy in support-shell.

Case number: 02684773
Customer: Db Systems Gmbh
Severity: 3

--- Additional comment from Yaniv Kaul on 2020-06-28 09:33:58 UTC ---

If it's a blivet issue, did you consult with blivet developers?

--- Additional comment from  on 2020-06-29 06:13:44 UTC ---

I suspect this might be a blivet problem, but I don't have enough evidence. That's the reason why I submitted this BZ to the RHHI team -  if there's any alternate method / procedure to contact blivet developers working in RHHI, please let me know. 

Thank you,

Natalia

--- Additional comment from  on 2020-06-30 12:14:50 UTC ---

Good afternoon.

Please, I'd appreciate an initial evaluation regarding this BZ. 

Thank you,

Natalia

--- Additional comment from Gobinda Das on 2020-07-01 12:51:17 UTC ---

Hi Vojtech,
 I was just refering  similar kind of BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1649364#c17
But here customer only used "_" . Is there anything specific we need to look?

--- Additional comment from Vojtech Trefny on 2020-07-01 13:37:31 UTC ---

The problem here is the length of the name. Limit for LVM names in blivet is currently 55 characters and "gluster_thinpool_gluster_vg_3600508b1001c84fdb22ae627d7a840e2p3" is 63.

--- Additional comment from  on 2020-07-01 13:49:42 UTC ---

Thank you very much. Will an lvrename of the thin-pool to a shorter name solve this issue?
Anything to be taken into account? In my lab setup it worked just fine, but I don't know if changing the thin-pool name in RHHI requires any further modifications.

Thank you,

Natalia

--- Additional comment from SATHEESARAN on 2020-07-11 08:00:59 UTC ---

(In reply to nravinas from comment #8)
> Thank you very much. Will an lvrename of the thin-pool to a shorter name
> solve this issue?
> Anything to be taken into account? In my lab setup it worked just fine, but
> I don't know if changing the thin-pool name in RHHI requires any further
> modifications.
> 
> Thank you,
> 
> Natalia

Hello Natalia,

I am the Quality Engineer validating RHHI-V.
I have too tested with renaming the VG as well as thinpool, and this works
good for RHHI env. There is no effect with LV

Go ahead and recommend this thing to customer.

--- Additional comment from SATHEESARAN on 2020-07-11 08:02:21 UTC ---

Similar to this bug, we also have another bug on the same topic, but there deployment itself fails,
when using multipath device name also enabling LV cache.

1. https://bugzilla.redhat.com/show_bug.cgi?id=1851114 ( This bug )
This bug is hit when with multipath configuration enabled ( no need for LV cache).
In this case, deployment succeeds but due to python-blivet restriction of LV names to 55 chars,
error messages are thrown in vdsm.log and supervdsm.log files

2. https://bugzilla.redhat.com/show_bug.cgi?id=1855945
On the other hand, deployment itself fails, with multipath configuration enabled
and LV cache also configured with device names as /dev/mapper/<WWID>

--- Additional comment from SATHEESARAN on 2020-07-11 08:03:28 UTC ---

marking this bug for RHHI-V 1.8 known_issue

--- Additional comment from Gobinda Das on 2020-07-15 06:56:11 UTC ---

(In reply to nravinas from comment #8)
> Thank you very much. Will an lvrename of the thin-pool to a shorter name
> solve this issue?
> Anything to be taken into account? In my lab setup it worked just fine, but
> I don't know if changing the thin-pool name in RHHI requires any further
> modifications.
> 
> Thank you,
> 
> Natalia

Hi Natalia,
 As discussed over google chat, only those steps are enough.

--- Additional comment from  on 2020-07-20 09:04:23 UTC ---

> 
> Hello Natalia,
> 
> I am the Quality Engineer validating RHHI-V.
> I have too tested with renaming the VG as well as thinpool, and this works
> good for RHHI env. There is no effect with LV
> 
> Go ahead and recommend this thing to customer.

Hello, Satheesaran

These were the recommended steps:

1. vgdisplay

Get VG UUID: <VG UUID>

2. vgrename <VG UUID> <New VG name>
3. lvrename <New VG name> <Old LV> <New LV>
4.  From engine click on each host -> Storage Devices-> Sync

Is this safe to be executed on a production system? In my test setup everything went fine, but just to double-check, since the customer is concerned about this.

Thank you,

Natalia

--- Additional comment from SATHEESARAN on 2020-07-21 09:07:25 UTC ---

(In reply to nravinas from comment #13)
> > 
> > Hello Natalia,
> > 
> > I am the Quality Engineer validating RHHI-V.
> > I have too tested with renaming the VG as well as thinpool, and this works
> > good for RHHI env. There is no effect with LV
> > 
> > Go ahead and recommend this thing to customer.
> 
> Hello, Satheesaran
> 
> These were the recommended steps:
> 
> 1. vgdisplay
> 
> Get VG UUID: <VG UUID>
> 
> 2. vgrename <VG UUID> <New VG name>
> 3. lvrename <New VG name> <Old LV> <New LV>
> 4.  From engine click on each host -> Storage Devices-> Sync
> 
> Is this safe to be executed on a production system? In my test setup
> everything went fine, but just to double-check, since the customer is
> concerned about this.
> 
> Thank you,
> 
> Natalia

Yes, this should be good, as I tested on QE setup with RHHI-V 1.7

--- Additional comment from Disha Walvekar on 2020-07-22 06:27:57 UTC ---

Hi Gobonda, 

Please review the updated doc text.
Thank you.

--- Additional comment from Gobinda Das on 2020-07-22 06:40:19 UTC ---

(In reply to Disha Walvekar from comment #15)
> Hi Gobonda, 
> 
> Please review the updated doc text.
> Thank you.

Looks good to me.

--- Additional comment from Marina Kalinin on 2020-09-17 17:53:53 UTC ---

Hi Natalia,

Can you please put this workaround in the KCS until Eng come with a solution?

--- Additional comment from Marina Kalinin on 2020-09-17 17:56:46 UTC ---

Hi Parth / Natalia,

Is this applicable to ALL RHHI-V deployments or something unique about this one?

--- Additional comment from  on 2020-09-18 11:03:59 UTC ---

Hello, Marina.

KCS created:

https://access.redhat.com/solutions/5416311

It's not yet published. Please, let me know any corrections you'd like to add.

This issue is related to the length of the LVM thin pool name. If it's larger than 55 characters, then we'll hit this problem. It's limitation in the python blivet implementation. So I don't think this is customer specific. Any other customer having a large LVM thin pool name will face this problem.

Thank you,

Natalia

--- Additional comment from SATHEESARAN on 2020-09-21 07:46:35 UTC ---

(In reply to nravinas from comment #19)
> Hello, Marina.
> 
> KCS created:
> 
> https://access.redhat.com/solutions/5416311
> 
> It's not yet published. Please, let me know any corrections you'd like to
> add.
> 
> This issue is related to the length of the LVM thin pool name. If it's
> larger than 55 characters, then we'll hit this problem. It's limitation in
> the python blivet implementation. So I don't think this is customer
> specific. Any other customer having a large LVM thin pool name will face
> this problem.
> 
> Thank you,
> 
> Natalia

One information here :

The only one reason where the length of LVM thinpool goes beyond 55 chars, is when the customer uses the multipath WWID for disks.
In this case, each disks will be addressed as /dev/mapper/<WWID> and subsquent VG creation, LVM thinpool uses this WWID ( which is 33 chars long ) to build
their names.

So this case, must be evident for any customers that has multipath enabled and using multipath WWID names for RHHI-V deployment

--- Additional comment from  on 2020-09-24 11:01:19 UTC ---

Hello, Sas.

Thanks for your feedback. I've added that note to the KCS document. Please, let me know if there are any other modifications needed, or if it's OK to publish it.

Thank you,

Natalia

Comment 1 Gobinda Das 2020-10-28 13:27:43 UTC
Removing this from 4.4.3 as it's an limitation on python-blivet, and when we are trying to fix from cockpit-ovirt side some people are not agreed.
So will create new bug for python-blivet to fix this.

Comment 2 Sandro Bonazzola 2020-12-17 08:50:36 UTC
This bug missed development freeze on Dec 11th and is not included in cockpit-ovirt-0.14.17.
If this is a blocker for 4.4.4 please mark it as blocker. If not, please re-target to 4.4.5.

Comment 3 Gobinda Das 2020-12-18 06:18:21 UTC
This is not a blocker and also it's a low priority bug, so retargeting this to 4.4.5

Comment 4 Parth Dhanjal 2020-12-22 04:22:26 UTC
This is not a blocker and also it's a low priority bug, so retargeting this to 4.4.5

Comment 6 Sandro Bonazzola 2021-03-22 12:57:22 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.