835920 – 3.1 - vdsm - beta1 PosixFS: after reconstruct, data-center is UP and storage is unknown (stuck)

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 835920 - 3.1 - vdsm - beta1 PosixFS: after reconstruct, data-center is UP and storage is unknown (stuck)

Summary: 3.1 - vdsm - beta1 PosixFS: after reconstruct, data-center is UP and storage ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	6.3
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	urgent
Target Milestone:	beta
Target Release:	---
Assignee:	Laszlo Hornyak
QA Contact:	Daniel Paikov
Docs Contact:
URL:
Whiteboard:	storage
Duplicates (2):	814331 835949 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-06-27 14:34 UTC by Haim
Modified:	2022-07-09 05:36 UTC (History)
CC List:	20 users (show)
Fixed In Version:	vdsm-4.9.6-27.0
Doc Type:	Bug Fix
Doc Text:	In an earlier version of Red Hat Enterprise Virtualization, when working with PosixFS (Gluster) and migrating data domains, the reconstruction of the data domains would sometimes fail. Sometimes when reconstruct commands were sent to VDSM, the storage domain acquired an "unknown" status and the status of the data center remained "UP". In this scenario, reconstruct and spmStart both succeeded on VDSM. This was because vdsm was sending "POSIXFS" instead of "SHAREDFS". VDSM has been now updated and storage migration now works as expected.
Clone Of:
Environment:
Last Closed:	2012-12-04 19:01:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
engine.log (120.12 KB, application/x-gzip) 2012-06-27 14:37 UTC, Haim	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2012:1508	0	normal	SHIPPED_LIVE	Important: rhev-3.1.0 vdsm security, bug fix, and enhancement update	2012-12-04 23:48:05 UTC

Description Haim 2012-06-27 14:34:07 UTC

Description of problem:

on posixFS (using glusterfs), after reconstruct command is sent to vdsm, storage domain goes to unknown, and data-center status is UP.
reconstruct and spmStart was succeeded on vdsm. 

no errors on vdsm side (host is SPM) - I think that the problem relies in the fact engine failed to change storage status on DB...

the following error repeats in engine-logs:

2012-06-27 20:34:11,595 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-86) [3fb85499] IrsBroker::Failed::GetStoragePoolInfoVDS due to: IRSErrorException: IRSErrorException: 
2012-06-27 20:34:21,680 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand] (QuartzScheduler_Worker-47) [7cc11628] irsBroker::BuildStorageDynamicFromXmlRpcStruct::Failed building Storage dynamic, xmlRpcStruct = org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcStruct@1b4351d6
2012-06-27 20:34:21,680 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.GetStoragePoolInfoVDSCommand] (QuartzScheduler_Worker-47) [7cc11628] org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSErrorException:


data-base capture of all related tables:

engine=# SELECT * from storage_domain_static;
                  id                  |               storage                |   storage_name   | storage_domain_type | storage_type | storage_domain_format_type |         _crea
te_date          |         _update_date          | recoverable 
--------------------------------------+--------------------------------------+------------------+---------------------+--------------+----------------------------+--------------
-----------------+-------------------------------+-------------
 fdbeb420-5422-4047-b241-2254cb131e34 | 879b3711-765e-45ac-b8aa-c71612822522 | myglusterDomin2  |                   0 |            1 | 0                          | 2012-06-26 21
:34:49.734133+03 | 2012-06-26 21:35:43.316215+03 | t
 faa99bc4-ecc5-4945-b0c4-3a8530709e04 | e5a46a66-f0e2-49a6-8699-e0c845465d8a | myglusterDomin1  |                   1 |            1 | 0                          | 2012-06-26 21
:26:39.106975+03 | 2012-06-26 21:35:43.316215+03 | t
 748039c1-9e96-459c-809f-6590fe11a37b | 0dabafbc-fd45-47c4-b8ff-ec833880226e | myDom            |                   0 |            6 | 0                          | 2012-06-26 21
:55:53.281407+03 | 2012-06-26 21:56:04.709448+03 | t
 e8c3bc49-3d28-433d-a215-63ff96fcbc97 | 2bd3367b-53f0-4fdb-9c3d-b472df8e64c7 | gluster2-volumes |                   0 |            1 | 0                          | 2012-06-27 16
:15:47.882953+03 | 2012-06-27 16:15:51.892524+03 | t
(4 rows)

engine=# SELECT * from storage_domain_dynamic;
                  id                  | available_disk_size | used_disk_size 
--------------------------------------+---------------------+----------------
 e8c3bc49-3d28-433d-a215-63ff96fcbc97 |                  13 |              4
 fdbeb420-5422-4047-b241-2254cb131e34 |                  14 |              3
 faa99bc4-ecc5-4945-b0c4-3a8530709e04 |                  14 |              3
 748039c1-9e96-459c-809f-6590fe11a37b |                  14 |              3
(4 rows)

engine=# SELECT * from storage_pool;
                  id                  |   name   |       description       | storage_pool_type | storage_pool_format_type | status | master_domain_version |              spm_vds
_id              | compatibility_version |         _create_date          |         _update_date          | quota_enforcement_type 
--------------------------------------+----------+-------------------------+-------------------+--------------------------+--------+-----------------------+---------------------
-----------------+-----------------------+-------------------------------+-------------------------------+------------------------
 75659836-bedc-11e1-ad25-001a4a16970e | Default  | The default Data Center |                 3 |                          |      0 |                     0 |                     
                 | 3.1                   | 2012-06-25 18:43:06.947764+03 | 2012-06-25 19:22:34.435044+03 |                      0
 880465a5-2db7-42c7-b567-16c2b1a074e0 | gluster  |                         |                 1 | 0                        |      4 |                     2 |                     
                 | 3.1                   | 2012-06-26 21:25:06.494617+03 | 2012-06-26 21:42:52.410258+03 |                      2
 b66cb5e6-1e47-4644-bcd2-fdd8d6b5f394 | kaka2    |                         |                 4 |                          |      0 |                     0 | 00000000-0000-0000-0
000-000000000000 | 3.1                   | 2012-06-25 20:41:47.429586+03 |                               |                      0
 e7c3db96-290e-413a-b06f-78628230b4f1 | Gluster2 |                         |                 1 | 0                        |      1 |                     1 | 1d1d51f4-bee2-11e1-a
36f-001a4a16970e | 3.1                   | 2012-06-27 16:15:08.456696+03 | 2012-06-27 16:16:38.055742+03 |                      0
 def8e8d2-9711-4adb-b86d-309051c7027a | PosixFS  |                         |                 6 | 0                        |      1 |                     1 | 0fc63d04-c072-11e1-9
011-001a4a16970e | 3.1                   | 2012-06-26 21:43:09.057281+03 | 2012-06-27 19:35:54.730954+03 |                      0
(5 rows)

Comment 1 Haim 2012-06-27 14:37:36 UTC

Created attachment 594803 [details]
engine.log

Comment 2 mkublin 2012-06-28 08:58:17 UTC

After investigation with Haim, it is look like that a problem is not at reconstruct, a problem is that vdsm is reporting at getStoragePoolInfo storage type SHAREDFS instead of POSIXFS moving to Ayal

Comment 4 mkublin 2012-07-10 05:48:16 UTC

*** Bug 814331 has been marked as a duplicate of this bug. ***

Comment 5 mkublin 2012-07-10 05:49:30 UTC

*** Bug 835949 has been marked as a duplicate of this bug. ***

Comment 6 Laszlo Hornyak 2012-07-10 11:55:45 UTC

http://gerrit.ovirt.org/6103

Comment 7 Laszlo Hornyak 2012-07-13 06:04:34 UTC

I73b0d29cf39a45589d90335e88ae84c5744796e1

Comment 11 Haim 2012-08-12 15:58:47 UTC

verified on si13.2 with vdsm 4.9-27. managed to create 2 hosts setup with 2 data posix fs domains, and migrate master between the both domains.

Comment 13 Laszlo Hornyak 2012-10-24 07:05:58 UTC

I think it is ok to include it in release notes.

Comment 14 Jacob Wyatt 2012-11-02 20:59:26 UTC

2 nodes, 1 engine

Glusterfs 3.3.1
vdsm 4.10  
ovirt-engine 3.1
Fedora 17

Successfully created a "brick" on each of 2 cluster nodes and then created a volume via the ovirt web interface.  I added that volume as the Data(Master) volume to the cluster and it initializes and starts but then the nodes constantly contend for SPM.  If I put one of the nodes in maintenance mode everything is fine.

I hope this is the right bug. I linked over from a duplicate that appeared to match my issue.  Thanks.

Comment 16 errata-xmlrpc 2012-12-04 19:01:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1508.html

Note You need to log in before you can comment on or make changes to this bug.

abaron
aburden
bazulay
bocaj.org
cpelland
dfediuck
dyasny
hateya
iheim
ilvovsky
lhornyak
lpeer
mgoldboi
mkublin
Rhev-m-bugs
robert
yeylon
ykaul
yzaslavs
zdover