Bug 1061211 - migration ok from webadmin gui but errors in vdsm.log with KeyError: 'path error
Summary: migration ok from webadmin gui but errors in vdsm.log with KeyError: 'path error
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.3
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.5.0
Assignee: Bala.FA
QA Contact: Aharon Canan
URL:
Whiteboard: gluster
Depends On: 1046020 1117241
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-04 14:54 UTC by Gianluca Cecchi
Modified: 2016-02-10 19:29 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-10-17 12:43:02 UTC
oVirt Team: Gluster
Embargoed:


Attachments (Terms of Use)
vnc console of problematic VM (251.93 KB, image/png)
2014-02-04 14:54 UTC, Gianluca Cecchi
no flags Details
vdsm log on source host (119.60 KB, application/gzip)
2014-02-04 15:03 UTC, Gianluca Cecchi
no flags Details
dest vdsm log (28.04 KB, application/gzip)
2014-02-04 15:03 UTC, Gianluca Cecchi
no flags Details
source supervdsm log (352.90 KB, application/gzip)
2014-02-04 15:04 UTC, Gianluca Cecchi
no flags Details
dest supervdsm log (361.04 KB, application/gzip)
2014-02-04 15:04 UTC, Gianluca Cecchi
no flags Details
output of gluster volume status on f18ovn01 (3.55 KB, text/xml)
2014-02-05 07:06 UTC, Gianluca Cecchi
no flags Details
output of gluster volume status on f18ovn03 (3.55 KB, text/xml)
2014-02-05 07:07 UTC, Gianluca Cecchi
no flags Details

Description Gianluca Cecchi 2014-02-04 14:54:50 UTC
Created attachment 859180 [details]
vnc console of problematic VM

Description of problem:
migration ok from webadmin gui but errors in vdsm.log

Version-Release number of selected component (if applicable):
vdsm-4.13.3-1.fc19.x86_64 and vdsm-4.13.3-3.fc19.x86_64

How reproducible:
Happened only once at the moment

Steps to Reproduce:
1. passing from 3.3.3rc to 3.3.3final on f19 with ovirt repo
two nodes with Gluster DC

2. migrate 3 VMs 

3. from web gui all are ok, but one (named c6s with CentOS 6.5 operating system) that was in boot problem since 5 days before (see screenshot) gets errors in vdsm.log

Actual results:
no problem form webadmin gui

Expected results:
some feedback. Actually I can see that no ip is shown in webadmin gui.
In fact VM is stopped at boot with problem with /boot fs because of superblock last write time in the future

Additional info:

Comment 1 Gianluca Cecchi 2014-02-04 14:56:12 UTC
during migration itself the two nodes was not aligned.
In this case on source host I had 3.3.3rc packages such as
vdsm-4.13.3-1.fc19.x86_64
and
kernel-3.12.8-200.fc19.x86_64

On dest I had already updated packages (also fedora ones) such as:
vdsm-4.13.3-3.fc19.x86_64
and
kernel-3.12.9-201.fc19.x86_64

Instead libvirt was the same on both:
libvirt-1.0.5.9-1.fc19.x86_64
qemu-kvm-1.4.2-15.fc19.x86_64

Comment 2 Gianluca Cecchi 2014-02-04 15:03:09 UTC
Created attachment 859185 [details]
vdsm log on source host

Comment 3 Gianluca Cecchi 2014-02-04 15:03:45 UTC
Created attachment 859186 [details]
dest vdsm log

Comment 4 Gianluca Cecchi 2014-02-04 15:04:14 UTC
Created attachment 859189 [details]
source supervdsm log

Comment 5 Gianluca Cecchi 2014-02-04 15:04:50 UTC
Created attachment 859195 [details]
dest supervdsm log

Comment 6 Dan Kenigsberg 2014-02-04 16:11:27 UTC
MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,410::utils::489::root::(execCmd) '/usr/sbin/gluster --mode=script volume info --xml' (cwd None)
MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,433::utils::509::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,434::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper) return wrapper with {'gviso': {'transportType': ['TCP'], 'uuid': 'c8cbcac7-1d40-4cee-837d-bb97467fb2bd', 'bricks': ['f18ovn01.mydomain:/gluster/ISO_GLUSTER/brick1', 'f18ovn03.mydomain:/gluster/ISO_GLUSTER/brick1'], 'volumeName': 'gviso', 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 'options': {'storage.owner-gid': '36', 'storage.owner-uid': '36', 'server.allow-insecure': 'on'}}, 'gvdata': {'transportType': ['TCP'], 'uuid': 'ed71a4c2-6205-4aad-9aab-85da086d5ba3', 'bricks': ['f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1', 'f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1'], 'volumeName': 'gvdata', 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 'options': {'storage.owner-gid': '36', 'storage.owner-uid': '36', 'server.allow-insecure': 'on'}}}
MainProcess|Thread-107::ERROR::2013-12-17 12:47:30,511::supervdsmServer::99::SuperVdsm.ServerCallback::(wrapper) Error in wrapper
Traceback (most recent call last):
  File "/usr/share/vdsm/supervdsmServer", line 97, in wrapper
    res = func(*args, **kwargs)
  File "/usr/share/vdsm/supervdsmServer", line 367, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/vdsm/gluster/__init__.py", line 31, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/vdsm/gluster/cli.py", line 309, in volumeStatus
    return _parseVolumeStatus(xmltree)
  File "/usr/share/vdsm/gluster/cli.py", line 135, in _parseVolumeStatus
    if value['path'] == 'localhost':
KeyError: 'path'

Comment 7 Bala.FA 2014-02-05 06:31:28 UTC
It looks like 'path' is not set in glusterfs.  Could you share glusterfs version detail?

Comment 8 Gianluca Cecchi 2014-02-05 06:50:23 UTC
It is the standard now provided in Fedora 19:

$ rpm -qa|grep gluster
glusterfs-3.4.2-1.fc19.x86_64
glusterfs-server-3.4.2-1.fc19.x86_64
glusterfs-api-3.4.2-1.fc19.x86_64
glusterfs-libs-3.4.2-1.fc19.x86_64
glusterfs-rdma-3.4.2-1.fc19.x86_64
glusterfs-cli-3.4.2-1.fc19.x86_64
glusterfs-fuse-3.4.2-1.fc19.x86_64
glusterfs-api-devel-3.4.2-1.fc19.x86_64
vdsm-gluster-4.13.3-3.fc19.noarch
glusterfs-devel-3.4.2-1.fc19.x86_64


Any setting I have to put in?

Gianluca

Comment 9 Bala.FA 2014-02-05 06:58:13 UTC
could you share 'gluster volume status --xml' output?

Comment 10 Gianluca Cecchi 2014-02-05 07:06:33 UTC
Created attachment 859495 [details]
output of gluster volume status on f18ovn01

Comment 11 Gianluca Cecchi 2014-02-05 07:07:01 UTC
Created attachment 859497 [details]
output of gluster volume status on f18ovn03

Comment 12 Gianluca Cecchi 2014-02-05 07:08:27 UTC
Please note that gviso in this moment is defined in Gluster but not used by oVirt.
It was crated to serve as an iso domain but not configured for it yet.

Gianluca

Comment 13 Bala.FA 2014-02-05 08:50:48 UTC
I remember similar kind of issue was found in glusterfs and respective fix was at http://review.gluster.org/6571.

I see this fix is available in master not in release-3.x branches.

However attached 'gluster volume status --xml' does not exhibit this.  I see vdsm uses 'gluster volume status <volumeName> --xml' way.  This could be the problem.

Kaushal, any thoughts?

Comment 14 Bala.FA 2014-02-05 10:36:32 UTC
My findings.

1. If volume is distributed-replicate and one of node of replicate is down, this bug appears due to nested <node> tag as per bz#1046020

2. This fix http://review.gluster.org/6571 is required for release-3.x branches.

Gianluca, can you confirm your volume as per point 1?

Comment 15 Gianluca Cecchi 2014-02-05 10:38:44 UTC
I confirm distributed replicated configuration between the two hosts:
# gluster volume  info gvdata
 
Volume Name: gvdata
Type: Replicate
Volume ID: ed71a4c2-6205-4aad-9aab-85da086d5ba3
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1
Brick2: f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1
Options Reconfigured:
server.allow-insecure: on
storage.owner-uid: 36
storage.owner-gid: 36
cluster.quorum-type: none

Comment 16 Itamar Heim 2014-02-09 08:52:43 UTC
Setting target release to current version for consideration and review. please
do not push non-RFE bugs to an undefined target release to make sure bugs are
reviewed for relevancy, fix, closure, etc.

Comment 17 Sandro Bonazzola 2014-03-04 09:28:50 UTC
This is an automated message.
Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1.

Comment 18 Sandro Bonazzola 2014-05-08 13:56:04 UTC
This is an automated message.

oVirt 3.4.1 has been released.
This issue has been retargeted to 3.5.0 since it has not been marked as high priority or severity issue, please retarget if needed.

Comment 19 Bala.FA 2014-07-08 08:54:44 UTC
This patch http://review.gluster.org/6571 is not still available in release-3.5 and release-3.4 branches.  Could you make it available in next upstream glusterfs release?

Comment 20 Sahina Bose 2014-09-23 06:13:55 UTC
glusterfs-3.5 contains this fix now. Hence marking this bug ON_QA

Comment 21 Sandro Bonazzola 2014-10-17 12:43:02 UTC
oVirt 3.5 has been released and should include the fix for this issue.


Note You need to log in before you can comment on or make changes to this bug.