Created attachment 859180 [details] vnc console of problematic VM Description of problem: migration ok from webadmin gui but errors in vdsm.log Version-Release number of selected component (if applicable): vdsm-4.13.3-1.fc19.x86_64 and vdsm-4.13.3-3.fc19.x86_64 How reproducible: Happened only once at the moment Steps to Reproduce: 1. passing from 3.3.3rc to 3.3.3final on f19 with ovirt repo two nodes with Gluster DC 2. migrate 3 VMs 3. from web gui all are ok, but one (named c6s with CentOS 6.5 operating system) that was in boot problem since 5 days before (see screenshot) gets errors in vdsm.log Actual results: no problem form webadmin gui Expected results: some feedback. Actually I can see that no ip is shown in webadmin gui. In fact VM is stopped at boot with problem with /boot fs because of superblock last write time in the future Additional info:
during migration itself the two nodes was not aligned. In this case on source host I had 3.3.3rc packages such as vdsm-4.13.3-1.fc19.x86_64 and kernel-3.12.8-200.fc19.x86_64 On dest I had already updated packages (also fedora ones) such as: vdsm-4.13.3-3.fc19.x86_64 and kernel-3.12.9-201.fc19.x86_64 Instead libvirt was the same on both: libvirt-1.0.5.9-1.fc19.x86_64 qemu-kvm-1.4.2-15.fc19.x86_64
Created attachment 859185 [details] vdsm log on source host
Created attachment 859186 [details] dest vdsm log
Created attachment 859189 [details] source supervdsm log
Created attachment 859195 [details] dest supervdsm log
MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,410::utils::489::root::(execCmd) '/usr/sbin/gluster --mode=script volume info --xml' (cwd None) MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,433::utils::509::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0 MainProcess|Thread-786::DEBUG::2013-12-17 12:19:21,434::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper) return wrapper with {'gviso': {'transportType': ['TCP'], 'uuid': 'c8cbcac7-1d40-4cee-837d-bb97467fb2bd', 'bricks': ['f18ovn01.mydomain:/gluster/ISO_GLUSTER/brick1', 'f18ovn03.mydomain:/gluster/ISO_GLUSTER/brick1'], 'volumeName': 'gviso', 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 'options': {'storage.owner-gid': '36', 'storage.owner-uid': '36', 'server.allow-insecure': 'on'}}, 'gvdata': {'transportType': ['TCP'], 'uuid': 'ed71a4c2-6205-4aad-9aab-85da086d5ba3', 'bricks': ['f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1', 'f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1'], 'volumeName': 'gvdata', 'volumeType': 'REPLICATE', 'replicaCount': '2', 'brickCount': '2', 'distCount': '2', 'volumeStatus': 'ONLINE', 'stripeCount': '1', 'options': {'storage.owner-gid': '36', 'storage.owner-uid': '36', 'server.allow-insecure': 'on'}}} MainProcess|Thread-107::ERROR::2013-12-17 12:47:30,511::supervdsmServer::99::SuperVdsm.ServerCallback::(wrapper) Error in wrapper Traceback (most recent call last): File "/usr/share/vdsm/supervdsmServer", line 97, in wrapper res = func(*args, **kwargs) File "/usr/share/vdsm/supervdsmServer", line 367, in wrapper return func(*args, **kwargs) File "/usr/share/vdsm/gluster/__init__.py", line 31, in wrapper return func(*args, **kwargs) File "/usr/share/vdsm/gluster/cli.py", line 309, in volumeStatus return _parseVolumeStatus(xmltree) File "/usr/share/vdsm/gluster/cli.py", line 135, in _parseVolumeStatus if value['path'] == 'localhost': KeyError: 'path'
It looks like 'path' is not set in glusterfs. Could you share glusterfs version detail?
It is the standard now provided in Fedora 19: $ rpm -qa|grep gluster glusterfs-3.4.2-1.fc19.x86_64 glusterfs-server-3.4.2-1.fc19.x86_64 glusterfs-api-3.4.2-1.fc19.x86_64 glusterfs-libs-3.4.2-1.fc19.x86_64 glusterfs-rdma-3.4.2-1.fc19.x86_64 glusterfs-cli-3.4.2-1.fc19.x86_64 glusterfs-fuse-3.4.2-1.fc19.x86_64 glusterfs-api-devel-3.4.2-1.fc19.x86_64 vdsm-gluster-4.13.3-3.fc19.noarch glusterfs-devel-3.4.2-1.fc19.x86_64 Any setting I have to put in? Gianluca
could you share 'gluster volume status --xml' output?
Created attachment 859495 [details] output of gluster volume status on f18ovn01
Created attachment 859497 [details] output of gluster volume status on f18ovn03
Please note that gviso in this moment is defined in Gluster but not used by oVirt. It was crated to serve as an iso domain but not configured for it yet. Gianluca
I remember similar kind of issue was found in glusterfs and respective fix was at http://review.gluster.org/6571. I see this fix is available in master not in release-3.x branches. However attached 'gluster volume status --xml' does not exhibit this. I see vdsm uses 'gluster volume status <volumeName> --xml' way. This could be the problem. Kaushal, any thoughts?
My findings. 1. If volume is distributed-replicate and one of node of replicate is down, this bug appears due to nested <node> tag as per bz#1046020 2. This fix http://review.gluster.org/6571 is required for release-3.x branches. Gianluca, can you confirm your volume as per point 1?
I confirm distributed replicated configuration between the two hosts: # gluster volume info gvdata Volume Name: gvdata Type: Replicate Volume ID: ed71a4c2-6205-4aad-9aab-85da086d5ba3 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: f18ovn01.mydomain:/gluster/DATA_GLUSTER/brick1 Brick2: f18ovn03.mydomain:/gluster/DATA_GLUSTER/brick1 Options Reconfigured: server.allow-insecure: on storage.owner-uid: 36 storage.owner-gid: 36 cluster.quorum-type: none
Setting target release to current version for consideration and review. please do not push non-RFE bugs to an undefined target release to make sure bugs are reviewed for relevancy, fix, closure, etc.
This is an automated message. Re-targeting all non-blocker bugs still open on 3.4.0 to 3.4.1.
This is an automated message. oVirt 3.4.1 has been released. This issue has been retargeted to 3.5.0 since it has not been marked as high priority or severity issue, please retarget if needed.
This patch http://review.gluster.org/6571 is not still available in release-3.5 and release-3.4 branches. Could you make it available in next upstream glusterfs release?
glusterfs-3.5 contains this fix now. Hence marking this bug ON_QA
oVirt 3.5 has been released and should include the fix for this issue.