Bug 1162283

Summary: Cannot register ovirt-node image to ovirt-engine
Product: [Retired] oVirt Reporter: Raul Laansoo <raul.laansoo>
Component: vdsmAssignee: Nir Soffer <nsoffer>
Status: CLOSED DUPLICATE QA Contact: Gil Klein <gklein>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.5CC: amureini, bazulay, bugs, danken, dfediuck, ecohen, fdeutsch, gklein, iheim, lsurette, mgoldboi, raul.laansoo, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-11-18 06:03:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1161021    
Bug Blocks: 1193195    
Attachments:
Description Flags
node logs
none
node logs none

Description Raul Laansoo 2014-11-10 18:07:58 UTC
Created attachment 955920 [details]
node logs

Description of problem:
Every time I configure node with latest vdsm packages (either install them on bare CentOS or use oVirt node image), after adding node to the engine, all my storage devices go offline -- all multipath devices fail or disks got corrupted after vdsm service restart. I have attached the output of dmesg of one such case.

Version-Release number of selected component (if applicable):
oVirt Engine 3.5
http://resources.ovirt.org/pub/ovirt-3.5-pre/iso/ovirt-node-iso-3.5.0.ovirt35.20140912.el6.iso

How reproducible:
Always.

Steps to Reproduce:
1. Install Engine
2. Install Node from ovirt-node-iso-3.5.0.ovirt35.20140912.el6.iso or vdsm packages on CentOS 6.6.
3. Register node from node TUI or via engine.

Actual results:
During node registration or vdsm service restart all devices are marked as failed (no I/O possible), node logging volume becomes read only. After susequent node reboot and vdsm service start all devices become read-only.

Expected results:


Additional info:
With node iso ovirt-node-iso-3.5.0.ovirt35.20140707.el6.iso I do not have this kind of issue.
Attached logs from failed node.

Comment 1 Doron Fediuck 2014-11-11 09:42:52 UTC
This could be related to iptables settings being changed when you
add the host / register it.

Can you verify if there were any configuration changes in iptables
before and after?

Comment 2 Fabian Deutsch 2014-11-12 18:18:13 UTC
It could be related to bug 1149655.
That bug is about registering 3.4 host with 3.5 engine, but maybe that is because the Node used here is very old, and might not have the relevant jsonrpc patches.

Comment 3 Dan Kenigsberg 2014-11-12 21:00:17 UTC
The fact that the host looses all connectivity makes me think that you are experiencing bug 1144639, and not the one suggested by Fabian.

Your attached log state that you are running vdsm 4.16.4-0.el6, which is prior to ovirt-3.5.0 release and the resolution of the said bug.

Could you retry installation using a post-3.5.0 release (vdsm >= 4.16.7)?

Comment 4 Raul Laansoo 2014-11-12 22:34:04 UTC
I have attached logs from latest install on CentOS 6.6,
vdsm-4.16.7-1.gitdb83943.el6.src.rpm

Comment 5 Raul Laansoo 2014-11-12 22:34:33 UTC
Created attachment 956884 [details]
node logs

Comment 6 Raul Laansoo 2014-11-14 08:15:02 UTC
The symptoms are exactly the same, when I issue echo "1" > /sys/class/fc_host/host/issue_lip on node host. Does vdsm rescan storage interconnects when starting?

Comment 7 Dan Kenigsberg 2014-11-17 15:35:45 UTC
Raul, supervdsm.log confirms your suggestion:

MainProcess|storageRefresh::DEBUG::2014-11-13 00:20:37,752::supervdsmServer::101::SuperVdsm.ServerCallback::(wrapper) call hbaRescan with () {}
MainProcess|storageRefresh::INFO::2014-11-13 00:20:37,752::hba::54::Storage.HBA::(rescan) Rescanning HBAs
MainProcess|storageRefresh::DEBUG::2014-11-13 00:20:37,753::hba::56::Storage.HBA::(rescan) Issuing lip /sys/class/fc_host/host0/issue_lip
MainProcess|storageRefresh::DEBUG::2014-11-13 00:20:38,061::hba::56::Storage.HBA::(rescan) Issuing lip /sys/class/fc_host/host1/issue_lip
MainProcess|storageRefresh::DEBUG::2014-11-13 00:20:38,408::supervdsmServer::108::SuperVdsm.ServerCallback::(wrapper) return hbaRescan with None

The LIP has been disabled by default in http://gerrit.ovirt.org/#/c/34215/ which would be part of ovirt-3.5.1. I'd appreciate if you verify that this is indeed your issue by taking the patch.

Comment 8 Raul Laansoo 2014-11-17 21:43:24 UTC
I can verify that this patch solves the issue I reported.

Comment 9 Nir Soffer 2014-11-18 06:03:18 UTC

*** This bug has been marked as a duplicate of bug 1152587 ***