Created attachment 612713 [details] ovirt-engine logfile Description of problem: Attempting to activate/add a host with Infiniband cards present fails. Filing bug against ovirt-engine-core may possibly involve VDSM as well. The returned length of the Infiniband card "hwaddr" seems to cause the engine DB SQL insert statement to fail --> (ERROR: value too long for type character varying(20)) Version-Release number of selected component (if applicable): oVirt Engine 3.1+ VDSM 4.10+ How reproducible: Always Steps to Reproduce: 1. Install infiniband card in EL6 or Fedora based ovirt VM host server 2. Install and start rdma service 3. Attempt to add host via vdsm-bootstrap OR 3.a Add host with vdsm-reg then attempt approve/activate host Actual results: Host activation/add fails. Expected results: Host activation/add succeeds. Additional info: Disabling the rdma service (EG: devices ib0, ib1, etc go away) on the affected system will restore normal function. This bug prevents the use Inifiniband networking (IPOIB) or NFS-RDMA for storage communications.
Created attachment 612714 [details] vdsm log file
To fix/workaround this: I ended up stopping the engine service, dumping the database and altering the the table vds_interface --> column "mac_addr" and increasing the char varying length from 20 to 60. I then restore the altered database and go about business as usual. The only side effects I note from this is that VDSM does not seem to be able to read/report TX/RX stats properly. Also the MAC Address fields in the admin/user portals seem to be of fixed size and the IB card HW address overflows the field. - DHC
Hi, There is an open bug: https://bugzilla.redhat.com/show_bug.cgi?id=823397 Itzik
*** This bug has been marked as a duplicate of bug 823397 ***
Just built and tested ovirt-engine from master commit: http://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=commit;h=8c3f5e5ba95ca46009b70143daa5aae4513943a5 I can confirm that this resolves this issue. The only remaining things to note which are really minor at best is that the MAC field in UI does not expand to show the full HCA IB HW address. Also interface statistics for the IB boards do not seem to be displaying. - DHC
DHC, does the kernel update /sys/class/net/ib0/statistics/rx_bytes ? Does vdsm report your IB board in vdsClient 0 getVdsStats ? What are the values reported there?
Created attachment 624202 [details] vdsClient 0 getVdsStats output Attached output for vdsClient 0 getVdsStats
I now tested with newer versions of firefox on windows and fedora/EL and the hover-over on the IB address field does display the whole address. IE9 also seems to work but IE8 for whatever reason does not always seem to work (very random).
Thanks. ib0': {'macAddr': '', 'name': 'ib0', 'txDropped': '12', 'rxErrors': '0', 'txRate': '0.1', 'rxRate': '0.1', 'txErrors': '0', 'state': 'up', 'speed': '1000', 'rxDropped': '0'} Could it be that 0.1 * 1000 mbps is the true value? Could you push some more traffic via ib0? How does /sys/class/net/ib0/statistics/rx_bytes grow?
The max data rate on the IB board per port in that server is 10GB so 1000 MB/s since the board is actually an older DDR 4x board. On that system the datastores are mounted via NFS via ib0 /sys/class/net/ib0/statistics/rx_bytes looks like: root@kezan~]# cat /sys/class/net/ib0/statistics/rx_bytes 212110936952
Would you stress your link and run # cat /sys/class/net/ib0/statistics/rx_bytes; sleep 10; /sys/class/net/ib0/statistics/rx_bytes; vdsClient 0 getVdsStats this would give us better comparison of what Vdsm reads and what it reports.
Created attachment 626036 [details] ib0 rx_bytes stats 5 second samples ib0 rx_bytes stats sampled every 5 seconds
Created attachment 626037 [details] vdsClient 0 getVdsStats output 5 second samples vdsClient 0 getVdsStats output sampled every 5 seconds
DHC, are you sure there is a problem? I see that in 110 seconds, 226690 bytes has been received on ib0. That's 1.8 mbs, which is in the neighborhood of what vdsm reports. Am I missing something?
Dan, I too note that vdsm is reporting correctly however what I see in the UI is: Name Address MAC Speed(Mbps) Rx (Mbps) Tx (Mbps) Drops(Pkts) ib0 192.168.1.1 80:00:04:04:fe... 0 < 1 < 1 21 Both RX/TX never seem to report anything but "< 1" even when I am pushing a large amount of data through the link. - DHC