Description of problem: when trying to add host to engine running vdsm version 4.13.0-x host add fails. Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: proposed fix seems to be working - http://gerrit.ovirt.org/19913 but might hide a bigger problem below.
Created attachment 809178 [details] engine.log error can be found on 2013-10-06 around 18:30 - 18:42
Let's handle the current issue and open a bug for others issues.
(In reply to Yair Zaslavsky from comment #3) > Let's handle the current issue and open a bug for others issues. This bug was opened in order to track the "other issues", which where supposedly triggered by http://gerrit.ovirt.org/#/c/17719/ backend: Fixing log print when vdsm version is not supported in cluster Engine must honor vdsm's supportedENGINEs list. If Engine sees its version there, it must accept this vdsm's version.
Even if Vdsm version is not one of the SupportedVDSMVersions ??? why so? so why to have both?
We have Engine-3.0.0 deployed in the field for eons. We are about to deploy vdsm-4.13.0. Vdsm's supportedENGINEs is the only way to tell that old Engine that it should accept the new version. We should maintain this future-compatibility feature and add a unit test so it does not break again.
So why keeping the SupportedVDSMVersions? If we verify that the cluster is supported and the engine's version is supported by the host's vdsm process, what does SupportedVDSMVersions suppose to mean?
It's useful for the opposite use case, where a newly-deployed Engine meets an ancient Vdsm.
But in such case the newly-deployed engine's version won't be in vdsm's supportedENGINEs as well. So still you don't use the SupportedVDSMVersions.. it might be useful to avoid using old versions
Closing - RHEV 3.3 Released
Please verify in the 3.4.0 bug with a 3.5 host, and in the 3.3.z bug with both 3.4 and a 3.5 hosts
Created attachment 897120 [details] engine-log failed qa. couldn't add host with vdsm-4.13.2-0.16.el6ev.x86_64 to engine with rhevm-3.4.0-0.20.el6ev.noarch error msg: Host rose04 is compatible with versions (3.0,3.1,3.2,3.3) and cannot join Cluster m3 which is set to version 3.4.
(In reply to Tareq Alayan from comment #16) > Created attachment 897120 [details] > engine-log > > failed qa. > > couldn't add host with vdsm-4.13.2-0.16.el6ev.x86_64 to engine with > rhevm-3.4.0-0.20.el6ev.noarch > > > > error msg: > Host rose04 is compatible with versions (3.0,3.1,3.2,3.3) and cannot join > Cluster m3 which is set to version 3.4. The use-case of this bug is adding a new host to a old cluster. You're adding here a host doesn't support cluster level 3.4, to such a cluster. As written in comment #13, you need to verify adding a new (3.5) host to an old (3.4) enging, and not vice versa. Moving to ON_QA.
i added vdsm-4.13.2-0.16.el6ev.x86_64 to rhevm-3.3.3-0.51.el6ev.noarch to cluster compatibility version 3.2 i added vdsm-4.13.2-0.16.el6ev.x86_64 to rhevm-3.4.0-0.20.el6ev.noarch to cluster with 3.3 compatibility mode verifed
please review comment 13. what you checked in comment 18 does not check the mechanism of a VDSM which does *not* appear in the SupportedVDSMVersions. please detail the vdsm version and the SupportedVDSMVersions in your verification.
case1: 3.4 engine with 3.5 vdsm ================================ vdsm-4.14.1-339.gitedb07b8.el6.x86_64 rhevm-3.4.0-0.20.el6ev.noarch cluster compatibility version 3.3 - host is UP cluster compatibility version 3.4 - host is up select option_name,option_value,version from vdc_options where option_name='SupportedVDSMVersions'; option_name | option_value | version -----------------------+------------------------------+--------- SupportedVDSMVersions | 4.9,4.10,4.11,4.12,4.13,4.14 | general case2: 3.4 engine with vdsm-4.13 (3.3) ======================================= vdsm-4.13.2-0.16.el6ev.x86_64 rhevm-3.4.0-0.20.el6ev.noarch cluster compatibility version 3.3 - host is UP cluster compatibility version 3.4 - host is non opertional select option_name,option_value,version from vdc_options where option_name='SupportedVDSMVersions'; option_name | option_value | version -----------------------+------------------------------+--------- SupportedVDSMVersions | 4.9,4.10,4.11,4.12,4.13,4.14 | general case3: 3.3.3 engine with 3.5 vdsm =================================== vdsm-4.14.1-339.gitedb07b8.el6.x86_64 rhevm-3.3.3-0.51.el6ev.noarch cluster compatibility version 3.2 - host is non operational cluster compatibility version 3.3 - host is non operational select option_name,option_value,version from vdc_options where option_name='SupportedVDSMVersions'; option_name | option_value | version -----------------------+-------------------------+--------- SupportedVDSMVersions | 4.9,4.10,4.11,4.12,4.13 | general case3 3.3 engine with vdsm-4.13 (3.3): ================================== vdsm-4.13.2-0.16.el6ev.x86_64 rhevm-3.3.3-0.51.el6ev.noarch cluster compatibility version 3.2 - host is UP cluster compatibility version 3.3 - host is UP select option_name,option_value,version from vdc_options where option_name='SupportedVDSMVersions'; option_name | option_value | version -----------------------+-------------------------+--------- SupportedVDSMVersions | 4.9,4.10,4.11,4.12,4.13 | general is this ok?
additional info: =============== when host is connected to 3.4 engine and vdsm is 4.13.2 vdsClient -s 0 getVdsCaps supportedENGINEs = ['3.0', '3.1', '3.2', '3.3'] when host is concected to 3.4 and vdsm is vdsm-4.14.1-339.gitedb07b8.el6.x86_64 supportedENGINEs = ['3.0', '3.1', '3.2', '3.3', '3.4']
Thanks but still something suspicious about the non-operational state. Host with vdsm >=4.13 is compatible with cluster 3.0-3.3, so the reason for the non-operational state is not clear to me (referring to case 2 and 3). what does the audit log say about those hosts?
Host rose04 is compatible with versions (3.0,3.1,3.2,3.3) and cannot join Cluster m3
(In reply to Tareq Alayan from comment #23) > Host rose04 is compatible with versions (3.0,3.1,3.2,3.3) and cannot join > Cluster m3 I guess that's the error in case: case2: 3.4 engine with vdsm-4.13 (3.3) ======================================= vdsm-4.13.2-0.16.el6ev.x86_64 rhevm-3.4.0-0.20.el6ev.noarch cluster compatibility version 3.3 - host is UP cluster compatibility version 3.4 - host is non opertional Which is okay. But, what's the error in the two cases below? case3: 3.3.3 engine with 3.5 vdsm =================================== vdsm-4.14.1-339.gitedb07b8.el6.x86_64 rhevm-3.3.3-0.51.el6ev.noarch cluster compatibility version 3.2 - host is non operational cluster compatibility version 3.3 - host is non operational
Retest with updated vdsm: ========================= rhevm-3.3.3-0.52.el6ev.noarch vdsm-4.15.0-22.gitdcd07f4.el6.x86_64 cluster compatibility version 3.3 and 3.2 host state = up select option_name,option_value,version from vdc_options where option_name='SupportedVDSMVersions'; option_name | option_value | version -----------------------+-------------------------+--------- SupportedVDSMVersions | 4.9,4.10,4.11,4.12,4.13 | general vdsClient -s 0 getVdsCaps HBAInventory = {'FC': [], 'iSCSI': [{'InitiatorName': 'iqn.1994-05.com.redhat:3865ad2788b0'}]} ISCSIInitiatorName = 'iqn.1994-05.com.redhat:3865ad2788b0' autoNumaBalancing = 2 bondings = {'bond0': {'addr': '', 'cfg': {}, 'hwaddr': '00:00:00:00:00:00', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'slaves': []}, 'bond1': {'addr': '', 'cfg': {}, 'hwaddr': '00:00:00:00:00:00', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'slaves': []}, 'bond2': {'addr': '', 'cfg': {}, 'hwaddr': '00:00:00:00:00:00', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'slaves': []}, 'bond3': {'addr': '', 'cfg': {}, 'hwaddr': '00:00:00:00:00:00', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'slaves': []}, 'bond4': {'addr': '', 'cfg': {}, 'hwaddr': '00:00:00:00:00:00', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'slaves': []}} bridges = {'rhevm': {'addr': '10.35.97.27', 'cfg': {'BOOTPROTO': 'dhcp', 'DEFROUTE': 'yes', 'DELAY': '0', 'DEVICE': 'rhevm', 'MTU': '1500', 'NM_CONTROLLED': 'no', 'ONBOOT': 'no', 'STP': 'off', 'TYPE': 'Bridge'}, 'gateway': '10.35.97.254', 'ipv6addrs': ['fe80::d6ae:52ff:fec6:1a0e/64'], 'ipv6gateway': '::', 'mtu': '1500', 'netmask': '255.255.255.0', 'opts': {'ageing_time': '29995', 'bridge_id': '8000.d4ae52c61a0e', 'forward_delay': '0', 'gc_timer': '36', 'group_addr': '1:80:c2:0:0:0', 'hash_elasticity': '4', 'hash_max': '512', 'hello_time': '199', 'hello_timer': '36', 'max_age': '1999', 'multicast_last_member_count': '2', 'multicast_last_member_interval': '99', 'multicast_membership_interval': '25996', 'multicast_querier': '0', 'multicast_querier_interval': '25496', 'multicast_query_interval': '12498', 'multicast_query_response_interval': '999', 'multicast_router': '1', 'multicast_snooping': '1', 'multicast_startup_query_count': '2', 'multicast_startup_query_interval': '3124', 'priority': '32768', 'root_id': '8000.d4ae52c61a0e', 'root_path_cost': '0', 'root_port': '0', 'stp_state': '0', 'tcn_timer': '0', 'topology_change': '0', 'topology_change_detected': '0', 'topology_change_timer': '0'}, 'ports': ['em1'], 'stp': 'off'}} clusterLevels = ['3.0', '3.1', '3.2', '3.3', '3.4'] cpuCores = '4' cpuFlags = 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,xtopology,nonstop_tsc,aperfmperf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx,f16c,rdrand,lahf_lm,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept,vpid,fsgsbase,smep,erms,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge' cpuModel = 'Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz' cpuSockets = '1' cpuSpeed = '3601.000' cpuThreads = '8' emulatedMachines = ['rhel6.5.0', 'pc', 'rhel6.4.0', 'rhel6.3.0', 'rhel6.2.0', 'rhel6.1.0', 'rhel6.0.0', 'rhel5.5.0', 'rhel5.4.4', 'rhel5.4.0'] guestOverhead = '65' hooks = {'after_disk_hotplug': {'aaa.jpg': {'md5': 'd41d8cd98f00b204e9800998ecf8427e'}}, 'after_nic_hotplug': {'aaa.jpg': {'md5': 'd41d8cd98f00b204e9800998ecf8427e'}}} kvmEnabled = 'true' lastClient = '10.35.97.27' lastClientIface = 'rhevm' management_ip = '0.0.0.0' memSize = '15921' netConfigDirty = 'False' networks = {'rhevm': {'addr': '10.35.97.27', 'bootproto4': 'dhcp', 'bridged': True, 'cfg': {'BOOTPROTO': 'dhcp', 'DEFROUTE': 'yes', 'DELAY': '0', 'DEVICE': 'rhevm', 'MTU': '1500', 'NM_CONTROLLED': 'no', 'ONBOOT': 'no', 'STP': 'off', 'TYPE': 'Bridge'}, 'gateway': '10.35.97.254', 'iface': 'rhevm', 'ipv6addrs': ['fe80::d6ae:52ff:fec6:1a0e/64'], 'ipv6gateway': '::', 'mtu': '1500', 'netmask': '255.255.255.0', 'ports': ['em1'], 'stp': 'off'}} nics = {'em1': {'addr': '', 'cfg': {'BRIDGE': 'rhevm', 'DEVICE': 'em1', 'HWADDR': 'd4:ae:52:c6:1a:0e', 'MTU': '1500', 'NM_CONTROLLED': 'no', 'ONBOOT': 'no'}, 'hwaddr': 'd4:ae:52:c6:1a:0e', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'speed': 1000}, 'em2': {'addr': '', 'cfg': {'BOOTPROTO': 'dhcp', 'DEVICE': 'em2', 'HWADDR': 'D4:AE:52:C6:1A:0F', 'NM_CONTROLLED': 'yes', 'ONBOOT': 'no', 'TYPE': 'Ethernet', 'UUID': '7053e28c-50c7-4dbc-8e4d-b7ee85072871'}, 'hwaddr': 'd4:ae:52:c6:1a:0f', 'ipv6addrs': [], 'mtu': '1500', 'netmask': '', 'speed': 0}} numaNodeDistance = {'0': [10]} numaNodes = {'0': {'cpus': [0, 1, 2, 3, 4, 5, 6, 7], 'totalMemory': '15921'}} operatingSystem = {'name': 'RHEL', 'release': '6.5.0.1.el6', 'version': '6Server'} packages2 = {'kernel': {'buildtime': 1393846365.0, 'release': '431.11.2.el6.x86_64', 'version': '2.6.32'}, 'libvirt': {'buildtime': 1395830257, 'release': '29.el6_5.7', 'version': '0.10.2'}, 'mom': {'buildtime': 1391960594, 'release': '0.0.master.20140209.gitd79b9d6.el6', 'version': '0.4.0'}, 'qemu-img': {'buildtime': 1398757031, 'release': '2.415.el6_5.9', 'version': '0.12.1.2'}, 'qemu-kvm': {'buildtime': 1398757031, 'release': '2.415.el6_5.9', 'version': '0.12.1.2'}, 'spice-server': {'buildtime': 1385990636, 'release': '6.el6_5.1', 'version': '0.12.4'}, 'vdsm': {'buildtime': 1400688219, 'release': '22.gitdcd07f4.el6', 'version': '4.15.0'}} reservedMem = '321' rngSources = ['random'] selinux = {'mode': '0'} software_revision = '22' software_version = '4.15' supportedENGINEs = ['3.0', '3.1', '3.2', '3.3', '3.4'] supportedProtocols = ['2.2', '2.3'] uuid = '4C4C4544-0058-5610-8056-B6C04F4D5731' version_name = 'Snow Man' vlans = {} vmTypes = ['kvm']
Great. Thanks for re-testing it. Please move it to VERIFIED.
Closing as part of 3.4.0
Hi Guys, I have this query. I have RHS 3.0.3 running vdsm - vdsm-4.14.7.3-1.el6rhs.x86_64 This node couldn't be added to RHEV 3.3 in 3.3 cluster compatibility. Is this expected ?
Hi Satheesaran, vdsm-4-14 is compatible with 3.4 compatibility mode and above.