## Description of problem: An invalid timezone setting for a single VM can cause cluster compatibility upgrade to not work. The logs do not clearly indicate the problem VM ## Version-Release number of selected component (if applicable): rhevm-4.0.4.4-0.1.el7ev.noarch ## How reproducible: always ## Steps to Reproduce: 1. create a 3.6 DC/cluster, and VM's 2. change one of the VM's timezone to '' (vm_static, time_zone) engine=# select vm_name, vm_guid, os, time_zone from vm_static where cluster_id = 'f0f30779-6e8b-46e8-8689-9fd46cea220b' order by vm_name; vm_name | vm_guid | os | time_zone ------------+--------------------------------------+----+------------------- linux-test | 0dbf06c6-2734-4d72-84ee-30d7dc230c56 | 5 | Etc/GMT rhel6-test | 5d5fe9fe-3e70-4a22-9cdc-d6c8c8f9694f | 19 | Etc/GMT rhel7-test | 12c43e79-648a-4c7f-a0a7-54a7a6be9e7f | 24 | win-test | b57eb0c7-f933-4a23-aff8-1ed6424ff0ed | 25 | GMT Standard Time 3. ## Actual results: From the gui, action fails with error: "Error while executing action Edit Cluster properties: Internal Engine Error" ## Expected results: GUI (or logs) should report specifically which VM is in error. ## Additional info: I don't have a reproducer for creating a VM with an invalid timezone. Not sure how the customer managed to achieve it, but we spent several hours messing around with the wrong VM's in an attempt to isolate the problem. In larger environments (with a mix of Linux and other OS's), it may be difficult to see which ones are 'invalid'
So far I had a look at this feature in the nightly build. There is some more information, but not sure if it might be enough to satisfy the request for pin pointing the problem and pointing to the problematic vm. e.g. I forced some invalid string in DB for vm's time_zone and try to upgrade the cluster I get: Error while executing action: Cannot edit Cluster. Invalid time zone for given OS type. Attribute: vmStatic While this does add the real reason to the message, if I had 200 VMs in my env I'd have a hard time figuring which one had caused the problem. I can figure that out by this line in engine.log, but I'm thinking it's not enough, open for a discussion about it. 2016-11-22 14:10:27,854 INFO [org.ovirt.engine.core.bll.UpdateClusterCommand] (default task-12) [28807589] Lock freed to object 'EngineLock:{exclusiveLocks='null', sharedLocks='[{faulty_vm's_id}=<VM, ACTION_TYPE_FAILED_CLUSTER_IS_BEING_UPDATED$clusterName another-clust>]'}' As this RFE doesn't hold many cases, I'm adding 1 case to check that an error in upgrade cluster doesn't produce Internal Error message - please review this case once I upload the link to polarion in here and tell me if you think I need to add more cases.
Verifying based on my comment #2 and attached test case. Opening a new RFE - https://bugzilla.redhat.com/show_bug.cgi?id=1418641 for more specific information in logging as I understand that my request from comment #2 will not be easily implemented within the scope of this RFE.