Description of problem: Had 2 storage domain in the DC: 1 Gluster as master and another data domain. I deactivated the Gluster domain and the operation ended with a failure in vdsm. Version-Release number of selected component (if applicable): vdsm-cli-4.17.0-1201.git7ba0684.el7.noarch vdsm-yajsonrpc-4.17.0-1201.git7ba0684.el7.noarch vdsm-infra-4.17.0-1201.git7ba0684.el7.noarch vdsm-4.17.0-1201.git7ba0684.el7.noarch vdsm-xmlrpc-4.17.0-1201.git7ba0684.el7.noarch vdsm-jsonrpc-4.17.0-1201.git7ba0684.el7.noarch vdsm-python-4.17.0-1201.git7ba0684.el7.noarch glusterfs-libs-3.7.2-3.el7.x86_64 glusterfs-api-3.7.2-3.el7.x86_64 glusterfs-server-3.7.2-3.el7.x86_64 glusterfs-client-xlators-3.7.2-3.el7.x86_64 glusterfs-fuse-3.7.2-3.el7.x86_64 glusterfs-cli-3.7.2-3.el7.x86_64 glusterfs-geo-replication-3.7.2-3.el7.x86_64 glusterfs-3.7.2-3.el7.x86_64 libvirt-daemon-1.2.8-16.el7_1.3.x86_64 qemu-kvm-ev-2.1.2-23.el7_1.3.1.x86_64 selinux-policy-3.13.1-23.el7_1.8.noarch ovirt-engine-3.6.0-0.0.master.20150627185750.git6f063c1.el6.noarch How reproducible: Always Steps to Reproduce: 1. Have a DC with 2 storage domains: 1 Gluster as master and another data domain 2. Deactivate the Gluster domain Actual results: Gluster master domain deactivation failure in vdsm: Thread-1547::ERROR::2015-07-30 08:31:11,883::task::866::Storage.TaskManager.Task::(_setError) Task=`a5505f7f-1eea-49e7-a02b-644d8e754b04`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1237, in deactivateStorageDomain pool.deactivateSD(sdUUID, msdUUID, masterVersion) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1091, in deactivateSD self.masterMigrate(sdUUID, newMsdUUID, masterVersion) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 810, in masterMigrate exclude=('./lost+found',)) File "/usr/share/vdsm/storage/fileUtils.py", line 68, in tarCopy raise TarCopyFailed(tsrc.returncode, tdst.returncode, out, err) TarCopyFailed: (1, 0, '', '/usr/bin/tar: ./vms: time stamp 2015-07-30 14:59:44 is 23312.863365733 s in the future\n/usr/bin/tar: ./tasks/31002a70-459d-418a-9b03-8aabb36b15c5/31002a70-459d-418a-9b03-8aabb36b15c5.ta sk: time stamp 2015-07-30 11:43:41 is 11549.857334236 s in the future\n/usr/bin/tar: ./tasks/31002a70-459d-418a-9b03-8aabb36b15c5/31002a70-459d-418a-9b03-8aabb36b15c5.job.0: time stamp 2015-07-30 11:43:41 is 11549 .854094166 s in the future\n/usr/bin/tar: ./tasks/31002a70-459d-418a-9b03-8aabb36b15c5/31002a70-459d-418a-9b03-8aabb36b15c5.result: time stamp 2015-07-30 11:43:41 is 11549.850335046 s in the future\n/usr/bin/tar: ./tasks/31002a70-459d-418a-9b03-8aabb36b15c5: time stamp 2015-07-30 11:43:41 is 11549.849210573 s in the future\n/usr/bin/tar: ./tasks/5a514e3a-2c8b-4aee-98a4-00192f879d21/5a514e3a-2c8b-4aee-98a4-00192f879d21.task : time stamp 2015-07-30 11:43:42 is 11550.844568364 s in the future\n/usr/bin/tar: ./tasks/5a514e3a-2c8b-4aee-98a4-00192f879d21/5a514e3a-2c8b-4aee-98a4-00192f879d21.result: time stamp 2015-07-30 11:43:42 is 11550. 840811022 s in the future\n/usr/bin/tar: ./tasks/5a514e3a-2c8b-4aee-98a4-00192f879d21/5a514e3a-2c8b-4aee-98a4-00192f879d21.job.0: time stamp 2015-07-30 11:43:42 is 11550.834173155 s in the future\n/usr/bin/tar: ./ tasks/5a514e3a-2c8b-4aee-98a4-00192f879d21: time stamp 2015-07-30 11:43:42 is 11550.833540006 s in the future\n/usr/bin/tar: ./tasks: time stamp 2015-07-30 11:43:42 is 11550.832915777 s in the future\n/usr/bin/tar : .: time stamp 2015-07-30 14:59:44 is 23312.832168412 s in the future\n') Engine.log: 2015-07-30 08:31:12,577 ERROR [org.ovirt.engine.core.bll.storage.DeactivateStorageDomainCommand] (org.ovirt.thread.pool-8-thread-40) [12325ce] Command 'org.ovirt.engine.core.bll.storage.DeactivateStorageDomainComm and' failed: VdcBLLException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSErrorException: IRSGenericException: IRSErrorException: Failed to DeactivateStorageDomainVDS, error = Error in storage domain action: (u's dUUID=95fe6a28-a292-4753-8b84-fd268141a79a, spUUID=10843014-2739-477b-b446-2383aec4730e, msdUUID=3551f61a-3476-4698-bafa-fc8fa2e25661, masterVersion=5',), code = 350 (Failed with error StorageDomainActionError and code 350) Expected results: Gluster domain deactivation while it's master should end successfully, reconstruct master should work. Additional info: sosreport from engine: http://file.tlv.redhat.com/ebenahar/sosreport-RHEL6.7Server-20150730083705.tar.xz sosreport from the hypervisor: http://file.tlv.redhat.com/ebenahar/sosreport-green-vdsb.qa.lab.tlv.redhat.com-20150730083706.tar.xz
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
this is an automated message. oVirt 3.6.0 RC3 has been released and GA is targeted to next week, Nov 4th 2015. Please review this bug and if not a blocker, please postpone to a later release. All bugs not postponed on GA release will be automatically re-targeted to - 3.6.1 if severity >= high - 4.0 if severity < high
Both Elad and I failed to reproduce.
This bug happens because unsynced clocks, we can use -m on tar to solve that issue. We shouldn't fail to migrate the master because of that. Allon/Nir - what's your take on that?
(In reply to Liron Aravot from comment #4) > This bug happens because unsynced clocks, we can use -m on tar to solve that > issue. We shouldn't fail to migrate the master because of that. > > Allon/Nir - what's your take on that? Sounds reasonable to me. Nir?
(In reply to Liron Aravot from comment #4) > This bug happens because unsynced clocks, we can use -m on tar to solve that > issue. We shouldn't fail to migrate the master because of that. > > Allon/Nir - what's your take on that? Did tar fail because the timestamps are in the future? Maybe we don't handle tar exit code properly? Anyway I don't think we should care about the timestamps, so if using -m avoid this issue, why not.
after a check, tar isn't failing because of the timestamp but logs those warnings in stderr which is misleading (and already confused people when this bug was investigated), we should avoid those timestamp errors when performing the operation.