Description of problem:
Environment: Vdsm 4.2 installed in 4.1 cluster, and running VMs created by 4.1 Engine.
When a disk is hotplugged, Vdsm may fail to store enough data. The hotplug operation per se is expected to work, but management operations, like live merge (and perhaps snapshot) will fail on that drive.
Version-Release number of selected component (if applicable):
Not sure, seems likely staring at the code but never encountered in the wild yet.
Steps to Reproduce:
1. Having oVirt 4.1 with at least a 4.1 cluster, upgrade Vdsm to 4.2
2. Run a pre-existing Vm using Engine 4.1, cluster 4.1 but with Vdsm 4.2
3. Hotplug a disk on the aforementioned Vm. Should work fine.
4. perform snapshot + live merge on the hotplugged disk. Live merge should fail.
live merge on hotplugged disk should fail. Live merge should work fine on cold plugged drives.
live merge works fine on both cold plugged and hot plugged disks.
no doc_text required; this should Just Work (tm).
tentatively scheduled for 4.2.4.
Patch is ready and simple, just need backport - but also a bit more investigation to confirm the scenario.
The scenario is simpler but more worrysome.
Vdsm will not add the disk to vm.conf['devices'], thus the newly hotplugged disk will not be reported in the output of getVMFullList(), thus the hotplug operation will (wrongly) reported as failed.
Steps to reproduce:
1. set up 4.1 environment: 4.1 cluster, 4.1 Engine
2. upgrade vdsm in 4.1 cluster to 4.2
3. run VM in 4.1 cluster managed by 4.2 Vdsm using 4.1 Engine
4. hotplug Disk (any format, any Storage Domain)
5. run snapshot after succesfull hotplug disk - should succeed as usual
6. live merge of the snapshot created - should succeed as usual
(In reply to Francesco Romani from comment #2)
> The scenario is simpler but more worrysome.
> Vdsm will not add the disk to vm.conf['devices'], thus the newly hotplugged
> disk will not be reported in the output of getVMFullList(), thus the hotplug
> operation will (wrongly) reported as failed.
...But this can happen only using test build never released to public.
So we can't really triggert this faulty flow with released software.