Description of problem: Upgrading to vdsm-4.14.13-2 sometimes causes storage instability, the hosts report latency errors and FC link flapping events. It is also noticed that supervdsm sends FC LIP events around the same time as these errors occur. Sending a LIP is advertised as a last resort action in our documentation: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/scanning-storage-interconnects.html why are we doing this on a regular basis in vdsm? This seems to negatively affect storage performance Version-Release number of selected component (if applicable): vdsm-4.14.13-2.el6ev How reproducible: Always Steps to Reproduce: 1. install RHEV-H with vdsm-4.14.13-2.el6ev 2. connect FibreChannel Storage 3. activate host/assign a LUN/activate a storage domain Actual results: High Latency errors reported by storage Link Up events registered for FC HBAs sanlock warnings/errors reported Expected results: All operates without issues Additional info: this may have been implemented as a fix to BZ#1121998
In 'vdsm-4.14.13-2' a LIP is now issued to all FibreChannel hosts in 'hba.py' in certain circumstances. This was introduced via BZ 1123637. In some circumstances, e.g. (dis)connectStoragePool, e.g. when going into or coming out of maintenance mode, this may be ok, but this is also occurring when the following are performed; - activating/deactivating an NFS Export or ISO domain - clicking on 'Edit' in the Admin Portal to edit an FC data domain (results in a 'getDeviceList' on the host) The problem here is that all the active FC storage domains will be affected by this. Even for the fix to BZ 1123637, which I believe is for being able to see a newly-presented FC lun on a host in order to either create a new storage domain or extend an existing one, all of the other FC storage domains will be affected.
Gordon, the attached patch disabled hba rescanning by defualt. Can you test this patch and confirm that the FC connection issues are resolved with this patch?
The new patch (34245) is a more correct fix. It would be helpful if you test this patch on relevant sites. The interesting test is: 1. While host is up, add new LUN on the storage server 2. Edit FC storage domain or create new one Expected results: - The new LUN should appear in the list of devices - Existing FC connection should not be disrupted.
Red Hat Customer Portal 01123741
*** Bug 1162283 has been marked as a duplicate of this bug. ***
FC LIP events are not issued as part of storage domain creation/edit. Checked the following: Installed OS on a guest with a disk attached resides on FC domain: - Mapped a new LUN to the host by FC, then clicked on 'new' domain. - Mapped a new LUN to the host by FC, then clicked on 'edit' domain. Rescanned the bus using 'rescan-scsi-bus.sh' tool, then perfomed those steps again. During those actions, OS installation on the guest wasn't affected, it finished successfully. Didn't encountered any of: High Latency errors reported by storage Link Up events registered for FC HBAs sanlock warnings/errors reported Checked using XtremIO storage server Used rhev 3.5 vt11
doctext copied from the zstream clone, bug 1157681.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0159.html