Description of problem: when the target is illegal (except /dev/disk/by-path or /dev/disk/by-id) in pool-define-as and pool-create-as: For pool-create-as, it will have an unknown error. FOr pool-define-as, sometimes we can create and start pool successfully with illegal target, but we can't get volume by `virsh vol-list npiv`. Version-Release number of selected component (if applicable): libvirt-4.5.0-6.el7.x86_64 qemu-kvm-rhev-2.12.0-9.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. pool-create-as # virsh pool-create-as npiv scsi --adapter-wwnn 20000024ff370144 --adapter-wwpn 2101001b32a90000 --target aaa error: Failed to create pool npiv error: An error occurred, but the cause is unknown 2. pool-define-as # virsh pool-define-as npiv scsi --adapter-wwnn 20000024ff370144 --adapter-wwpn 2101001b32a90000 --target aaa Pool npiv defined # virsh pool-start npiv error: Failed to refresh pool npiv error: An error occurred, but the cause is unknown Actual results: As above Expected results: Don't support define/create pool with illegal target or give a clear error Additional info: Sometimes this error message occurred in pool-refresh when pool-create-as can create pool successfully with illegal target. For example: # virsh pool-create-as npiv scsi --adapter-wwnn 20000024ff370144 --adapter-wwpn 2101001b32a90000 --target /dev/disk/aaa Pool npiv created # virsh pool-refresh npiv error: Failed to refresh pool npiv error: An error occurred, but the cause is unknown
This borders on notabug territory since target is documented as: [--target path] is the path for the mapping of the storage pool into the host file system. Still we shouldn't get "error: An error occurred, but the cause is unknown" Unfortunately my NPIV environment is not available at the moment, so it's a bit hard to reproduce and chase. Prefixing your command with LIBVIRT_DEBUG=1 and then providing the output around the failure may give me some hints. I have a feeling this has to do with a failure somewhere in virStorageBackendSCSIFindLUs since 'aaa' or '/dev/disk/aaa' are not real paths. Still this is not something that would be a blocker/exception, so I'm moving to rhel7.7. NB: A similar command done for scsi on my non npiv capable host: # virsh pool-create-as npiv scsi --adapter-name scsi_host7 --target aaa error: Failed to create pool npiv error: invalid argument: unable to use target path 'aaa' for dev 'sdk' #
I finally revived my test system and figured out what was happening. The "bug" of displaying the generic message "An error occurred, but the cause is unknown" instead of the actual error is a result of commit decaeb288 which altered the stop code processing to add a call to an external libvirt API virGetConnectNodeDev which calls virConnectOpen that will call virResetLastError. So the "fix" for that will be to save the error over the call resulting in the failure such as: # virsh pool-refresh npiv error: Failed to refresh pool npiv error: cannot read dir '/dev/disk/aaa': No such file or directory # with respect to "why" # virsh pool-create-as npiv displayed the error one time but not another. Well that's a factor of timing and the creation of the vport. It's possible, but usually unlikely that by the time the refreshPool call runs in storagePoolCreateXML that the scsi_host for the NPIV/vHBA was created and the LUNs can be found/searched. But generally what happens is the createVport processing will create a background thread which would handle the refresh of the pool to list it's volumes. Failures in that thread don't destroy the pool, but wait for the next refreshPool to provide the "bad news". Again, the error/bug is not that the definition/creation is allowed, it's that it's wrong. The define/create code doesn't validate the target path due to how the LUNs are 'discovered' for the SCSI (and iSCSI) pools. In any case, patches have been generated and posted upstream: https://www.redhat.com/archives/libvir-list/2018-September/msg00579.html
Patch is pushed upstream: commit 5309b6cb64a7b92f6b75eb6221c2e9e8889f3d7c Author: John Ferlan <jferlan> Date: Wed Sep 12 11:25:37 2018 -0400 storage: Save error during refresh failure processing ... Save the error from the refresh failure because the stopPool processing may overwrite the error or even worse clear it due to calling an external libvirt API that resets the last error such as is the case with the SCSI pool which may call virGetConnectNodeDev (see commit decaeb288) in order to process deleting an NPIV vport. $ git describe 5309b6cb64a7b92f6b75eb6221c2e9e8889f3d7c v4.7.0-178-g5309b6cb64 $
Verified at: libvirt-5.0.0-10.virtcov.el8.x86_64 qemu-kvm-2.12.0-64.module+el8.0.0.z+3418+a72cf898.2.x86_64 Step: 1.Create npiv pool on host with illegal target # virsh pool-create-as npiv scsi --adapter-wwnn 20000000c99e2b81 --adapter-wwpn 1000000000000002 --target aaa Pool npiv created 2.Refresh the pool error: Failed to refresh pool npiv error: invalid argument: unable to use target path 'aaa' for dev 'sde' Get clear error and the pool will be destroyed
Created attachment 1588308 [details] Code coverage 100%
This was verified and shipped long ago. Closing the bug report.