Bug 2188314
| Summary: | [OSP16.1] Instance root disk uses multipath but extra volume use single path ? | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | ggrimaux |
| Component: | python-os-brick | Assignee: | Cinder Bugs List <cinder-bugs> |
| Status: | NEW --- | QA Contact: | Evelina Shames <eshames> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 16.1 (Train) | CC: | apanagio, apevec, dasmith, eglynn, geguileo, jhakimra, jschluet, jveiraca, kchamart, lhh, ltamagno, ltoscano, mwitt, sbauza, sgordon, vromanso |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The os-brick behavior on `connect_volume` call is to try to return a device, even if it's a single pathed one.
This means that Nova won't be using a multipath device if at the time of the attachment one or more of these happen at the same time:
- There's only 1 path that can reach the storage array
- The connection to the storage array is very, very, very slow
- The CPU usage during the connection is very high
On the other hand there are instances were we will get a multipath device even if there is only 1 path up. For example if multipathd on a host had already created a multipath device for that same volume in the past, in that case it will create the multipath device as soon as the first device is connected.
In request req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e we can see that there are no warnings from os-brick about not finding the WWN, instead it says that it did find the WWNs:
2023-05-09 14:32:26.933 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000000cc95000255e5
2023-05-09 14:32:31.096 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001068b00024c40
2023-05-09 14:32:35.252 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001068c00024c40
2023-05-09 14:32:39.420 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001068d00024c40
2023-05-09 14:32:43.577 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001068e00024c40
2023-05-09 14:32:47.746 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001068f00024c40
2023-05-09 14:32:51.918 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001069600024c40
2023-05-09 14:32:56.095 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001069000024c40
2023-05-09 14:33:00.260 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000001069200024c40
2023-05-09 14:33:04.433 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000000003200024c6b
2023-05-09 14:33:08.601 7 INFO os_brick.initiator.linuxscsi [req-0113c00f-7dc6-4fdd-aa47-5ca68125c91e 66d3f3d465eb4567b0dfdd38ca0a02fd 781a8e8abc624a058b854ad114fdecc3 - default default] Find Multipath device file for volume WWN 360002ac0000000000000003500024c6b
And we can see in the instance XML in etc/libvirt/qemu/instance-00004eef.xml that all the devices are using multipath devices:
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000000cc95000255e5'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001068b00024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001068c00024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001068d00024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001068e00024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001068f00024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001069600024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001069000024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000001069200024c40'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000000003200024c6b'/>
<source dev='/dev/disk/by-id/dm-uuid-mpath-360002ac0000000000000003500024c6b'/>
And checking sos_commands/multipath/multipath_-ll we see that all of those device maps have 4 paths.
So this is not a really an unexpected problem with the behavior of the code. It is behaving as it is currently intended.
Another matter is whether improvements are possible in the code.
For example I can think of 2 possible changes:
- Make the FC os-brick connector force the creation of a multipath device as soon as one of the devices connects. This is not really a great improvement, but under some circumstances it could help, because after recovering fallen paths we could do a manual scan on the host and the paths would be added to the device mapper that is being used by the instance without having to stop/reboot the instance.
- Make the connect_volume method understand the `enforce_multipath` parameter and then reject all `connect_volume` calls that would yield as a result a single path or a multipath with a single path.
For this specific case we can simulate the first of those 2 changes with a configuration change to /etc/multipath.conf
We just need to change in `/etc/multipath.conf` the option `find_multipaths` to `no` and restart the `multipathd` service.
If we are using containers we would need to add this to the puppet generated configuration file and restart the `multipathd` container. That would be the "quick" fix, but if we are configuring `multipathd` with Director we would need to make sure that we change how we deploy it or next time we may be undoing that change.
The defaults would look like:
defaults {
skip_kpartx yes
user_friendly_names no
find_multipaths no
polling_interval 5
dev_loss_tmo 60
uxsock_timeout 30000
}
What that does is tell `multipathd` that any device that appears on the system is going to be multipathed, and that it shouldn't wait to see a second device to create the multipath device mapper.
That should help with the "manual recover" of paths.
For the record, the SOS reports are missing almost all of the os-brick logs due to a change in how Nova works. When having trouble with os-brick in the Nova-compute service, please enable all the os-brick logs. I give a brief explanation on this KCS article: https://access.redhat.com/articles/5906971 After my colleague Alexandros pointed out that there were some SCSI kernel messages indicating that an attached device was forcefully unmapped/unexported from the array I have a possible hypothesis on what could be happening.
With the OSP version in use Nova instance deletion can leave device behind if there is an error on the local disconnect operation. These leftover devices may result in issues when attaching future volumes to that same host. One of the issues that can happen is failure to form a multipath.
The solution is to update to the latest OSP version, where Nova passes "force=True" to the "disconnect_volume" os-brick method and where os-brick has support for force disconnect on the FC connector.
In the meantime I would recommend changing multipathd configuration on all compute and controller nodes to include not only "find_multipaths no" but also "recheck_wwid yes":
defaults {
skip_kpartx yes
user_friendly_names no
recheck_wwid yes
find_multipaths no
polling_interval 5
dev_loss_tmo 60
uxsock_timeout 30000
}
And also periodically check that all multipath devices in a host have their 4 paths.
|
Description of problem: Client is using SAN storage backends. Client opened a case because some disk errors were seen inside the guest OS and on the hypervisor as well. Hypervisor: blk_update_request: I/O error, dev sdcg, sector 380321704 op 0x0:(READ) flags 0x4200 phys_seg 44 prio class 0 Guest: print_req_error, dev vdb, 299419432 We found the root cause of this: defective SFP module on SAN switch side. What was discovered though is that the root disk uses multipath: -blockdev '{"driver":"host_device","filename":"/dev/disk/by-id/dm-uuid-mpath-3600000000000000000aaaaaaaaaaaaaa","aio":"native","node-name":"libvirt-8-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-8-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-8-storage"}' \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=libvirt-8-format,id=virtio-disk0,bootindex=1,write-cache=on,serial=7c78cf3d-4958-47af-916a-e956570c09ac \ But extra volumes on this same instance uses single path: -blockdev '{"driver":"host_device","filename":"/dev/disk/by-path/pci-0000:af:00.1-fc-0x20320002ac024c6b-lun-6","aio":"native","node-name":"libvirt-7-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-7-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-7-storage"}' \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=libvirt-7-format,id=virtio-disk1,write-cache=on,serial=6a34f92a-8168-423f-8c14-b68810a423e3 \ And what happened here is that single path did in fact go down: Mar 09 17:26:27 compute kernel: lpfc 0000:af:00.1: 3:1305 Link Down Event x4 received Data: x4 x20 x800110 x0 x0 For the record all the volumes of the instances are in multipath -ll output. So its not a matter that they don't exist somehow. So my question to you is this a bug ? It knows how to use multipath with root disk (vda). If you need anything else from us, let me know! Thank you. Version-Release number of selected component (if applicable): OSP16.1.7 How reproducible: 100% Steps to Reproduce: 1. Create an instance 2. Add volumes to it. 3. Actual results: non root disks volumes are not using multipath, creating single point of failure. Expected results: Use mulitpaths for those volumes so if a single path goes down they are not impacted. Additional info: we have sosreport from the compute node.