Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1961730

Summary: [rhosp13 z14] unable to start instances due to multipath failures after minor update from rhosp13 z6 to z14
Product: Red Hat OpenStack Reporter: Ketan Mehta <kmehta>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED DUPLICATE QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: apevec, dasmith, eglynn, jhakimra, jschluet, kchamart, lhh, lyarwood, sbauza, sgordon, vromanso
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-08 11:35:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ketan Mehta 2021-05-18 15:14:50 UTC
Description of problem:

unable to start instances on 1 compute node (compute-5) after performing a minor update from z6 to z14 and also enabling tls.

Upon a request to start/hard reboot a guest, as seen in nova-compute.log the request fails while trying to remove a mpath using multiipath -f <> which is associated with another running instance.

the multipath_ids in dumpxml & connection_info looks fine, however it still tries to remove a mpath associated with another vm running on the same host.

Earlier it appeared that it could be due to missing /etc/multipath.conf mounts in nova-compute and cinder-volume containers, but manually adding the mounts to nova-compute container as well did not fix the issue. Although, we just added it manually for nova-compute container on compute-5 and not cinder-volume containers.

these were the mounts:

  "/etc/multipath.conf:/etc/multipath.conf:ro",
      "/etc/multipath/:/etc/multipath/:rw",

the request fails upon starting the 

* VM1: cd42c047-fea5-4b0b-b4bf-8bb1f17b7148

with errors:

~~~
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server ProcessExecutionError: Unexpected error while running command.                                                                                                                 │
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server Command: multipath -f /dev/disk/by-id/dm-uuid-mpath-360002ac00000000000019561000199cd                                                                                          │
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server Exit code: 1                                                                                                                                                                   │
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server Stdout: u'May 18 15:49:03 | /dev/disk/by-id/dm-uuid-mpath-360002ac00000000000019561000199cd: map in use\nMay 18 15:49:03 | failed to remove multipath map /dev/disk/by-id/dm-uu│
id-mpath-360002ac00000000000019561000199cd\n'                                                                                                                                                                                            │
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server Stderr: u''                                                                                                                                                                    │
2021-05-18 15:49:43.637 9 ERROR oslo_messaging.rpc.server
~~~

Upon checking mpath 'mpath-360002ac00000000000019561000199cd' has been associated with:

* VM2: 75c2fc08-f9f2-4352-a4a6-7ee48cc95912

which is in running state on the same host.

adding connection_info from nova.bdm via a comment below and db dumps will be soon available as well.

The same behaviour is noticed for multiple vms on the same host, i'll try to list them..all of them are in stopped state so we need to attempt bringing them up.

Version-Release number of selected component (if applicable):

RHOSP13z14 + HPE3PAR [Please note hitachi storage is also used but that is for host lvms, 3par is for instances]

+++

rhosp13/openstack-nova-compute                13.0-165             dd337e44bcfa        7 weeks ago         1.78 GB

rhosp13/openstack-nova-libvirt                13.0-175             fbcbbcc3ed24        7 weeks ago         1.67 GB

+++

python2-os-brick-2.3.9-9.el7ost.noarch                      Sat May 15 10:11:27 2021


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Providing in comments below.