Bug 1697994
| Summary: | [RHOSP14] dataontap volume driver only provides single target_portal address when multiple are configured | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Pablo Caruana <pcaruana> | |
| Component: | openstack-cinder | Assignee: | Pablo Caruana <pcaruana> | |
| Status: | CLOSED ERRATA | QA Contact: | Tzach Shefi <tshefi> | |
| Severity: | high | Docs Contact: | Tana <tberry> | |
| Priority: | high | |||
| Version: | 14.0 (Rocky) | CC: | aavraham, abishop, apevec, dasmith, dhill, eglynn, geguileo, gkumar, igallagh, jhakimra, jschluet, kchamart, knylande, lhh, lyarwood, msufiyan, pcaruana, pgrist, rheslop, sbauza, sgordon, shdunne, tenobreg, vromanso | |
| Target Milestone: | z3 | Keywords: | Triaged, ZStream | |
| Target Release: | 14.0 (Rocky) | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-cinder-13.0.3-3.el7ost | Doc Type: | Bug Fix | |
| Doc Text: |
Previously, NetApp drivers would fail to attach a volume if the IP provided for discovery was not accessible from the host. The NetApp iSCSI drivers have been updated to return `target_iqns`, `target_portals`, and `target_luns` parameters when these options are available.
|
Story Points: | --- | |
| Clone Of: | 1653051 | |||
| : | 1697996 (view as bug list) | Environment: | ||
| Last Closed: | 2019-07-02 19:43:59 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1653051, 1697996 | |||
|
Description
Pablo Caruana
2019-04-09 12:11:20 UTC
Pablo while testing this on: openstack-cinder-13.0.5-2.el7ost.noarch
I'd hit a problem see at the bottom.
On a netapp iscsi backend, I'd created and attached a volume to an instance.
On compute nodes we see the basic flow.
3600a098038304479363f4c4870453456 dm-0 NETAPP ,LUN C-Mode
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua'
wp=rw
|-+- policy='service-time 0' prio=50 status=active
| `- 6:0:0:0 sda 8:0 active ready running
`-+- policy='service-time 0' prio=10 status=enabled
`- 7:0:0:0 sdb 8:16 active ready running
[root@compute-0 ~]# iscsiadm -m session -P 3 | grep X.Y. (x.y represent internal IPs..)
Current Portal: X.Y.16.12:3260,1042
Persistent Portal: X.Y.16.12:3260,1042
Current Portal: X.Y.16.11:3260,1041
Persistent Portal: X.Y.16.11:3260,1041
Now let's detach the volume and fail one of the paths, via FW block rule.
(overcloud) [stack@undercloud-0 ~]$ nova volume-detach 02244b34-bc4e-4fcb-8325-058d5b84bcb2 7f21a2b4-2313-496f-9e3b-8c29963eeba0
We see no active connection at this state:
[root@compute-0 ~]# iscsiadm -m session -P 3 | grep 16
iscsiadm: No active sessions.
Now lets add an iptables drop rule on compute node:
sudo iptables -s X.Y.16.11 -p tcp --sport 3260 -I INPUT -m statistic --mode random --probability 1 -j DROP
Now retry to attach volume again
(overcloud) [stack@undercloud-0 ~]$ nova volume-attach 02244b34-bc4e-4fcb-8325-058d5b84bcb2 7f21a2b4-2313-496f-9e3b-8c29963eeba0 auto
+----------+--------------------------------------+
| Property | Value |
+----------+--------------------------------------+
| device | /dev/vdb |
| id | 7f21a2b4-2313-496f-9e3b-8c29963eeba0 |
| serverId | 02244b34-bc4e-4fcb-8325-058d5b84bcb2 |
| volumeId | 7f21a2b4-2313-496f-9e3b-8c29963eeba0 |
+----------+--------------------------------------+
And we see only one connection vi ip x.y.16.12
[root@compute-0 ~]# iscsiadm -m session -P 3 | grep 16 Current Portal: 10.46.16.12:3260,1042
Persistent Portal: 10.46.16.12:3260,1042
[root@compute-0 ~]# multipath -ll -v2
3600a098038304479363f4c4870453456 dm-0 NETAPP ,LUN C-Mode
size=2.0G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 alua' wp=rw
`-+- policy='service-time 0' prio=50 status=active
`- 8:0:0:0 sda 8:0 active ready running
I detached the volume again, now blocked x.y.16.11 but now I fail to attach the volume.
First of all would my verification steps be suited to verify the bz?
If not what/how else should I verify?
What do we do with the fact that I fail to attach volume once I block x.y.16.11?
I'm not sure about the terms but wouldn't this fit the error the bz fixes?
How do I know if x.y.16.11 is the discovery provided ip?
(In reply to Tzach Shefi from comment #10) > Pablo while testing this on: openstack-cinder-13.0.5-2.el7ost.noarch > I'd hit a problem see at the bottom. > > > On a netapp iscsi backend, I'd created and attached a volume to an instance. > On compute nodes we see the basic flow. > > 3600a098038304479363f4c4870453456 dm-0 NETAPP ,LUN C-Mode > size=2.0G features='4 queue_if_no_path pg_init_retries 50 > retain_attached_hw_handle' hwhandler='1 alua' > wp=rw > |-+- policy='service-time 0' prio=50 status=active > | `- 6:0:0:0 sda 8:0 active ready running > `-+- policy='service-time 0' prio=10 status=enabled > `- 7:0:0:0 sdb 8:16 active ready running > > > [root@compute-0 ~]# iscsiadm -m session -P 3 | grep X.Y. (x.y represent > internal IPs..) > Current Portal: X.Y.16.12:3260,1042 > Persistent Portal: X.Y.16.12:3260,1042 > Current Portal: X.Y.16.11:3260,1041 > Persistent Portal: X.Y.16.11:3260,1041 > > > Now let's detach the volume and fail one of the paths, via FW block rule. > (overcloud) [stack@undercloud-0 ~]$ nova volume-detach > 02244b34-bc4e-4fcb-8325-058d5b84bcb2 7f21a2b4-2313-496f-9e3b-8c29963eeba0 > > We see no active connection at this state: > [root@compute-0 ~]# iscsiadm -m session -P 3 | grep 16 > iscsiadm: No active sessions. > > Now lets add an iptables drop rule on compute node: > sudo iptables -s X.Y.16.11 -p tcp --sport 3260 -I INPUT -m statistic --mode > random --probability 1 -j DROP > > Now retry to attach volume again > > > (overcloud) [stack@undercloud-0 ~]$ nova volume-attach > 02244b34-bc4e-4fcb-8325-058d5b84bcb2 7f21a2b4-2313-496f-9e3b-8c29963eeba0 > auto > +----------+--------------------------------------+ > | Property | Value | > +----------+--------------------------------------+ > | device | /dev/vdb | > | id | 7f21a2b4-2313-496f-9e3b-8c29963eeba0 | > | serverId | 02244b34-bc4e-4fcb-8325-058d5b84bcb2 | > | volumeId | 7f21a2b4-2313-496f-9e3b-8c29963eeba0 | > +----------+--------------------------------------+ > > And we see only one connection vi ip x.y.16.12 > [root@compute-0 ~]# iscsiadm -m session -P 3 | grep 16 Current Portal: > 10.46.16.12:3260,1042 > Persistent Portal: 10.46.16.12:3260,1042 > > [root@compute-0 ~]# multipath -ll -v2 > 3600a098038304479363f4c4870453456 dm-0 NETAPP ,LUN C-Mode > size=2.0G features='4 queue_if_no_path pg_init_retries 50 > retain_attached_hw_handle' hwhandler='1 alua' wp=rw > `-+- policy='service-time 0' prio=50 status=active > `- 8:0:0:0 sda 8:0 active ready running > > I detached the volume again, now blocked x.y.16.11 but now I fail to attach > the volume. > > First of all would my verification steps be suited to verify the bz? > If not what/how else should I verify? > > What do we do with the fact that I fail to attach volume once I block > x.y.16.11? > I'm not sure about the terms but wouldn't this fit the error the bz fixes? > How do I know if x.y.16.11 is the discovery provided ip? Test looks valid, assuming no other setup leftovers, most important element is confirming that addresses, ports pairs are passed correctly. Anyway the best thing to understand better what is going is reproducing it with both nova and cinder components in debug to track the volume request down both a compute/controller side, ideally with any associated trace call. Verified on: openstack-cinder-13.0.5-2.el7ost.noarch Created a iscis volume, attached to instance we see both paths. [root@compute-0 ~]# iscsiadm -m session -P 3 | grep X.Y. > internal IPs..) > Current Portal: X.Y.16.12:3260,1042 > Persistent Portal: X.Y.16.12:3260,1042 > Current Portal: X.Y.16.11:3260,1041 > Persistent Portal: X.Y.16.11:3260,1041 Per the second issue I'd hit at the end of comment 11, it's a known nova multipath bug. For the this bz verification we can ignore it. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1678 |