Bug 1411963 - NetApp NFS Cmode: Fix NotFound exception - backport upstream fix
Summary: NetApp NFS Cmode: Fix NotFound exception - backport upstream fix
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 8.0 (Liberty)
Assignee: Eric Harney
QA Contact: Tzach Shefi
URL:
Whiteboard:
Depends On: 1411967
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-10 20:56 UTC by Andreas Karis
Modified: 2020-03-11 15:35 UTC (History)
6 users (show)

Fixed In Version: openstack-cinder-7.0.3-4.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1411967 (view as bug list)
Environment:
Last Closed: 2017-03-09 17:53:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0479 0 normal SHIPPED_LIVE openstack-cinder bug fix advisory 2017-03-09 22:53:38 UTC

Description Andreas Karis 2017-01-10 20:56:05 UTC
Description of problem:
Please backport https://github.com/openstack/cinder/commit/0125df9c25c156142d73356e305e3a31513a1fe8 into OSP 8

Version-Release number of selected component (if applicable):
OSP 8, python-cinder-7.0.3-1.el7ost.noarch

Additional info:
When python tries to catch an exception in Netapp's nfs_cmode.py, the following log entry is generated:
~~~
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode [req-72c563cf-dde9-49ad-8b31-473e9753e535 - - - - -] Copy offload workflow unsuccessful. type object 'exceptions.Exception' has no attribute 'NotFound'
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode Traceback (most recent call last):
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py", line 425, in copy_image_to_volume
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode     image_id)
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode   File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 872, in trace_method_logging_wrapper
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode     return f(*args, **kwargs)
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode   File "/usr/lib/python2.7/site-packages/cinder/utils.py", line 872, in trace_method_logging_wrapper
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode     return f(*args, **kwargs)
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode   File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py", line 512, in _copy_from_img_service
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode     except Exception.NotFound:
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode AttributeError: type object 'exceptions.Exception' has no attribute 'NotFound'
2017-01-05 14:12:00.520 24342 ERROR cinder.volume.drivers.netapp.dataontap.nfs_cmode 
~~~

Here's the code. As you can see, it leads to the unhelpful message "has no attribute 'NotFound'"
~~~
less -N /usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py
(...)
    496     def _copy_from_img_service(self, context, volume, image_service,
    497                                image_id):
    498         """Copies from the image service using copy offload."""
    499         LOG.debug("Trying copy from image service using copy offload.")
    500         image_loc = image_service.get_location(context, image_id)
    501         locations = self._construct_image_nfs_url(image_loc)
    502         src_ip = None
    503         selected_loc = None
    504         # this will match the first location that has a valid IP on cluster
    505         for location in locations:
    506             conn, dr = self._check_get_nfs_path_segs(location)
    507             if conn:
    508                 try:
    509                     src_ip = self._get_ip_verify_on_cluster(conn.split(':')[0])
    510                     selected_loc = location
    511                     break
    512                 except Exception.NotFound:
    513                     pass
    514         if src_ip is None:
    515             raise exception.NotFound(_("Source host details not found."))
(...)
    440     def _get_ip_verify_on_cluster(self, host):
    441         """Verifies if host on same cluster and returns ip."""
    442         ip = na_utils.resolve_hostname(host)
    443         vserver = self._get_vserver_for_ip(ip)
    444         if not vserver:
    445             raise exception.NotFound(_("Unable to locate an SVM that is "
    446                                        "managing the IP address '%s'") % ip)
    447         return ip
(...)
~~~
The above is from an OSP 8 lab. The important code sections have changed in a minor, but important detail upstream .. you can compare the above to https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py. 

I first verified this with a simple script:
~~~
try:
    raise Exception.NotFound("not found")
except Exception as e:
    print e
~~~

Looks as if raising Exception.NotFound isn't valid python code.
~~~
[akaris@wks-akaris ~]$ ./test.py 
type object 'exceptions.Exception' has no attribute 'NotFound'
~~~


Now, let's compare our code in OSP 8 with upstream:
~~~
    512                 except Exception.NotFound:
    513                     pass
~~~

~~~
    def _copy_from_img_service(self, context, volume, image_service,
                               image_id):
        """Copies from the image service using copy offload."""
        LOG.debug("Trying copy from image service using copy offload.")
        image_loc = image_service.get_location(context, image_id)
        locations = self._construct_image_nfs_url(image_loc)
        src_ip = None
        selected_loc = None
        # this will match the first location that has a valid IP on cluster
        for location in locations:
            conn, dr = self._check_get_nfs_path_segs(location)
            if conn:
                try:
                    src_ip = self._get_ip_verify_on_cluster(conn.split(':')[0])
                    selected_loc = location
                    break
                except exception.NotFound:
                  pass
~~~
https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py

Note the difference between Exception.NotFound and exception.NotFound! This is why we get "has no attribute 'NotFound'"!

Comment 2 Tzach Shefi 2017-03-07 13:38:22 UTC
Andreas, 

How was this system install OSPD or packstack? 
If OSPD do you happen to have the storage yaml files used, as a template for me to reuse? 

Can you explain how reproduce/test? 
I noticed the below not sure I understand. 

~~~
try:
    raise Exception.NotFound("not found")
except Exception as e:
    print e
~~~
What script is this python? 

~~~
[akaris@wks-akaris ~]$ ./test.py 
type object 'exceptions.Exception' has no attribute 'NotFound'
~~~

What does test.py include, the code snip-it from before?

Comment 3 Andreas Karis 2017-03-07 14:59:13 UTC
Hi,

It was an OSPD production system at a customer site. 

The test.py script was only raising this exception:
 Exception.NotFound 

Check upper vs lower case!

The code in OSP 8:
~~~
    512                 except Exception.NotFound:
    513                     pass
~~~

The code in upstream:
~~~
                except exception.NotFound:
                  pass
~~~

Class 'Exception' has no NotFound exception.
The OpenStack custom 'exception' though does have it. 

Hence, in the OSP 8 code, there is a typo.


Please see here for what needs to be backported, it's only one line:
https://github.com/openstack/cinder/commit/0125df9c25c156142d73356e305e3a31513a1fe8


All clear now? :)

Comment 4 Andreas Karis 2017-03-07 15:11:14 UTC
In order to verify this, you'd need an environment with NetApp backend storage and the CopyOffload feature configured. 

http://netapp.github.io/openstack-deploy-ops-guide/juno/content/figures/5/a/images/rapid_cloning_flowchart.png
http://www.netapp.com/us/media/tr-4506.pdf

Then, the NetApp storage  Then, you'd run this with the netapp copy offload feature disabled on the netapp backend (I don't know how to do this, it was disabled / not functioning in the customer environemnt). 

You will need to somehow hit this exception, so this method here has to fail:
self._get_ip_verify_on_cluster(conn.split(':')[0])
~~~
less -N /usr/lib/python2.7/site-packages/cinder/volume/drivers/netapp/dataontap/nfs_cmode.py
(...)
    496     def _copy_from_img_service(self, context, volume, image_service,
    497                                image_id):
    498         """Copies from the image service using copy offload."""
    499         LOG.debug("Trying copy from image service using copy offload.")
    500         image_loc = image_service.get_location(context, image_id)
    501         locations = self._construct_image_nfs_url(image_loc)
    502         src_ip = None
    503         selected_loc = None
    504         # this will match the first location that has a valid IP on cluster
    505         for location in locations:
    506             conn, dr = self._check_get_nfs_path_segs(location)
    507             if conn:
    508                 try:
    509                     src_ip = self._get_ip_verify_on_cluster(conn.split(':')[0])
    510                     selected_loc = location
    511                     break
    512                 except Exception.NotFound:
    513                     pass
~~~

Long story short, this was an obvious typo, which was already fixed and waived upstream, so I don't know how much work (if at all) there is to do for QA.

Comment 7 errata-xmlrpc 2017-03-09 17:53:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0479.html


Note You need to log in before you can comment on or make changes to this bug.