Description of problem: cinder-volume with multi backend fails with AttributeError: 'NetAppCmodeNfsDriver' object has no attribute 'shares' where 'NetAppCmodeNfsDriver' is the driver block name in the cinder.conf Version-Release number of selected component (if applicable): cinder-volume --version 2013.1.2 How reproducible: unknown Steps to Reproduce: 1.unknown. this issue does not occur on clean systems with no cinder volumes it has been triggered on one system where a packstack was run to add a nova compute node. Actual results: 2013-09-27 11:11:21 ERROR [cinder.service] Unhandled exception Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/cinder/service.py", line 227, in _start_child self._child_process(wrap.server) File "/usr/lib/python2.6/site-packages/cinder/service.py", line 204, in _child_process launcher.run_server(server) File "/usr/lib/python2.6/site-packages/cinder/service.py", line 95, in run_server server.start() File "/usr/lib/python2.6/site-packages/cinder/service.py", line 355, in start self.manager.init_host() File "/usr/lib/python2.6/site-packages/cinder/volume/manager.py", line 149, in init_host self.driver.ensure_export(ctxt, volume) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/nfs.py", line 255, in ensure_export self._ensure_share_mounted(volume['provider_location']) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/nfs.py", line 315, in _ensure_share_mounted self._mount_nfs(nfs_share, mount_path, ensure=True) File "/usr/lib/python2.6/site-packages/cinder/volume/drivers/nfs.py", line 380, in _mount_nfs if self.shares.get(nfs_share) is not None: AttributeError: 'NetAppCmodeNfsDriver' object has no attribute 'shares' Expected results: cinder-volume start properly Additional info:
this bug renders production environments 'down' because cinder cannot be started when it hits.
I have serious concerns over the process how Red Hat releases OpenStack packages and additional fixes for stable branches. As per my understanding the distributions always take the stable/<release> branches of the code and package it. If any bug fix is applied in the main branch it needs to be back ported first to the stable/<release> branch and then the distributions can take the latest from the branch to release additional patches. This practice keeps the dev community on the same page, gives consistent view of the whole code and also helps to do regressions and identify bugs on stable branches. Unfortunately this practice has not been followed for grizzly release on stable/grizzly branch by Red Hat. They apparently took some of the selected files like the “cinder/volume/drivers/nfs.py” from the main branch, which are still not back ported to stable/grizzly , and created additional patch. There are other files like ours which are dependent on the selected file and it needed to be tested/us informed to do some rounds of testing with the changes before doing an adhoc patch release without back porting the code to stable/grizzly branch before creating a package. The change to file “cinder/volume/drivers/nfs.py” also required change to NetApp driver file “cinder/volume/drivers/netapp/nfs.py”which was not done and tested at the time picking files for package creation.
Hi Navneet, I'll need to defer to our storage/cinder experts on the specifics of this bug, but I wanted to address your more general comments below. (In reply to Navneet from comment #3) > I have serious concerns over the process how Red Hat releases OpenStack > packages and additional fixes for stable branches. As per my understanding > the distributions always take the stable/<release> branches of the code and > package it. If any bug fix is applied in the main branch it needs to be back > ported first to the stable/<release> branch and then the distributions can > take the latest from the branch to release additional patches. This practice > keeps the dev community on the same page, gives consistent view of the whole > code and also helps to do regressions and identify bugs on stable branches. Red Hat OpenStack releases do follow the stable branch updates. So, when a stable branch update for Grizzly is released, we rebase our RHOS 3.0 packages to that stable branch tarball and release new packages to RHN. This allows us to always ensure that at a minimum our RHOS distribution always contains at least the fixes provided by the stable branches. However, RHOS does backport patches from the upstream master branch for very targeted reasons. Here are some of the cases when we would do that: * Not all bug fixes upstream end up being candidates for stable branch backports. A bug might not be deemed critical enough for stable branch, but for our customers they might consider it more important than upstream would. In this case, we would apply a selected patch set to RHOS on top of the latest stable branch release. When a new stable branch tarball is released, we will rebase the patch on top of the new tarball. * Stable branch releases are done at a cadence of 1-2 months in between releases. For our customers, it might be critical to apply a fix sooner. For example, a data corruption issue, security issue or other bug that is critical to a customer's business operation. In this case, we would push for inclusion of the patch in the stable branch. But, until the stable branch release is cut, we would carry the patch set in RHOS as a one-off, until we can absorb the next stable branch release. * Targeted feature backports that do not break API or core functionality. Examples here might be backporting of a plugin for Neutron or a driver for Cinder, since these can cleanly layer on top of the core components. So, as you can see, there are valid reasons for not making Red Hat OpenStack follow the stable branch releases strictly. But, we do follow a few core principles: * If a patch should be in the stable branch, we get it in there first before backporting it to RHOS. We may carry it in RHOS short term until the next stable branch update is released * Patches need to at a minimum be in the upstream master branch, to ensure that we are not forking from upstream OpenStack * Backported features should not disrupt existing functionality or APIs * We always take latest stable branch releases and rebase 1-off patches on top of these stable branch tarballs. Hopefully that explanation helps. > Unfortunately this practice has not been followed for grizzly release on > stable/grizzly branch by Red Hat. They apparently took some of the selected > files like the “cinder/volume/drivers/nfs.py” from the main branch, which > are still not back ported to stable/grizzly , and created additional patch. > There are other files like ours which are dependent on the selected file and > it needed to be tested/us informed to do some rounds of testing with the > changes before doing an adhoc patch release without back porting the code to > stable/grizzly branch before creating a package. Again, I will have to defer to Eric and other Cinder experts on the specifics of why this file was backported. It could be that the changes to nfs.py were not candidates for stable branch inclusion. I think that it might be necessary to provide pre-release RPMs for new builds like this to vendors (like Netapp) to do validation of any changes to find issues like this before new RPMs are released officially. @hrivero, can you assist with this aspect?
Hi Perry, Thanks for the explanation. Well I think in this particular case the code from main branch seems to have been applied on top of the stable branch by Red Hat and it was not back ported. I now get the reasons why it might have been done, but the fact of the matter is that it broke NetApp driver and caused a stability problem with it. It took us some time to identify the issue as there was no github repo/branch which we knew that might have been the base for the patch and hence we had to get hold of the installed version of RHOS before we could figure out the issue and subsequently suggest a fix to the customer. There seems to be an obvious need for Red Hat to share their repository with NetApp or make the packages available to NetApp for a pre-release certification of the full release or the patch before pushing out to public. I guess some of my colleagues will be reaching out to you to reach an understanding on the same. Thanks Navneet
>I think that it might be necessary to provide pre-release RPMs for new builds >like this to vendors (like Netapp) to do validation of any changes to find >issues like this before new RPMs are released officially. > >@hrivero, can you assist with this aspect? Yes, I can coordinate with NetApp (Jeff?) to determine a suitable process and identify the best test model and frequency.
This is broken in RHOS 3.0 due to changes made for the backport for BZ 957902. The functionality backported there was a new feature for Havana and therefore wasn't/isn't eligible for upstream stable/grizzly. (https://review.openstack.org/#/c/29323/) That patch unfortunately assumes that the driver's self.shares will be initialized before the RemoteFsDriver's _ensure_share_mounted() is called. This is true for NFS and GlusterFS but not NetAppDirectCmodeNfsDriver. Havana / RHOS 4.0 does not appear to have this problem since NetAppNFSDriver's do_setup calls super().do_setup() which initializes self.shares via the (base) NFS driver's method. We should fix this for RHOS 3.0 by adding initialization to the RemoteFsDriver class to set self.shares = {}, rather than relying on a certain code path to initialize it. This should also go upstream, as it is correct Python practice, and may help other drivers. But, this would be only for Icehouse at this point, since things (as far as I can tell) are not broken in upstream Grizzly nor Havana, and this is likely not a significant enough issue to qualify as a Havana RC-blocker.
Created attachment 806344 [details] Patched NetApp NFS file. This is the NetApp file to be placed at path cinder/volume/drivers/netapp to fix the issue.
We issued the above patch for the NetApp NFS driver to the customer to fix the issue, in case it helps.
configured multi-backed for Cinder - played with it - nothing happened.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1510.html