Bug 1059778
| Summary: | [engine] Engine sends DisconnectStorageServer with empty connectionList | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Gadi Ickowicz <gickowic> | ||||
| Component: | General | Assignee: | Daniel Erez <derez> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | Aharon Canan <acanan> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | --- | CC: | amureini, bugs, derez, fsimonce, gickowic, gklein, laravot, nlevinki, rbalakri, tnisan, yeylon, ylavi | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2016-03-28 14:33:11 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Federico, does this seem related to your recent refactoring? When executing disconnectStorageServer, storage cache refresh is triggered regardless of the passed connections list. Federico, it it safe to remove that at least in some of the cases? or do we want to leave it? (In reply to Allon Mureinik from comment #1) > Federico, does this seem related to your recent refactoring? What I've been able to reproduce is a broader issue that is not related to any DisconnectStorageServer refactoring. It seems that the logic in filterConnectionsUsedByOthers (ISCSIStorageHelper.java:~108) is fairly simple and if you have more than one domain on the same connection (on different luns) it decides that we should not disconnect from the server. It doesn't take in account any of the followings (even if it should): - consider only the storage domains of the pool(s) that the vds is connected to - consider only the direct luns that are currently in use by a VM - if all the remaining relevant storage domains are in maintenance then don't filter (=> allow disconnection) Once we solved these issues that are the pre-requisites we'll see if there's something more to address in the specific DestroyStoragePool flow. (In reply to Federico Simoncelli from comment #3) > (In reply to Allon Mureinik from comment #1) > > Federico, does this seem related to your recent refactoring? > > What I've been able to reproduce is a broader issue that is not related to > any DisconnectStorageServer refactoring. > > It seems that the logic in filterConnectionsUsedByOthers > (ISCSIStorageHelper.java:~108) is fairly simple and if you have more than > one domain on the same connection (on different luns) it decides that we > should not disconnect from the server. > > It doesn't take in account any of the followings (even if it should): > > - consider only the storage domains of the pool(s) that the vds is connected > to > - consider only the direct luns that are currently in use by a VM > - if all the remaining relevant storage domains are in maintenance then > don't filter (=> allow disconnection) > > Once we solved these issues that are the pre-requisites we'll see if there's > something more to address in the specific DestroyStoragePool flow. I agree, there are issues on that helper that should addressed, i assume that the reason that it wasn't done before is to avoid synchronization between the different flows. Fede, regardless of that - can you please reply to https://bugzilla.redhat.com/show_bug.cgi?id=1059778#c2 ? (In reply to Liron Aravot from comment #2) > When executing disconnectStorageServer, storage cache refresh is triggered > regardless of the passed connections list. > Federico, it it safe to remove that at least in some of the cases? or do we > want to leave it? If I understand correctly you're suggesting to skip sdCache.refreshStorage() when connList for disconnectStorageServer is empty. Can you explain how is that relevant to the bz? My take on this is that engine shouldn't send connect/disconnectStorageServer with an empty connList and on the other side vdsm should raise an exception in that case (we would have caught this much earlier). (In reply to Federico Simoncelli from comment #5) > (In reply to Liron Aravot from comment #2) > > When executing disconnectStorageServer, storage cache refresh is triggered > > regardless of the passed connections list. > > Federico, it it safe to remove that at least in some of the cases? or do we > > want to leave it? > > If I understand correctly you're suggesting to skip sdCache.refreshStorage() > when connList for disconnectStorageServer is empty. > > > My take on this is that engine shouldn't send > connect/disconnectStorageServer with an empty connList and on the other side > vdsm should raise an exception in that case (we would have caught this much > earlier). That's exactly what I've asked, if we can avoid executing it when the connection list is empty. iirc we needed it because of the refreshStorage - can it be safely not be executed for all dc versions? thanks. (In reply to Liron Aravot from comment #6) > (In reply to Federico Simoncelli from comment #5) > > (In reply to Liron Aravot from comment #2) > > > When executing disconnectStorageServer, storage cache refresh is triggered > > > regardless of the passed connections list. > > > Federico, it it safe to remove that at least in some of the cases? or do we > > > want to leave it? > > > > If I understand correctly you're suggesting to skip sdCache.refreshStorage() > > when connList for disconnectStorageServer is empty. > > > > > > My take on this is that engine shouldn't send > > connect/disconnectStorageServer with an empty connList and on the other side > > vdsm should raise an exception in that case (we would have caught this much > > earlier). > > That's exactly what I've asked, if we can avoid executing it when the > connection list is empty. As I said I expect vdsm to raise an exception early if connList is empty (on master, not 3.5 material). (Unrelated to the solution for this bug) > iirc we needed it because of the refreshStorage - What is "it"? (Previous "it" was "refreshStorage" but here it doesn't make sense) > can it be safely not be executed for all dc versions? This is something that you're probably more qualified to answer, did anyone ever use disconnectStorageServer to trigger a refreshStorage as an undocumented side-effect? Anyway that is not relevant to this bz, this bug is about engine sending disconnectStorageServer with an empty connList where instead it should contain the connections. The discussion about an undocumented use of disconnectStorageServer and refreshStorage should take place somewhere else. (In reply to Federico Simoncelli from comment #7) > (In reply to Liron Aravot from comment #6) > > (In reply to Federico Simoncelli from comment #5) > > > (In reply to Liron Aravot from comment #2) > > > > When executing disconnectStorageServer, storage cache refresh is triggered > > > > regardless of the passed connections list. > > > > Federico, it it safe to remove that at least in some of the cases? or do we > > > > want to leave it? > > > > > > If I understand correctly you're suggesting to skip sdCache.refreshStorage() > > > when connList for disconnectStorageServer is empty. > > > > > > > > > My take on this is that engine shouldn't send > > > connect/disconnectStorageServer with an empty connList and on the other side > > > vdsm should raise an exception in that case (we would have caught this much > > > earlier). > > > > That's exactly what I've asked, if we can avoid executing it when the > > connection list is empty. > > As I said I expect vdsm to raise an exception early if connList is empty (on > master, not 3.5 material). > (Unrelated to the solution for this bug) > > > iirc we needed it because of the refreshStorage - > > What is "it"? (Previous "it" was "refreshStorage" but here it doesn't make > sense) > > > can it be safely not be executed for all dc versions? > > This is something that you're probably more qualified to answer, did anyone > ever use disconnectStorageServer to trigger a refreshStorage as an > undocumented side-effect? > > Anyway that is not relevant to this bz, this bug is about engine sending > disconnectStorageServer with an empty connList where instead it should > contain the connections. > > The discussion about an undocumented use of disconnectStorageServer and > refreshStorage should take place somewhere else. Today disconnectStorageServer/connectStorageServer is executed even without connection and triggers refreshStorage on the vdsm side, my question was to clarify whether we can safely remove that without risk on all versions, as refreshStorage() is executed we won't be able to do that. The second issue to inspect is how we choose which connections to pass on the disconnectStorageServer call and if we do pass all needed connections (which we'll probably won't change for 3.5). > Today disconnectStorageServer/connectStorageServer is executed even without
> connection and triggers refreshStorage on the vdsm side, my question was to
> clarify whether we can safely remove that without risk on all versions, as
> refreshStorage() is executed we won't be able to do that.
bad phrasing here, what i meant to ask is whether removing it can cause to regression, as it might. we won't remove the calls at least for now.
Seems like a low impact bug, with a fix the involves high risk (probably need a synchronization mechanism around connections) - deferring. This is an automated message. This Bugzilla report has been opened on a version which is not maintained anymore. Please check if this bug is still relevant in oVirt 3.5.4. If it's not relevant anymore, please close it (you may use EOL or CURRENT RELEASE resolution) If it's an RFE please update the version to 4.0 if still relevant. Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release. In oVirt testing is done on single stream by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone. Fixing this bug involves major code changes at the engine side and potentially at vdsm side. Approaches that we need to evaluate: * vdsm "new" conn mgmt api * deamon maintenance * ref counting at vdsm side All approaches involves major code changes hence, the decision to move this bug to 4.0.0. (In reply to Ala Hino from comment #15) > Fixing this bug involves major code changes at the engine side and > potentially at vdsm side. > > Approaches that we need to evaluate: > * vdsm "new" conn mgmt api > * deamon maintenance > * ref counting at vdsm side > > All approaches involves major code changes hence, the decision to move this > bug to 4.0.0. So why do it all? What's the REAL value in fixing this? I'd CLOSE-WONTFIX it. (In reply to Yaniv Kaul from comment #16) > (In reply to Ala Hino from comment #15) > > Fixing this bug involves major code changes at the engine side and > > potentially at vdsm side. > > > > Approaches that we need to evaluate: > > * vdsm "new" conn mgmt api > > * deamon maintenance > > * ref counting at vdsm side > > > > All approaches involves major code changes hence, the decision to move this > > bug to 4.0.0. > > So why do it all? What's the REAL value in fixing this? > > I'd CLOSE-WONTFIX it. @Allon - can we close this? (IIUC, it's merely a performance issue at best?) |
Created attachment 857550 [details] engine logs Description of problem: Engine sends DisconnectStorageServer with empty connectionList during remove data center flow: 2014-01-30 15:28:04,600 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (org.ovirt.thread.pool-6-thread-25) [43035533] START, DisconnectStorageServerVDSCommand(HostName = gold-v 2014-01-30 15:28:05,248 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (org.ovirt.thread.pool-6-thread-25) [43035533] FINISH, DisconnectStorageServerVDSCommand, return: {}, log Version-Release number of selected component (if applicable): ovirt-engine-3.4.0-0.2.master.20140112020439.git9ad8529.el6.noarch How reproducible: 100% Steps to Reproduce: DC with 1 storage domain, 1 host 1. Move domain to maintenance 2. Remove datacenter Actual results: Engine logs show DisconnectStorageServer with empty connectionList sent Expected results: If there are no servers to disconnect, there is no need to send DisconnectStorageServer Additional info: