Bug 1363759
| Summary: | Dashboard - storage values are not refreshed correctly and shows zeros | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-engine-dashboard | Reporter: | Lukas Svaty <lsvaty> |
| Component: | Core | Assignee: | Alexander Wels <awels> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavel Novotny <pnovotny> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | unspecified | CC: | awels, bugs, lsvaty, mgoldboi, oourfali, sradco |
| Target Milestone: | ovirt-4.0.6 | Flags: | rule-engine:
ovirt-4.0.z+
mgoldboi: planning_ack+ oourfali: devel_ack+ lsvaty: testing_ack+ |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-11-15 11:41:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | UX | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Lukas Svaty
2016-08-03 13:39:12 UTC
As mentioned in the original bug, I don't see how we can fix that without polling DWH very frequently, which will have an impact on performance. I don't think this should block the dashboard RFE. Currently targeting it to 4.1 for consideration there. Why are the zeros displayed and not the last status gathered from dwh? (In reply to Lukas Svaty from comment #2) > Why are the zeros displayed and not the last status gathered from dwh? Alexander? Because the last status from the DWH actually returns a negative value. Basically the following happens: 1. DWH reads data from engine database, and the dashboard gets the values and shows them. 2. User detaches storage domain from engine. 3. The storage domain is flagged as unavailable in one of the tables in the DWH, but the used data is not changed. 4. The query looks up the total available storage (which is now down one storage domain), but the used data has not been updated yet, so that is still the value from before. 5. Available = total - used. But total is lower now while used is not. This can lead to negative values. We have code in place that basically sets a value to 0 if it is negative. 6. After a while the used value is updated as well, and everything is fine. IMO the problem is the different intervals at which some of the data is updated in the DWH. (In reply to Alexander Wels from comment #4) > Because the last status from the DWH actually returns a negative value. > Basically the following happens: > > 1. DWH reads data from engine database, and the dashboard gets the values > and shows them. > 2. User detaches storage domain from engine. > 3. The storage domain is flagged as unavailable in one of the tables in the > DWH, but the used data is not changed. > 4. The query looks up the total available storage (which is now down one > storage domain), but the used data has not been updated yet, so that is > still the value from before. Do both the total and used come from DWH? > 5. Available = total - used. But total is lower now while used is not. This > can lead to negative values. We have code in place that basically sets a > value to 0 if it is negative. > 6. After a while the used value is updated as well, and everything is fine. > > IMO the problem is the different intervals at which some of the data is > updated in the DWH. Yes everything in the dashboard comes from the DWH, with the exception of the inventory cards at the top. So total/used comes from DWH database. Shirly - what do you suggest to do? We take the last 5 minutes data, so the total in the latest sample doesn't contain the last domain, however the previous samples do contain it. How can we overcome it? Actually, took another look at the queries and this is the problem: The total query basically looks at the last sample to determine if it should include a SD in the calculation or not. So when you detach the SD the next sample will immediately exclude the detach SD. The used query takes the average used over the last 5 minutes (this is what is shown in the center donut, and this is also the reason we have 2 queries, one for total and one for used). Now if you detach the SD, the last sample will not be included in the average, but the previous 4 will be. We have several options to fix this: 1. Make the total an average over the last 5 minutes like the used. 2. Make the used not an average, but simply look at the last sample. 3. Modify the query to exclude all samples from the average if the last sample says the SD is not active. @Moran, Which option would you like? (In reply to Alexander Wels from comment #8) > Actually, took another look at the queries and this is the problem: > > The total query basically looks at the last sample to determine if it should > include a SD in the calculation or not. So when you detach the SD the next > sample will immediately exclude the detach SD. > > The used query takes the average used over the last 5 minutes (this is what > is shown in the center donut, and this is also the reason we have 2 queries, > one for total and one for used). Now if you detach the SD, the last sample > will not be included in the average, but the previous 4 will be. > > We have several options to fix this: > 1. Make the total an average over the last 5 minutes like the used. > 2. Make the used not an average, but simply look at the last sample. > 3. Modify the query to exclude all samples from the average if the last > sample says the SD is not active. > > @Moran, > Which option would you like? the most appealing to me would be option 2, since i think it gives the current status and nature of storage statistics is different from CPU and MEM which are very dynamic and needs to be normalized , what do you think would be the "downsides" of going with this option. I personally don't see any downsides to any of the options, it will in certain circumstances give slightly different data. All of which are valid IMO. If you want to go with option #2, I will implement that. (In reply to Alexander Wels from comment #10) > I personally don't see any downsides to any of the options, it will in > certain circumstances give slightly different data. All of which are valid > IMO. If you want to go with option #2, I will implement that. let's just make sure that if we do this change, that we do it in a consistent manner across the dashboard (In reply to Moran Goldboim from comment #11) > (In reply to Alexander Wels from comment #10) > > I personally don't see any downsides to any of the options, it will in > > certain circumstances give slightly different data. All of which are valid > > IMO. If you want to go with option #2, I will implement that. > > let's just make sure that if we do this change, that we do it in a > consistent manner across the dashboard Can you please attach a screenshot of the issue? I was not able to reproduce this issue, it was fixed in previous release. Thus closing. |