| Summary: | Windowed functions don't work on a complex query, | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Yaniv Lavi <ylavi> | ||||
| Component: | postgresql | Assignee: | Tom Lane <tgl> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | qe-baseos-daemons | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.1 | CC: | hhorak | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-01-12 06:31:09 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Yaniv Lavi
2011-11-01 22:59:17 UTC
Created attachment 531210 [details]
The example query.
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. If you would like it considered as an exception in the current release, please ask your support representative. AFAICT, the bug is in the query not in Postgres. AVG() with an OVER clause is not an aggregate function, it is a window function. In this example, since there is a GROUP BY, the window function is supposed to act on the grouped rows, so it must have an argument expression that is well-defined for a grouped row. And receive_rate_percent isn't well-defined, because it is not one of the grouping columns. It is a bit unhelpful that Postgres doesn't show an error cursor position :-(. I'll see about improving that. But I don't believe the query is valid. From my experience with other databases, windowed functions are performed after the select is created in memory. this means that it doesn't matter what is the select, only the partition. The fact that postgres is not able to perform this select is a major disadvantage. I don't understand why postgres is not able to run aggregate functions and windowed functions at the same time. grouping should ignore windowed functions. (In reply to comment #5) > grouping should ignore windowed functions. I'm sorry, but you're wrong, and I challenge you to find another DBMS that will execute this query as written. The SQL standard is perfectly clear that a window function operates on the group rows when there is a GROUP BY. In SQL:2008, section 4.14.9 Windowed tables, third paragraph: If a <table expression> is grouped and also has a window, then there is a syntactic transformation that segregates the grouping into a <derived table>, so that the window partitions consist of rows of the <derived table> rather than groups of rows. (In reply to comment #6) > (In reply to comment #5) > > grouping should ignore windowed functions. > > I'm sorry, but you're wrong, and I challenge you to find another DBMS that will > execute this query as written. The SQL standard is perfectly clear that a > window function operates on the group rows when there is a GROUP BY. In > SQL:2008, section 4.14.9 Windowed tables, third paragraph: > > If a <table expression> is grouped and also has a window, then there is a > syntactic transformation that segregates the grouping into a <derived table>, > so that the window partitions consist of rows of the <derived table> rather > than groups of rows. But this still means the query will run on the rows according to the partition you define on the derived table. postgres fails to do this. The point is that the query is attempting to refer to a column that doesn't exist in the derived grouped table. The grouped table contains the grouping columns, the results of plain aggregate functions over each group, and nothing else. Window functions run *after* plain aggregate functions, and are logically a separate level of query operations. (The syntax the SQL committee chose for them makes this remarkably obscure :-(). Perhaps it would be more profitable to discuss how to write this query correctly. I'm not sure what it is you are actually trying to get the average of? The Rx_rate and Tx_rate columns are averages over each vm_interface_id/interface_name/history_datetime group, but what are you trying to accomplish with window_col? I want the average of receive_rate_percent by the id without minding the datetime. (In reply to comment #9) > I want the average of receive_rate_percent by the id without minding the > datetime. Hm. Not sure if it's possible to do that in the same query that produces this other slicing of the data. You could get an approximation to it with avg(avg(receive_rate_percent)) OVER (PARTITION BY v3_0_vm_interface_daily_history_view.vm_interface_id) which would produce the average of the per-datetime averages. (Here, the inner avg() is a plain aggregate and the outer one is a window function.) However, if the per-datetime groups contain significantly different numbers of observations (original rows), you'd be weighting the observations in the smaller groups more heavily than those in the larger groups. Not knowing the table design, I don't know if this is a problem or not. If that doesn't work for you then I think you need to do a bunch of sub-selects, that is (SELECT avg(receive_rate_percent) FROM tables WHERE vm_interface_id matches the outer value) AS window_col Closing -- I see no bug here. |