RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 830326 - mysql-server poor correlated subquery performance
Summary: mysql-server poor correlated subquery performance
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: mysql
Version: 6.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Jakub Dorňák
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 947775 1070830 1159820
TreeView+ depends on / blocked
 
Reported: 2012-06-08 20:57 UTC by Charles Kloote
Modified: 2015-10-14 06:48 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-14 06:48:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
DB schema for bug recreation test (200.10 KB, application/x-gzip)
2012-07-18 21:16 UTC, Charles Kloote
no flags Details

Description Charles Kloote 2012-06-08 20:57:27 UTC
Description of problem:
Experiencing the issue as described in mysql bug #64557.

http://bugs.mysql.com/bug.php?id=64557

Version-Release number of selected component (if applicable):

mysql-server-5.1.61-1.el6_2.1.x86_64

How reproducible:

Launch a particular query against our dataset with multiple joins and a correlated subquery.

Actual results:
Query does not return result and must eventually be killed.

Expected Results:
Query should return very quickly. On RHEL5 with mysql-server 5.0.X, the same query against the same dataset returns over 1000 rows in 0.33 seconds.

Additional information:
I can't share our internal data publicly, but I'm willing to help reproduce if needed. I'm mainly curious if the fix from 5.6.5 described in the mysql bug can get backported into the RHEL 6 5.1 package, assuming this is the same issue.

Comment 2 Tom Lane 2012-07-17 15:23:08 UTC
That mysql bug is pretty devoid of any concrete information :-(.  Can you put together a self-contained test case?

Comment 3 Charles Kloote 2012-07-18 13:05:38 UTC
Yeah, we'll see if we can make it happen with a dummy schema and data, then update this bug. May take a few days to put something together.

Comment 4 Tom Lane 2012-07-18 14:05:30 UTC
Thanks.  Just to clarify the situation: the mysql 5.6.5 release notes (http://dev.mysql.com/doc/refman/5.6/en/news-5-6-5.html) indicate that quite a lot of work was done in the area of optimizing "foo IN (sub-query)" constructs; it looks like a near-total rewrite of that section of the optimizer.  If that's what fixed your problem, there's no chance of maintaining a back-port into 5.1 or even 5.5.  The only thing that gives me some hope is that the test query shown in upstream bug 64557 includes GROUP BY in the sub-query, but the documentation at http://dev.mysql.com/doc/refman/5.6/en/semi-joins.html says that the new semi-join handling doesn't cover such cases.  So for this specific case, the fix might not have come from the semi-join rewrite but from some more localized change that we might stand a chance of back-porting.

So in short, we need to see a test case that pretty exactly embodies the specific pain point you're having, and then we can try to trace what fixed it.  It may well end up that the only sane answer will be "you gotta move to 5.6 when it's out", but we'll take a look at it.

Comment 5 Charles Kloote 2012-07-18 21:16:59 UTC
Created attachment 599002 [details]
DB schema for bug recreation test

Comment 6 Charles Kloote 2012-07-18 21:22:04 UTC
I have uploaded a small gzipped MySQL dump. Here's a query against that dump that demonstrates the problem:

SELECT foo.foo_id, foo.data, bar.bar_id, bar.timestamp FROM foo JOIN bar on foo.foo_id = bar.foo_id WHERE ( bar.foo_id = ( SELECT bar.foo_id FROM bar WHERE bar.foo_id = foo.foo_id ORDER BY bar.timestamp DESC LIMIT 1 ) ) LIMIT 1000;

On RHEL5 with MySQL 5.0.45, the results return nearly instantly.
On RHEL6 with MySQL 5.1.61 however: 1000 rows in set (1.93 sec)

On much larger tables with real production data, that execution time gets much much longer.

Let me know if you have any questions. Yeah, if this isn't something that can be backported, we do have workarounds by breaking this up into two queries, but I'd still be curious to learn the results of your testing.

Comment 7 Tom Lane 2012-07-18 21:47:50 UTC
Thanks for the test case!  Just by eyeball, it does not look like a candidate to be a semi-join, so maybe there's a chance of doing something for you.

Honza Horak had expressed interest in studying this, so I'm going to let him do the initial research.

Comment 8 Honza Horak 2012-07-19 11:57:33 UTC
Well, I did some testing and it seems the reality is a bit different than we expected. I used dataset and test query from comment #6 from Charles tried it on mysql-5.0.95, mysql-5.5.24 and mysql-5.6.6-m9, which came right from upstream repository. Here are results:

----------------------------
| mysql-5.0.95   | ~0.02 s |
| mysql-5.5.24   | ~4.00 s |
| mysql-5.6.6-m9 | ~2.00 s |
----------------------------

So the performance issue doesn't seem to be actually fixed even in the present source code, which means Charles probably hit another issue than #64557.

Going a bit deeper, EXPLAIN shows the same result in mysql-5.5.24 and in mysql-5.6.6:

+----+--------------------+-------+-------+---------------+
| id | select_type        | table | type  | possible_keys |
+----+--------------------+-------+-------+---------------+
|  1 | PRIMARY            | foo   | ALL   | PRIMARY       |
|  1 | PRIMARY            | bar   | ref   | foo_id        |
|  2 | DEPENDENT SUBQUERY | bar   | index | foo_id        |
+----+--------------------+-------+-------+---------------+
-----------+---------+------+------+-------------+
 key       | key_len | ref  | rows | Extra       |
-----------+---------+------+------+-------------+
 NULL      | NULL    | NULL |  885 | Using where |
 foo_id    | 4       | func |   49 | Using where |
 timestamp | 4       | NULL |    1 | Using where |
-----------+---------+------+------+-------------+

...meanwhile mysql-5.0.95 shows a bit different EXPLAIN:

+----+--------------------+-------+------+---------------+--------+
| id | select_type        | table | type | possible_keys | key    |
+----+--------------------+-------+------+---------------+--------+
|  1 | PRIMARY            | foo   | ALL  | PRIMARY       | NULL   |
|  1 | PRIMARY            | bar   | ref  | foo_id        | foo_id |
|  2 | DEPENDENT SUBQUERY | bar   | ref  | foo_id        | foo_id |
+----+--------------------+-------+------+---------------+--------+
---------+----------------------+------+-----------------------------+
 key_len | ref                  | rows | Extra                       |
---------+----------------------+------+-----------------------------+
 NULL    | NULL                 | 1099 | Using where                 | 
 4       | func                 |   50 | Using where                 | 
 4       | bug830326.foo.foo_id |   50 | Using where; Using filesort | 
---------+----------------------+------+-----------------------------+

That means that the use of timestamp index significantly slows down the query, which is what one wouldn't expect. I'd say it is a bug, but I haven't found such bug report at bugs.mysql.com. Charles, would you mind reporting that performance issue or shall I?

Anyway, using "IGNORE INDEX (timestamp)" for bar table in the subquery I got the same fast results as in mysql-5.0.95, so it could be a fix for Charles at least.

Comment 9 Charles Kloote 2012-07-19 18:23:46 UTC
Since you're working directly with the upstream MySQL source it probably makes the most sense for you to file the bug, if that works for you. I can confirm that doing IGNORE INDEX (timestamp) on that query improves performance to 0.08 sec, so that definitely gives us a good workaround for the time being.

Comment 10 Honza Horak 2012-07-20 07:02:15 UTC
Reported as http://bugs.mysql.com/bug.php?id=65957.

Comment 11 RHEL Program Management 2012-12-14 08:03:42 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 14 Honza Horak 2015-10-14 06:48:06 UTC
Upstream only fixed this in 5.6+ releases and doesn't plan to backport it to older releases. Due to big differences in planner implementation we're not able to backport the fix into 5.1 and since using downstrea-only patch would be too risky, I'm closing this bug and don't plan to fix it.

Please, mind that there is still a work-around to specify "IGNORE INDEX" to force planner to choose better execution plan.


Note You need to log in before you can comment on or make changes to this bug.