Archive for the ‘range’ Category

On queries with many values in the IN clause

Апрель 10th, 2012
A few customers with rather extreme needs have contacted us about a performance issue with the range optimizer. Our solution to the problem is to introduce a new variable in MySQL 5.6, eq_range_index_dive_limit, which can be used to control whether or not the range optimizer will a) do index dives, or b) use index statistics when estimating the number of rows in the ranges of the query. The former method gives a far more accurate estimate while the latter costs a lot less to compute.

This is what the help text has to tell about the variable:

The optimizer will use existing index statistics instead of doing index dives for equality ranges if the number of equality ranges for the index is larger than or equal to [the value of variable]. If set to 0, index dives are always used.
"Equality range" means predicates using operators IN() or =, and it's important to notice that the number of such ranges is counted on a per index basis. Furthermore, index dives are not used on unique/primary indexes, so only non-unique indexes are affected. 

The two ways row estimates can be computed

For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan (QEP) for the query. Unfortunately, it involves two index dives for each range. That's not a big deal if you run normal queries with a few equality ranges like

SELECT title FROM albums WHERE artist IN (10, 12)
But some MySQL users have queries with hundreds of values in the IN clause. For such queries it may take considerably longer to optimize the query than to execute it. This is when it makes sense to use the less accurate index statistics. Consider this example:

Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

On queries with many values in the IN clause

Апрель 10th, 2012
A few customers with rather extreme needs have contacted us about a performance issue with the range optimizer. Our solution to the problem is to introduce a new variable in MySQL 5.6, eq_range_index_dive_limit, which can be used to control whether or not the range optimizer will a) do index dives, or b) use index statistics when estimating the number of rows in the ranges of the query. The former method gives a far more accurate estimate while the latter costs a lot less to compute.

This is what the help text has to tell about the variable:

The optimizer will use existing index statistics instead of doing index dives for equality ranges if the number of equality ranges for the index is larger than or equal to [the value of variable]. If set to 0, index dives are always used.
"Equality range" means predicates using operators IN() or =, and it's important to notice that the number of such ranges is counted on a per index basis. Furthermore, index dives are not used on unique/primary indexes, so only non-unique indexes are affected. 

The two ways row estimates can be computed

For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan (QEP) for the query. Unfortunately, it involves two index dives for each range. That's not a big deal if you run normal queries with a few equality ranges like

SELECT title FROM albums WHERE artist IN (10, 12)
But some MySQL users have queries with hundreds of values in the IN clause. For such queries it may take considerably longer to optimize the query than to execute it. This is when it makes sense to use the less accurate index statistics. Consider this example:

Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

Index Condition Pushdown to the rescue!

Март 19th, 2012
A while ago, I explained how range access in a multiple-part index works and why MySQL can't utilize key parts beyond the first occurrence of some often used comparison operators. Luckily, there is a great improvement underway in MySQL 5.6 that will remedy much of this limitation. Meet Index Condition Pushdown.

How does ICP work?

Index Condition Pushdown is a new way for MySQL to evaluate conditions. Instead of evaluating conditions on rows read from a table, ICP makes it possible to evaluate conditions in the index and thereby avoid looking at the table if the condition is false.

Let's assume that we have a multiple-part index covering columns (keypart_1, ..., keypart_n). Further assume that we have a condition with a comparison operator on keypart_1 that does not allow keypart_2,...,keypart_n to be used as a range condition (more on that here).

When MySQL does range access without ICP enabled, the index is traversed to locate rows in the table that are within the range(s) relevant for the conditions on keypart_1. The rows in these ranges are read from the table and returned from the storage engine to the MySQL server. The server then evaluates the remaining conditions, including those that apply to keypart_2,..,keypart_n.

Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

Optimizer tracing: how to configure it

Октябрь 4th, 2011
In this blog post, my colleague Jørgen Løland described a new feature of MySQL 5.6: Optimizer Tracing. I recommend reading his article, as it presents this new feature in a simple, easy-to-read manner.

The Optimizer Tracing feature can help understanding what the Optimizer is doing; it is available since milestone 5.6.3, announced October 3rd at Oracle Open World (here is the changelog). It's good to see it mature now; I remember that Sergey Petrunia did the first prototype back in March 2009!

Today  I will be giving some must-have tips related to handling big traces.

First thing to know, a trace lives in main memory (internally it is allocated on the heap or free store of the MySQL Server). An SQL statement which gives the optimizer a lot of work (for example, by joining many tables) will generate a large trace. Up to gigabytes in some pathological cases! To avoid hogging memory then, a trace will never grow beyond the value of the @@optimizer_trace_max_mem_size session variable. Which has a default of 16 kilobytes; yes, it's a low value, to protect the innocent. I often raise it in my session, to 1 megabyte, which is enough for most queries in my daily work:

mysql> SET OPTIMIZER_TRACE_MAX_MEM_SIZE=1000000;

If a trace was truncated because exceeding this limit, the column INFORMATION_SCHEMA.OPTIMIZER_TRACE.MISSING_BYTES_BEYOND_MAX_MEM_SIZE shows a non-zero value: the number of bytes which could not be added to the trace.

When I read a trace I like to scroll up and down in it, search for some optimization phase (that means searching for some keyword in the trace); to do this, I set, in the "mysql" command-line client, the pager to "less" (I'm using Unix):

mysql> pager less;

This is very useful. I have all "less" commands under hand: can save to a file, can search for a regular expression match, search forward or backward... One last benefit is that when I have finished reading, quitting "less" makes the trace go away from the terminal, it does not linger on my screen forever, does not occupy the terminal's display history...

When there are many tables to join, as I said above the trace can grow a lot, because the Optimizer evaluates many possible join orders (using an algorithm known as "greedy search"): each evaluated join order is mentioned in the trace. Greedy search ends up being the greatest part of the trace. What if I want to see the trace of what happens in the Optimizer after greedy search is complete? If I set @@optimizer_trace_max_mem_size to a low value, it will trim greedy search and what follows. If I set @@optimizer_trace_max_mem_size to a high value, to see what I want, greedy search will be traced which will possibly exceed the amount of memory I can afford on this machine... It would be nice if I could tell the system: "please do not trace this and that". In my case, it would be "please do not trace greedy search".

As an example, let's consider twenty tables, t1...t20, all similar to this one:

mysql> CREATE TABLE t1 (a INT, INDEX(a)) ENGINE=MYISAM;
mysql> INSERT INTO t1 VALUES(x),(y); # x and y being some integers

and let's run this query, after turning tracing on:
mysql> SET OPTIMIZER_TRACE="ENABLED=ON,END_MARKER=ON";
mysql> EXPLAIN SELECT 1 FROM t1  JOIN t2 ON t2.a=t1.a JOIN t3 ON t3.a=t2.a JOIN t4 ON t4.a=t3.a JOIN t5 ON t5.a=t4.a JOIN t6 ON t6.a=t5.a JOIN t7 ON t7.a=t6.a JOIN t8 ON t8.a=t7.a JOIN t9 ON t9.a=t8.a JOIN t10 ON t10.a=t9.a JOIN t11 ON t11.a=t10.a JOIN t12 ON t12.a=t11.a JOIN t13 ON t13.a=t12.a JOIN t14 ON t14.a=t13.a JOIN t15 ON t15.a=t14.a JOIN t16 ON t16.a=t15.a JOIN t17 ON t17.a=t16.a JOIN t18 ON t18.a=t17.a JOIN t19 ON t19.a=t18.a JOIN t20 ON t20.a=t19.a;
+----+-------------+-------+-------+---------------+------+---------+------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------------+------+--------------------------+
| 1 | SIMPLE | t1 | index | a | a | 5 | NULL | 2 | Using where; Using index |
| 1 | SIMPLE | t2 | ref | a | a | 5 | test.t1.a | 1 | Using where; Using index |
| 1 | SIMPLE | t3 | ref | a | a | 5 | test.t2.a | 1 | Using where; Using index |
| 1 | SIMPLE | t4 | ref | a | a | 5 | test.t1.a | 1 | Using index |
| 1 | SIMPLE | t5 | ref | a | a | 5 | test.t1.a | 1 | Using index |
| 1 | SIMPLE | t6 | ref | a | a | 5 | test.t1.a | 1 | Using index |
| 1 | SIMPLE | t7 | ref | a | a | 5 | test.t1.a | 1 | Using index |
| 1 | SIMPLE | t8 | ref | a | a | 5 | test.t1.a | 1 | Using index |
| 1 | SIMPLE | t9 | ref | a | a | 5 | test.t1.a | 1 | Using where; Using index |
| 1 | SIMPLE | t10 | ref | a | a | 5 | test.t9.a | 1 | Using where; Using index |
| 1 | SIMPLE | t11 | ref | a | a | 5 | test.t1.a | 1 | Using where; Using index |
| 1 | SIMPLE | t12 | ref | a | a | 5 | test.t11.a | 1 | Using where; Using index |
| 1 | SIMPLE | t13 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t14 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t15 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t16 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t17 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t18 | ref | a | a | 5 | test.t11.a | 1 | Using where; Using index |
| 1 | SIMPLE | t19 | ref | a | a | 5 | test.t12.a | 1 | Using where; Using index |
| 1 | SIMPLE | t20 | ref | a | a | 5 | test.t11.a | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------------+------+--------------------------+
As optimizer developer, it catches my eye that some tables have "Using where" and others don't. "Using where" tells me that a condition is evaluated after fetching a row from the table, but does not tell me what condition. To know more, I will look at the trace. But I'm not interested in the trace of greedy search, which likely accounts, in this 20-table example, for most of the total trace, and would just make my reading less comfortable, or even hog memory, if I had more than those 20 tables or more indices on them, or more complex conditions. So I do:

mysql> SET OPTIMIZER_TRACE_FEATURES="GREEDY_SEARCH=OFF";
and run the EXPLAIN again. To show the resulting trace, I'm posting a screenshot of how it is displaid by the JsonView Firefox add-on. Using this add-on requires taking two precautions:
  • turning off the end_marker flag of @@optimizer_trace (this flag is good for human-readability but is not JSON-compliant)
  • sending the trace to a file without escaping of newlines (which is why I use INTO DUMPFILE instead of INTO OUTFILE):

mysql> SET OPTIMIZER_TRACE="END_MARKER=OFF";
mysql> SELECT TRACE INTO DUMPFILE "/tmp/trace.json" FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE;

Here's the trace now in Firefox:

We see how greedy_search=off has eliminated the trace of greedy search ("considered_execution_plans"), replacing it with just an ellipsis ("...")! Then there is what I wanted to see: the part after greedy search, named "attached_conditions_summary", which describes what condition is behind each "Using where" in EXPLAIN. There are equality conditions of course. Some coming directly from the "JOIN ON" conditions. Some deduced by equality propagation (if t1.a=t2.a and t2.a=t3.a then t1.a=t3.a, a new condition which we see is attached to t3). There are also "IS NOT NULL" conditions; indeed, an equality condition of the form t1.a=t2.a allows us to deduce that both t1.a and t2.a are not NULL, so if rows of t1 are retrieved first they can be filtered with this deduced condition, known as early NULL filtering.

There are other flags in @@optimizer_trace_features; we added a flag for each feature which we think can make the trace grow unreasonably large:
  • greedy search
  • repeated execution of subqueries (one execution per row)
  • repeated range optimization (known as "range checked for each record" in EXPLAIN)
  • range optimization in general.
I think I have shown enough for today. A positive note to finish: in the Optimizer team, this tracing has already helped us to debug some problems, so has started to fulfill its goal!

For the curious: the Optimizer Trace is covered in full detail, including the above, and more, in a chapter of the "MySQL Internals" manual.

PlanetMySQL Voting: Vote UP / Vote DOWN

Optimizer tracing: Query Execution Plan descriptions beyond EXPLAIN

Октябрь 4th, 2011
Understanding why MySQL chooses a particular join order or why table scan is chosen instead of range scan is often very hard even for experienced MySQL users. Two almost identical queries, differing only in constant values, may produce completely different plans. That's why we're introducing a great new feature in 5.6: Optimizer Tracing. The target users of this feature are developers and MySQL users experienced enough to understand the ins and outs of EXPLAIN.

What Optimizer Tracing is
You may already have guessed this, but optimizer tracing is a printout  of important decisions the MySQL optimizer has done during the process of making the Query Execution Plan.

The trace is presented in JSON format which is easy to read both for humans and others.

Currently, the optimizer trace includes short explanations for:
  • Why the join order was chosen.
  • Important query transformations like IN to EXISTS.
  • The access methods applicable and why one of them was chosen.
  • Which conditions are attached to each table (i.e., what hides behind "Using where" in EXPLAIN)
More coverage will be added as we go along.


What Optimizer Tracing is NOT
The feature is not, and never will be, complete in the sense that it does not describe all choices the optimizer does. However, we're going to add more information when we find valuable things to add. We will not guarantee backwards compatibility like we do for EXPLAIN. This gives us the freedom to improve tracing at a high pace without going through multi-release deprecation warnings etc. In other words: optimizer tracing is not a replacement for EXPLAIN.

A quick tour
The best way get a feeling of optimizer tracing is to give it a try. Below is a teaser:
Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

Tips and tricks: Killer response time for non-overlapping intervals

Сентябрь 27th, 2011
Assume you have a table where you store non-overlapping intervals using two columns, e.g. IP ranges. IP ranges are simple to represent using integer notation:

CREATE TABLE ip_owner (
   owner_id int NOT NULL,
   /* some columns */
   ip_start_int bigint NOT NULL,      /* IP address converted to integer */
   ip_end_int bigint NOT NULL,        /* IP address converted to integer */
   PRIMARY KEY (owner_id),
   INDEX ip_range (ip_start_int, ip_end_int)
) ENGINE=InnoDB;


And then you find yourself in a situation where you want to know who, if anyone, owns the IP address X. This can be done using the following query:

SELECT * FROM ip_owner WHERE ip_start_int <= X AND ip_end_int >= X;

MySQL can resolve this using a range scan, but will unfortunately only be able to use the ip_start_int <= X part of the condition as a range as explained here. Thus, the query will either be resolved by range scan if fairly few records have ip_start_int <= X or table scan otherwise. That means unreliable response time because it will be much quicker to query low-valued IPs than high valued IPs. I inserted 1M records into the table before running the queries below:

Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

The MySQL range access method explained

Август 23rd, 2011
The range access method uses an index to read a subset of rows that form one or multiple continuous index value intervals. The intervals are defined by the query's range predicates, which are comparisons using any of =, <=>, IN(), IS NULL, IS NOT NULL, >, <, >=, <=, BETWEEN, !=, <> or LIKE.

Some examples:
SELECT * FROM blog WHERE author_id IN (1, 7, 8, 10)
SELECT * FROM orders WHERE value > 1000

You know that the range access method is used when EXPLAIN shows type=range.

Naturally, there has to be an index on the column used by the range predicate. Since indexes are ordered, MySQL will, for each interval, dive down the index using the interval start value and read it's way through the index leaves until it reaches the interval end value:



Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

The MySQL range access method explained

Август 23rd, 2011
The range access method uses an index to read a subset of rows that form one or multiple continuous index value intervals. The intervals are defined by the query's range predicates, which are comparisons using any of =, <=>, IN(), IS NULL, IS NOT NULL, >, <, >=, <=, BETWEEN, !=, <> or LIKE.

Some examples:
SELECT * FROM blog WHERE author_id IN (1, 7, 8, 10)
SELECT * FROM orders WHERE value > 1000

You know that the range access method is used when EXPLAIN shows type=range.

Naturally, there has to be an index on the column used by the range predicate. Since indexes are ordered, MySQL will, for each interval, dive down the index using the interval start value and read it's way through the index leaves until it reaches the interval end value:



Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

Data Warehousing Best Practices: Comparing Oracle to MySQL, part 2 (partitioning)

Июль 30th, 2010

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle’s partitioning looks very nice (then again, that’s why it costs so much I guess).

Partition – Larger tables or fact tables can benefit from partitioning because it makes data load easier and can increase join performance and use data elimination. Parallel execution can be done with partitioning due to partition pruning. The degree of parallelism should be a power of 2, because of hash-based algorithm in hash partitioning. To translate this to the MySQL world, if you are using LINEAR HASH partitioning, then you should use a degree of parallelism that is a power of 2 (I checked, and indeed. Otherwise, use a degree of parallelism that makes sense given the number of partitions you have.

One important note that during Pythian’s testing of MySQL partitioning, we found that all partitions were locked when an INSERT occurs, for the duration of the INSERT. Bulk-loading with MySQL partitioning is not as fast as it would be if MySQL allowed partition pruning for INSERTs.

So, what should be partitioned? For the first level of partitioning, the goal is to enable partitioning pruning and simplify data management. The most typical partitioning is range or interval partitioning on a date column. Interval partitioning is you say what the partition is (date, month) and partition is automatically created. MySQL does not have interval partitioning, and I have seen typical first-level partitioning be range or list based on a date or timestamp column. Note that if you use a timestamp field, the partitioning expression is optimized if you use TO_DAYS(timestamp_field) or YEAR(timestamp_field). In my experience, using anything else (such as DATE(timestamp_field)) actually makes partitioning slower than not using partitioning at all. Note that this is based on tests I did a few months ago, and your mileage may vary.

So — how do you decide partitioning strategy? Ask yourself:

  • What range of data do the queries touch – a quarter, a year?
  • What is the data loading frequency?
  • Is an incremental load required?
  • How much data is involved, a day, a week, a month?

The answers to the above questions will tell you about how big your interval needs to be. The best scenario is that all answers are the same, “we load every day, and people query by day.” If the answers are different weight access a higher priority than loading, because most people care more about query performance than performance of ETL.

This is true even if your intervals have different sizes — ie sales per day are much bigger in Dec but that’s OK. However, Maria recommends that the subpartition be as evenly divided as possible.

Easier to look at more partitions than to look at a partition that’s too big. But you don’t want too many partitions, max Oracle allows partitions is 1 million partitions, prior to 11g it was 64,000. “Stick closer to 64,000 than 1 million”. MySQL’s limitation is 1024 per table.

For the second level of partitioning, also called subpartitioning, the goal is to allow for multi-level pruning and improve join performance. In Oracle, the most typical subpartition is hash or list – in MySQL, you can only subpartition by hash or key.

How do you decide subpartitioning strategy?

  • Select the dimension queried most frequently on the fact table OR
  • Pick the common join column

For example, if you want to look at sales per day, per store, you would choose “per day” as the partition and “per store” as the subpartition.

If you do not have a good partition on logical elements (like grouping), then you can subpartition using hash partitioning on common joins — perhaps surrogate keys, or using join key of the largest table involved in the join.

For example, if the sales table is partitioned and another big table is product, you can hash subpartition product_id.

Because there’s overhead in partitions (loading metadata, reading metadata), make sure size of partitions and subpartitions is >20 Mb. So better to have a 30 Mb subpartition than a 15 Mb subpartition. [I have no idea if this is true in MySQL or not -- I think the general concept is true, because there is some overhead, but I have no idea about the 20 Mb figure and why that's true for Oracle, nor do I know what is true in MySQL.]

One easy calculation is double the # of CPUs, round up to nearest power of 2. If you’re executing in parallel, Oracle will use 2x CPUs. (all this advice, by the way, follows 80/20 rule, this is probably good for about 80% of the environments out there). Of course, MySQL does not do parallel execution very well, so this probably does not apply.

Oracle knows it can get partition elimination while it does a join.

If 2 tables have the same degree of parallelism (same # of buckets) and are partitioned in the same way on the join column (say, customer_id in a subpartition of sales and a partition of customer), Oracle will match the partitions when joining:

sales table joined with customer table can change into 4 small joins:
sales sub part 1 joins with customer part 1
sales sub part 2 joins with customer part 2
sales sub part 3 joins with customer part 3
sales sub part 4 joins with customer part 4

And with parallelism, the total time is now reduced to the time it takes to do one of those smaller joins.

This is also why you want to have a power of 2 for buckets – because cores/processors come in powers of 2. Partition-wise joins like this can also be done with range or list, assuming both tables in the join have the same buckets.

I have no idea if MySQL partitioning works this way, but it’s certainly a functionality that makes sense to me.


PlanetMySQL Voting: Vote UP / Vote DOWN