Archive for the ‘optimization’ Category

Profiling your slow queries using pt-query-digest and some love from Percona Server

Декабрь 28th, 2011
This guide will get you up and running with how to identify the bottleneck queries using the excellent tool pt-query-digest. You will learn how to use and analyze the output returned by pt-query-digest. You will also learn some differences between slow query logging in various MySQL versions. Later on in the post I will also show you how to make use of the extra diagnostic data available with Percona Server.
PlanetMySQL Voting: Vote UP / Vote DOWN

Flag based COUNT using MySQL

Октябрь 27th, 2011

I was asked for help in optimizing a MySQL query where flags are stored in a database and references should be counted based on the flag value. Sounds not complicated at first, but there are several flags that should be counted and also just once per reference. A lot of food for GROUP BY you may think. Having said this, search and group for flags in this table would be really slow due to a very poor cardinality. But let's start with the actual problem. The example is fictitious, but I did my best to find a general use case for this problem.

Read the rest »


PlanetMySQL Voting: Vote UP / Vote DOWN

Optimized Pagination using MySQL

Октябрь 24th, 2011

Dealing with large data sets makes it necessary to pick out only the newest or the hottest elements and not displaying everything. In order to have older items still available, Pagination navigation's have become established. However, implementing a Pagination with MySQL is one of those problems that can be optimized poorly with MySQL and certainly other RDBM systems. However, knowing the underlying database can also help in optimizing pagination queries, because there is no real copy and paste solution.

Read the rest »


PlanetMySQL Voting: Vote UP / Vote DOWN

OR conditions considered bad… Or? And a workaround.

Июль 19th, 2011
Some things are known to be just bad. GOTOs used to be one such thing (something I still use them, but only where appropriate, which isn't that many places). Maybe it is just so, that some things are useful, but not for everything, so maybe the issue is that they are used inappropriately. Or?

The OR condition is one such things in MySQL circles! Oh, you have an OR condition! That is going to be so slow! sort of. And the reason an OR is "slow" is that as MySQL will use only one index for each statement, only one "side" or the or condition can use an index. Or sometimes even worse, MySQL will consider using an index that is common to the two "sides" or is outside the OR conditition, despite that fact that there are perfectly fine, highly selective indexes on both sides of the OR condition.

If you ask me, this is not a fault with the OR condition but rather a problem with the MySQL optimizer. Why in heavens name can't a single statement use two indexes, if that is what it takes? And let me let you in on a little secret: MySQL can use multiple indexes for one statement! But that depends on what you mean with a statement. And MySQL means something slightly different than many of us do!

Without further ado, lets have a look at an example. We work at a retail store, and a package from us has been stuck at the post office. We want to check what product this is, but we don't know the product id. What the guy who called us from the post-office said was something that looked like a brand name, that I can map to a brand ID, the number of units in the package and the weight. But to be honest, the last two weren't terribly reliable. OK, lets find the product in the product table, which looks like this:
CREATE TABLE `product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`brand_id` int(11) NOT NULL,
`quantity` int(11) NOT NULL,
`weight` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `ix_brand` (`brand_id`),
KEY `ix_weight_brand` (`weight`,`brand_id`),
KEY `ix_quantity_brand` (`quantity`,`brand_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2321295 DEFAULT CHARSET=utf8

I know for certain that brand_id is 6, I already looked that up. But there are millions of products in the product table! Luckily, looking for approriate products using brand_id and either quantity or weight should be easy, right? We know now that the weight is 41 and quantity is 78. And we have approriate indexes, this should not be a big deal, right:
SELECT id FROM product WHERE brand_id = 6 AND (weight = 41 OR quantity = 78)

Well, although this works, it is a big sluggish, real slow actually. Lets look at what mySQL does with this statement:
EXPLAIN SELECT id FROM product WHERE brand_id = 6 AND (weight = 41 OR quantity = 78)
And what we get is this:
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | product | ref | ix_brand,ix_weight_brand,ix_quantity_brand | ix_brand | 4 | const | 4291 | Using where |
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+
That wasn't so good. Let's try a different way:
EXPLAIN SELECT id FROM product WHERE (brand_id = 6 AND weight = 41) or (brand_id = 6 AND quantity = 78);

And that will result in the same query path. Only one index can be used, and there is one index that fits with both paths, that on brand_id, so MySQL picks that. Using FORCE_INDEX will work, but still only 1 index will be used, and the result may well be even worse, as a FORCE_INDEX on, say the ix_weight_brand index, will make the other path, on quantity, dead slow! What you would like MySQL to do, which doesn't seem so complicated, is to realize that there are two distinct paths here which can be looked up using an index real easy, execute them both and merge the results. But no, MySQL will not DO that! Only 1 index per statement, that's it. Or?

Well, when you understand that MySQL will only use one index per statement, consider what MySQL means with statement here. For a SELECT it is the individual SELECT statement that is the statement, which sounds reasonable until you consider a UNION! Each and every statement in a UNION is considered a separate statement (in this particular case that is, but it is a but messy, UNIONs in MySQL are a bit of a kludge, really)! So if we rewrite the statement above as a UNION, which is easily done for many queries involving OR-conditions, you get something like this:
SELECT id FROM product
WHERE brand_id = 6 AND weight = 41
UNION
SELECT id FROM product
WHERE brand_id = 6 AND quantity = 78;

What are we saying here? We are telling MySQL that these are actually two separate paths, which is what we did with the OR condition, but in this case, MySQL can use two indexes, and will nicely merge the results, so an explain looks like this:
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+
| 1 | PRIMARY | product | ref | ix_brand,ix_weight_brand | ix_weight_brand | 8 | const,const | 31 | Using index |
| 2 | UNION | product | ref | ix_brand,ix_quantity_brand | ix_quantity_brand | 8 | const,const | 1 | Using index |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+
This latter query is often so much faster than the alternatives, and we have tricked MySQL into using two indexes and merge the result. But for some reason, MySQL is unable to figure this one out for itself. Is this an important tip I am giving you here? Is this a neat optimization trick that I am handing out? Short-term, the answer is yes.

But am I with this saying that you should stay clear of OR-conditions? Absolutely not, no way. What I am presenting here is an awkward way of circumventing some obvious flaws with the MySQL optimizer, and this should be fixed! But what I AM saying is this: If you currently have big performance problems with MySQL SELECTs involving OR conditions, you might consider rewriting those statements to UNIONs, sometimes that hels. But do not do this will ALL your OR-conditions, only where you have to and it makes sense. Let's meanwhile wait for the MySQL developers to fix this. (No, I'm not good enough at the optimizer code or most other parts of the MySQL kernel to fix this myself. I'm happy to build things around MySQL, but I do not have the time to get more involved with the kernel).

And before I keave you for now: This was tested with MySQL 5.5.7 on Linux. I have NOT checked for fixes, updates to this, but I do hope it has NOT been fixed? Why? Why do I now want it fixed?? Have I gone bonkers? Yes, I am bonkers, but that's not the issue here, the issue is that such rather involved fixes to the optimizer is NOT something I want introduced in the middle of a GA release! But I'd be really glad to have it fixed in 5.6 or whatever that release is to be called! And yes, I am ware this is not exactly with the optimizer itself, but more so with the query execution, but for now, I have decided to call it the optimimizer anyway, as the sun is shining and the weather is nice and all that, sometime around christmas I might consider changing my mind.

Cheers for now
/Karlsson

PlanetMySQL Voting: Vote UP / Vote DOWN

On Covering Indexes and Their Impact on Performance

Июль 3rd, 2011
The purpose of this post is to describe what covering indexes are and how they can be used to improve the performance of queries. People mostly use indexes to filter or sort the results but not much thought is given to actually reduce the disk reads by using proper indexes. So I will show you how to reduce disk reads and hence improve the performance of queries by utilizing indexes properly.
PlanetMySQL Voting: Vote UP / Vote DOWN

Optimizing UPDATE and DELETE statements

Февраль 24th, 2011

While most people look at performance optimizations for SELECT statements, UPDATE and DELETE statements are often overlooked. These can benefit from the principles of analyzing the Query Execution Plan (QEP). You can only run an EXPLAIN on a SELECT statement, however it’s possible to rewrite an UPDATE or DELETE statement to perform like a SELECT statement.

To optimize an UPDATE, look at the WHERE clause. If you are using the PRIMARY KEY, no further analysis is necessary. If you are not, it is of benefit to rewrite your UPDATE statement as a SELECT statement and obtain a QEP as previously detailed to ensure optimal indexes are used. For example:

UPDATE t
SET	c1 = ‘x’, c2 = ‘y’, c3 = 100
WHERE c1 = ‘x’
AND	d = CURDATE()

You can rewrite this UPDATE statement as a SELECT statement for using EXPLAIN:

EXPLAIN SELECT c1, c2, c3 FROM	t WHERE c1 = ‘x’ AND	d = CURDATE()

You should now apply the same principles as you would when optimizing SELECT statements.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Query Optimization – Tip # 1 – Avoid using wildcard character at the start of a LIKE pattern.

Ноябрь 23rd, 2010
The more I go through others SQL, there are some common mistakes that I see developers making over and over again, so I thought why not start a series of tips that can help developers optimize their queries and avoid common pitfalls. So this post is a part of that series of tips, and this is the first tip "Avoid using a wild card character at the start of a LIKE pattern".
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Paginated displays – How to kill performance vs How to improve performance!

Октябрь 30th, 2010
Pagination is used very frequently in many websites, be it search results or most popular posts they are seen everywhere. But the way how it is typically implemented is naive and prone to performance degradation. In this article I attempt on explaining the performance implications of poorly designed pagination implementation. I have also analyzed how Google, Yahoo and Facebook handle pagination implementation. Then finally i present my suggestion which will greatly improve the performance related to pagination.
PlanetMySQL Voting: Vote UP / Vote DOWN

Resolve many-to-many relations a bit different with MySQL

Сентябрь 2nd, 2010

In database modeling, a m:n relationship is usually resolved by an additional table. But what if this relation is used only for archiving and the number of links in the resulting table is not too high? In that context, I got the idea to store all referring ID's as CSV string directly into a TEXT column of one of the referring tables. I came to this idea, because otherwise I would have to build complicated foreign keys and this way I also save one additional table. Certainly, this only makes sense if the data is not frequently accessed as foreign key. Nevertheless, I would like to tackle the problem, even if the implementation is very MySQL-oriented.

Read the rest »


PlanetMySQL Voting: Vote UP / Vote DOWN

Improving MySQL Productivity – From Design to Implementation

Июль 2nd, 2010

My closing presentation at the dedicated MySQL track at ODTUG Kaleidoscope 2010 discussed various techniques and best practices for improving the ROI of developer resources using MySQL. Included in the sections on Design, Security, Development, Testing, Implementation, Instrumentation and Support were also a number of horror stories of not what to do, combined with practical examples of improving productivity.


PlanetMySQL Voting: Vote UP / Vote DOWN