Archive for the ‘optimization’ Category
Profiling your slow queries using pt-query-digest and some love from Percona Server
Декабрь 28th, 2011PlanetMySQL Voting: Vote UP / Vote DOWN
Flag based COUNT using MySQL
Октябрь 27th, 2011I was asked for help in optimizing a MySQL query where flags are stored in a database and references should be counted based on the flag value. Sounds not complicated at first, but there are several flags that should be counted and also just once per reference. A lot of food for GROUP BY you may think. Having said this, search and group for flags in this table would be really slow due to a very poor cardinality. But let's start with the actual problem. The example is fictitious, but I did my best to find a general use case for this problem.
PlanetMySQL Voting: Vote UP / Vote DOWN
Optimized Pagination using MySQL
Октябрь 24th, 2011Dealing with large data sets makes it necessary to pick out only the newest or the hottest elements and not displaying everything. In order to have older items still available, Pagination navigation's have become established. However, implementing a Pagination with MySQL is one of those problems that can be optimized poorly with MySQL and certainly other RDBM systems. However, knowing the underlying database can also help in optimizing pagination queries, because there is no real copy and paste solution.
PlanetMySQL Voting: Vote UP / Vote DOWN
OR conditions considered bad… Or? And a workaround.
Июль 19th, 2011The OR condition is one such things in MySQL circles! Oh, you have an OR condition! That is going to be so slow! sort of. And the reason an OR is "slow" is that as MySQL will use only one index for each statement, only one "side" or the or condition can use an index. Or sometimes even worse, MySQL will consider using an index that is common to the two "sides" or is outside the OR conditition, despite that fact that there are perfectly fine, highly selective indexes on both sides of the OR condition.
If you ask me, this is not a fault with the OR condition but rather a problem with the MySQL optimizer. Why in heavens name can't a single statement use two indexes, if that is what it takes? And let me let you in on a little secret: MySQL can use multiple indexes for one statement! But that depends on what you mean with a statement. And MySQL means something slightly different than many of us do!
Without further ado, lets have a look at an example. We work at a retail store, and a package from us has been stuck at the post office. We want to check what product this is, but we don't know the product id. What the guy who called us from the post-office said was something that looked like a brand name, that I can map to a brand ID, the number of units in the package and the weight. But to be honest, the last two weren't terribly reliable. OK, lets find the product in the product table, which looks like this:
CREATE TABLE `product` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`brand_id` int(11) NOT NULL,
`quantity` int(11) NOT NULL,
`weight` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `ix_brand` (`brand_id`),
KEY `ix_weight_brand` (`weight`,`brand_id`),
KEY `ix_quantity_brand` (`quantity`,`brand_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2321295 DEFAULT CHARSET=utf8
I know for certain that brand_id is 6, I already looked that up. But there are millions of products in the product table! Luckily, looking for approriate products using brand_id and either quantity or weight should be easy, right? We know now that the weight is 41 and quantity is 78. And we have approriate indexes, this should not be a big deal, right:
SELECT id FROM product WHERE brand_id = 6 AND (weight = 41 OR quantity = 78)
Well, although this works, it is a big sluggish, real slow actually. Lets look at what mySQL does with this statement:
EXPLAIN SELECT id FROM product WHERE brand_id = 6 AND (weight = 41 OR quantity = 78)
And what we get is this:
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+That wasn't so good. Let's try a different way:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+
| 1 | SIMPLE | product | ref | ix_brand,ix_weight_brand,ix_quantity_brand | ix_brand | 4 | const | 4291 | Using where |
+----+-------------+---------+------+--------------------------------------------+----------+---------+-------+------+-------------+
EXPLAIN SELECT id FROM product WHERE (brand_id = 6 AND weight = 41) or (brand_id = 6 AND quantity = 78);
And that will result in the same query path. Only one index can be used, and there is one index that fits with both paths, that on brand_id, so MySQL picks that. Using FORCE_INDEX will work, but still only 1 index will be used, and the result may well be even worse, as a FORCE_INDEX on, say the ix_weight_brand index, will make the other path, on quantity, dead slow! What you would like MySQL to do, which doesn't seem so complicated, is to realize that there are two distinct paths here which can be looked up using an index real easy, execute them both and merge the results. But no, MySQL will not DO that! Only 1 index per statement, that's it. Or?
Well, when you understand that MySQL will only use one index per statement, consider what MySQL means with statement here. For a SELECT it is the individual SELECT statement that is the statement, which sounds reasonable until you consider a UNION! Each and every statement in a UNION is considered a separate statement (in this particular case that is, but it is a but messy, UNIONs in MySQL are a bit of a kludge, really)! So if we rewrite the statement above as a UNION, which is easily done for many queries involving OR-conditions, you get something like this:
SELECT id FROM product
WHERE brand_id = 6 AND weight = 41
UNION
SELECT id FROM product
WHERE brand_id = 6 AND quantity = 78;
What are we saying here? We are telling MySQL that these are actually two separate paths, which is what we did with the OR condition, but in this case, MySQL can use two indexes, and will nicely merge the results, so an explain looks like this:
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+This latter query is often so much faster than the alternatives, and we have tricked MySQL into using two indexes and merge the result. But for some reason, MySQL is unable to figure this one out for itself. Is this an important tip I am giving you here? Is this a neat optimization trick that I am handing out? Short-term, the answer is yes.
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+
| 1 | PRIMARY | product | ref | ix_brand,ix_weight_brand | ix_weight_brand | 8 | const,const | 31 | Using index |
| 2 | UNION | product | ref | ix_brand,ix_quantity_brand | ix_quantity_brand | 8 | const,const | 1 | Using index |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+------+--------------+------------+------+----------------------------+-------------------+---------+-------------+------+-------------+
But am I with this saying that you should stay clear of OR-conditions? Absolutely not, no way. What I am presenting here is an awkward way of circumventing some obvious flaws with the MySQL optimizer, and this should be fixed! But what I AM saying is this: If you currently have big performance problems with MySQL SELECTs involving OR conditions, you might consider rewriting those statements to UNIONs, sometimes that hels. But do not do this will ALL your OR-conditions, only where you have to and it makes sense. Let's meanwhile wait for the MySQL developers to fix this. (No, I'm not good enough at the optimizer code or most other parts of the MySQL kernel to fix this myself. I'm happy to build things around MySQL, but I do not have the time to get more involved with the kernel).
And before I keave you for now: This was tested with MySQL 5.5.7 on Linux. I have NOT checked for fixes, updates to this, but I do hope it has NOT been fixed? Why? Why do I now want it fixed?? Have I gone bonkers? Yes, I am bonkers, but that's not the issue here, the issue is that such rather involved fixes to the optimizer is NOT something I want introduced in the middle of a GA release! But I'd be really glad to have it fixed in 5.6 or whatever that release is to be called! And yes, I am ware this is not exactly with the optimizer itself, but more so with the query execution, but for now, I have decided to call it the optimimizer anyway, as the sun is shining and the weather is nice and all that, sometime around christmas I might consider changing my mind.
Cheers for now
/Karlsson
PlanetMySQL Voting: Vote UP / Vote DOWN
On Covering Indexes and Their Impact on Performance
Июль 3rd, 2011PlanetMySQL Voting: Vote UP / Vote DOWN
Optimizing UPDATE and DELETE statements
Февраль 24th, 2011While most people look at performance optimizations for SELECT statements, UPDATE and DELETE statements are often overlooked. These can benefit from the principles of analyzing the Query Execution Plan (QEP). You can only run an EXPLAIN on a SELECT statement, however it’s possible to rewrite an UPDATE or DELETE statement to perform like a SELECT statement.
To optimize an UPDATE, look at the WHERE clause. If you are using the PRIMARY KEY, no further analysis is necessary. If you are not, it is of benefit to rewrite your UPDATE statement as a SELECT statement and obtain a QEP as previously detailed to ensure optimal indexes are used. For example:
UPDATE t SET c1 = ‘x’, c2 = ‘y’, c3 = 100 WHERE c1 = ‘x’ AND d = CURDATE()
You can rewrite this UPDATE statement as a SELECT statement for using EXPLAIN:
EXPLAIN SELECT c1, c2, c3 FROM t WHERE c1 = ‘x’ AND d = CURDATE()
You should now apply the same principles as you would when optimizing SELECT statements.
PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Query Optimization – Tip # 1 – Avoid using wildcard character at the start of a LIKE pattern.
Ноябрь 23rd, 2010PlanetMySQL Voting: Vote UP / Vote DOWN
MySQL Paginated displays – How to kill performance vs How to improve performance!
Октябрь 30th, 2010PlanetMySQL Voting: Vote UP / Vote DOWN
Resolve many-to-many relations a bit different with MySQL
Сентябрь 2nd, 2010In database modeling, a m:n relationship is usually resolved by an additional table. But what if this relation is used only for archiving and the number of links in the resulting table is not too high? In that context, I got the idea to store all referring ID's as CSV string directly into a TEXT column of one of the referring tables. I came to this idea, because otherwise I would have to build complicated foreign keys and this way I also save one additional table. Certainly, this only makes sense if the data is not frequently accessed as foreign key. Nevertheless, I would like to tackle the problem, even if the implementation is very MySQL-oriented.
PlanetMySQL Voting: Vote UP / Vote DOWN
Improving MySQL Productivity – From Design to Implementation
Июль 2nd, 2010My closing presentation at the dedicated MySQL track at ODTUG Kaleidoscope 2010 discussed various techniques and best practices for improving the ROI of developer resources using MySQL. Included in the sections on Design, Security, Development, Testing, Implementation, Instrumentation and Support were also a number of horror stories of not what to do, combined with practical examples of improving productivity.
PlanetMySQL Voting: Vote UP / Vote DOWN