Archive for the ‘sort_buffer_size’ Category

More on understanding sort_buffer_size

Май 11th, 2010

There have been a few posts by Sheeri and Baron today on the MySQL sort_buffer_size variable. I wanted to add some more information about this buffer, what is impacted when it is changed, and what to do about it?

The first thing you need to know is the sort_buffer_size is a per session buffer. That is this memory is assigned per connection/thread. I’ve seen clients that set this assuming it’s a global buffer Don’t Assume – Per Session Buffers.

Second, internally in the OS usage independently of MySQL, there is a threshold > 256K. From Monty Taylor “if buffer is set to over 256K, it uses mmap() instead of malloc() for memory allocation. Actually – this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:” . He goes on in a further to shows that impact > 256K for a buffer is 37x slower. This applies to all per session buffers, not just sort buffer. Now I have heard recently about this limit being 512K. I wasn’t able to nail down the specific speaker to see if this was a newer library or kernel or OS.

With MySQL instrumentation and the sort_buffer_size we are lucky, there is the Sort_merge_passes status variable. While it’s not perfect, it does indicate if the size of the buffer is in-sufficient, however even if we use a sort_buffer_size of say 256K, and you see Sort_merge_passes increasing slowly, does not indicate you have to increase the buffer.

So, all this does not tell you how to tune the buffer? Unfortunately with MySQL there is no actual easy answer. You do need to monitor the mysqld memory usage overall, especially if you are using persistent connections. A connection/thread will not release the memory assigned until it is closed, so it’s important to monitor for memory creep of the PGA, knowing what your initial SGA is. Morgan Tocker wrote a patch in Bug #33540 to create a RESET CONNECTION type command.

You do need to look for memory as a bottleneck. You need to learn how MySQL use memory, not just the sort_buffer_size. I actually started many years ago to write global/session variables to indicate when buffers were used, and how much and I started with the sort_buffer_size which was buried down in some very old filesort code. When I sought the input of an expert C coder around this, they wondered how the code, especially a loop handler even actually worked.

Nobody knows what the optimal setting is, and that’s the problem. In certain areas especially memory usage the MySQL instrumentation is simply non-existent, and I’d like to see this as something that is fixed.

In conclusion, if I ever see a sort_buffer_size above 256K, e.g. 1M or 2M, I always reset it to 256K. My reasoning is simple. Until you have evidence in your specific environment increasing the buffer makes performance better, it’s better to use a smaller value. There are bigger wins, like not using sorting, or better design, or even better simplifying or eliminating SQL.

References

sort_buffer_size and Knowing Why
How to tune MySQL’s sort_buffer_size
Read Buffer performance hit
more on malloc() speed


PlanetMySQL Voting: Vote UP / Vote DOWN

sort_buffer_size and Knowing Why

Май 10th, 2010

In How to tune MySQL’s sort_buffer_size, Baron gives a condescending viewpoint on how to tune the sort_buffer_size variable in MySQL. In a much-nicer-nutshell, his advice is “do not change sort_buffer_size from the default.”

Baron did not explain the logic behind his reasoning, he handwaves that “people utterly ruin their server performance and stability with it,” but does not explain how changing the sort_buffer_size kills performance and stability. Regardless of how respected and knowledgeable the source, NEVER take any advice that tells you what to do or how to do it without understanding WHY.

This article will explain the “why” of Baron’s point, and it will also talk more about understanding why, an integral part against the “Battle against any guess.” Baron’s recommendation to leave sort_buffer_size as the default is just as bad as all the advice given to change the sort_buffer_size, because all that advice (including Baron’s) does not explain the underlying causes.

First, I explain the sort_buffer_size issue. The sort buffer size, as the name implies, is a memory buffer used when ordering is needed (usually for GROUP BY and ORDER BY clauses, when the index used for the filter/join does not follow the GROUP/ORDER BY order). Increasing the sort_buffer_size means allowing more memory to be used for the sorting process.

Increasing the sort_buffer_size usually improves performance because more memory is used in sorting. It can be detrimental to performance because the full size of the sort buffer is allocated for each thread that needs to do a sort, even if that sort does not need a very large sort buffer.

A better optimization would be to change the schema and/or queries so that all that sorting is not necessary. Increasing the sort_buffer_size gives you a false sense of security that your server is performing better. Your server is performing the same tasks, only faster — the best optimization is to make the tasks smaller or eliminate some tasks. If you can have queries without so much sorting, that’s a much better optimization than changing sort_buffer_size.

That being said, increasing the sort_buffer_size is a perfectly acceptable stop-gap solution that can be implemented RIGHT NOW (it’s a dynamic variable), while you examine your queries by doing a query review with a tool such as mk-query-digest. This is indeed what Pythian does — and, by the way, not only do we recommend that course of action, but we explain it to you and help you find and optimize the queries in question.

That all assumes that having lots of sorts that require lots of memory is a bad thing. It may be that you have tuned your queries and schema such that you have eliminated as many sorts as you can, but some may remain. An intensive data mining server is a good example of a situation in which permanently increasing the sort_buffer_size may be the right solution.

Now that we have the specifics of this situation out of the way, let’s look at the Battle Against Any Guess. This is a movement against guessing games. Understanding what you are doing is essential; in the case of sort_buffer_size, you can believe that you know what you are doing by increasing sort_buffer_size. However, the real solution to the problem lies in changing the queries, not changing the memory patterns.

There is a 6-page description of the “Battle against any guess” in the Northern California Oracle User Group’s May Journal, starting on page 13. The examples are specific to Oracle, but the points made are sound even if you do not know Oracle well. For example:

Blindly implementing best practices is nothing different from guesswork; we are applying some past-proven solutions without measuring how they stand against our requirements, and without testing whether they bring us any closer to the targets we have. Industry has become so obsessed with best practices that we commonly see projects in which reviewing an environment for compliance with best practices is the ultimate goal.

One good reason you need to know *why* is also mentioned in the article: The second danger of best practices is that they easily become myths. The technology keeps improving and issues addressed by certain best practices might not be relevant anymore in the next software version.

So, even from respected folks like Baron or myself, do not take advice on face value. Ask why, understand why, and then think if there is another level. It is not always easy; often you think you understand but really you miss that other level – such as with sort_buffer_size.


PlanetMySQL Voting: Vote UP / Vote DOWN

How to tune MySQL’s sort_buffer_size

Май 9th, 2010

I perpetually see something like the following:

My server load is high and my queries are slow and my server crashes. Can you help me tune my server? Here is some information.

[random sample of SHOW GLOBAL STATUS, like the query cache counters]

my.cnf:

[mysqld]
key_buffer_size=1500M
query_cache_size= 64M
max_connections = 256
key_buffer = 8M
sort_buffer_size = 100M
read_buffer_size = 8M
delay_key_write = ALL

There are many problems in this my.cnf file, but the sort_buffer_size is a glaring one that identifies the user as someone who should not be playing with live ammunition. Therefore, I have developed an advanced process for tuning sort_buffer_size, which you can follow to get amazing performance improvements. It’s magical.

  1. How expert are you?
    • I know that there is a sort buffer, and that it is related to sort_merge_passes. When sort_merge_passes is high, I have been told to increase the sort_buffer_size. I also know that it is somehow related to the number of sorts the server does, so when there are a lot of sorts shown in variables like Sort_rows and Sort_scan, I think I should also increase it. You are a beginner.
    • I have been administering MySQL for many years. I know that there are two sort algorithms inside MySQL. I know exactly how to optimize the key cache hit ratio. You are a novice.
    • I have read every blog post Peter Zaitsev ever wrote, and I can improve on them all. You are an expert.
  2. Based on your score on the scale above, find your optimal sort_buffer_size tuning algorithm below:
    • Beginners and novices should leave this setting at its default, and comment it out of the configuration file.
    • Experts don’t need me to tell them what to do, but most of them will leave this setting at its default, and comment it out of the configuration file.

The most amazing thing about sort_buffer_size is how many people utterly ruin their server performance and stability with it, but insist that they know it’s vital to change it instead of leaving at its default. I do not know why this is always the case. Why don’t people choose random variables to destroy their performance? It’s not as though there is a shortage to choose from. Why does everyone always pick sort_buffer_size instead of something else? It’s like a flame drawing the moths in.

Feel free to ask questions if anything is unclear, but be prepared for a direct answer if you ask for tuning advice.

PS: I considered a simpler tuning guide, such as Domas’s guide to tuning the query cache, but I am convinced that people need more a complex guide for the sort_buffer_size, or they will not believe in the validity of the instructions. I base this on multiple experiences being paid a lot of money to suggest not setting sort_buffer_size to 256M, and being told that I must be an idiot.

Related posts:

  1. Sanity-check features in MySQL MySQL has
  2. Don’t forget about SHOW PROFILES It seems t
  3. I’ll be speaking at the O’Reilly MySQL Conference 2010 I’m

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN