Archive for the ‘baag’ Category

sort_buffer_size and Knowing Why

Май 10th, 2010

In How to tune MySQL’s sort_buffer_size, Baron gives a condescending viewpoint on how to tune the sort_buffer_size variable in MySQL. In a much-nicer-nutshell, his advice is “do not change sort_buffer_size from the default.”

Baron did not explain the logic behind his reasoning, he handwaves that “people utterly ruin their server performance and stability with it,” but does not explain how changing the sort_buffer_size kills performance and stability. Regardless of how respected and knowledgeable the source, NEVER take any advice that tells you what to do or how to do it without understanding WHY.

This article will explain the “why” of Baron’s point, and it will also talk more about understanding why, an integral part against the “Battle against any guess.” Baron’s recommendation to leave sort_buffer_size as the default is just as bad as all the advice given to change the sort_buffer_size, because all that advice (including Baron’s) does not explain the underlying causes.

First, I explain the sort_buffer_size issue. The sort buffer size, as the name implies, is a memory buffer used when ordering is needed (usually for GROUP BY and ORDER BY clauses, when the index used for the filter/join does not follow the GROUP/ORDER BY order). Increasing the sort_buffer_size means allowing more memory to be used for the sorting process.

Increasing the sort_buffer_size usually improves performance because more memory is used in sorting. It can be detrimental to performance because the full size of the sort buffer is allocated for each thread that needs to do a sort, even if that sort does not need a very large sort buffer.

A better optimization would be to change the schema and/or queries so that all that sorting is not necessary. Increasing the sort_buffer_size gives you a false sense of security that your server is performing better. Your server is performing the same tasks, only faster — the best optimization is to make the tasks smaller or eliminate some tasks. If you can have queries without so much sorting, that’s a much better optimization than changing sort_buffer_size.

That being said, increasing the sort_buffer_size is a perfectly acceptable stop-gap solution that can be implemented RIGHT NOW (it’s a dynamic variable), while you examine your queries by doing a query review with a tool such as mk-query-digest. This is indeed what Pythian does — and, by the way, not only do we recommend that course of action, but we explain it to you and help you find and optimize the queries in question.

That all assumes that having lots of sorts that require lots of memory is a bad thing. It may be that you have tuned your queries and schema such that you have eliminated as many sorts as you can, but some may remain. An intensive data mining server is a good example of a situation in which permanently increasing the sort_buffer_size may be the right solution.

Now that we have the specifics of this situation out of the way, let’s look at the Battle Against Any Guess. This is a movement against guessing games. Understanding what you are doing is essential; in the case of sort_buffer_size, you can believe that you know what you are doing by increasing sort_buffer_size. However, the real solution to the problem lies in changing the queries, not changing the memory patterns.

There is a 6-page description of the “Battle against any guess” in the Northern California Oracle User Group’s May Journal, starting on page 13. The examples are specific to Oracle, but the points made are sound even if you do not know Oracle well. For example:

Blindly implementing best practices is nothing different from guesswork; we are applying some past-proven solutions without measuring how they stand against our requirements, and without testing whether they bring us any closer to the targets we have. Industry has become so obsessed with best practices that we commonly see projects in which reviewing an environment for compliance with best practices is the ultimate goal.

One good reason you need to know *why* is also mentioned in the article: The second danger of best practices is that they easily become myths. The technology keeps improving and issues addressed by certain best practices might not be relevant anymore in the next software version.

So, even from respected folks like Baron or myself, do not take advice on face value. Ask why, understand why, and then think if there is another level. It is not always easy; often you think you understand but really you miss that other level – such as with sort_buffer_size.


PlanetMySQL Voting: Vote UP / Vote DOWN

Hotsos Symposium 2010 — Battle Against Any Guess Is Won

Март 9th, 2010

Video fragments of my session posted at the end — read on.

I arrived at Omni Mandalay Hotel on Sunday evening with Dan Norris. I was flying through Chicago and it turned out that Dan was on the same flight and only few rows behind me. Small world.

Preparations for the conference were very chaotic on my part and, of course, I didn’t have either of my presentations ready. I was very stressed and getting sick as well — it looked like a complete disaster waiting to happen. I’d like to say that I was feeling like Doug Burns as he often managed to get sick just before a conference. Of course, I worked on my slides for the last few days as well as on the flight and presentation was slowly getting there but boy was I tired!

I quickly said hello to the crowd in the bar on the way to my room and rushed away to do some more damage to my slides. And then I had a brilliant idea — I could still see one of my best mates and do something good about my presentation! I asked Doug if he was interested in the preview (he probably wasn’t interested but he couldn’t say it to me) especially that my session wasn’t on his original agenda. Of course, that would mean that he had to leave a bunch of other good friends and spend some time tete-a-tete. Knowing Doug, this is some of the hardest thing to ask from him but it shows how good of a friend he is! (Plus, everyone thinks that he is anti-social anyway. Shhhh!)

Doug has made my day — while he provided lots of ideas and feedback on few things that I was lucking, he generally approved the idea and confirmed that it wasn’t totally crazy. I guess that was all I needed back then and Doug knew how nervous I was about it. (Thanks mate!)

So I called Sunday a day very early and went to bed before midnight. I really needed some sleep. Woken up by the alarm at 5AM (I woke up few times during the night looking at the clock — making sure I didn’t sleep through) and slides were ready just before lunch. I even managed to do a test run and it took 65 minutes — a wee bit too long for one hour session. But it was good test and I knew I had to be just a bit more concise in few parts.

Mi morning was very productive. Unfortunately, I missed the opening keynote from Tom Kyte. Such a pity! If what Doug wrote is true, Tom was talking about the mistakes we make *because* of our experience and our assumptions. This was exactly one of the points I was making in my Battle Against Any Guess — experience is danger. I wish I could see Tom’s example. Oh well, maybe another time.

I managed to attend half of the Richard Foote’s session on indexes but my mind was far away — with my own slides. Though, I did manage to focus on bitmap indexes part and the myth of bitmap indexes not working well for columns with high cardinality. Very interesting conclusions. I’m still wondering how much overhead updates will do to such bitmap index.

After lunch, it was my turn. I ordered few copies of the latest OakTable book — Expert Oracle Practices: Oracle Database Administration from the Oak Table — that I co-authored with the bunch of other Oakies. I contributed chapter 1 in the book titled just like my presentation — Battle Against Any Guess. The plan was to give a copy away during the presentation and do a draw for another one at the end of the session. I was so nervous that I forgot about it until the end of the session so I just did a draw for two copies. The lucky winners were Lynn-Georgia Tesch and Surendra Anchula. Congratulations! For the rest of you who left the contact details — please stay tuned and we’ll organize few things online.

Now the main topic of this post — my presentation. What’s unusual about this session is that it’s not some technical stuff that I usually do but a more conceptual and motivational talk. Could I pull it off? Well, I think it went fairly well in general even though I did identify few rough places and my lack of English language mastering. Might need to work a little bit more on the flow of the presentation.

We had quite a few good laughs. Later, people in the next hall were asking about it and Dan was making the jokes on the stage so it must have been loud. Anyway, I think nobody fell asleep and I managed to get people thinking about the topic. I received many “thank you” notes yesterday and compliments on a good session so by the end of the day I was more and more pleased. Thanks everyone for attending and especially big thanks to those of you who brought to my attention examples from their own battles. If you have more to discuss — contact me by email (my last name) {at} pythian.com.

Thanks to Marco Gralike for recording some fragments and sharing them. I think he has more to come.

This is the introductory couple minutes. You can definitely notice how nervous I am starting on the stage:

Solving the wrong problem example:

That’s all for now. Stay tuned — more to come.


PlanetMySQL Voting: Vote UP / Vote DOWN