Archive for the ‘mysql’ Category

Hotsos Symposium 2010 — Battle Against Any Guess Is Won

Март 9th, 2010

Video fragments of my session posted at the end — read on.

I arrived at Omni Mandalay Hotel on Sunday evening with Dan Norris. I was flying through Chicago and it turned out that Dan was on the same flight and only few rows behind me. Small world.

Preparations for the conference were very chaotic on my part and, of course, I didn’t have either of my presentations ready. I was very stressed and getting sick as well — it looked like a complete disaster waiting to happen. I’d like to say that I was feeling like Doug Burns as he often managed to get sick just before a conference. Of course, I worked on my slides for the last few days as well as on the flight and presentation was slowly getting there but boy was I tired!

I quickly said hello to the crowd in the bar on the way to my room and rushed away to do some more damage to my slides. And then I had a brilliant idea — I could still see one of my best mates and do something good about my presentation! I asked Doug if he was interested in the preview (he probably wasn’t interested but he couldn’t say it to me) especially that my session wasn’t on his original agenda. Of course, that would mean that he had to leave a bunch of other good friends and spend some time tete-a-tete. Knowing Doug, this is some of the hardest thing to ask from him but it shows how good of a friend he is! (Plus, everyone thinks that he is anti-social anyway. Shhhh!)

Doug has made my day — while he provided lots of ideas and feedback on few things that I was lucking, he generally approved the idea and confirmed that it wasn’t totally crazy. I guess that was all I needed back then and Doug knew how nervous I was about it. (Thanks mate!)

So I called Sunday a day very early and went to bed before midnight. I really needed some sleep. Woken up by the alarm at 5AM (I woke up few times during the night looking at the clock — making sure I didn’t sleep through) and slides were ready just before lunch. I even managed to do a test run and it took 65 minutes — a wee bit too long for one hour session. But it was good test and I knew I had to be just a bit more concise in few parts.

Mi morning was very productive. Unfortunately, I missed the opening keynote from Tom Kyte. Such a pity! If what Doug wrote is true, Tom was talking about the mistakes we make *because* of our experience and our assumptions. This was exactly one of the points I was making in my Battle Against Any Guess — experience is danger. I wish I could see Tom’s example. Oh well, maybe another time.

I managed to attend half of the Richard Foote’s session on indexes but my mind was far away — with my own slides. Though, I did manage to focus on bitmap indexes part and the myth of bitmap indexes not working well for columns with high cardinality. Very interesting conclusions. I’m still wondering how much overhead updates will do to such bitmap index.

After lunch, it was my turn. I ordered few copies of the latest OakTable book — Expert Oracle Practices: Oracle Database Administration from the Oak Table — that I co-authored with the bunch of other Oakies. I contributed chapter 1 in the book titled just like my presentation — Battle Against Any Guess. The plan was to give a copy away during the presentation and do a draw for another one at the end of the session. I was so nervous that I forgot about it until the end of the session so I just did a draw for two copies. The lucky winners were Lynn-Georgia Tesch and Surendra Anchula. Congratulations! For the rest of you who left the contact details — please stay tuned and we’ll organize few things online.

Now the main topic of this post — my presentation. What’s unusual about this session is that it’s not some technical stuff that I usually do but a more conceptual and motivational talk. Could I pull it off? Well, I think it went fairly well in general even though I did identify few rough places and my lack of English language mastering. Might need to work a little bit more on the flow of the presentation.

We had quite a few good laughs. Later, people in the next hall were asking about it and Dan was making the jokes on the stage so it must have been loud. Anyway, I think nobody fell asleep and I managed to get people thinking about the topic. I received many “thank you” notes yesterday and compliments on a good session so by the end of the day I was more and more pleased. Thanks everyone for attending and especially big thanks to those of you who brought to my attention examples from their own battles. If you have more to discuss — contact me by email (my last name) {at} pythian.com.

Thanks to Marco Gralike for recording some fragments and sharing them. I think he has more to come.

This is the introductory couple minutes. You can definitely notice how nervous I am starting on the stage:

Solving the wrong problem example:

That’s all for now. Stay tuned — more to come.


PlanetMySQL Voting: Vote UP / Vote DOWN

How do I identify the MySQL my.cnf file?

Март 9th, 2010

As part of my upcoming FREE my.cnf check advice I first need to ask people to provide the current MySQL configuration file commonly found as a file named my.cnf

If only that question was easy to answer!

Use of configuration files

MySQL will by default use at least one configuration file from the following defaults. MySQL also uses a cascade approach for configuration files. When you have multiple files in the appropriate paths you can see unexpected behavior when you override certain values in different files.

You can however for example specify –no-defaults to use no configuration file, or add options to your command line execution, so even looking at all configuration files is no guarantee of your operating configuration.

However for most environments, these complexities do not exist.

Default Location

By default and on single instance MySQL servers you are most likely to find this file called my.cnf and found at:

  • /etc/my.cnf
  • /etc/mysql/my.cnf

These are known as the global options files.

Alternative Locations

MySQL has both instance specific and user specific locations. For the inclusion of an instance specific file, the location is:

  • $MYSQL_HOME/my.cnf

where MYSQL_HOME is a defined environment variable. Historical MySQL versions also looked at [datadir]/my.cnf however I am unaware if this is applicable in 5.x versions.

You can also specific options on a per user basis for default inclusion. These are found at:

  • $HOME/.my.cnf

Distro specific locations

Ubuntu for example also provides an ability to add options via an include directory.

Specifying a configuration at runtime

While you may have these default files, you may elect to start mysql with a specific configuration file as specified by –defaults-file. This option will override all global/instance/user locations and use just this configuration file. You can also specify additional configuration that supplements and not overrides the default with –defaults-extra-file.

What files are on my system?

Again, assuming the default names you can perform a brute force check with:

$ sudo find / -name "*my*cnf"

This is actually worthwhile, especially if you find a /root/.my.cnf file which is default MySQL settings for the Operating System ‘root’ user.

MySQL recommendations

MySQL by default provides a number of recommended files however these are generally outdated especially for newer hardware. These files include my-huge.cnf, my-large.cnf, my-medium.cnf, my-small.cnf and my-innodb-heavy-4G.cnf. Don’t assume replacing your configuration with one of these files will make your system perform better.

MySQL made some attempt to correct these and at least some very poor defaults with MySQL 5.4 however I am unsure what’s in MySQL 5.5

MySQL Configuration at runtime

While several commands can help with identifying your configuration files and print defaults etc, it’s also possible to change your configuration at runtime. It’s possible that these changes are not reflected in your configuration files and pose an additional mismatch.

References


PlanetMySQL Voting: Vote UP / Vote DOWN

Data Comparison Methods Overview

Март 9th, 2010

Data comparison is a difficult and resource-intensive process. For convenience, this process can be divided into several steps.
First, you should compare tables from one database on one server with the database on the other server. You should choose columns for data comparison, and also choose a column that will be a comparison key.
The next step is to choose all data from these tables or some specified part of the data.
The third and the most important step is comparison of the two tables by the selected comparison key itself. During this process the status of each record is set to “only in source”, “only in target”, “different”, or “equal”.
The final steps of the data comparison process are including records to the synchronization and synchronization itself. During these steps records needed for synchronization are chosen, update script is created, and after that the script is executed.
You can read a detailed description of the comparison process here.

Now let’s look at the third step (data comparison) thoroughly.

There are several ways of data comparison that differ only by the side where data comparison is going to be performed – on the server side or on the client PC.

Data comparison on the server side is performed using the resources of the server.
The algorithm of comparison is the following:
1. For each record of each of the two tables its checksum is calculated;
2. Then the checksum of every record from one table is compared to the checksum of the corresponding record from another table and conclusion if the records are equal or different is made;
3. The comparison result is stored in a temporary table on the server.

Performance indicators:
1. The speed of data comparison directly depends on the server capacity and occupancy;
2. The maximal size of database for comparison is limited by the resources of the server itself.

Advantages:
1. There is no need to transfer large amounts of data for comparison to the client PC through network. This way we save network traffic;
2. The speed of comparison does not depend on the client PC resources;
3. Ability to compare blob data of any size.

Disadvantages:
1. Because of the record checksum calculation algorithm in some cases different data can result in equal checksum, and instead of the expected “different” status the “equal” status will be received;
2. There is no flexibility in the synchronization and comparison options usage;
3. There is no possibility to view records differences and exclude a part of the records from the synchronization manually;
4. During the synchronization script creation you should perform data transfer from the server to the client side;
5. The control checksum calculation of a large amount of records consumes all server resources;
6. One should provide extra space on the server for the comparison results storage in the temporary table.

As we can see, this way of comparison has more disadvantages than advantages, that’s why this way is rarely used.

Data comparison on the client PC is performed using the client machine resources, and the server only provides data for comparison. In turn, this way of comparison can be divided into several more ways depending on the way how comparison information will be stored.

Comparing Data on local PC when comparison result is stored in RAM.
The comparison algorithm is the following:
1. Server passes all data from both tables to the local PC;
2. Every record of every table is placed to RAM and is compared without checksum calculation;
3. If a record gets “only in source”, “only in target” or “equal” status, only comparison key is stored in RAM. If records get “different” status, they are placed to RAM for storage completely.

Performance indicators:
1. The speed of data comparison directly depends on the client PC resources and on the speed of data transfer through the network;
2. Maximum size of the database for comparison is limited by the size of RAM on the client PC, and this maximum size also depends on the degree to which the databases that should be compared are different – the smaller is the amount of different records, the larger databases can be compared.

Advantages:
1. Minimal server occupancy – server performs only simple data selection;
2. The simplest algorithm of data comparison because records are sorted on the client side;
3. Flexibility in the comparison options usage;
4. Minimal size of the comparison data store;
5. Status of every record for any data is always correct.

Disadvantages:
1. To view records with “only in source”, “only in target”, or “equal” status an extra data selection is needed;
2. An extra data selection is needed to create a synchronization script;
3. OutOfMemory Exception may be arisen when there are a lot of differences in data in databases;
4. Possibility to compare blob data only of the size that equals to the size of free RAM.

This way of comparison is implemented in dbForge Data Compare for SQL Server v1.10, dbForge Data Compare for MySQL v2.00 and allows to compare databases of any size if data in these databases does not differ a lot.

Comparing Data on local PC when comparison result is stored as a cashed file on the disk.
The algorithm of comparison is the following:
The server passes all data from both tables sorted by comparison key to a local PC. Data is read by bytes, compared without checksum calculation and written to a file on the disk.

Performance indicators:
1. The speed of data comparison directly depends on the client PC resources and on the speed of data transferring through the network;
2. The maximum size of a database to compare is limited by free disk space and does not depend on the degree of data difference in databases.

Advantages:
1. Medium server occupancy – server performs data sorting and selection;
2. To view records and synchronization script creation extra requests to the server are not necessary;
3. The status for every record is always correct for any data;
4. Possibility to compare blob data of the size equal to the size of free space available on the disk.

Disadvantages:
1. Difficult algorithm of data comparison for the records comparison key of which is of the string data type;
2. Difficult algorithm of disk cash for temporary information storage creation.

We can see that in this case the only disadvantage of this way of comparison is the difficulty of implementation. There are more advantages than in the ways of comparison listed above. That’s why this way of comparison will be used in the new version of dbForge Data Compare for SQL Server v2.00 and dbForge Data Compare for MySQL v3.00 for data comparison.


PlanetMySQL Voting: Vote UP / Vote DOWN

SQLyog MySQL GUI 8.3 Has Been Released

Март 9th, 2010

Changes (as compared to 8.22) include:

Features:
* Added an option to define a ‘color code’ for a connection. The color will be used as background color in the Object Browser.
* A Query Builder session can now be saved and resumed.
* In Query Builder a table alias can be defined for any table by double-clicking the title bar of the table symbol.
* In RESULT tab results can now be retrieved page-wise. This is ON as default with this build with a defined LIMIT of 1000 rows. For a specific query user can change and for this specific query the setting is persistent across sessions. Also read ‘miscellaneous’ paragraph below.
* Added a context menu to Query Builder canvas.

Bug Fixes:
* Deleting a user would leave non-global privileges orphaned in the ‘mysql’ database. Now we use DELETE USER syntax if server supports.
* Also using EDIT USER dialogue to change host or user specifier for a user would not move non-global privileges. We have split the old ALTER USER dialogue into two: a EDIT USER and RENAME USER dialogue. The latter will use RENAME USER syntax if server supports.
* On Wine Data Sync could generate a malformed XML-string what would case Data Sync to abort.
* Fixed an issue where SSH-tunneling failed with public/private key authentication. Technically the fix is in the PLINK binary shipped with SQLyog.
* SJA failed to send notification mails if Yahoo SMTP servers were used. Note that the fix disables encryption option with Yahoo SMTP servers – but it won’t work anyway due to a non-standard SMTP implementation server-side.
* When importing data from a Universe ODBC-source string data could be truncated.
* The fix in 8.22 for the issue that horizontal scrollbar in GRID would sometime not appear was not complete. It could still happen.
* SQLyog will now trim trailing whitespaces in Connection Manager and Create object dialogs to avoid MySQL Errors..
* Opening a file from ‘recent files’ list could crash SQLyog if a Query Builder or Schema Designer tab was selected and the file specified was not a valid XML file for that tab. This bug was introduced in beta 1.
* When calling a Stored Procedure with more than one SELECT statement from ‘Notifications Services’ only one result set was sent by mail.
* The sja.log file had no line-breaks between what was recorded for two jobs.
* On multi-monitor system resizable dialogues could open on the wrong monitor. New implementation is like this: on multi-monitor systems main program dialogue and ‘first child dialogue’ (example: ALTER TABLE) will open where they were closed (if possible), second and higher child dialogues (example: table advanced properties) will always open on top of its ‘parent’ dialogue. Non-resizable dialogues (such as confirmation boxes) will always open on top of their ‘parent’.
* With multiple SSH-tunnelled connections open stopping and re-executing queries in multiple connections in a fast manner could crash SQLyog.
* If more than one comment occurred before a SELECT statement in the editor, the statement was not identified as a SELECT statement by the Query Profiler and the Query Profiler TAB would not display.
* We did not validate client-side if user checked atoincrement option for a bit column with Create/Alter table dialog.
* If an error occurred while renaming a trigger then trigger was lost as SQLyog was not recreating it back.
* Small GUI fixes.

Miscellaneous:
* The default LIMIT setting for DATA tab has been removed. The setting is not required since we introduced table-level persistence for number of rows displayed. The default for new tables that have not been opened before is 50 – but when user changes the value and next ‘refresh’es SQLyog will save the LIMIT for that particular table persistently across sessions. This in combination with page-wise display in RESULT tab results in a more uniform User Interface for DATA and RESULT tabs.

Downloads: http://webyog.com/en/downloads.php
Purchase: http://webyog.com/en/buy.php


PlanetMySQL Voting: Vote UP / Vote DOWN

Tip: faster than TRUNCATE

Март 9th, 2010

TRUNCATE is usually a fast operation (much faster than DELETE FROM). But sometimes it just hangs; I’ve has several such uncheerful events with InnoDB (Plugin) tables which were extensively written to. The TRUNCATE hanged; nothing else would work; minutes pass.

TRUNCATE on tables with no FOREIGN KEYs should act fast: it translate to dropping the table and creating a new one (and it all depends on the MySQL version, see the manual).

What’s faster than TRUNCATE, then? If you don’t have triggers nor FOREIGN KEYs, a RENAME TABLE can come to the rescue. Instead of:

TRUNCATE log_table

Do:

CREATE TABLE log_table_new LIKE log_table;
RENAME TABLE log_table TO log_table_old, log_table_new TO log_table;
DROP TABLE log_table_old;

I found this to work well for me. Do note that AUTO_INCREMENT values can be tricky here: the “new” table is created with an AUTO_INCREMENT value which is immediately taken in the “working” table. If you care about not using same AUTO_INCREMENT values, you can:

ALTER TABLE log_table_new AUTO_INCREMENT=some high enough value;

Just before renaming.

I do not have a good explanation as for why the RENAME TABLE succeeds to respond faster than TRUNCATE.


PlanetMySQL Voting: Vote UP / Vote DOWN

Speaking at the O’Reilly MySQL Conference & Expo: "A look into a MySQL DBA’s toolchest"

Март 9th, 2010


O'Reilly MySQL Conference & Expo 2010
I'm happy to announce that my talk "Making MySQL administration a breeze - a look into a MySQL DBA's toolchest" has been accepted for this year's edition of the MySQL Conference & Expo in Santa Clara, which will take place on April 12-15, 2010. The session is currently scheduled for Wednesday 14th, 10:50 in Ballroom E.

My plan is to provide an overview over the most popular utilities and applications that a MySQL DBA should be aware of to make his life easier. The focus will be on Linux/Unix applications available under opensource licenses that ease tasks related to user administration, setting up and administering replication setups, performing backups and security audits.

Of course I will cover the usual suspects (e.g. Maatkit), some of these are actually collections of different utilities by themselves. As it's impossible to go over each individual component in the given time frame, I will try to pick out the most popular/useful parts related to the scopes mentioned above. But I will also cover some lesser known gems that migh be worth taking a look at. What's your the most valued tool in your toolchest? I am still looking for more inspiration.

I look forward to being at the conference again and meeting with colleagues and friends in the MySQL community. Judging from the current schedule, it will be a very interesting mix of talks.

If you're interested in attending, you should consider registering soon! The early registration ends on March 15th. Until then, I encourage you to make use of this "Friend of Speaker" discount code (25% off): mys10fsp


PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzle BoF at the MySQL Conference and Expo

Март 9th, 2010

At the 2010 O’Reilly MySQL Conference and Expo there will be a Drizzle BoF!

It’s currently scheduled for 7pm on April 13th.

Come along, it will be awesome.


PlanetMySQL Voting: Vote UP / Vote DOWN

Speaking At The MySQL Users Conference

Март 9th, 2010
My proposal has been accepted, yay!

I'll be speaking on a topic that I feel passionate about: MySQL Server Diagnostics Beyond Monitoring. MySQL has limitations when it comes to monitoring and diagnosing as it has been widely documented in several blogs.

My goal is to share my experience from the last few years and, hopefully, learn from what others have done. If you have a pressing issue, feel free to comment on this blog and I'll do my best to include the case in my talk and/or post a reply if the time allows.

I will also be discussing my future plans on sarsql. I've been silent about this utility mostly because I've been implementing it actively at work. I'll post a road map shortly based on my latest experience.

I'm excited about meeting many old friends (and most now fellow MySQL alumni) and making new ones. I hope to see you there!

PlanetMySQL Voting: Vote UP / Vote DOWN

An SQL Puzzle?

Март 9th, 2010

Dear Lazy Web,

What should the result of the SELECT be below? Assume InnoDB for all storage engines.

CREATE TABLE t1 (a int, b int);
insert into t1 values (1,1),(1,2);
CREATE TEMPORARY TABLE t2 (a int, b int, primary key (a));
BEGIN;
INSERT INTO t2 values(100,100);
CREATE TEMPORARY TABLE IF NOT EXISTS t2 (primary key (a)) select * from t1;

# The above statement will correctly produce an ERROR 23000: Duplicate entry '1' for key 'PRIMARY'
# What should the below result be?

SELECT * from t2;
COMMIT;


PlanetMySQL Voting: Vote UP / Vote DOWN

Qsh.pl: distributed query tool

Март 8th, 2010
I've written quite a few tools over time to connect to many mysql servers and run queries. Most of these have been pretty specific to a small set of tasks such as running an alter across many servers. Any sysadmin that is in charge of many servers is probably familiar with dsh, and as I was using recently I realized how all those specific tools I've written for mysql could be generalized into a dsh like tool. Thus, Qsh.pl was born! (download at launchpad)

Usage should be familiar to anyone who has used dsh before, it even will read group files made for dsh in /etc/dsh/group/or /usr/local/etc/group/.

Here's an example where this tool was quite useful. I was getting a query error for SHOW GLOBAL STATUS. This was a curious result since we're running mysql 5.0 everywhere. So what better way to find out which machines are complaining than just run it everywhere:

# qsh.pl -Mcg all_servers --user root --ask-pass --db=test -e 'SHOW GLOBAL STATUS' 2>error.log
{snip ... lots of output}
Done. Total time 2.919

My group file for all_servers includes 120 mysql servers, executing that query on all of them took a total of 2.9 seconds, not bad. I also redirected stderr to a file, so any query errors are easy to find:
cat error.log
myserver1: Query Error (1064) You have an error in your SQL syntax ...
myserver2: Query Error (1064) You have an error in your SQL syntax ...

Ok, we found all the servers that return an error, why do they complain?
# qsh.pl -Mcm myserver1,myserver2,myserver3 --user root --ask-pass --db=test -e 'SELECT VERSION()'
Password:
myserver1: +-----------------+
myserver1: | VERSION() |
myserver1: +-----------------+
myserver1: | 4.1.21-standard |
myserver1: +-----------------+
myserver3: +----------------------------+
myserver3: | VERSION() |
myserver3: +----------------------------+
myserver3: | 5.0.66a-enterprise-gpl-log |
myserver3: +----------------------------+
myserver2: +-----------------+
myserver2: | VERSION() |
myserver2: +-----------------+
myserver2: | 4.1.21-standard |
myserver2: +-----------------+
Done. Total time 0.063
Ooops! That's right, still a few old versions for legacy reasons.

That's just one example of how I used it. There are probably lots of use cases out there, but since it's new I'm still learning to rely on it. It certainly makes things faster when I can think about querying many servers at once, and is a more efficient way to work when dealing with many machines. It might be useful for:

+ comparing explain plan between many machines
+ altering large tables across many slaves, before promoting one to master.
+ grabbing status output from many machines to feed into awk or sed

PlanetMySQL Voting: Vote UP / Vote DOWN