Archive for the ‘SQL’ Category

The MySQL init-script mess

Апрель 25th, 2012

I don’t think there is a single good-quality MySQL init script for a Unix-like operating system. On Windows, there is the service facility, and I used to write Windows services. Based on that, I believe Windows has a pretty good claim to better reliability for start/stop services with MySQL.

What’s wrong with the init scripts? Well, let me count the reasons! Wait, I’m out of fingers and toes.

I’ll just mention the two annoying ones that I’ve run into most recently. Both are on Debian, though there is nothing especially broken about Debian’s init scripts. The first one comes from parsing my.cnf wrong and not treating pid-file and pid_file identically. The server treats them identically, thus, so should any other program that reads the my.cnf file (there’s this program called my_print_defaults… use it!). The second bug is because Debian uses two configuration files for start/stop services: the init script reads /etc/mysql/debian.cnf for no discernable reason. (I guess they never heard of using [sections] in the /etc/mysql/my.cnf file, or just reading the [mysqld] section.) So if you configure your server to place its socket in a non-default location, you have to redundantly update /etc/mysql/debian.cnf too, or the init script will fail. Duplication of configuration parameters is just stupid, period.

These are fairly mundane bugs. I’ve seen literally dozens more. Part of the problem is that each distribution that packages up and redistributes MySQL tends to ship with their own init script, instead of reusing the official scripts provided by MySQL. Understandable, because mysqld_safe is generic and doesn’t really integrate well with any system’s init facilities. But man, do they reinvent a bunch of lovely bugs, mostly related to things like parsing the .cnf files, handling pid files, handling sockets, special user accounts, braindead look-before-you-leap patterns of pinging before actually doing a task, stupid timeouts, wrong handling of log files and log file rotation, dumb hacks with syslog, failing to check for real evidence of a running process (you can’t trust what a cache file on disk says!), adding facepalm-worthy CHECK TABLES automatically on every table on server startup, and on and on.

The official mysqld_safe script tends to be a little less broken, in my experience, but still has many unlovely behaviors and missing features that I’d consider to be bugs.

I haven’t even mentioned the “manage multiple instances” scripts yet. Boy, do those have a ton of bugs. They do stupid things like grepping configuration files for strings that may or may not be in the configuration files. I remember one emergency case where MySQL couldn’t be started on a box because the string “mysql_multi” didn’t exist in a my.cnf file clearly designed for multiple instances to run. I added a comment to the effect of “# This comment is necessary for mysql_multi to work” and the problem was solved. A sane script would actually check for multiple instance definitions, not for some arbitrary string of characters. Anyway, this is just one tiny example, I don’t mean to dwell on it.

What happens when you have a bad init script? All kinds of things. You can’t shut down the server gracefully, so if you shut down the system, you hard-crash MySQL eventually, and good luck getting replication back after that in most cases. You can’t start the server correctly, or it reports the wrong thing and then tries to start several instances, and the second one borks the first one’s pid file and/or socket, causing the aforementioned shutdown problem or worse. And on it goes.

My principle is usually “don’t complain, do something about it.” But there’s a problem, in this case: writing a good init script is actually a significantly complex software engineering project. It is NOT “just a script.” (Insert my usual rant about the need for an actual test suite.) And that is not something I am working on at the moment, nor has it ever become my priority for the last several years. So in this case I’m complaining, because the writing on the wall says that I am probably never going to work on this, and I’d at least like there to be some visibility about what a serious problem this is.

Distribution maintainers could probably improve the situation significantly by taking a look at each other’s bug reports. If everyone solved the same bugs everyone else has solved (and don’t forget bugs in mysqld_safe, too) that would be a big step forward.

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

A post-conference tradition

Апрель 19th, 2012

It’s nice that this happened again this year:

The question I ask myself every year is, When am I going to get time to study all of these great blog posts?

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona Toolkit gripes welcome

Апрель 18th, 2012
What are your pet peeves about the tools? I have a lot. For example, pt-table-sync doesn’t let me do things I want it to do sometimes (and I know it could be made to do them). Another example: pt-query-digest’s tcpdump parser doesn’t pay attention to TCP sequence numbers, so sometimes it invents a really long-running query where there isn’t one. Post your gripes, big and small, in the comments! (This blog post is an attempt to do a 37signals approach to prioritizing: the stuff people complain about repeatedly is what you should focus on. No promises about fixing the gripes — I just want to hear them!)

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Another great MySQL 5.6 milestone release

Апрель 11th, 2012

Oracle has released another huge batch of improvements to MySQL slated for MySQL 5.6. The pace of innovation in MySQL is accelerating rapidly and the improvements are impressive. I have not read all of the blog posts thoroughly yet, but I’ve skimmed and it looks fantastic.

The link posted above is ONE of the sets of improvements: there are also a bunch of other posts about new improvements to the query optimizer, InnoDB, and probably more. But I don’t see a single table-of-contents blog post for those that I can link to. Hence I recommend that you look at Planet MySQL to find these posts.

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Another great MySQL 5.6 milestone release

Апрель 11th, 2012

Oracle has released another huge batch of improvements to MySQL slated for MySQL 5.6. The pace of innovation in MySQL is accelerating rapidly and the improvements are impressive. I have not read all of the blog posts thoroughly yet, but I’ve skimmed and it looks fantastic.

The link posted above is ONE of the sets of improvements: there are also a bunch of other posts about new improvements to the query optimizer, InnoDB, and probably more. But I don’t see a single table-of-contents blog post for those that I can link to. Hence I recommend that you look at Planet MySQL to find these posts.

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Automated, integrated sharding — the new killer database feature

Апрель 10th, 2012

MySQL became wildly successful in part because it had built-in, simple replication. Sure, it had lots of interesting failure scenarios and was not great at first — it is much better these days — but it was nevertheless successful because there was a single, out-of-the-box, not-very-complex way to do replication. I have opined many times before that this was one of the killer features missing from PostgreSQL. I think that can large explain why MySQL became more popular more quickly.

The new killer feature is automatic sharding, in my opinion. If you’re not accustomed to the word, “sharding” means partitioning of a large dataset across many servers.

It is easy to poke fun at MongoDB’s current limitations, but for all that, it has a story to tell about sharding. There is One Right Way To Do It in MongoDB, and it’s a part of the product.

I don’t see sharding being added into the core of MySQL itself, but there are some very interesting efforts headed towards MySQL. There are at least the following companies providing sharding via a proxy or middleware solution, with a lot of other features also available in some products:

In addition, there are community-based efforts, such as Shard-Query and the Spider storage engine. And there’s MySQL (NDB) Cluster, and commercial rip-out-and-plug-in replacements for MySQL such as Clustrix.

Am I missing any? I probably am. You can see and talk to many of these companies at this week’s MySQL conference, by the way.

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Sessions I want to see at MySQL conference 2012

Апрель 9th, 2012

In case you live under a rock, the MySQL conference starts on Tuesday. There are 8 concurrent tracks of content. Aside from my own talks, here are the sessions I would most like to see.

On Tuesday:

  • Tutorial: Innodb and XtraDB Architecture and Performance Optimization. Peter’s talks are always great, and this is a perennial favorite. If you have never been, you should go to this one. You will realize how little you really know about InnoDB/XtraDB, and how much it matters.
  • Tutorial: Linux and H/W optimizations for MySQL. Last year I tweeted that Yoshinori’s tutorial was the 3rd edition of High Performance MySQL, and I really wasn’t exaggerating that much.
  • BoF session: Percona XtraDB Cluster. I believe that this is one of the few truly groundbreaking changes to MySQL — right up there with “MySQL has transactions now” and “MySQL adds replication.”

On Wednesday:

On Thursday:

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Can COUNT(*) be used in MySQL on InnoDB tables?

Апрель 2nd, 2012
COUNT() function returns a number of rows returned by a query. In a popular opinion COUNT(*) should not be used on InnoDB tables, but this is only half true. If a query performs filtering on any column, there is no relevant difference in how COUNT(*) will be executed regardless of the storage engine. In any [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

Progress on High Performance MySQL, 4th Edition

Апрель 1st, 2012

With the 3rd edition of High Performance MySQL finally complete, I’ve begun work on the 4th edition. As you know, technology moves much faster than printing presses, and a book is outdated very quickly, so this is a never-ending project. I’m also outlining the 5th edition in anticipation of starting it immediately afterwards.

I’m looking for your input on what I should cover in the new edition. Should I discuss MySQL’s intra-query parallelization across multiple CPU cores? Should I explore how cloud computing platforms enable higher performance at lower cost than dedicated hardware? Should I explain the bizarre bug in MySQL’s datetime type that causes it to skip the day after March 31st every year? What are your suggestions?

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

Progress on High Performance MySQL, 4th Edition

Апрель 1st, 2012

With the 3rd edition of High Performance MySQL finally complete, I’ve begun work on the 4th edition. As you know, technology moves much faster than printing presses, and a book is outdated very quickly, so this is a never-ending project. I’m also outlining the 5th edition in anticipation of starting it immediately afterwards.

I’m looking for your input on what I should cover in the new edition. Should I discuss MySQL’s intra-query parallelization across multiple CPU cores? Should I explore how cloud computing platforms enable higher performance at lower cost than dedicated hardware? Should I explain the bizarre bug in MySQL’s datetime type that causes it to skip the day after March 31st every year? What are your suggestions?

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN