Archive for the ‘SQL’ Category

Proper SQL table alias use conventions

Март 11th, 2010

After seeing quite some SQL statements over the years, something is bugging me: there is no consistent convention as for how to write an SQL query.

I’m going to leave formatting, upper/lower-case issues aside, and discuss a small part of the SQL syntax: table aliases. Looking at three different queries, I will describe what I find to be problematic table alias use.

Using the sakila database, take a look at the following queries:

Query #1

SELECT
 R.rental_date, C.customer_id, C.first_name, C.last_name
FROM
 rental R
 JOIN customer C USING (customer_id)
WHERE
 R.rental_date >= DATE('2005-10-01')
 AND C.store_id=1;

The above looks for film rentals done in a specific store (store #1), as of Oct. 1st, 2005.

Query #2

SELECT
 F.title, C.name
FROM
 film AS F
 JOIN film_category AS S ON (F.film_id = S.film_id)
 JOIN category AS C ON (S.category_id = C.category_id)
WHERE F.length > 180;

The above lists the title and category for all films longer than three hours.

Query #3

SELECT c.customer_id, c.last_name
FROM
  customer c
  INNER JOIN address a ON (c.address_id = a.address_id)
  INNER JOIN (
    SELECT
      c.city_id
    FROM
      city AS c
      JOIN country s ON (c.country_id = s.country_id)
    WHERE
      s.country LIKE 'F%'
  ) s1 USING (city_id)
WHERE
  create_date > DATE('2005-10-01');

The above lists customers created after Oct. 1st, 2005, and who live in countries starting with an ‘F’. The query could be solved without a subquery, but there’s a good reason why I made it so.

The problems

I used very different conventions on any one of the queries, and sometimes within each query. And it’s common that I see the same on a customer’s site, what with having many programmers do the SQL coding. Again, I will only discuss the table aliases conventions. I’ll leaver the rest to the reader.

Here’s where I see problems:

  • Query #1: In itself, it looks fine. Rental turns to R, Customer turns to C. I will comment on this slightly later on when I provide my full opinion.
  • Query #2: So film turns to F, category turns to C. What should film_category turn into? Out of letters? Let’s just go for S, shall we? But S has nothing do with film_category. Yet it’s so commonly seen.
  • Query #2: We’re using the AS keyword now. We didn’t use it before.
  • Queries #1, #2: Hold on. Wasn’t C taken for customer in Query #1? Now, in Query #2 it stands for category? I’m beginning to get confused.
  • Query #3: Now aliases are lower case; I was just getting used to them being upper case.
  • Query #3: But, hey, c is back to customer!
  • Query #3: Or, is it? Take a look at the subquery. Theres another c in there! This time it’s city! And it’s perfectly valid syntax. We actually have two identical aliases in the same query.
  • Query #3: If I could, I would name country with c as well. But I can’t. So why not throw in s again?
  • Query #3: and now I don’t even bother using the alias when accessing the create_date. Well, there’s no such column in any of the other tables!

Proper conventions

What I find so disturbing is that whenever I read a complex query, I need to go back and forth, back and forth between table aliases (found everywhere in the query) and their declaration point. Such irregularities make the queries difficult to read.

Any of the above issues could be justified. But I wish to make some suggestions:

  • Decide whether you’re going for upper or lower case.
  • Do not use the same alias twice in your query, even if it’s valid.
  • Aliases do not have to be single character. film_category may just as well be FC.
  • Do not alias something that is hard to interpret. s does not stand for country.
  • Think ahead: use same aliases throughout all your queries, as far as you can. If uniqueness is a problem, make for longer aliases. Use cust instead of c.

The above should make for more organized and readable SQL code. Remember: what one programmer finds as a very intuitive alias, is unintuitive to another!

My own convention

Simple: I only use aliases when using self joins. I am aware that queries are much longer what with long table names. I go farther than that: I prefer fully qualifying questionable columns throughout the query. Yes, it makes the query even longer.

I know this does not appeal to many. But there’s no confusion. And it’s easily searchable. And it’s consistent. And if properly formatted, as in the above queries, is well readable.

Now please join me in asking Oracle if they can add multi-line Strings for java, as there are for python.


PlanetMySQL Voting: Vote UP / Vote DOWN

Maatkit BoF session at the MySQL conference

Март 9th, 2010

I’ve submitted a Birds of a Feather session for Maatkit at the upcoming MySQL conference. It’s not on the public schedule yet, but it has been accepted and scheduled for 19:00 on 13 Apr 2010. See you there!

Related posts:

  1. Presentation uploaded for Maatkit talk at MySQL Conference The slides
  2. I’ll be speaking at the O’Reilly MySQL Conference 2010 I’m
  3. Learn about Maatkit at the MySQL Conference I’m

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Tip: faster than TRUNCATE

Март 9th, 2010

TRUNCATE is usually a fast operation (much faster than DELETE FROM). But sometimes it just hangs; I’ve has several such uncheerful events with InnoDB (Plugin) tables which were extensively written to. The TRUNCATE hanged; nothing else would work; minutes pass.

TRUNCATE on tables with no FOREIGN KEYs should act fast: it translate to dropping the table and creating a new one (and it all depends on the MySQL version, see the manual).

What’s faster than TRUNCATE, then? If you don’t have triggers nor FOREIGN KEYs, a RENAME TABLE can come to the rescue. Instead of:

TRUNCATE log_table

Do:

CREATE TABLE log_table_new LIKE log_table;
RENAME TABLE log_table TO log_table_old, log_table_new TO log_table;
DROP TABLE log_table_old;

I found this to work well for me. Do note that AUTO_INCREMENT values can be tricky here: the “new” table is created with an AUTO_INCREMENT value which is immediately taken in the “working” table. If you care about not using same AUTO_INCREMENT values, you can:

ALTER TABLE log_table_new AUTO_INCREMENT=some high enough value;

Just before renaming.

I do not have a good explanation as for why the RENAME TABLE succeeds to respond faster than TRUNCATE.


PlanetMySQL Voting: Vote UP / Vote DOWN

NoSQL doesn’t mean non-relational

Март 8th, 2010

It seems that a lot of people equate non-SQL databases with non-relational-ness, or malign the word relational. This is pretty much pure ignorance. If you’ve ever uttered a sentence that includes the phrase “…non-relational database…” then I have two suggestions for you.

  1. Study relational algebra. At a bare minimum, read the Wikipedia article on relational algebra. There is much more you could do — take a class on the topic, or read C.J. Date’s SQL and Relational Theory (my review). Ask yourself how similar SQL is to the relational algebra. How is relational algebra different from SELECT and GROUP BY? Is relational theory about relationships between data? What part do transactions play in relational algebra? Is MySQL a relational database? What about PostgreSQL, Oracle, or DB2?
  2. Now that you understand relational theory more, choose a database that you think is non-relational and write a formal proof that it is not relationally complete. Please do post a link to the proof in the comments.

The truth is, a non-relational database would be of very little use. In layman’s terms, it would mean you have some data that represents true statements, and a piece of software designed to answer questions using those facts, and you can’t answer simple first-order logic questions with the software. How is this an improvement? How is this useful?

Related posts:

  1. A review of SQL and Relational Theory by C. J. Date SQL and Re
  2. InnoDB is a NoSQL database As long as
  3. On the unhelpfulness of NoSQL My favorit

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

A growing trend: InnoDB mutex contention

Март 4th, 2010

I’ve been noticing an undeniable trend in my consulting engagements in the last year or so, and when I vocalized this today, heads nodded all around me. Everyone sees a growth in the number of cases where otherwise well-optimized systems are artificially limited by InnoDB contention problems.

A year ago, I simply wasn’t seeing the need for analysis of GDB backtraces en masse. These days, I’m writing custom tools to gather and analyze backtraces. A year ago, I simply looked at the SEMAPHORE section of SHOW INNODB STATUS. These days I’m writing custom tools to aggregate and reformat that data so I can interpret it more easily. And I’m actually seeing cases of this type of problem multiple times every week. I remember the first time I ran into a server that was literally optimized to the limit, but struggling under the load. It was something new for me, not that long ago. Oh, I’d seen it before, plenty, but was always able to point out where something could be improved without changing InnoDB itself. Now it’s commonplace: schemas are fine — check. Queries are all well-indexed — check. Everything else — check. InnoDB is bottlenecked and absolutely nothing can be improved — check.

Part of the difference is the rapidly improving hardware. It’s getting hard to buy a server with fewer than 8 or even 16 cores, and 16GB of RAM feels like something I’d install in a wristwatch. But I also suspect that if I’d been characterizing the workload of servers over time in a way that was easy to compare, I’d see a clear trend towards bigger data and more queries per second. We’re just pushing MySQL + InnoDB harder today than we ever have before.

What can be done? Well, InnoDB needs to be improved, that’s all. Oracle, Percona, Google, Facebook and others are working on it, and in many cases these efforts have yielded dramatic results. But there is still much room for improvement.

Related posts:

  1. Hindsight on a scalable replacement for InnoDB A while ag
  2. What do you know about Oracle’s InnoDB+ storage engine? That
  3. Xtrabackup is for InnoDB tables too, not just XtraDB Just thoug

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Learn how to achieve PCI compliance with MySQL

Март 3rd, 2010

One of my colleagues, Ryan Lowe, has just heard that his session on PCI compliance with MySQL has been accepted at the upcoming MySQL conference. Ryan is highly qualified to present this topic, and not many people can say that; I certainly can’t claim that title myself. If you’re looking to learn how to make your MySQL installation PCI-compliant, there’s also not a lot of trustworthy information online. Personally — and really, no bias just because he’s my colleague — I think this is a great session for the MySQL conference, which I sometimes thought didn’t have enough diversity of topics in past years. We need more stuff like this to give people a reason to return after they’ve gone for 2 or 3 years in a row.

Related posts:

  1. Learn about Maatkit at the MySQL Conference I’m
  2. Learn about mk-query-digest at PgEast 2010 I’ll
  3. There will be an O’Reilly MySQL Conference in April 2010 O’Re

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Meet Xaprb at the training course in NYC this Friday

Февраль 24th, 2010

I’ll be helping Morgan Tocker deliver the second half of his training course for MySQL Developers/DBAs in New York City in a few days (more Percona training). It was a snap decision at the last minute, but I’m hoping I’ll still get to meet some folks there. If we’ve corresponded over email or blog comments and you would like to get together, ping me in the comments here!

If you’re in the New York City area and you use MySQL, you should consider attending this course, too. Morgan knows his stuff and has written a good curriculum. Attendees give his courses excellent feedback, and the price is very reasonable. Oh, and I’ll be there too, did I mention that? You can pick my brain, no extra charge. Bring organic free-trade chocolate to assist with extra-tough questions.

Related posts:

  1. Xaprb uses a hybrid of Journalist and deLight themes A while ag
  2. Recap of CPOSC 2009, plus slides Yesterday
  3. Sessions of interest at the Percona Performance Conference Having wri

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Charset support in MySQL is really not all that complex

Февраль 23rd, 2010

The headline is flame-bait, don’t take it. I just wanted to point something out about character sets and collations in MySQL.

To the uninitiated, it may seem overwhelming. Everything has a character set! Everything has a collation! And they act weirdly! The server has one. The database has one (oh, and it changes magically as I USE different databases.) Every table has one, and columns too. Is that all? NO! My connection has one! Kill me now!

Relax. In truth, only one kind of thing actually has a charset/collation. That is values. And values are stored in columns. The only thing that really has a charset/collation is a column.[1]

What about all the rest of those things — connection, database, server, table? Those are just defaults, which determine what charset/collation a value gets if it isn’t overridden. So if the table’s default charset is utf8, and you add a column without saying what it should be — why, it’ll be utf8. If the database’s default is latin1 and you add a table without saying what its default should be, it’ll be latin1.

[1] It’s not quite true. Literal values in SQL statements are values too, as are @user_variables. But if your connection’s charset is latin1 and you say SELECT "xaprb", without an explicit introducer, you’re really saying SELECT _latin1 "xaprb". Again, the only thing that really has a charset/collation is a value. Other things are just defaults.

Related posts:

  1. What data types does your innovative storage engine NOT support? I’ve
  2. The difference between a unique index and primary key in MySQL There
  3. Sanity-check features in MySQL MySQL has

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Cary Millsap: Thinking Clearly about Performance

Февраль 23rd, 2010

Cary Millsap has a concise, readable paper on performance. Anyone involved in database performance optimization should read it. Cary’s writing has heavily influenced my mk-query-digest tool for analyzing MySQL/PostgreSQL/Memcached/HTTP query performance, and I think you’ll get a lot more from mk-query-digest if you read this paper — and you should also read his book, reviewed here. It’s one of the top books on my Essential Books List.

Related posts:

  1. A review of Optimizing Oracle Performance by Cary Millsap Optimizing
  2. Sessions of interest at the Percona Performance Conference Having wri
  3. An ongoing thread of blogs on MySQL performance In the las

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

I’ll be speaking at the O’Reilly MySQL Conference 2010

Февраль 20th, 2010

O'Reilly MySQL Conference & Expo 2010

I’m speaking at the O’Reilly MySQL Conference 2010. I hope I don’t lose my voice, because I have four sessions!

You can click through on the links above to learn more about each session. I’m also looking forward to the other sessions. Here’s a sample of a few that I have my eye on:

The schedule is far from complete, because the conference committee is still working on it, but the proposals to choose from are impressive. Stay tuned as more talks are approved and the schedule fills out, and don’t wait too long to register and book your flight! This is going to be a banner year.


Related posts:

  1. Submit your proposals for MySQL conference 2010 The MySQL
  2. There will be an O’Reilly MySQL Conference in April 2010 O’Re
  3. Speaking at EdUI Conference 2009 I’m

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN