Archive for the ‘ISO 9075’ Category

A small issue of SQL standards

Май 26th, 2010
From a functional perspective, the core SQL support in all major and minor RDBMS-es is reasonably similar. In this light, it's sometimes quite disturbing to find how some very basic things work so differently across different products. Consider this simple statement:

SELECT 'a' /* this is a comment */ 'b'
FROM onerow

What should the result be? (You can assume that onerow is an existing table that contains one row)

It turns out popular RDBMS-es mostly disagree with one another.

In Oracle XE, we get this:
SELECT  'a' /* comment */ 'b'
*
ERROR at line 1:
ORA-00923: FROM keyword not found where expected

PostgreSQL 8.4 also treats it as a syntax error, and is thus compatible with Oracle:
ERROR:  syntax error at or near "'b'"
LINE 1: SELECT 'a' /* this is a comment */ 'b'

In Microsoft SQL Server 2008 we get:
b
-
a

(1 rows affected)
As you can see, MS SQL treats the query as SELECT 'a' AS b FROM onerow.

Finally, in MySQL, we get:

+----+
| a |
+----+
| ab |
+----+
1 row in set (0.00 sec)
So in MySQL, its as if the comment isn't there at all, and as if the string literals 'a' and 'b' are actually just one string literal 'ab'.

So what does the SQL standard say? In my copy of the 2003 edition, I find this (ISO/IEC 9075-2:2003 (E) 5.3 <literal>, page 145):
Syntax Rules
1) In a <character string literal> or <national character string literal>, the sequence:
<quote> <character representation>... <quote> <separator> <quote> <character representation>... <quote>
is equivalent to the sequence
<quote> <character representation>... <character representation>... <quote>
If we lookup the definition of <separator>, it reads:
<separator> ::= { <comment> | <white space> }...
So in this case, MySQL does the "right" thing, and basically ignores the comment, treating 'a' and 'b' as a single string constant 'ab'.

If you try the same thing with a single line comment, all products mentioned react the same as with the initial query, except for PostgreSQL, which now treats the query according to the standard.

Now don't get me wrong. This post is not designed to bash or glorify any of the products mentioned. I think all of them are great in their own way. I am quite aware that although MySQL happens to adhere to the standard here, it violates it in other places. Finally, I should point out that I don't have a specific opinion on what the right behavior should be. I just want it to be the same on all platforms.

At the same time, I realize that for SQL it's probably too late - up to an extent, incompatibility is considered normal, and database professionals tend to be specialized in particular products anyway. So I'm not holding my breath for the grand unification of SQL dialects.

When I encountered this issue, I did have to think about that other rathole of incompatibilities I have to deal with professionally, which is web-browsers. An interesting development there is the HTML 5 specification, which actually defines an algorithm for parsing HTML - even invalid HTML. This is quite different from the approach taken by most standards, which typically define only an abstract grammar, but leave the implementation entirely up to the vendors. In theory, providing parsing instructions as detailed as done in HTML 5 should make it easier to create correct parsers, and hopefully this will contribute to a more robust web.

Anyway. That was all. Back to work...

PlanetMySQL Voting: Vote UP / Vote DOWN