Archive for the ‘bug’ Category

A tale of a bug…

Июль 22nd, 2010

So I sometimes get asked if we funnel back bug reports or patches back to MySQL from Drizzle. Also, MariaDB adds some interest here as they are a lot closer (and indeed compatible with) to MySQL. With Drizzle, we have deviated really quite heavily from the MySQL codebase. There are still some common areas, but they’re getting rarer (especially to just directly apply a patch).

Back in June 2009, while working on Drizzle at Sun, I found a bug that I knew would affect both. The patch would even directly apply (well… close, but I made one anyway).

So the typical process of me filing a MySQL bug these days is:

  • Stewart files bug
  • In the next window of Sveta being awake, it’s verified.

This happened within a really short time.

Unfortunately, what happens next isn’t nearly as awesome.

Namely, nothing. For a year.

So a year later, I filed it in launchpad for MariaDB.

So, MariaDB is gearing up for a release, it’s a relatively low priority bug (but it does have a working, correct and obvious patch), within 2 months, Monty applied it and improved the error checking around it.

So MariaDB bug 588599 is Fix Committed (June 2nd 2010 – July 20th 2010), MySQL Bug 45377 is still Verified (July 20th 2009 – ….).

(and yes, this tends to be a general pattern I find)

But Mark says he gets things through… so yay for him.2


PlanetMySQL Voting: Vote UP / Vote DOWN

Best Practices: Additional User Security

Июнь 3rd, 2010

By default MySQL allows you to create user accounts and privileges with no password. In my earlier MySQL Best Practices: User Security I describe how to address the default installation empty passwords.

For new user accounts, you can improve this default behavior using the SQL_MODE variable, with a value of NO_AUTO_CREATE_USER. As detailed via the 5.1 Reference Manual

NO_AUTO_CREATE_USER

Prevent the GRANT statement from automatically creating new users if it would otherwise do so, unless a nonempty password also is specified.

Having set this variable I attempted to show the error of operation to demonstrate in my upcoming “MySQL Idiosyncrasies that bite” presentation.

Confirm Settings

mysql> show global variables like 'sql_mode';
+---------------+---------------------+
| Variable_name | Value               |
+---------------+---------------------+
| sql_mode      | NO_AUTO_CREATE_USER |
+---------------+---------------------+
1 row in set (0.00 sec)

mysql> show session variables like 'sql_mode';
+---------------+---------------------+
| Variable_name | Value               |
+---------------+---------------------+
| sql_mode      | NO_AUTO_CREATE_USER |
+---------------+---------------------+
1 row in set (0.00 sec)

Create error condition

mysql> CREATE USER superuser@localhost;
Query OK, 0 rows affected (0.00 sec)
mysql> GRANT ALL ON *.* TO superuser@localhost;
Query OK, 0 rows affected (0.00 sec)
mysql> exit

What the? Surely this isn’t right.

$ mysql -usuperuser

mysql> SHOW GRANTS;
+--------------------------------------------------------+
| Grants for superuser@localhost                         |
+--------------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'superuser'@'localhost' |
+--------------------------------------------------------+

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.1.39    |
+-----------+

Well that’s broken functionality.

What should happen as described in Bug #43938 is a cryptic message as reproduced below.

mysql> GRANT SELECT ON foo.* TO 'geert12'@'localhost';
ERROR 1133 (42000): Can't find any matching row in the user table
mysql> GRANT SELECT ON *.* TO geert12@localhost IDENTIFIED BY 'foobar';
Query OK, 0 rows affected (0.00 sec)

It seems however that the user of CREATE USER first nullifies this expected behavior.


PlanetMySQL Voting: Vote UP / Vote DOWN

New CREATE TABLE performance record!

Июнь 3rd, 2010

4 min 20 sec

So next time somebody complains about NDB taking a long time in CREATE TABLE, you’re welcome to point them to this :)

  • A single CREATE TABLE statement
  • It had ONE column
  • It was an ENUM column.
  • With 70,000 possible values.
  • It was 605kb of SQL.
  • It ran on Drizzle

This was to test if you could create an ENUM column with greater than 216 possible values (you’re not supposed to be able to) – bug 589031 has been filed.

How does it compare to MySQL? Well… there are other problems (Bug 54194 – ENUM limit of 65535 elements isn’t true filed). Since we don’t have any limitations in Drizzle due to the FRM file format, we actually get to execute the CREATE TABLE statement.

Still, why did this take four and a half minutes? I luckily managed to run poor man’s profiler during query execution. I very easily found out that I had this thread constantly running check_duplicates_in_interval(), which does a stupid linear search for duplicates. It turns out, that for 70,000 items, this takes approximately four minutes and 19.5 seconds. Bug 589055 CREATE TABLE with ENUM fields with large elements takes forever (where forever is defined as a bit over four minutes) filed.

So I replaced check_duplicates_in_interval() with a implementation using a hash table (boost::unordered_set actually) as I wasn’t quite immediately in the mood for ripping out all of TYPELIB from the server. I can now run the CREATE TABLE statement in less than half a second.

So now, I can run my test case in much less time and indeed check for correct behaviour rather quickly.

I do have an urge to find out how big I can get a valid table definition file to though…. should be over 32MB…


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL 5.5.4 looks awesome.

Апрель 16th, 2010

Been at the MySQL conference the last few days, and I have to say, I’m really blown away by MySQL 5.5.4’s improvements.  Last year I keynoted and I begged Oracle on stage to realize that MySQL and InnoDB under one roof represented opportunity.  It’s clear they heard the community – this is some serious progress, and right when we needed it.

Jeremy Zawodny’s blog post covers most of the stuff I’m really excited about, and there are some great detailed technical slides here and here, but I wanted to go into a little more detail on one important improvment:  We’ve been plagued by MySQL’s undo slot limits for an awfully long time.  Basically, you could have 512 INSERT transactions and 512 UPDATE transactions running at once, for a grand total of 1024.  If you use INSERT … ON DUPLICATE KEY UPDATE, though, it takes two of those spots, meaning you get 512 concurrent transactions.  On modern hardware, it’s trivially easy to hit this limit.

I’ve had an Enterprise support ticket open for years on the issue, there’s been a MySQL bug for a long time, and there was basically no movement.  In fact, I’d gotten so frustrated about this issue, I’d basically decided this year was our last year of Enterprise MySQL support.  It was one of the sole reasons we paid for support for the last few years – the promise that a fix was just around the corner.  I felt good about voting with my dollars, and contributing back to a core technology we depend on, but enough was enough.

Lo and behold, it’s fixed!  You can now have a whopping 128K transactions in flight.  Best of all, it’s far more performant than it used to be!  And craziest of all?  If you run 5.5.4 on a database, then roll back to some older release, the change still takes effect.  Backwards bug and performance fixing – that’s a new one on me.

THANK YOU ORACLE!

Shameless plug – we’re hiring. And it’s a blast.



PlanetMySQL Voting: Vote UP / Vote DOWN

Ubuntu Karmic’s Network Manager Issues

Январь 14th, 2010
Since Ubuntu 8.04 aka Hardy Heron, I've had issues with every new release. As Ubuntu evolves into being a viable desktop OS alternative, its complexity has been growing and with the new and improved looks new challenges arise. This bug in particular has been very difficult to diagnose and I can't imagine anyone without enough Linux experience to overcome it on their own, so I decided to summarize the steps I took to fix it ... and vent my frustration at the end.

The Symptom

I came across the issue for the first time while trying Ubuntu's Karmic Netbook remix. After overcoming the typical Broadcom wifi driver, Network Manager would connect, but Firefox would fail to load the web pages 90% of the time. Using ping in the command line worked just fine. Maybe I needed to update the software packages to get the latest patches, surprise, apt-get was having similar problems and timing out. So the problem was deep in the OS layer.

After a lot fiddling and some googling I found bug #417757:
[...] In Karmic, DNS lookups take a very long time with some routers, because glibc's DNS resolver tries to do IPv6 (AAAA) lookups even if there are no (non-loopback) IPv6 interfaces configured. Routers which do not repond to this cause the lookup to take 20 seconds (until the IPv6 query times out). [...]
These routers are common place in many households and most users are completely unaware that they have their own DNS servers, what IPv6 means or even how to update the router's firmware if needed.

The Solution(s)

Going through the comments in the bug I found several recommendations, some made more sense than others, but these are the 2 I used. Most regular users will feel comfortable with these steps. I haven't tried, but it might not be necessary to apply both.

Disable IPv6

You should apply this one especially if the networks to which you connect are not using IPv6 (most home and public networks). Otherwise, skip it. The solution is explained here. To edit the /etc/sysctl.conf file use:
sudo vi /etc/sysctl.conf
Replace vi with your editor of choice. Reboot before retrying the connection.

You can try the setting without changing your system configuration or restarting the machine using the following command:
sudo sysctrl -w net.ipv6.conf.all.disable_ipv6=1

Use OpenDNS or Google DNS Servers

If the previous solution isn't enough and/or you want to try these DNS servers instead of relying on your router or ISP's DNS servers (in many cases it'll improve the DNS lookup performance) edit your /etc/dhcp3/dhclient.conf file using the following command:
sudo vi /etc/dhcp3/dhclient.conf
Add the following lines after the line starting with #prepend:
# OpenDNS servers
prepend domain-name-servers 208.67.222.222, 208.67.220.220;
# Google DNS servers
prepend domain-name-servers 8.8.8.8, 8.8.4.4;
Or if you want to use the GUI, you can follow the instructions in How to setup Google Public DNS in Ubuntu 9.10 Karmic Koala, the instructions work with any of the IP addresses above. Once you apply these changes, restart your box and retry.

The Editorial

If you just read the article for the technical content, this is a good spot to stop. If you're interested in my rant, keep going.

It is well known that long term Linux users have been frustrated by the complications that have been popping up with video (in particular dual head setups), sound and networking in the releases post Ubuntu 8.04 (Hardy). The last 3 releases have improved the overall GUI usability a lot, but they have introduced a number of bugs and issues that make those improvements irrelevant. It's easy to find articles in the web about these issues and I've been hearing and reading about them at multiple Linux and MySQL forums.

Then there are comment like this one which miss the point completely:
[...] > You can't tell your grandmother to edit some config files because her internet is slow
Does your grandmother use Ubuntu then? If so, then just help her out in fixing the issue :) [...]
This goes against what bug #1 is trying to address: Wider Linux adoption.

My requests to the Ubuntu community are:
  1. Stop fiddling with the UI and start solving real usability problems. Without easy display, sound and network integration supporting widespread installed hardware, only the übergeeks are going to use Linux and bug #1 will still remain unsolved long after Ubuntu's Zippy Zebra release (I made up the name).
  2. This is another example where the Open Source community can be as bad as regular companies addressing real world needs. The OSS advantage is in the community members that, instead of spreading FUD and useless comments and articles, come up with proper suggestions and contributions. Unfortunately, sometimes it takes time to find them and until a given software package is forked, no real progress is made.
I promise that on my next article I'll write about a MySQL topic.

PlanetMySQL Voting: Vote UP / Vote DOWN

Will your production MySQL server survive a restart?

Август 30th, 2009
Do you know if your production MySQL servers will come back up when restarted? A recent support episode illustrates a number of best practices. The task looked trivial: Update a production MySQL server (replication master) with a configuration tuned and tested on a development server. Clean shutdown, change configuration, restart. Unfortunately, the MySQL daemon did not just ‘come back’, leaving 2 sites offline. Thus begins an illuminating debugging story.
First place to look is the daemon error log, which revealed that the server was segfaulting, seemingly at the end of or just after InnoDB recovery. Reverting to the previous configuration did not help, nor did changing the InnoDB recovery mode. Working with the client, we performed a failover to a replication slave, while I got a second opinion from a fellow engineer to work out what had gone wrong on the server.
Since debug symbols weren’t shown in the stack trace, we needed to generate a symbol file (binary was unstripped) to use with the resolve_stack_dump utility. The procedure for obtaining this is detailed in the MySQL manual. With a good stack trace in hand, we were able (with assistance from an old friend, thanks Dean!) to narrow the crash down to bug 38856 (also see 37027). A little further investigation showed that the right conditions did exist to trigger this bug:
  • expire_logs_days = 14 # had been set in the my.cnf
  • the binlog.index file did not match the actual state of log files (i.e. some had been manually deleted, or deleted by a script)
So with this knowledge, it was possible to bring the MySQL server back up. It turned out that the expire_logs_days had perhaps been added to the configuration but not tested at the time (the server had not been restarted for 3 months). This had placed the system in a state, unbeknownst to the administrators, where it would not come back up after a restart. It was an interesting (if a tad stressful) incident as it shows the reasons for many best practices – which most of us know and follow – but worth re-capping here.
  • even seemingly trivial maintenance can potentially trigger downtime
  • plan any production maintenance in the quiet zone, and be sure to allow enough time to deal with the unforeseen
  • don’t assume your live server will ‘just restart’
  • put my.cnf under revision control (check out “etckeeper”, a standard Ubuntu package; it can keep track of everything in /etc using bzr, svn or git)
  • do not make un-tested changes to config, test immediately, preferably on dev or staging system
  • be ready to failover (test regularly like a fire drill); this is another reason why master-master setups are more convenient than mere master-slave
  • replication alone is NOT a backup
  • don’t remove binlogs or otherwise touch anything in data dir behind mysql’s back
  • have only 1 admin per server so you don’t step on each other’s toes (but share credentials with 2IC for emergencies only)
  • use a trusted origin for your binary packages, just building and passing the basis test-suite is not always sufficient
  • know how to get a good stack trace with symbols, to help find bug reports
  • be familiar with bugs.mysql.com, but it still helps to ask others as they might have just seen something similar and can help you quickly find what you’re looking for!
  • and last but very important: it really pays to find the root cause to a problem (and prevention requires it!), so a “postmortum” on a dead server is very important… if we had just wiped that server, the problem might have reoccurred with another server later.

PlanetMySQL Voting: Vote UP / Vote DOWN

Be careful with BETWEEN clauses, because the MySQL optimizer is not smarter than a fifth grader!

Август 23rd, 2009
edit: I filed MySQL bug#46867 about this issue.

Ask anyone who has learned to count the following two questions. The answer to both of which should be yes.

Q:Is 5 between 1 and 10? A: Yes

Q:Is 5 between 10 and 1? A: Yes

Ask MySQL those same questions:

mysql>  (select 'Yes' 
           from dual 
          where 5 between 1 and 10
        ) 
        union 
        (select 'No'
        ) 
        limit 1;
+-----+
| Yes |
+-----+
| Yes |
+-----+
1 row in set (0.00 sec)

mysql>  (select 'Yes' 
          from dual 
         where 5 between 10 and 1 
        ) 
        union 
        (select 'No') 
        limit 1;
+-----+
| Yes |
+-----+
| No  |
+-----+
1 row in set (0.00 sec)


This is a problem because applications may produce BETWEEN clauses. I don't think most applications specifically check that the lower bounds is lower than the upper bounds in left to right order. Logically, a number between 1 and 10 is also between 10 and 1.

I encountered this on my own benchmarking application that generates BETWEEN clauses.

So generate BETWEEN clauses with caution. Perhaps you would be better off with two >= and <= expressions instead of the more convenient BETWEEN.
PlanetMySQL Voting: Vote UP / Vote DOWN