Archive for the ‘bugs’ Category

Cloud openness contemplated

Апрель 15th, 2010

I caught some of the keynotes and discussion at the Linux Foundation Collaboration Summit today, and was particularly interested in the panel discussion on open source and cloud computing. While we are used to hearing and talking about how important open source software is to cloud computing (open source giving to cloud computing), moderator John Mark Walker posed the question of whether cloud computing gives back? The discussion also rightfully focused on openness in cloud computing, how open source might or might not translate to cloud openness and the importance of data to be open as well.

The discussion also centered on some issues regarding open standards and how open is open enough for cloud computing? It may depend on who you ask, but I tend to think that the flexibility, interoperability and portability advantages of open source software will dictate its continued use and true openness in the cloud.

However, this is not always the case. When we consider openness in the mobile market, we see that while open source software is going into more and more smartphones and mobile devices, by the time it gets into the product and into the hands of consumers, it ends up closed. This is not necessarily a violation of open source license, either in rule or in spirit, but rather the use, incorporation and reliance on open source alongside proprietary products, strategies and companies, typically under a permissive license. Much of it also has to do with the need, both perceived and real, for control of code in these devices among hardware, software, wireless carrier and other players with a stake.

Another interesting perspective of what open source means, or doesn’t mean, in terms of cloud computing, standards and interoperability comes from the Xen community’s Simon Crosby of Citrix.

One of the most interesting things to watch when considering whether cloud computing gives back to open source is the AGPLv3 license, which is viewed in different ways as both a burden and a boon to network-based, distributed development by various parties. We continue to see vendors, such as mobile software player Funambol, as strong supporters of AGPL while others, such as Google, continue their resistence to it.

The AGPL also came up in the Linux Foundation Collaboration summit panel again, and while I don’t think the license currently serves as the answer to whether cloud computing gives back to open source, we do see some benefits to open source from cloud computing, both in terms of code, projects and communities and the commercial vendors leveraging open source software. In terms of code, large users of open source software projects, such Linux, MySQL, Hadoop, Cassandra, help to raise the profile and credibility of open source. Whether corporations or university campuses, these large users can also be among the most active community participants — driving features and shaking out bugs, and most prolific code contributors — creating features and extensions and enlarging the ecosystem. In terms of commercial open source vendors, cloud computing can also mitigate the challenges of balancing and differentiating free, community versions and separate, paid versions. If the vendor is able to offer support, services or even extensions with the cloud version of its software, it is easily separated from a free, community version that may be available for free, but not from the cloud.

Of course, there is more that cloud computing can do for open source and there is much more that has to be done to ensure true openness in cloud computing, particularly when some existing and emerging defacto standards are anything but open, but for all that open source is to cloud computing, cloud computing seems to be returning the favor to some degree already.


PlanetMySQL Voting: Vote UP / Vote DOWN

Missleading Innodb message on recovery

Февраль 16th, 2010

As I wrote about 2 years ago the feature of Innodb to store copy of master’s position in Slave’s Innodb tablespace got broken. There is a lot of discussions at the corresponding bug report while outcome of the fix remained uncertain for me (the bug is market duplicate while the bugs it seems to be duplicate for describe different issues).
Anyway. The customer came to me today having the following message in the error log after Slave crash while running MySQL 5.1.41 (extra virgin version without XtraDB or Percona Patches)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 0 10000000, file name mysql-bin.005000
InnoDB: Last MySQL binlog file position 0 20000000, file name ./mysql-bin.003000

He has tried to restart replication from mysql-bin.005000 position 10000000 which failed with “Could not find first log file name in binary log index file” error message

Looking at the Master for this slave I could see its binary log files going only to about 2000, so the binary log file mentioned did not ever exist on this master. What is going on ?
The thing is Innodb does not update this information any more however if it is stored in the tablespace the code is still there to print it. This database was running older MySQL version a while back which was updated to MySQL 5.1 months ago, moved to the new hardware by physical copy and the log file numbers restarted back from 1 but tablespace still contained the ancient data.

I reported the bug on this which should be easy to fix. Otherwise it is easy mistake to make. We also have a patch which restores this information and uses it on slave crash recovery.


Entry posted by peter | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Getting around optimizer limitations with an IN() list

Январь 10th, 2010

There was a discussion on LinkedIn one month ago that caught my eye:

Database search by "within x number of miles" radius?

Anyone out there created a zipcode database and created a "search within x numer of miles" function ?
Thankful for any tips you can throw my way..

J

A few people commented that some solutions wouldn't scale. To understand why these sorts of geographic search queries are problematic in MySQL, it's best to show some execution plans on dummy data:

SQL:
  1. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  2. AND y BETWEEN 50 AND 60 AND col_a = 'set1';
  3. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  4. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  5. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  6. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL | 4032 | USING WHERE |
  7. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  8. 1 row IN SET (0.00 sec)
  9.  
  10. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  11. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  12. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  13. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  14. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 3       | NULL | 4032 | USING WHERE |
  15. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  16. 1 row IN SET (0.01 sec)
  17.  
  18. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  19. AND y BETWEEN 50 AND 60 AND col_a = 'set1';
  20. +----------+
  21. | count(*) |
  22. +----------+
  23. |        1 |
  24. +----------+
  25. 1 row IN SET (0.00 sec)
  26.  
  27. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  28. +----------+
  29. | count(*) |
  30. +----------+
  31. |     1664 |
  32. +----------+
  33. 1 row IN SET (0.01 sec)

Did you notice that we estimate just as many rows on the first EXPLAIN as the second one? That doesn't make any sense! The index covers x,y and col_a and should be eliminating a lot of searching, since there is only one row which meets this condition!

The reason for this is simply a missing feature of the MySQL optimizer - and it has to do with using x BETWEEN 30 and 40 (and it's also true with x >= 30 AND x <= 40). Using a range like this prevents us from using the rest of the index. There is a workaround, but it's not pretty:

SQL:
  1. EXPLAIN SELECT * FROM coordinates WHERE x IN
  2. (30.30,30.61,31.18,31.26,31.72,32.11,32.25,32.30,32.42,32.91,33.27,
  3. 33.69,33.79,33.93,34.62,34.78,35.10,35.41,36.62,36.93,37.17,38.93,39.20,
  4. 39.56,39.84,39.87) AND y IN (59.58,56.81,57.27,54.14,56.43,51.87,54.59,
  5. 59.56,57.42,54.13,56.79,59.45) AND col_a = 'set1';
  6. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  7. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  8. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  9. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL312 | USING WHERE |
  10. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  11. 1 row IN SET (0.00 sec)

The ugliest thing about this, is that in real life you wouldn't know all your possible values of X or Y, and so you may end up with a very big IN list. The workaround to this, is to create steppings of the value X and Y that we can use for indexes:

SQL:
  1. ALTER TABLE coordinates ADD x_floor INT NOT NULL, ADD y_floor INT NOT NULL, DROP INDEX x_y_col_a,
  2. ADD INDEX x_floor_y_floor_col_a (x_floor, y_floor, col_a);
  3.  
  4. UPDATE coordinates SET x_floor = FLOOR(x), y_floor = FLOOR(y);
  5.  
  6. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  7. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = 'set1'\G
  8. *************************** 1. row ***************************
  9.            id: 1
  10.   select_type: SIMPLE
  11.         TABLE: coordinates
  12.          type: range
  13. possible_keys: x_floor_y_floor_col_a
  14.           KEY: x_floor_y_floor_col_a
  15.       key_len: 40
  16.           ref: NULL
  17.          rows: 121
  18.         Extra: USING WHERE
  19. 1 row IN SET (0.00 sec)

Fantastic! The only remaining problem with this query is that it's not quite identical to our original. In this query 60.79 will be floored to 60 (and erroneously included in our results):

SQL:
  1. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = 'set1';
  3. +-----+-------+-------+-------+-------+---------+---------+
  4. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  5. +-----+-------+-------+-------+-------+---------+---------+
  6. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  7. 38 | 39.20 | 60.79 | set1  | NULL  |      39 |      60 |
  8. +-----+-------+-------+-------+-------+---------+---------+
  9. 2 rows IN SET (0.00 sec)

However, this is a quick fix by re-including the original WHERE conditions (we are just no longer using an index on them):

SQL:
  1. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  3. AND col_a = 'set1' AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         TABLE: coordinates
  8.          type: range
  9. possible_keys: x_floor_y_floor_col_a
  10.           KEY: x_floor_y_floor_col_a
  11.       key_len: 40
  12.           ref: NULL
  13.          rows: 121
  14.         Extra: USING WHERE
  15. 1 row IN SET (0.00 sec)
  16.  
  17. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  18. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  19. AND col_a = 'set1' AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60;
  20. +-----+-------+-------+-------+-------+---------+---------+
  21. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  22. +-----+-------+-------+-------+-------+---------+---------+
  23. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  24. +-----+-------+-------+-------+-------+---------+---------+
  25. 1 row IN SET (0.00 sec)

Conclusion:
My examples were only on a small amount of data (16 000 rows) that fitted in memory, but the original query would have full table scanned if I didn't use a FORCE INDEX hint. Add more data, and if X can't filter out enough rows by itself this can create a real problem.

Workarounds are all very good, but they also make applications more difficult to maintain. If you really want to do these types of queries, you should give Sphinx a try.


Entry posted by Morgan Tocker | 3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Getting around optimizer limitations with an IN() list

Январь 10th, 2010

There was a discussion on LinkedIn one month ago that caught my eye:

Database search by "within x number of miles" radius?

Anyone out there created a zipcode database and created a "search within x numer of miles" function ?
Thankful for any tips you can throw my way..

J

A few people commented that some solutions wouldn't scale. To understand why these sorts of geographic search queries are problematic in MySQL, it's best to show some execution plans on dummy data:

SQL:
  1. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  2. AND y BETWEEN 50 AND 60 AND col_a = 'set1';
  3. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  4. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  5. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  6. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL | 4032 | USING WHERE |
  7. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  8. 1 row IN SET (0.00 sec)
  9.  
  10. EXPLAIN SELECT * FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  11. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  12. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  13. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  14. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 3       | NULL | 4032 | USING WHERE |
  15. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  16. 1 row IN SET (0.01 sec)
  17.  
  18. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40
  19. AND y BETWEEN 50 AND 60 AND col_a = 'set1';
  20. +----------+
  21. | count(*) |
  22. +----------+
  23. |        1 |
  24. +----------+
  25. 1 row IN SET (0.00 sec)
  26.  
  27. SELECT count(*) FROM coordinates FORCE INDEX (x_y_col_a) WHERE x BETWEEN 30 AND 40;
  28. +----------+
  29. | count(*) |
  30. +----------+
  31. |     1664 |
  32. +----------+
  33. 1 row IN SET (0.01 sec)

Did you notice that we estimate just as many rows on the first EXPLAIN as the second one? That doesn't make any sense! The index covers x,y and col_a and should be eliminating a lot of searching, since there is only one row which meets this condition!

The reason for this is simply a missing feature of the MySQL optimizer - and it has to do with using x BETWEEN 30 and 40 (and it's also true with x >= 30 AND x <= 40). Using a range like this prevents us from using the rest of the index. There is a workaround, but it's not pretty:

SQL:
  1. EXPLAIN SELECT * FROM coordinates WHERE x IN
  2. (30.30,30.61,31.18,31.26,31.72,32.11,32.25,32.30,32.42,32.91,33.27,
  3. 33.69,33.79,33.93,34.62,34.78,35.10,35.41,36.62,36.93,37.17,38.93,39.20,
  4. 39.56,39.84,39.87) AND y IN (59.58,56.81,57.27,54.14,56.43,51.87,54.59,
  5. 59.56,57.42,54.13,56.79,59.45) AND col_a = 'set1';
  6. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  7. | id | select_type | TABLE       | type  | possible_keys | KEY       | key_len | ref  | rows | Extra       |
  8. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  9. 1 | SIMPLE      | coordinates | range | x_y_col_a     | x_y_col_a | 38      | NULL312 | USING WHERE |
  10. +----+-------------+-------------+-------+---------------+-----------+---------+------+------+-------------+
  11. 1 row IN SET (0.00 sec)

The ugliest thing about this, is that in real life you wouldn't know all your possible values of X or Y, and so you may end up with a very big IN list. The workaround to this, is to create steppings of the value X and Y that we can use for indexes:

SQL:
  1. ALTER TABLE coordinates ADD x_floor INT NOT NULL, ADD y_floor INT NOT NULL, DROP INDEX x_y_col_a,
  2. ADD INDEX x_floor_y_floor_col_a (x_floor, y_floor, col_a);
  3.  
  4. UPDATE coordinates SET x_floor = FLOOR(x), y_floor = FLOOR(y);
  5.  
  6. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  7. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = 'set1'\G
  8. *************************** 1. row ***************************
  9.            id: 1
  10.   select_type: SIMPLE
  11.         TABLE: coordinates
  12.          type: range
  13. possible_keys: x_floor_y_floor_col_a
  14.           KEY: x_floor_y_floor_col_a
  15.       key_len: 40
  16.           ref: NULL
  17.          rows: 121
  18.         Extra: USING WHERE
  19. 1 row IN SET (0.00 sec)

Fantastic! The only remaining problem with this query is that it's not quite identical to our original. In this query 60.79 will be floored to 60 (and erroneously included in our results):

SQL:
  1. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60) AND col_a = 'set1';
  3. +-----+-------+-------+-------+-------+---------+---------+
  4. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  5. +-----+-------+-------+-------+-------+---------+---------+
  6. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  7. 38 | 39.20 | 60.79 | set1  | NULL  |      39 |      60 |
  8. +-----+-------+-------+-------+-------+---------+---------+
  9. 2 rows IN SET (0.00 sec)

However, this is a quick fix by re-including the original WHERE conditions (we are just no longer using an index on them):

SQL:
  1. EXPLAIN SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  2. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  3. AND col_a = 'set1' AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60\G
  4. *************************** 1. row ***************************
  5.            id: 1
  6.   select_type: SIMPLE
  7.         TABLE: coordinates
  8.          type: range
  9. possible_keys: x_floor_y_floor_col_a
  10.           KEY: x_floor_y_floor_col_a
  11.       key_len: 40
  12.           ref: NULL
  13.          rows: 121
  14.         Extra: USING WHERE
  15. 1 row IN SET (0.00 sec)
  16.  
  17. SELECT * FROM coordinates WHERE x_floor IN (30,31,32,33,34,35,36,37,38,39,40)
  18. AND y_floor IN (50,51,52,53,54,55,56,57,58,59,60)
  19. AND col_a = 'set1' AND x BETWEEN 30 AND 40 AND y BETWEEN 50 AND 60;
  20. +-----+-------+-------+-------+-------+---------+---------+
  21. | id  | x     | y     | col_a | col_b | x_floor | y_floor |
  22. +-----+-------+-------+-------+-------+---------+---------+
  23. | 144 | 33.79 | 54.59 | set1  | NULL  |      33 |      54 |
  24. +-----+-------+-------+-------+-------+---------+---------+
  25. 1 row IN SET (0.00 sec)

Conclusion:
My examples were only on a small amount of data (16 000 rows) that fitted in memory, but the original query would have full table scanned if I didn't use a FORCE INDEX hint. Add more data, and if X can't filter out enough rows by itself this can create a real problem.

Workarounds are all very good, but they also make applications more difficult to maintain. If you really want to do these types of queries, you should give Sphinx a try.


Entry posted by Morgan Tocker | 3 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster 7.0.8a and 6.3.27a released (source)

Октябрь 7th, 2009
Fixes regression - bug 47844.

Configurator and Sandbox scripts are updated to use this version.

If you build from source then you can upgrade using these scripts:
Upgrade script from 7.0.7 -> 7.0.8a is here.
Upgrade script from 7.0.8 -> 7.0.8a is here.
Upgrade script from 6.3.26-> 6.3.27a is here.

Follow the instructions in this blog post!

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster 7.0.8/6.3.27 — serious regression!!!

Октябрь 6th, 2009
A regression in MySQL Cluster 7.0.8 and 6.3.27 has caused them to be removed from download.

Problem:
If you have a mysql server with higher node id than 49, then it will cause cluster to crash.

An updated version of MySQL Cluster 7.0.8, called 7.0.8a will be released very soon.

The Configurator has been reverted back to use 7.0.7 and 6.3.26.

PlanetMySQL Voting: Vote UP / Vote DOWN

Data node stuck in phase 101 — what to do?

Август 31st, 2009
Sometimes I have seen and heard about that a data node gets stuck in start phase 101.

Unfortunately it is difficult to reproduce this (found no way yet), so no bug fix is in the pipe yet.

What happens is that in sp 101, the starting data node (actually a block called SUMA) should reconnect to the mysql server and take over the event handling (sending events to the mysql server), but it never gets the reconnect to the mysql server(s). A better explanation is here :)

If you see your data node stuck here then try the following:
1) Restart the mysql servers (one by one), the data node should now start
or
2) Restart the mysql servers and restart the data node.

Only do 2) if 1) does not work.

And if you know how to reproduce - let us know!

PlanetMySQL Voting: Vote UP / Vote DOWN

OPTIMIZE TABLE on Cluster (revisited)

Август 4th, 2009
Jonas just wrote a patch to this bug on OPTIMIZE TABLE, and the issue that was also discussed in this blog post. Jonas also fixed this bug when he was at it.

Before, OPTIMIZE TABLE hardly freed up any pages and to defragment you had to do a rolling restart of the data nodes.

Now, there is only a 2% discrepancy between OPTIMIZE TABLE and doing a rolling restart. This is great stuff.
This will fix will make it into 6.3.26 and 7.0.7.

See below for details:

Creating two tables, t5 and t6:

CREATE TABLE `t5` (
`id` varchar(32) CHARACTER SET utf8 NOT NULL,
`name` varchar(32) NOT NULL DEFAULT '',
`unit` int(11) DEFAULT NULL,
PRIMARY KEY (`id`,`name`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1

And then the following table:

'CREATE TABLE `t6` ( `id` varchar(32) CHARACTER SET utf8 NOT NULL, `name` varchar(32) NOT NULL DEFAULT '', c0 int unsigned not null, c1 int unsigned not null, data2 varchar(255), `unit` int(11) DEFAULT NULL, PRIMARY KEY (`id`,`name`) ) ENGINE=ndbcluster DEFAULT CHARSET=latin1;


Then we use hugoLoad (a test program that isn't built by default in the src distribution of cluster) to load it with 1M records.

./hugoLoad -r 1000000 -d test t5
./hugoLoad -r 1000000 -d test t6


Memory usage after population:

Node 2: Data usage is 45%(14890 32K pages of total 32768)
Node 2: Index usage is 22%(3722 8K pages of total 16416)
Node 3: Data usage is 45%(14890 32K pages of total 32768)
Node 3: Index usage is 22%(3722 8K pages of total 16416)


Delete 500K records from t5.


mysql> set ndb_use_transactions=0;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from t5 limit 100000;
Query OK, 100000 rows affected (2.73 sec)

mysql> delete from t5 limit 100000;
Query OK, 100000 rows affected (3.13 sec)

mysql> delete from t5 limit 100000;
Query OK, 100000 rows affected (3.11 sec)

mysql> delete from t5 limit 100000;
Query OK, 100000 rows affected (3.13 sec)

mysql> delete from t5 limit 100000;
Query OK, 100000 rows affected (2.97 sec)


Memory usage after deletion:


Node 2: Data usage is 45%(14890 32K pages of total 32768)
Node 2: Index usage is 19%(3120 8K pages of total 16416)
ndb_mgm> Node 3: Data usage is 45%(14890 32K pages of total 32768)
Node 3: Index usage is 19%(3121 8K pages of total 16416)


Run OPTIMIZE TABLE:


mysql> optimize table t5;
+---------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------+----------+----------+----------+
| test.t5 | optimize | status | OK |
+---------+----------+----------+----------+
1 row in set (1 min 41.94 sec)



CPU UTIL during OPTIMIZE TABLE:
User:23.8% Sys: 4.9% ndbd

Memory usage after OPTIMIZE TABLE:


ndb_mgm> Node 2: Data usage is 39%(13090 32K pages of total 32768)
Node 2: Index usage is 19%(3120 8K pages of total 16416)
Node 3: Data usage is 39%(13090 32K pages of total 32768)
Node 3: Index usage is 19%(3121 8K pages of total 16416)


Delete 500K (50%) of the records from t6:

mysql> set ndb_use_transactions=0;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from t6 limit 100000;
Query OK, 100000 rows affected (2.77 sec)

mysql> delete from t6 limit 100000;
Query OK, 100000 rows affected (3.05 sec)

mysql> delete from t6 limit 100000;
Query OK, 100000 rows affected (3.12 sec)

mysql> delete from t6 limit 100000;
Query OK, 100000 rows affected (3.19 sec)

mysql> delete from t6 limit 100000;
Query OK, 100000 rows affected (3.24 sec)


Memory usage after deletion:


Node 2: Data usage is 39%(13090 32K pages of total 32768)
Node 2: Index usage is 15%(2521 8K pages of total 16416)
Node 3: Data usage is 39%(13090 32K pages of total 32768)
Node 3: Index usage is 15%(2522 8K pages of total 16416)


Run OPTIMIZE TABLE:


mysql> optimize table t6;
+---------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------+----------+----------+----------+
| test.t6 | optimize | status | OK |
+---------+----------+----------+----------+
1 row in set (1 min 45.83 sec)


Memory usage after OPTIMIZE TABLE:


Node 2: Data usage is 28%(9249 32K pages of total 32768)
Node 2: Index usage is 15%(2521 8K pages of total 16416)
Node 3: Data usage is 28%(9249 32K pages of total 32768)
Node 3: Index usage is 15%(2522 8K pages of total 16416)


After restart of node 3:


Node 2: Data usage is 28%(9249 32K pages of total 32768)
Node 2: Index usage is 15%(2521 8K pages of total 16416)
Node 3: Data usage is 26%(8568 32K pages of total 32768)
Node 3: Index usage is 13%(2236 8K pages of total 16416)


Only two percent diff between rolling restart and OPTIMIZE TABLE. Excellent. Thank you Jonas for fixing this.

Upgrade 6.3.x to 7.0.6

Август 4th, 2009
Currently a number of users has reported upgrade issues on 6.3.x to 7.0.6 (below is an example what might happen to you).

Thus, we don't recommend you to upgrade from 6.3.x to 7.0.6 in a production system yet.
However, we are very interested in bug reports and and help with testing.
We are also of course working on fixing these issues and will keep you posted when things has improved. Sorry for the inconvenience.

ERROR 1005 (HY000): Can't create table 'test.#sql-3d25_aa905' (Errno: 140)

mysql> show warnings;
+-------+------+--------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+--------------------------------------------------------------------------+
| Error | 1296 | Error 755 'Invalid tablespace' from NDB |
| Error | 1005 | Can't create table 'test.#sql-3d25_aa905' (Errno: 140) |
+-------+------+--------------------------------------------------------------------------+