Archive for the ‘information_schema’ Category

Finding tables without primary keys

Сентябрь 5th, 2011
I was checking a third party server, and I needed to find if there were tables without primary keys. This is important to know, not only because the lack of primary keys affects performance and data accuracy in general, but also because in row-based replication performance can degrade beyond belief when updating tables without primary keys.Anyway, I did not remember off the bat any method to get this information from a server with thousands of tables, and thus I went to find a solution on my own.My first instinct called for using the COLUMNS table from the INFORMATIOn_SCHEMA, and so I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):

select
table_schema,table_name
from
information_schema.columns
group by
table_schema,table_name
having
sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;
This query got the job done, and it was quite quick as well.Then, since I was chatting with Sheeri Cabral about other matters, I asked her if she could come up with an alternative solution. She suggested a LEFT JOIN between the information_schema.tables and information_schema.statistics, which I translated into this query:

select
t.table_schema, t.table_name
from
information_schema.tables t
left join information_schema. statistics s
on t.table_schema=s.table_schema and t.table_name=s.table_name
and s.non_unique=0
where
s.table_name is null;
This query works on the principle that it removes from the tables list all the ones for which there is no corresponding table in the statistics table.This query also works. Using both queries in a relatively empty server did not show any significant difference. But since I knew that I had to use this method on a very busy server, with a lot of tables, I quickly created 1,000 databases, each containing 5 tables, two of which did not have any primary or unique key.Now came the first surprise.The query with GROUP BY took about 0.5 seconds, while the one using LEFT JOIN used 11 seconds.I was about to congratulate myself for my acumen, when I realized that, in addition to schema and table names, I also needed the table engine.For the second query, that is not a problem. Adding the engine to the columns list works OK, and the query runs in 11 seconds like before.The first query, though, can't list the engine. There is no 'engine' in the COLUMNS table. So I needed a JOIN. Thus my query became

select
t.table_schema,t.table_name,engine
from
information_schema.tables t
inner join information_schema .columns c
on t.table_schema=c.table_schema and t.table_name=c.table_name
group by
t.table_schema,t.table_name
having
sum(if(column_key in ('PRI','UNI'), 1,0)) =0;
Guess what? This query ran in 17 seconds (!). So much for my instinct!Joins without keys are not efficient in MySQL, and tables in the information schema are no exception.If anyone has a more efficient method of getting a list of tables without primary key (the list including schema name, table name, and engine), I am curious to know.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Global status difference using single query

Август 12th, 2011

Have just read MySQL Global status difference using MySQL procedures / functions, by Andres Karlsson. Have commented, but realized I did not provide with a direct answer. In the comment, I suggested checking out a solution based on views, found in common_schema. But the solution in common_schema is split into two views, due to the fact views cannot handle derived tables subqueries.

Well, here’s a single query to do that: it checks GLOBAL_STATUS twice, 10 seconds apart in the following sample. It uses SLEEP() to actually wait between the two reads. Yes, you can do that with a query.

The following query shows all GLOBAL_STATUS values that have changed during the sample period.

SELECT STRAIGHT_JOIN
   LOWER(gs0.VARIABLE_NAME) AS variable_name,
   gs0.VARIABLE_VALUE AS variable_value_0,
   gs1.VARIABLE_VALUE AS variable_value_1,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) AS variable_value_diff,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) / 10 AS variable_value_psec,
   (gs1.VARIABLE_VALUE - gs0.VARIABLE_VALUE) * 60 / 10 AS
variable_value_pminute
FROM
   (
     SELECT
       VARIABLE_NAME,
       VARIABLE_VALUE
     FROM
       INFORMATION_SCHEMA.GLOBAL_STATUS
     UNION ALL
     SELECT
       '',
       SLEEP(10)
     FROM DUAL
   ) AS gs0
   JOIN INFORMATION_SCHEMA.GLOBAL_STATUS gs1 USING (VARIABLE_NAME)
WHERE
   gs1.VARIABLE_VALUE != gs0.VARIABLE_VALUE
;
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+
| variable_name                     | variable_value_0 | variable_value_1 | variable_value_diff | variable_value_psec | variable_value_pminute |
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+
| aborted_clients                   | 2210669          | 2210686          |                  17 |                 1.7 |                    102 |
| bytes_received                    | 53259933210      | 53260211104      |              277894 |             27789.4 |                1667364 |
| bytes_sent                        | 351130988015     | 351132884956     |             1896941 |            189694.1 |               11381646 |
| com_change_db                     | 3760546          | 3760584          |                  38 |                 3.8 |                    228 |
| com_delete                        | 6774784          | 6774801          |                  17 |                 1.7 |                    102 |
| com_insert                        | 52743750         | 52744012         |                 262 |                26.2 |                   1572 |
| com_insert_select                 | 13362650         | 13362740         |                  90 |                   9 |                    540 |
| com_select                        | 51722818         | 51723107         |                 289 |                28.9 |                   1734 |
| com_set_option                    | 108564134        | 108564754        |                 620 |                  62 |                   3720 |
| com_show_collations               | 3760530          | 3760568          |                  38 |                 3.8 |                    228 |
| com_show_processlist              | 366078           | 366082           |                   4 |                 0.4 |                     24 |
| com_show_status                   | 366047           | 366051           |                   4 |                 0.4 |                     24 |
| com_show_variables                | 3760535          | 3760573          |                  38 |                 3.8 |                    228 |
| com_update                        | 6271283          | 6271324          |                  41 |                 4.1 |                    246 |
| connections                       | 3781382          | 3781420          |                  38 |                 3.8 |                    228 |
| created_tmp_disk_tables           | 983223           | 983224           |                   1 |                 0.1 |                      6 |
| created_tmp_tables                | 9134044          | 9134126          |                  82 |                 8.2 |                    492 |
| handler_commit                    | 125798040        | 125798688        |                 648 |                64.8 |                   3888 |
| handler_delete                    | 6849562          | 6849578          |                  16 |                 1.6 |                     96 |
| handler_read_first                | 5332451          | 5332498          |                  47 |                 4.7 |                    282 |
| handler_read_key                  | 373910509        | 373912469        |                1960 |                 196 |                  11760 |
| handler_read_next                 | 850122025        | 850170403        |               48378 |              4837.8 |                 290268 |
| handler_read_rnd                  | 255104660        | 255105932        |                1272 |               127.2 |                   7632 |
| handler_read_rnd_next             | 992505444        | 992549948        |               44504 |              4450.4 |                 267024 |
| handler_update                    | 27930283         | 27930465         |                 182 |                18.2 |                   1092 |
| handler_write                     | 2051582925       | 2051602416       |               19491 |              1949.1 |                 116946 |
| innodb_buffer_pool_pages_data     | 77232            | 77243            |                  11 |                 1.1 |                     66 |
| innodb_buffer_pool_pages_dirty    | 626              | 645              |                  19 |                 1.9 |                    114 |
| innodb_buffer_pool_pages_flushed  | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_buffer_pool_pages_misc     | 4294922063       | 4294922052       |                 -11 |                -1.1 |                    -66 |
| innodb_buffer_pool_read_requests  | 1933796064       | 1933871603       |               75539 |              7553.9 |                 453234 |
| innodb_buffer_pool_reads          | 11360212         | 11360214         |                   2 |                 0.2 |                     12 |
| innodb_buffer_pool_write_requests | 1074109722       | 1074115394       |                5672 |               567.2 |                  34032 |
| innodb_data_fsyncs                | 5583880          | 5583905          |                  25 |                 2.5 |                    150 |
| innodb_data_read                  | 3339489280       | 3339501568       |               12288 |              1228.8 |                  73728 |
| innodb_data_reads                 | 11796492         | 11796494         |                   2 |                 0.2 |                     12 |
| innodb_data_writes                | 105587582        | 105588145        |                 563 |                56.3 |                   3378 |
| innodb_data_written               | 3721600000       | 3727315968       |             5715968 |            571596.8 |               34295808 |
| innodb_dblwr_pages_written        | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_dblwr_writes               | 596503           | 596506           |                   3 |                 0.3 |                     18 |
| innodb_log_write_requests         | 380978894        | 380981368        |                2474 |               247.4 |                  14844 |
| innodb_log_writes                 | 74407604         | 74407990         |                 386 |                38.6 |                   2316 |
| innodb_os_log_fsyncs              | 2310799          | 2310807          |                   8 |                 0.8 |                     48 |
| innodb_os_log_written             | 2905292800       | 2906502656       |             1209856 |            120985.6 |                7259136 |
| innodb_pages_created              | 1341584          | 1341593          |                   9 |                 0.9 |                     54 |
| innodb_pages_read                 | 13117652         | 13117654         |                   2 |                 0.2 |                     12 |
| innodb_pages_written              | 38429812         | 38430032         |                 220 |                  22 |                   1320 |
| innodb_rows_deleted               | 6849552          | 6849568          |                  16 |                 1.6 |                     96 |
| innodb_rows_inserted              | 43787980         | 43788207         |                 227 |                22.7 |                   1362 |
| innodb_rows_read                  | 4289845136       | 4289919560       |               74424 |              7442.4 |                 446544 |
| innodb_rows_updated               | 24119627         | 24119809         |                 182 |                18.2 |                   1092 |
| key_read_requests                 | 41262330         | 41262338         |                   8 |                 0.8 |                     48 |
| open_files                        | 7                | 5                |                  -2 |                -0.2 |                    -12 |
| opened_files                      | 4212920          | 4212924          |                   4 |                 0.4 |                     24 |
| questions                         | 253158874        | 253160331        |                1457 |               145.7 |                   8742 |
| select_full_join                  | 546              | 547              |                   1 |                 0.1 |                      6 |
| select_range                      | 721945           | 721947           |                   2 |                 0.2 |                     12 |
| select_scan                       | 12828865         | 12828989         |                 124 |                12.4 |                    744 |
| sort_range                        | 170971           | 170973           |                   2 |                 0.2 |                     12 |
| sort_rows                         | 255175383        | 255176655        |                1272 |               127.2 |                   7632 |
| sort_scan                         | 534078           | 534080           |                   2 |                 0.2 |                     12 |
| table_locks_immediate             | 142673687        | 142674454        |                 767 |                76.7 |                   4602 |
| threads_cached                    | 7                | 8                |                   1 |                 0.1 |                      6 |
| threads_connected                 | 5                | 10               |                   5 |                 0.5 |                     30 |
| threads_created                   | 840486           | 840509           |                  23 |                 2.3 |                    138 |
+-----------------------------------+------------------+------------------+---------------------+---------------------+------------------------+

Some values don’t make sense to do difference on (e.g. threads_connected), since they present with momentary status and are not incrementing as others (e.g. threads_created).

Happy SQLing!


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing common_schema: common views & routines for MySQL

Июль 13th, 2011

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing common_schema: common views & routines for MySQL

Июль 13th, 2011

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL security: inconsistencies

Июнь 22nd, 2011

Doing some work with MySQL security, I’ve noticed a few inconsistencies. They’re mostly not-too-terrible for daily work, except they get in my way right now.

The ALL PRIVILEGES inconsistency

The preferred way of assigning account privileges in MySQL is by way of using GRANT.

With GRANT, one assigns one or more privileges to an account, such as SELECT, UPDATE, ALTER, SUPER ,etc. Sometimes it makes sense for an account to have complete control over a domain. For example, the root account is typically assigned with all privileges. Or, some user may require all possible privileges on a certain schema.

Instead of listing the entire set of privileges, the ALL PRIVILEGES meta-privilege can be used. There is a fine issue to notice here; typically this is not a problem, but I see it as a flaw. Assume the following account:

root@mysql-5.1.51> GRANT ALL PRIVILEGES ON world.* TO 'world_user'@'localhost';

root@mysql-5.1.51> SHOW GRANTS FOR 'world_user'@'localhost';
+---------------------------------------------------------------+
| Grants for world_user@localhost                               |
+---------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'world_user'@'localhost'                |
| GRANT ALL PRIVILEGES ON `world`.* TO 'world_user'@'localhost' |
+---------------------------------------------------------------

This makes sense. We granted ALL PRIVILEGES and we see that the account is granted with ALL PRIVILEGES.

Now notice the following:

root@mysql-5.1.51> GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, EVENT, EXECUTE, INDEX, INSERT, LOCK TABLES, REFERENCES, SELECT, SHOW VIEW, TRIGGER, UPDATE ON `world`.* TO 'other_user'@'localhost';

root@mysql-5.1.51> SHOW GRANTS FOR 'other_user'@'localhost';
+---------------------------------------------------------------+
| Grants for other_user@localhost                               |
+---------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'other_user'@'localhost'                |
| GRANT ALL PRIVILEGES ON `world`.* TO 'other_user'@'localhost' |
+---------------------------------------------------------------+

I didn’t ask for ALL PRIVILEGES. I explicitly listed what I thought should be an account’s privileges. It just so happens that these make for the entire set of privileges available on the schema domain.

You might think this is a nice feature, an ease out MySQL provides with. I do not see it this way.

My preferred way of upgrading MySQL version involves exporting and importing of the GRANTs. That is, I do not dump and load the mysql system tables, but rather export all the SHOW GRANTS FOR … (e.g. with mk-show-grants), then execute these on the new version. This process was extremely useful on upgrades from 5.0 to 5.1, where some mysql system tables were modified.

Now, consider the case where some new MySQL version introduced a new set of privileges. My ‘other_user’@'localhost’ was not created with that set of privileges, nor did I intend it to have them. However, when exporting with SHOW GRANTS, the account is said to have ALL PRIVILEGES. When executed on the new version, the account will have privileges which I never assigned it.

Typically, this is not an issue. I mean, how many times do I assign an account with the entire set of privileges, yet do not intend it to have all privileges? Nevertheless, this makes for an inconsistency. It is unclear, by way of definition, which privileges are assigned to a user, without knowing the context of the version and the set of privileges per version. It makes for an inconsistency when moving between versions. And right now I’m working on some code which doesn’t like these inconsistencies.

The WITH GRANT OPTION inconsistency

An account can be granted with the WITH GRANT OPTION privilege, which means the account’s user can assign her privileges to other accounts. The inconsistency I found is that the GRANT mechanism is fuzzy with regard to GRANT OPTION, and falsely presents us with the wrong impression.

Let’s begin with the bottom line: the WITH GRANT OPTION can only be set globally for an account-domain combination. Consider:

root@mysql-5.1.51> GRANT INSERT, DELETE, UPDATE ON world.City TO 'gromit'@'localhost';
Query OK, 0 rows affected (0.00 sec)

root@mysql-5.1.51> GRANT SELECT ON world.City TO 'gromit'@'localhost' WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)

root@mysql-5.1.51> SHOW GRANTS FOR 'gromit'@'localhost';
+--------------------------------------------------------------------------------------------------+
| Grants for gromit@localhost                                                                      |
+--------------------------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'gromit'@'localhost'                                                       |
| GRANT SELECT, INSERT, UPDATE, DELETE ON `world`.`City` TO 'gromit'@'localhost' WITH GRANT OPTION |
+--------------------------------------------------------------------------------------------------+

The syntax of first two queries leads us to believe that we’re only providing the WITH GRANT OPTION for the SELECT privilege. But that is not so: the WITH GRANT OPTION is assigned for all privileges on world.City to ‘gromit’@'localhost’.

The syntax would be more correct if we were to write something like:

GRANT GRANT_OPTION ON world.* TO 'gromit'@'localhost';

That would make it clear that this privilege does not depend on other privileges set on the specified domain.

The USAGE inconsistency

You can GRANT the USAGE privilege, but you may never REVOKE it. To revoke USAGE means to DROP USER.

The missing ROUTINES_PRIVILEGES inconsistency

INFORMATION_SCHEMA provides with four privileges tables: USER_PRIVILEGES, SCHEMA_PRIVILEGES, TABLE_PRIVILEGES, COLUMN_PRIVILEGES, which map well to mysql‘s user, db, tables_priv and columns_priv tables, respectively.

Ahem, which INFORMATION_SCHEMA table maps to mysql.procs_priv?


PlanetMySQL Voting: Vote UP / Vote DOWN

Monitoring MySQL SQL statements the way it SHOULD be done!

Ноябрь 23rd, 2010
You may have read a previous post of mine, back in April this year, where I wrote about using the MySQL 5.5 Audit interface to SQL Statement monitoring. There was a bunch of comments and some ideas, but not much happened. Until now that is.

Hereby I release the first version of SQLStats, which is a much enhanced version of what I described in the previous post. This is a MySQL Plugin for MySQL 5.5 that allows you to monitor the most recently executed, and the most frequently executed statements using 2 INFORMATION_SCHEMA tables. The thing is not very complex, to be honest, but it does do the job. So what was the job then? Well, looking at what this plugin does, it goes something like this:
  • Allows you to monitor ALL SQL statements executed by the server.
  • The SQL statements are "normalized", meaning that literals / constants are removed before comparison.
  • Data is saved in memory. No disk access and very little overhead.
  • Data is retrieved from INFORMATION_SCHEMA tables, just a simple SELECT and you know what is going on in the server.
I have done some basic testing of the overhead of this plugin, and it wasn't even noticable. If you have used the general query log for this stuff before, know what overhead I'm talking about. With SQLStats, the overhead is close to 0.

There is no need for MySQL Proxy or anything like that. There is no need to change something in the Client or in the Connector. To be honest, there are a couple of things I want to add to the plugin eventually, but this is a starting point at least. To use it: download it, install the tomcat / mysql monitoring server, install the monitoring agent and ... No wait, that was MySQL Enterprise Monitor, this is how you do it: download, build, install and use it. That's it.

So where can I get it, you ask (or maybe you don't, but I'm gonna tell you anyway). It's on sourceforge, and you can download it from here:
http://sourceforge.net/projects/sqlstats/

There are two files to download: A simple PDF documents with some basic usage and configuration tips, and a source-code package (which also includes the same PDF).

What would I want from you? Ideas for future development, bug reports and a few beers, that's it, not too much to ask, eh?

/Karlsson
Also, did I mention that the overhead is VERY low...

PlanetMySQL Voting: Vote UP / Vote DOWN

Thoughts and ideas for Online Schema Change

Октябрь 7th, 2010

Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than ever.

The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.

“The original table must have PK. Otherwise an error is returned.”

This restriction could be lifted: it’s enough that the table has a UNIQUE KEY. My original oak-online-alter-table handled that particular case. As far as I see from their code, the Facebook code would work just as well with any unique key.

However, this restriction is of no real interest. As we’re mostly interested in InnoDB tables, and since any InnoDB table should have a PRIMARY KEY, we shouldn’t care too much.

“No foreign keys should exist. Otherwise an error is returned.”

Tricky stuff. With oak-online-alter-table, changes to the original table were immediately reflected in the ghost table. With InnoDB tables, that meant same transaction. And although I never got to update the text and code, there shouldn’t be a reason for not using child-side foreign keys (the child-side is the table on which the FK constraint is defined).

The Facebook patch works differently: it captures changes and writes them to a delta table,  to be later (asynchronously) analyzed and make for a replay of actions on the ghost table.

So in the Facebook code, some cases will lead to undesired behavior. Consider two tables, country and city, with city holding a RESTRICT/NO ACTION foreign key on country‘s id. Now consider the scenario:

  1. Rows from city are DELETEd, where the country Id is Spain’s.
    • city‘s ghost table is still unaffected, Spain’s cities are still there.
    • A change is written to the delta table to mark these rows for deletion.
  2. A DELETE is issued on country‘s Spain record.
    • The DELETE should work, from the user’s perspective
    • But it will fail: city’s ghost table has not received the changes yet. There’s still matching rows. The NO ACTION constraint will fail the DELETE statement.

Now, this does not lead to corruption, just to seemingly unreasonable behavior on the database part. This behavior is probably undesired. NO ACTION constraint won’t do.

However, with CASCADE or SET NULL options, there is less of an issue: operations on the parent table (e.g. country) cannot fail. We must make sure operations on the ghost table make it consistent with the original table (e.g. city).

Consider the following scenario:

  1. A new country is created, called “Sleepyland”. An INSERT is made to country.
    • Both city and city‘s ghost are immediately aware of it.
  2. A new town is created and INSERTed to city. The town is called “Naphaven”.
    • The change takes time to propagate to city‘s ghost table.
  3. Meanwhile, we realized we made a mistake. We’ve been had. There’s no such city nor country.
    1. We DELETE “Naphaven” from city.
    2. We DELETE “Sleepyland” from country.
    • Note that city‘s ghost table still hasn’t caught up with the changes.
  4. Eventually, the INSERT statement for “Naphaven” reaches city‘s ghost table.
    • What should happen now? The INSERT cannot succeed.
    • Will this fail the entire process?

Looking at the PHP code, I see that changes written on the delta table are blindly replayed on the ghost table.

Since the process is asynchronous, this should not be the case. We can solve the above if we use INSERT IGNORE instead of INSERT. The statement will fail without failing anything else. The row cannot exist, and that’s because the original row does not exist anymore.

Unlike a replication corruption, this does not lead to accumulation mistakes. The replay is static, somewhat like in binary log format. Changes are just written, regardless of existing data.

I have given this considerable thought, and I can’t say I’ve covered all the possible scenario. However I believe that with proper use of INSERT IGNORE and REPLACE INTO (two statements I heavily relied on with oak-online-alter-table), correctness can be achieved.

There’s the small pain of re-generating the foreign key definition on the “ghost” table (CREATE TABLE LIKE … does not copy FK definitions). And since foreign key names are unique, a new name must be picked up. Not pretty, but perfectly doable.

“No AFTER_{INSERT/UPDATE/DELETE} triggers must exist.”

It would be nicer if MySQL had an ALTER TRIGGER statement. There isn’t such statement. If there were such an atomic statement, then we would be able to rewrite the trigger, so as to add our own code to the end of the trigger’s code. Yuck. Would be even nicer if we were allowed to have multiple triggers of same event.

So, we are left with DROP and CREATE triggers. Alas, this makes for a short period where the trigger does not exist. Bad. The easy solution would be to LOCK WRITE the table, but apparently you can’t DROP the trigger (*) when the table is locked. Sigh.

(*) Happened to me, apparently to Facebook too; With latest 5.1 (5.1.51) version this actually works. With 5.0 it didn’t use to; this needs more checking.

Use of INFORMATION_SCHEMA

As with oak-online-alter-table, the OSC checks for triggers, indexes, column by searching on the INFORMATION_SCHEMA tables. This makes for nice SQL for getting the exact listing and types of PRIMARY KEY columns, whether or not AFTER triggers exist, and so on.

I’ve always considered this to be the weak part of openark-kit, that it relies on INFORMATION_SCHEMA so much. It easier, it’s cleaner, it’s even more correct to work that way — but it just puts too much locks. I think Baron Schwartz (and now Daniel Nichter) did amazing work on analyzing table schemata by parsing the SHOW CREATE TABLE and other SHOW commands regex-wise with Maatkit. It’s a crazy work! Had I written openark-kit in Perl, I would have just import their code. But I’m too lazy busy to do the conversion from Perl to Python, and rewrite that code, what with all the debugging.

OSC is written in PHP. Again, much conversion work. I think performance-wise this is an important step to make.

A word for the critics

Finally, a word for the critics. I’ve read some Facebook/MySQL bashing comments and wish to relate.

In his interview to The Register, Mark Callaghan gave the example that “Open Schema Change lets the company update indexes without user downtime, according to Callaghan”.

PostgeSQL was mentioned for being able to add index with only read locks taken, or being able to do the work with no locks using CREATE INDEX CONCURRENTLY. I wish MySQL had that feature! Yes, MySQL has a lot to improve upon, and the latest PostgreSQL 9.0 brings valuable new features. (Did I make it clear I have no intention of bashing PostgreSQL? If not, please re-read this paragraph until convinced).

Bashing related to the notion of MySQL being so poor that Facebook used an even poorer mechanism to work out the ALTER TABLE.

Well, allow me to add a few words: the CREATE INDEX is by far not the only thing you can achieve with OSC (although it may be Facebook’s major concern). You should be able to:

  • Add columns
  • Drop columns
  • Convert character sets
  • Modify column types
  • Add partitioning
  • Reorganize partitioning
  • Compress the table
  • Otherwise changing table format
  • Heck, you could even modify the storage engine! (To other transactional engine)

These are giant steps. How easy would it be to write these down into the database? It only takes a few weeks time to work out a working solution with reasonable limitations, just using the resources the MySQL server provides you with. The MySQL@Facebook team should be given credit for that.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL analytics: information_schema polling for table engine percentages

Сентябрь 23rd, 2010

If you’ve ever needed to know how the data and index percentages per table engine were laid out on your MySQL server, but didn’t have the time to write out a query… here it is!

select
(select (sum(DATA_LENGTH)+sum(INDEX_LENGTH))/(POW(1024,3)) as total_size from tables) as total_size_gb,
(select sum(INDEX_LENGTH)/(POW(1024,3)) as index_size from tables) as total_index_gb,
(select sum(DATA_LENGTH)/(POW(1024,3)) as data_size from tables) as total_data_gb, 

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables) as perc_index,
(select ((sum(DATA_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_data from tables) as perc_data,

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables where ENGINE='innodb') as innodb_perc_index,
(select ((sum(DATA_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_data from tables where ENGINE='innodb') as innodb_perc_data,

(select ((sum(INDEX_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_index from tables where ENGINE='myisam') as myisam_perc_index,
(select ((sum(DATA_LENGTH) / ( sum(DATA_LENGTH) + sum(INDEX_LENGTH)))*100) as perc_data from tables where ENGINE='myisam') as myisam_perc_data,

(select (select sum(INDEX_LENGTH) from tables where ENGINE='innodb') / (select sum(INDEX_LENGTH) from tables))*100 as perc_total_index_innodb,
(select (select sum(DATA_LENGTH) from tables where ENGINE='innodb') / (select sum(DATA_LENGTH) from tables))*100 as perc_total_data_innodb,

(select (select sum(INDEX_LENGTH) from tables where ENGINE='myisam') / (select sum(INDEX_LENGTH) from tables))*100 as perc_total_index_myisam,
(select (select sum(DATA_LENGTH) from tables where ENGINE='myisam') / (select sum(DATA_LENGTH) from tables))*100 as perc_total_data_myisam

from tables limit 1\G

This will output something along the lines of:

          total_size_gb: 2.7505537783727
         total_index_gb: 0.018660545349121
          total_data_gb: 2.7318932330236
             perc_index: 0.6784
              perc_data: 99.3216
      innodb_perc_index: 0.6502
       innodb_perc_data: 99.3498
      myisam_perc_index: 17.5120
       myisam_perc_data: 82.4880
perc_total_index_innodb: 95.6713
 perc_total_data_innodb: 99.8607
perc_total_index_myisam: 4.3287
 perc_total_data_myisam: 0.1393

PlanetMySQL Voting: Vote UP / Vote DOWN

Improving InnoDB Transaction Reporting

Август 18th, 2010

Everybody knows that parsing the output of SHOW ENGINE INNODB STATUS is hard, especially when you want to track the information historically, or want to aggregate any of the more dynamic sections such as the TRANSACTIONS one.

Within the InnoDB plugin the INFORMATION_SCHEMA.INNODB_TRX table was added, which allowed you to at least get some of the information on each transaction, but not the full breadth of of information that SHOW ENGINE INNODB STATUS provided.

“This is nice..” I thought “..but why not go the whole hog..?”.. And so I set about doing that, and opened up Bug#53336. In a very short time, I was in a review process with the InnoDB team, the patch was cleaned up, and (after a little hiccup) everything has been pushed for the next milestone.

Here’s the docs notes that I wrote for it today (with a little more annotation):

This patch adds the following columns:

mysql> DESC innodb_trx;
+----------------------------+---------------------+------+-----+---------------------+-------+
| FIELD                      | Type                | NULL | KEY | DEFAULT             | Extra |
+----------------------------+---------------------+------+-----+---------------------+-------+
...
| trx_operation_state        | varchar(64)         | YES  |     | NULL                |       |
| trx_tables_in_use          | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_tables_locked          | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_lock_structs           | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_lock_memory_bytes      | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_rows_locked            | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_rows_modified          | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_concurrency_tickets    | bigint(21) UNSIGNED | NO   |     | 0                   |       |
| trx_isolation_level        | varchar(16)         | NO   |     |                     |       |
| trx_unique_checks          | int(1)              | NO   |     | 0                   |       |
| trx_foreign_key_checks     | int(1)              | NO   |     | 0                   |       |
| trx_last_foreign_key_error | varchar(256)        | YES  |     | NULL                |       |
| trx_adaptive_hash_latched  | int(1)              | NO   |     | 0                   |       |
| trx_adaptive_hash_timeout  | bigint(21) UNSIGNED | NO   |     | 0                   |       |
+----------------------------+---------------------+------+-----+---------------------+-------+
22 rows IN SET (0.05 sec)

Most of these are self explanatory, a lot of them duplicate the information within SHOW ENGINE INNODB STATUS (so you now no longer need to parse the output of that to get complete transaction information).

Below are some of my notes. For comparison, here’s an example of a transaction from the SHOW ENGINE INNODB STATUS statement:

---TRANSACTION 517, ACTIVE 1 sec, OS thread id 2958520320 inserting
mysql TABLES IN USE 2, locked 2
189 LOCK struct(s), heap size 27968, 54389 row LOCK(s), undo log entries 2406
MySQL thread id 2, query id 36 localhost root Sending DATA
INSERT IGNORE INTO t1 (SELECT (i * rand())*100, 1 FROM t1)

trx_operation_state - corresponds to “inserting”, InnoDB’s internal transaction state
trx_tables_in_use - corresponds to “mysql tables in use 2″, the number of table locks requested by MySQL via external_lock()
trx_tables_locked - corresponds to “locked 2″, the number of actual table locks taken via external_lock()
trx_lock_structs - corresponds to “189 lock struct(s)”, the size of the lock struct list
trx_lock_memory_bytes - corresponds to “heap size 27968″, the number of bytes allocated to locks structs
trx_rows_locked - corresponds to “54389 row lock(s)”, an estimation of the number of rows locked (delete marked rows may make it imprecise)
trx_rows_modified - corresponds to “undo log entries 2406″, the number of rows modified in the transaction (inserted, updated, deleted)

Not seen in the transaction output above:

trx_concurrency_tickets - corresponds to “thread declared inside InnoDB 89″ for a transaction in SHOW ENGINE INNODB STATUS, the number of concurrency tickets remaining for the transaction when innodb_thread_concurrency != 0
trx_isolation_level - the transactions isolation level
trx_unique_checks - whether the transaction has “SET UNIQUE_CHECKS = 0″
trx_foreign_key_checks - whether the transaction has “SET FOREIGN_KEY_CHECKS = 0″
trx_last_foreign_key_error - if the last statement in the transaction resulted in an FK error, the error text is printed here
trx_adaptive_hash_latched - corresponds to “holds adaptive hash latch” being printed for a transaction in SHOW ENGINE INNODB STATUS
trx_adaptive_hash_timeout - when innodb_adapative_hash_index is enabled (default), statements that try to get the adapative hash latch spin 10000 (BTR_SEA_TIMEOUT) times, re-trying getting the adaptive hash latch, before giving up. lower numbers here for a lot of transactions may indicate contention on the adaptive hash latch

In summary - no more having to parse SHOW ENGINE INNODB STATUS output for transaction information, now you can just do it with a SQL (with all it’s aggregation goodness as well if you want to)!


PlanetMySQL Voting: Vote UP / Vote DOWN

INFORMATION_SCHEMA tables are case sensitive

Июль 20th, 2010

I wanted to get examples of some of the extra information that the Percona server has in its INFORMATION_SCHEMA metadata, and in doing so, I stumbled across an interesting MySQL bug/feature/point — INFORMATION_SCHEMA tables (which are actually system views) are case sensitive when used in comparisons:

mysql> use information_schema;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>  select @@version;
+--------------------+
| @@version          |
+--------------------+
| 5.1.36-xtradb6-log |
+--------------------+
1 row in set (0.00 sec)

mysql> use information_schema;
Database changed
mysql> show tables like 'innodb%';
Empty set (0.00 sec)

mysql>  show tables like 'INNODB%';
+----------------------------------------+
| Tables_in_information_schema (INNODB%) |
+----------------------------------------+
| INNODB_BUFFER_POOL_PAGES_INDEX         |
| INNODB_RSEG                            |
| INNODB_LOCKS                           |
| INNODB_BUFFER_POOL_PAGES               |
| INNODB_TRX                             |
| INNODB_INDEX_STATS                     |
| INNODB_LOCK_WAITS                      |
| INNODB_CMP_RESET                       |
| INNODB_CMP                             |
| INNODB_CMPMEM_RESET                    |
| INNODB_BUFFER_POOL_PAGES_BLOB          |
| INNODB_CMPMEM                          |
| INNODB_TABLE_STATS                     |
+----------------------------------------+
13 rows in set (0.00 sec)

It is not just for the new tables Percona has added:

mysql> show tables like 'table%';

Empty set (0.00 sec)

mysql> show tables like 'TABLE%';

+---------------------------------------+
| Tables_in_information_schema (TABLE%) |
+---------------------------------------+
| TABLES                                |
| TABLE_CONSTRAINTS                     |
| TABLE_PRIVILEGES                      |
+---------------------------------------+
3 rows in set (0.00 sec)

And it is not due to the collation:

mysql> show global variables like '%collat%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | latin1_swedish_ci |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)

mysql> show session variables like '%collat%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | utf8_general_ci   |
| collation_server     | latin1_swedish_ci |
+----------------------+-------------------+
3 rows in set (0.00 sec)

PlanetMySQL Voting: Vote UP / Vote DOWN