Archive for the ‘performance’ Category

New flushing algorithm in InnoDB

Апрель 13th, 2012

In MySQL labs release April 2012 we have reworked the flushing heuristics in InnoDB. In this post I’ll give an overview of what we have changed and the various configuration variables we have introduced to fine tune the flushing algorithm. If you are interested in finding out how our new flushing algorithm fares in benchmarks you can get these details in Dimitri’s well explained blog here.

Flushing means writing dirty pages to disk. I have explained in some detail about adaptive_flushing and types of flushing in my prvious notes. Please go through these notes if you want to make sense of what follows.

The page_cleaner thread checks the state of the system every second and takes into account number of dirty pages, amount of reusable redo space, the rate at which redo is generated and the IO capacity for which the server is configured and based on these factors decide how many pages we need to flush.

In the new scheme of things the page_cleaner thread uses a single non-linear formula to calculate how many pages we need to flush to have sufficient reusable redo space. This is different from current flushing heuristic where async_water_mark is taken as point where we change our flushing algorithm. Similarly, instead of using innodb_max_dirty_pages_pct as a switch which triggers flushing we have introduced the concept of a range where the flushing to control the dirty pages percentage starts once we cross the low water mark and gets more and more aggressive as we near the high water mark.

There are four new configuration variables. Note that if your system is not experiencing any IO spikes due to checkpoints then you can probably leave all of the following as is. All the variables are global in scope and can be set dynamically.

  • innodb_adaptive_flushing_lwm: Low water mark measured in %age of total redo log size at which adaptive flushing kicks in. If currently unusable redo space is less then this value no background flushing will happen. Default value is 10 and permissible values are 0 – 70.
  • innodb_max_dirty_pages_pct_lwm: Low water mark of dirty pages in %age where preflushing to control dirty page ratio kicks in. Default value is 0 which has the special meaning of this value having no effect. Allowable values are 0 – 99.
  • innodb_max_io_capacity: InnoDB generally attempts to work within the limits of innodb_io_capacity. However, if it needs to do aggressive flushing then innodb_max_io_capacity defines the limit to which the write IOPs can be stretched. Default value is 2000 which is ten times the default value for innodb_io_capacity.
  • innodb_flushing_avg_loops: Number of iterations for which we keep the previously caculated snapshot of the flushing state. This variable is roughly a measure of how smooth you want the transition in the flushing activity to be. The higher the value the smoother will be the transition in flushing in face of rapidly changing workload. A lower value implies that the flushing algorithm is more responsive but it also means that flushing acitivity can become spiky when confronted with a quickly changing workload. Default value is 30 and permissible range is 1 – 1000.

Finally, there is a wealth of information availabe to you to monitor the impact of changing these variables. You can get a good inside view of how flushing activity is working by querying innodb_metrics table.

mysql> select name, comment from information_schema.innodb_metrics where name like ‘Buffer_flush%’;
+————————————-+—————————————————————-+
| name | comment |
+————————————-+—————————————————————-+
| buffer_flush_batch_scanned | Total pages scanned as part of flush batch |
| buffer_flush_batch_num_scan | Number of times buffer flush list flush is called |
| buffer_flush_batch_scanned_per_call | Pages scanned per flush batch scan |
| buffer_flush_batch_total_pages | Total pages flushed as part of flush batch |
| buffer_flush_batches | Number of flush batches |
| buffer_flush_batch_pages | Pages queued as a flush batch |
| buffer_flush_neighbor_total_pages | Total neighbors flushed as part of neighbor flush |
| buffer_flush_neighbor | Number of times neighbors flushing is invoked |
| buffer_flush_neighbor_pages | Pages queued as a neighbor batch |
| buffer_flush_n_to_flush_requested | Number of pages requested for flushing. |
| buffer_flush_avg_page_rate | Average number of pages at which flushing is happening |
| buffer_flush_lsn_avg_rate | Average redo generation rate |
| buffer_flush_pct_for_dirty | Percent of IO capacity used to avoid max dirty page limit |
| buffer_flush_pct_for_lsn | Percent of IO capacity used to avoid reusable redo space limit |
| buffer_flush_adaptive_total_pages | Total pages flushed as part of adaptive flushing |
| buffer_flush_adaptive | Number of adaptive batches |
| buffer_flush_adaptive_pages | Pages queued as an adaptive batch |
| buffer_flush_sync_total_pages | Total pages flushed as part of sync batches |
| buffer_flush_sync | Number of sync batches |
| buffer_flush_sync_pages | Pages queued as a sync batch |
| buffer_flush_background_total_pages | Total pages flushed as part of background batches |
| buffer_flush_background | Number of background batches |
| buffer_flush_background_pages | Pages queued as a background batch |
+————————————-+—————————————————————-+
23 rows in set (0.00 sec)


PlanetMySQL Voting: Vote UP / Vote DOWN

InnoDB performance improvements

Апрель 11th, 2012

The problem
After making several performance fixes, notable among them being the kernel mutex split and the new handling of read-only transaction and in particular non-locking auto-commit read-only transactions, we weren’t seeing any increase in transaction per second (TPS) on our high-end hardware. On this one particular host, a 24 core with 2 threads per core host. The TPS using Sysbench was a tepid 5.6K at 16 threads and more or less plateaued till 1K user threads. No matter what config setting we used, we would more or less end up with the same result.

We ended up getting together for a meeting at Paris to discuss this issue and during the brain storming, one of the potential issues that cropped up was the effect of cache coherence and/or false sharing. After using the excellent Linux tool perf we were able to narrow it down to a global statistic counter in row_sel_search_for_mysql(). Mikael Ronstrom explains this in more detail.

The solution
Create a generic counter class (InnoDB code is now C++) that splits the counter into multiple (configurable) slots that are on separate 64 byte cache lines. Use the thread id of the updating thread to index into a slot to reduce the contention/sharing and it had the desired effect. The TPS went from 5.6 to 15K at 64 user threads and stayed close to stable right up to 1K, very slow degradation. This was using Sysbench OLTP_RO for autocommit-non-locking-read-only queries (Sysench option –oltp-skip-trx=off).

The code and binary can be downloaded from labs release downloads, the current release is mysql-5.6.6-labs-april-2012-*. See the code in include/os0thread.h. The new class is ib_counter_t.

We have now refactored the code and grouped all the InnoDB statistic counters in srv_counter_t. This will help in further consolidation and improvements. Currently, most of the InnoDB config and statistics variables are defined in srv0srv.cc (with a few exceptions). We need to start paying even more attention to their layout and alignment from now on. There seem to be some false sharing issues that we haven’t completely identified yet.

Results
I think it is better to look at Dimitri’s blog for results that reflect the improvements.

Enjoy!


PlanetMySQL Voting: Vote UP / Vote DOWN

April 2012 Labs Release – Online DDL Improvements

Апрель 11th, 2012

This feature is a continuation of the “Fast Index Creation” feature introduced in Fast Index Creation in the InnoDB Storage Engine. Now you can perform other kinds of DDL operations on InnoDB tables online: that is, with minimal delay for operations on that table, and without rebuilding the entire table. This enhancement improves responsiveness and availability in busy production environments, where making a table unavailable for seconds or minutes whenever its column definitions change is not practical.

The DDL operations enhanced by this feature are these variations on the ALTER TABLE statement:

  • Create secondary indexes: CREATE INDEX name ON table (col_list) or ALTER TABLE table ADD INDEX name (col_list)

    Drop secondary indexes: DROP INDEX name ON table; or ALTER TABLE table DROP INDEX name

    Creating and dropping secondary indexes on InnoDB tables has avoided the table-copying behavior since the days of MySQL 5.1 with the InnoDB Plugin. Now, the table remains available for read and write operations while the index is being created or dropped. The CREATE INDEX or DROP INDEX statement only finishes after all transactions that are modifying the table are completed, so that the initial state of the index reflects the most recent contents of the table.

    Previously, modifying the table while an index was being created or dropped typically resulted in a deadlock that cancelled the insert, update, or delete statement on the table.

  • Changing the auto-increment value for a column: ALTER TABLE table AUTO_INCREMENT=next_value;

    Especially in a distributed system using replication or sharding, you sometimes reset the auto-increment counter for a table to a specific value. The next row inserted into the table uses the specified value for its auto-increment column. You might also use this technique in a data warehousing environment where you periodically empty all the tables and reload them, and you can restart the auto-increment sequence from 1.

  • Drop a foreign key constraint: ALTER TABLE tbl DROP FOREIGN KEY fk_name

    Currently, online DDL only includes the DROP operation for foreign keys, not ADD to create foreign keys.

    If you do not know the names of the foreign key constraints on a particular table, issue the following statement and find the constraint name in the CONSTRAINT clause for each foreign key:

    show create table table\G
    

    Or, query the information_schema.table_constraints table and use the constraint_name and constraint_type columns to identify the foreign key names.

    As a consequence of this enhancement, you can now also drop a foreign key and its associated index in a single statement, which previously required separate statements in a strict order:

    ALTER TABLE  table DROP FOREIGN KEY  constraint, DROP INDEX index;
    
  • Rename a column: ALTER TABLE tbl CHANGE old_col_name new_col_name datatype

    When you keep the same data type and only change the column name, this operation can always be performed online. As part of this enhancement, you can now rename a column that is part of a foreign key constraint, which was not allowed before.

Note

As your database schema evolves with new columns, data types, constraints, indexes, and so on, keep your CREATE TABLE statements up to date with the latest table definitions. Even with the performance improvements of online DDL, it is more efficient to create stable database structures at the beginning, rather than creating part of the schema and then issuing ALTER TABLE statements afterward.

The main exception to this guideline is for secondary indexes on tables with large numbers of rows. It is typically most efficient to create the table with all details specified except the secondary indexes, load the data, then create the secondary indexes.

Whatever sequence of CREATE TABLE, CREATE INDEX, ALTER TABLE, and similar statements went into putting a table together, you can capture the SQL needed to reconstruct the current form of the table by issuing the statement SHOW CREATE TABLE table\G (uppercase \G required for tidy formatting). This output shows clauses such as numeric precision, NOT NULL, and CHARACTER SET that are sometimes added behind the scenes, and you might otherwise leave out when cloning the table on a new system or setting up foreign key columns with identical type.

Performance and Availability Considerations

Avoiding a table copy during DDL improves several aspects of MySQL operation, such as performance, concurrency, availability, and scalability:

  • By avoiding the disk I/O and CPU cycles to rebuild the table, you minimize the overall load on the database and maintain good performance and high throughput during the DDL operation.

  • Because the DDL operation completes in less time, there is a shorter period when queries and DML operations on that table are blocked, making your application more responsive.

  • Because less data is read into the buffer pool, you avoid purging frequently accessed data from the memory cache, which typically causes a temporary performance dip after a DDL operation.

  • Because there is a shorter period when queries and DML operations queue up waiting for the DDL to finish, there is less locking and waiting for other resources all throughout the MySQL server. Reducing this type of overhead leads to greater scalability, even for operations not involving the table being altered.

Benchmarking

To judge the relative performance of online DDL operations, you can run such operations on a big InnoDB table using current and earlier versions of MySQL. You can also run all the performance tests under the latest MySQL version, simulating the previous DDL behavior for the “before” results, by setting the old_alter_table system variable. Issue the statement set old_alter_table=1 in the session, and measure DDL performance to record the “before” figures. Then set old_alter_table=0 to re-enable the newer, faster behavior, and run the DDL operations again to record the “after” figures.

For a basic idea of whether a DDL operation does its changes in-place or performs a table copy, look at the “rows affected” value displayed after the command finishes. For example, here are lines you might see after doing different types of DDL operations:

  • Changing the default value of a column (super-fast, does not affect the table data at all):

    Query OK, 0 rows affected (0.07 sec)
  • Adding an index (takes time, but 0 rows affected shows that the table is not copied):

    Query OK, 0 rows affected (21.42 sec)
  • Changing the data type of a column (takes substantial time and does require rebuilding all the rows of the table):

    Query OK, 1671168 rows affected (1 min 35.54 sec)

For example, before running a DDL operation on a big table, you might check whether the operation will be fast or slow as follows:

  1. Clone the table structure.

  2. Populate the cloned table with a tiny amount of data.

  3. Run the DDL operation on the cloned table.

  4. Check whether the “rows affected” value is zero or not. A non-zero value means the operation will require rebuilding the entire table, which might require special planning. For example, you might do the DDL operation during a period of scheduled downtime, or on each replication slave server one at a time.

For a deeper understanding of the reduction in MySQL processing, examine the PERFORMANCE_SCHEMA and INFORMATION_SCHEMA tables related to InnoDB before and after DDL operations, to see the number of physical reads, writes, memory allocations, and so on.

Background Information

Historically, the MySQL server and InnoDB have each kept their own metadata about table and index structures. The MySQL server stores this information in .frm files that are not protected by a transactional mechanism, while InnoDB has its own data dictionary as part of the system tablespace. If a DDL operation was interrupted by a crash or other unexpected event partway through, the metadata could be left inconsistent between these two locations, causing problems such as startup errors or inability to access the table that was being altered. Now that InnoDB is the default storage engine, addressing such issues is a high priority. These enhancements to DDL operations reduce the window of opportunity for such issues to occur.

Examples

Here are code examples showing some operations whose performance, concurrency, and scalability are improved by this new feature.

Example 1. Schema Setup Code for Online DDL Experiments

Here is the code that sets up the initial tables used in these demonstrations:

/*
Setup code for the online DDL demonstration:
- Set up some config variables.
- Create 2 tables that are clones of one of the INFORMATION_SCHEMA tables
  that always has some data. The "small" table has a couple of thousand rows.
  For the "big" table, keep doubling the data until it reaches over a million rows.
- Set up a primary key for the sample tables, since we are demonstrating InnoDB aspects.
*/ 

set autocommit = 0;
set foreign_key_checks = 1;
set global innodb_file_per_table = 1;
set old_alter_table=0;
prompt mysql: 

use test;

\! echo "Setting up 'small' table:"
drop table if exists small_table;
create table small_table as select * from information_schema.columns;
alter table small_table add id int unsigned not null primary key auto_increment;
select count(id) from small_table;

\! echo "Setting up 'big' table:"
drop table if exists big_table;
create table big_table as select * from information_schema.columns;
show create table big_table\G

insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
insert into big_table select * from big_table;
commit;

alter table big_table add id int unsigned not null primary key auto_increment;
select count(id) from big_table;

Running this code gives this output, condensed for brevity and with the most important points bolded:

Setting up 'small' table:

Query OK, 1678 rows affected (0.13 sec)
Query OK, 1678 rows affected (0.07 sec)

+-----------+
| count(id) |
+-----------+
|      1678 |
+-----------+
1 row in set (0.00 sec)

Setting up 'big' table:

Query OK, 1678 rows affected (0.17 sec)

*************************** 1. row ***************************
       Table: big_table
Create Table: CREATE TABLE `big_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `ORDINAL_POSITION` bigint(21) unsigned NOT NULL DEFAULT '0',
  `COLUMN_DEFAULT` longtext CHARACTER SET utf8,
  `IS_NULLABLE` varchar(3) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `DATA_TYPE` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `CHARACTER_MAXIMUM_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_OCTET_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_SCALE` bigint(21) unsigned DEFAULT NULL,
  `DATETIME_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_SET_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLLATION_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLUMN_TYPE` longtext CHARACTER SET utf8 NOT NULL,
  `COLUMN_KEY` varchar(3) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `EXTRA` varchar(30) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `PRIVILEGES` varchar(80) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT ''
) ENGINE=InnoDB DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Query OK, 1678 rows affected (0.09 sec)
Query OK, 3356 rows affected (0.07 sec)
Query OK, 6712 rows affected (0.17 sec)
Query OK, 13424 rows affected (0.44 sec)
Query OK, 26848 rows affected (0.63 sec)
Query OK, 53696 rows affected (1.72 sec)
Query OK, 107392 rows affected (3.02 sec)
Query OK, 214784 rows affected (6.28 sec)
Query OK, 429568 rows affected (13.25 sec)
Query OK, 859136 rows affected (28.16 sec)
Query OK, 1718272 rows affected (1 min 9.22 sec)

+-----------+
| count(id) |
+-----------+
|   1718272 |
+-----------+
1 row in set (1.75 sec)

Example 2. Speed and Efficiency of CREATE INDEX and DROP INDEX

Here is a sequence of statements demonstrating the relative speed of CREATE INDEX and DROP INDEX statements. For a small table, the elapsed time is less than a second whether we use the fast or slow technique, so we look at the “rows affected” output to verify which operations can avoid the table rebuild. For a large table, the difference in efficiency is obvious because skipping the table rebuild saves substantial time.

\! clear

-- Make sure we're using the new-style fast DDL.
-- Outside of benchmarking and testing, you would
-- never enable the old_alter_table setting.
set old_alter_table=0;

\! echo "=== Create and drop index (small table, new/fast technique) ==="
\! echo
\! echo "Data size (kilobytes) before index created: "
\! du -k data/test/small_table.ibd
create index i_dtyp_small on small_table (data_type);
\! echo "Data size after index created: "
\! du -k data/test/small_table.ibd
drop index i_dtyp_small on small_table;

-- Revert to the older slower DDL for comparison.
set old_alter_table=1;

\! echo "=== Create and drop index (small table, old/slow technique) ==="
\! echo
\! echo "Data size (kilobytes) before index created: "
\! du -k data/test/small_table.ibd
create index i_dtyp_small on small_table (data_type);
\! echo "Data size after index created: "
\! du -k data/test/small_table.ibd
drop index i_dtyp_small on small_table;

-- In the above example, we examined the "rows affected" number,
-- ideally looking for a zero figure. Let's try again with a larger
-- sample size, where we'll see that the actual time taken can
-- vary significantly.

-- Back to the new/fast behavior:
set old_alter_table=0;

\! echo "=== Create and drop index (big table, new/fast technique) ==="
\! echo
\! echo "Data size (kilobytes) before index created: "
\! du -k data/test/big_table.ibd
create index i_dtyp_big on big_table (data_type);
\! echo "Data size after index created: "
\! du -k data/test/big_table.ibd
drop index i_dtyp_big on big_table;

-- Let's see that again, in slow motion:
set old_alter_table=1;

\! echo "=== Create and drop index (big table, old/slow technique) ==="
\! echo
\! echo "Data size (kilobytes) before index created: "
\! du -k data/test/big_table.ibd
create index i_dtyp_big on big_table (data_type);
\! echo "Data size after index created: "
\! du -k data/test/big_table.ibd
drop index i_dtyp_big on big_table;

Running this code gives this output, condensed for brevity and with the most important points bolded:

=== Create and drop index (small table, new/fast technique) ===

Data size (kilobytes) before index created:
384	data/test/small_table.ibd
Query OK, 0 rows affected (0.04 sec)

Data size after index created:
432	data/test/small_table.ibd
Query OK, 0 rows affected (0.02 sec)

=== Create and drop index (small table, old/slow technique) ===

Data size (kilobytes) before index created:
432	data/test/small_table.ibd
Query OK, 1678 rows affected (0.12 sec)

Data size after index created:
448	data/test/small_table.ibd
Query OK, 1678 rows affected (0.10 sec)

=== Create and drop index (big table, new/fast technique) ===

Data size (kilobytes) before index created:
315392	data/test/big_table.ibd
Query OK, 0 rows affected (33.32 sec)

Data size after index created:
335872	data/test/big_table.ibd
Query OK, 0 rows affected (0.02 sec)

=== Create and drop index (big table, old/slow technique) ===

Data size (kilobytes) before index created:
335872	data/test/big_table.ibd
Query OK, 1718272 rows affected (1 min 5.01 sec)

Data size after index created:
348160	data/test/big_table.ibd
Query OK, 1718272 rows affected (46.59 sec)

Example 3. Concurrent DML During CREATE INDEX and DROP INDEX

Here are some snippets of code that I ran in separate mysql sessions connected to the same database, to illustrate DML statements (insert, update, or delete) running at the same time as CREATE INDEX and DROP INDEX.

CREATE INDEX statement (in session 1):

/*
CREATE INDEX statement to run against a table while
insert/update/delete statements are modifying the
column being indexed.
*/

-- We'll run this script in one session, while simultaneously creating and dropping
-- an index on test/big_table.table_name in another session.

use test;
create index i_concurrent on big_table(table_name);

DROP INDEX statement (in session 1):

/*
DROP INDEX statement to run against a table while
insert/update/delete statements are modifying the
column being indexed.
*/

-- We'll run this script in one session, while simultaneously creating and dropping
-- an index on test/big_table.table_name in another session.

use test;
drop index i_concurrent on big_table;

DML statements (in session 2). I ran these DELETE statements while the CREATE INDEX was running.
Because DROP INDEX in this case takes less than a second, I started the DELETE first and then ran the DROP INDEX while the DML was in progress. In each case, the DML statement waited until the transaction with the DELETE was finished. (That’s why the timing numbers are higher than in the other examples. I waited for a while before issuing the ROLLBACK statement at the mysql command line.)

/*
Some queries and insert/update/delete statements to run against a table
while an index is being created or dropped. Previously, these operations
would have stalled during the index create/drop period and possibly
timed out or deadlocked.
*/

-- We'll run this script in one session, while simultaneously creating and dropping
-- an index on test/big_table.table_name in another session.

-- In our test instance, that column has about 1.7M rows, with 136 different values.

set autocommit = 0;
use test;

select distinct character_set_name from big_table where table_name = 'FILES';
delete from big_table where table_name = 'FILES';
select distinct character_set_name from big_table where table_name = 'FILES';

-- I'll issue the final rollback interactively, not via script,
-- the better to control the timing.
-- rollback;

Running this code gives this output, condensed for brevity and with the most important points bolded:

mysql: source concurrent_ddl_create.sql
Database changed
Query OK, 0 rows affected (1 min 25.15 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql: source concurrent_ddl_drop.sql
Database changed
Query OK, 0 rows affected (24.98 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql: source concurrent_dml.sql
Query OK, 0 rows affected (0.00 sec)

Database changed
+--------------------+
| character_set_name |
+--------------------+
| NULL               |
| utf8               |
+--------------------+
2 rows in set (0.32 sec)

Query OK, 38912 rows affected (1.84 sec)

Empty set (0.01 sec)

mysql: rollback;
Query OK, 0 rows affected (1.05 sec)

Example 4. Renaming a Column

Here is a demonstration of using ALTER TABLE to rename a column. We use the new, fast DDL mechanism to change the name, then the old, slow DDL mechanism (with old_alter_table=1) to restore the original column name.

Note
  • Because the syntax for renaming a column also involves re-specifying the data type, be very careful to specify exactly the same data type to avoid a costly table rebuild. In this case, we checked the output of show create table table\G and copied any clauses such as CHARACTER SET and NOT NULL from the original column definition.

  • Again, renaming a column for a small table is fast enough that we need to examine the “rows affected” number to verify that the new DDL mechanism is more efficient than the old one. With a big table, the difference in elapsed time makes the improvement obvious.

/*
Run through a sequence of 'rename column' statements.
Because this operation involves only metadata, not table data,
it is fast for big and small tables, with new or old DDL mechanisms.
*/

\! clear

\! echo "Rename column (fast technique, small table):"
set old_alter_table=0;
alter table small_table change `IS_NULLABLE` `NULLABLE`
  varchar(3) character set utf8 not null;
\! echo "Rename back to original name (slow technique):"
set old_alter_table=1;
alter table small_table change `NULLABLE` `IS_NULLABLE`
  varchar(3) character set utf8 not null;

\! echo "Rename column (fast technique, big table):"
set old_alter_table=0;
alter table big_table change `IS_NULLABLE` `NULLABLE`
  varchar(3) character set utf8 not null;
\! echo "Rename back to original name (slow technique):"
set old_alter_table=1;
alter table big_table change `NULLABLE` `IS_NULLABLE`
  varchar(3) character set utf8 not null;
set old_alter_table=0;

Running this code gives this output, condensed for brevity and with the most important points bolded:

Rename column (fast technique, small table):
Query OK, 0 rows affected (0.13 sec)

Rename back to original name (slow technique):
Query OK, 1678 rows affected (0.35 sec)

Rename column (fast technique, big table):
Query OK, 0 rows affected (0.11 sec)

Rename back to original name (slow technique):
Query OK, 1718272 rows affected (1 min 0.00 sec)

Example 5. Dropping Foreign Keys

Here is a demonstration of foreign keys, including improvement to the speed of dropping a foreign key constraint.

/*
Demonstrate aspects of foreign keys that are or aren't affected by the DDL improvements.
- Create a new table with only a few values to serve as the parent table.
- Set up the 'small' and 'big' tables as child tables using a foreign key.
- Verify that the ON DELETE CASCADE clause makes changes ripple from parent to child tables.
- Drop the foreign key constraints, and optionally associated indexes.
  (This is the operation that is sped up.)
*/

\! clear

-- Make sure foreign keys are being enforced, and allow
-- rollback after doing some DELETEs that affect both
-- parent and child tables.
set foreign_key_checks = 1;
set autocommit = 0;

-- Create a parent table, containing values that we know are already present
-- in the child tables.
drop table if exists schema_names;
create table schema_names (id int unsigned not null primary key auto_increment,
  schema_name varchar(64) character set utf8 not null, index i_schema (schema_name))
  as select distinct table_schema schema_name from small_table;

show create table schema_names\G
show create table small_table\G
show create table big_table\G

-- Creating the foreign key constraint isn't any faster than before.
-- It still involves a table rebuild, as illustrated by the "rows affected" figure.
alter table small_table add constraint small_fk
  foreign key i_table_schema (table_schema) references schema_names(schema_name)
  on delete cascade;
alter table big_table add constraint big_fk
  foreign key i_table_schema (table_schema) references schema_names(schema_name)
  on delete cascade;

show create table small_table\G
show create table big_table\G

select schema_name from schema_names order by schema_name;
select count(table_schema) howmany, table_schema from small_table group by table_schema;
select count(table_schema) howmany, table_schema from big_table group by table_schema;

-- big_table is the parent table.
-- schema_names is the parent table.
-- big_table is the child table.
-- (One row in the parent table can have many "children" in the child table.)
-- Changes to the parent table can ripple through to the child table.
-- For example, removing the value 'test' from schema_names.schema_name will
-- result in the removal of 20K or so rows from big_table.

delete from schema_names where schema_name = 'test';

select schema_name from schema_names order by schema_name;
select count(table_schema) howmany, table_schema from small_table group by table_schema;
select count(table_schema) howmany, table_schema from big_table group by table_schema;

-- Because we've turned off autocommit, we can still get back those deleted rows
-- if the DELETE was issued by mistake.
rollback;

select schema_name from schema_names order by schema_name;
select count(table_schema) howmany, table_schema from small_table group by table_schema;
select count(table_schema) howmany, table_schema from big_table group by table_schema;

-- All of the cross-checking between parent and child tables would be
-- deadly slow if there wasn't the requirement for the corresponding
-- columns to be indexed!

-- But we can get rid of the foreign key using a fast operation
-- that doesn't rebuild the table.
-- If we didn't specify a constraint name when setting up the foreign key, we would
-- have to find the auto-generated name such as 'big_table_ibfk_1' in the
-- output from 'show create table'.

-- For the small table, we'll drop the foreign key and the associated index.
-- Having an index on a small table is less critical.

\! echo "DROP FOREIGN KEY and INDEX from small_table:"
alter table small_table drop foreign key small_fk, drop index small_fk;

-- For the big table, we'll drop the foreign key and leave the associated index.
-- If we are still doing queries that reference the indexed column, the index is
-- very important to avoid a full table scan of the big table.
\! echo "DROP FOREIGN KEY from big_table:"
alter table big_table drop foreign key big_fk;

show create table small_table\G
show create table big_table\G

Running this code gives this output, condensed for brevity and with the most important points bolded:

Query OK, 4 rows affected (0.03 sec)

*************************** 1. row ***************************
       Table: schema_names
Create Table: CREATE TABLE `schema_names` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `schema_name` varchar(64) CHARACTER SET utf8 NOT NULL,
  PRIMARY KEY (`id`),
  KEY `i_schema` (`schema_name`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

*************************** 1. row ***************************
       Table: small_table
Create Table: CREATE TABLE `small_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `ORDINAL_POSITION` bigint(21) unsigned NOT NULL DEFAULT '0',
  `COLUMN_DEFAULT` longtext CHARACTER SET utf8,
  `IS_NULLABLE` varchar(3) CHARACTER SET utf8 NOT NULL,
  `DATA_TYPE` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `CHARACTER_MAXIMUM_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_OCTET_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_SCALE` bigint(21) unsigned DEFAULT NULL,
  `DATETIME_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_SET_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLLATION_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLUMN_TYPE` longtext CHARACTER SET utf8 NOT NULL,
  `COLUMN_KEY` varchar(3) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `EXTRA` varchar(30) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `PRIVILEGES` varchar(80) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1679 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

*************************** 1. row ***************************
       Table: big_table
Create Table: CREATE TABLE `big_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `ORDINAL_POSITION` bigint(21) unsigned NOT NULL DEFAULT '0',
  `COLUMN_DEFAULT` longtext CHARACTER SET utf8,
  `IS_NULLABLE` varchar(3) CHARACTER SET utf8 NOT NULL,
  `DATA_TYPE` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `CHARACTER_MAXIMUM_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_OCTET_LENGTH` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `NUMERIC_SCALE` bigint(21) unsigned DEFAULT NULL,
  `DATETIME_PRECISION` bigint(21) unsigned DEFAULT NULL,
  `CHARACTER_SET_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLLATION_NAME` varchar(32) CHARACTER SET utf8 DEFAULT NULL,
  `COLUMN_TYPE` longtext CHARACTER SET utf8 NOT NULL,
  `COLUMN_KEY` varchar(3) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `EXTRA` varchar(30) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `PRIVILEGES` varchar(80) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  KEY `big_fk` (`TABLE_SCHEMA`)
) ENGINE=InnoDB AUTO_INCREMENT=1718273 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Query OK, 1678 rows affected (0.10 sec)
Query OK, 1718272 rows affected (1 min 14.54 sec)

*************************** 1. row ***************************
       Table: small_table
Create Table: CREATE TABLE `small_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  ...
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  KEY `small_fk` (`TABLE_SCHEMA`),
  CONSTRAINT `small_fk` FOREIGN KEY (`TABLE_SCHEMA`)
  REFERENCES `schema_names` (`schema_name`)
  ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1679 DEFAULT CHARSET=latin1
1 row in set (0.12 sec)

*************************** 1. row ***************************
       Table: big_table
Create Table: CREATE TABLE `big_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  ...
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  KEY `big_fk` (`TABLE_SCHEMA`),
  CONSTRAINT `big_fk` FOREIGN KEY (`TABLE_SCHEMA`)
  REFERENCES `schema_names` (`schema_name`)
  ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1718273 DEFAULT CHARSET=latin1
1 row in set (0.01 sec)

+--------------------+
| schema_name        |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|     563 | information_schema |
|     286 | mysql              |
|     786 | performance_schema |
|      43 | test               |
+---------+--------------------+
4 rows in set (0.01 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|  576512 | information_schema |
|  292864 | mysql              |
|  804864 | performance_schema |
|   44032 | test               |
+---------+--------------------+
4 rows in set (2.10 sec)

Query OK, 1 row affected (1.52 sec)

+--------------------+
| schema_name        |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+
3 rows in set (0.00 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|     563 | information_schema |
|     286 | mysql              |
|     786 | performance_schema |
+---------+--------------------+
3 rows in set (0.00 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|  576512 | information_schema |
|  292864 | mysql              |
|  804864 | performance_schema |
+---------+--------------------+
3 rows in set (1.74 sec)

Query OK, 0 rows affected (0.60 sec)

+--------------------+
| schema_name        |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|     563 | information_schema |
|     286 | mysql              |
|     786 | performance_schema |
|      43 | test               |
+---------+--------------------+
4 rows in set (0.01 sec)

+---------+--------------------+
| howmany | table_schema       |
+---------+--------------------+
|  576512 | information_schema |
|  292864 | mysql              |
|  804864 | performance_schema |
|   44032 | test               |
+---------+--------------------+
4 rows in set (1.59 sec)

DROP FOREIGN KEY and INDEX from small_table:
Query OK, 0 rows affected (0.02 sec)

DROP FOREIGN KEY from big_table:
Query OK, 0 rows affected (0.02 sec)

*************************** 1. row ***************************
       Table: small_table
Create Table: CREATE TABLE `small_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  ...
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1679 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

*************************** 1. row ***************************
       Table: big_table
Create Table: CREATE TABLE `big_table` (
  `TABLE_CATALOG` varchar(512) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_SCHEMA` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `TABLE_NAME` varchar(64) CHARACTER SET utf8 NOT NULL DEFAULT '',
  ...
  `COLUMN_COMMENT` varchar(1024) CHARACTER SET utf8 NOT NULL DEFAULT '',
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`),
  KEY `big_fk` (`TABLE_SCHEMA`)
) ENGINE=InnoDB AUTO_INCREMENT=1718273 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

Example 6. Changing Auto-Increment Value

Here is an illustration of increasing the auto-increment lower limit for a table column, demonstrating how this operation now avoids a table rebuild, plus some other fun facts about InnoDB auto-increment columns.

/*
If this script is run after foreign_key.sql, the schema_names table is
already set up. But to allow this script to run multiple times without
running into duplicate ID errors, we set up the schema_names table
all over again.
*/

\! clear

\! echo "=== Adjusting the Auto-Increment Limit for a Table ==="
\! echo

drop table if exists schema_names;
create table schema_names (id int unsigned not null primary key auto_increment,
  schema_name varchar(64) character set utf8 not null,
  index i_schema (schema_name))
  as select distinct table_schema schema_name from small_table;

\! echo "Initial state of schema_names table."
\! echo "AUTO_INCREMENT is included in SHOW CREATE TABLE output."
\! echo "Note how MySQL reserved a block of IDs,"
\! echo "but only needed 4 of them in this transaction,"
\! echo "so the next inserted values would get IDs 8 and 9."
show create table schema_names\G
select * from schema_names order by id;

\! echo "Inserting even a tiny amount of data can produce gaps in the ID sequence."
insert into schema_names (schema_name) values ('eight'), ('nine');

set old_alter_table=0;
\! echo "Bumping auto-increment lower limit to 20 (fast mechanism):"
alter table schema_names auto_increment=20;

\! echo "Inserting 2 rows that should get IDs 20 and 21:"
insert into schema_names (schema_name) values ('foo'), ('bar');
commit;

set old_alter_table=1;
\! echo "Bumping auto-increment lower limit to 30 (slow mechanism):"
alter table schema_names auto_increment=30;

\! echo "Inserting 2 rows that should get IDs 30 and 31:"
insert into schema_names (schema_name) values ('bletch'),('baz');
commit;

select * from schema_names order by id;

set old_alter_table=0;

\! echo "Final state of schema_names table."
\! echo "AUTO_INCREMENT value shows the next inserted row would get ID=32."
show create table schema_names\G

Running this code gives this output, condensed for brevity and with the most important points bolded:

=== Adjusting the Auto-Increment Limit for a Table ===

Query OK, 0 rows affected (0.01 sec)

Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0

Initial state of schema_names table.
AUTO_INCREMENT is included in SHOW CREATE TABLE output.
Note how MySQL reserved a block of IDs,
but only needed 4 of them in this transaction,
so the next inserted values would get IDs 8 and 9.
*************************** 1. row ***************************
       Table: schema_names
Create Table: CREATE TABLE `schema_names` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `schema_name` varchar(64) CHARACTER SET utf8 NOT NULL,
  PRIMARY KEY (`id`),
  KEY `i_schema` (`schema_name`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

+----+--------------------+
| id | schema_name        |
+----+--------------------+
|  1 | information_schema |
|  2 | mysql              |
|  3 | performance_schema |
|  4 | test               |
+----+--------------------+
4 rows in set (0.00 sec)

Inserting even a tiny amount of data can produce gaps in the ID sequence.
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

Bumping auto-increment lower limit to 20 (fast mechanism):
Query OK, 0 rows affected (0.01 sec)

Inserting 2 rows that should get IDs 20 and 21:
Query OK, 2 rows affected (0.00 sec)

Bumping auto-increment lower limit to 30 (slow mechanism):
Query OK, 8 rows affected (0.02 sec)

Inserting 2 rows that should get IDs 30 and 31:
Query OK, 2 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

+----+--------------------+
| id | schema_name        |
+----+--------------------+
|  1 | information_schema |
|  2 | mysql              |
|  3 | performance_schema |
|  4 | test               |
|  8 | eight              |
|  9 | nine               |
| 20 | foo                |
| 21 | bar                |
| 30 | bletch             |
| 31 | baz                |
+----+--------------------+
10 rows in set (0.00 sec)

Final state of schema_names table.
AUTO_INCREMENT value shows the next inserted row would get ID=32.
*************************** 1. row ***************************
       Table: schema_names
Create Table: CREATE TABLE `schema_names` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `schema_name` varchar(64) CHARACTER SET utf8 NOT NULL,
  PRIMARY KEY (`id`),
  KEY `i_schema` (`schema_name`)
) ENGINE=InnoDB AUTO_INCREMENT=32 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Conference 2012 Day 0

Апрель 11th, 2012
Wow what a lot has changed since the last MySQL conference I blogged about in 2007 MySQL has been acquired twice, once as MySQL by Sun and the second time around bundled with Sun when Oracle bought Sun. The conference is no longer organized by O’Reilly but by Percona. And the MySQL database itself has [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Conference 2012 Day 0

Апрель 11th, 2012
Wow what a lot has changed since the last MySQL conference I blogged about in 2007 MySQL has been acquired twice, once as MySQL by Sun and the second time around bundled with Sun when Oracle bought Sun. The conference is no longer organized by O’Reilly but by Percona. And the MySQL database itself has [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

Improvements for many-table joins in MySQL 5.6

Апрель 11th, 2012
A lot has happened in MySQL 5.6 for queries joining many tables. For the most common use cases we have drastically reduced the cost of finding the execution plan. We have also improved the heuristics and removed bugs so that the final plan is often better than it used to be. Read on if you are one of those people who do 15 way joins!

Finding a query execution plan
First some background. You can skip this part if you know how MySQL picks the table join order in 5.5.

When presented with a query, MySQL will try to find the best order to join tables by employing a greedy search algorithm. The outcome is what we call a query execution plan, QEP. When you join just a few tables, there's no problem calculating the cost of all join order combinations and then pick the best plan. However, since there are (#tables)! possible combinations, the cost of calculating them all soon becomes too high: for five tables, e.g., there are 120 combinations which is no problem to compute. For 10 tables there are 3.6 million combinations and for 15 tables there are 1307 billion. You get the picture. For this reason, MySQL makes a trade off: use heuristics to only explore promising plans. This is supposed to significantly reduce the number of plans MySQL needs to calculate, but at the same time you risk not finding the best one.

Read more »

PlanetMySQL Voting: Vote UP / Vote DOWN

InnoDB 2012 Spring Labs Release

Апрель 10th, 2012

InnoDB team is pleased to announce the 2012 Spring labs release, with several much anticipated new features and performance enhancements. Please download mysql-5.6-labs-april-2012 from MySQL Labs and give a try. Do not forget to provide your feedback.

The 2012 Spring labs release on MySQL Labs consists of the following InnoDB new features, which are not in the newly released MySQL 5.6.5 DMR yet:

  • Online DDL: some of the DDLs are now truly online, including ADD INDEX, SET DEFAULT, and DROP FOREIGN KEY.
  • Memcached plugin: with additional features, such as SASL support.
  • Transportable tablespace: allow user to export data files and import them into another MySQL instance.
  • Persistent statistics ON/OFF switch: the ability of controlling persistent statistics on table level.
  • Option for specifying locations of InnoDB tables: allows user to choose the location of specific tables.

This labs release also includes several performance and scalability improvements, specially on modern CPUs:

  • Reduced false sharing
  • Configurable fast mutexes
  • Improved adaptive flushing
  • Improved neighbor flushing

With those improvements, (InnoDB) read-only performance reaches a new high. Please see Mikael’s blog for some of the improvements. You will see the benchmark results on DimitriK’s blog. And the InnoDB team will also continue publishing technical details in the coming days on this site.

We intend to make those new features & improvements into future development milestone releases and GA releases. Thanks for being interested in InnoDB!


PlanetMySQL Voting: Vote UP / Vote DOWN

Benchmarking MySQL Replication with Multi-Threaded Slaves

Апрель 10th, 2012

The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6.

As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to:

· Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data)

· Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog)

The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments.

Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include:

· Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion

· Crash Safe Slaves and Binlog

· Optimized Row Based Replication

· Replication Event Checksums

· Time Delayed Replication

These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article 

Back to the benchmark - details are as follows.


Environment
The test environment consisted of two Linux servers:

· one running the replication master

· one running the replication slave.

Only the slave was involved in the actual measurements, and was based on the following configuration:

- Hardware: Oracle Sun Fire X4170 M2 Server

- CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz.

- OS: 64-bit Oracle Enterprise Linux 6.1
- Memory: 48 GB

Test Procedure
Initial Setup:

Two MySQL servers were started on two different hosts, configured as replication master and slave.

10 sysbench schemas were created, each with a single table:

CREATE TABLE `sbtest` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `k` int(10) unsigned NOT NULL DEFAULT '0',
   `c` char(120) NOT NULL DEFAULT '',
   `pad` char(60) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`),
   KEY `k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted.

10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements:

UPDATE sbtest set k=k+1 WHERE id = <random row>

In total, this generated 100,000 update statements to later replicate during the test itself.

Test Methodology:
The number of slave workers to test with was configured using:

SET GLOBAL slave_parallel_workers=<workers>

Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave.

The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated.

The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS).

Test Reset:

The test-reset cycle was implemented as follows:

· the slave was stopped

· the slave data directory replaced with the previous backup

· the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog.

The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison.

The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count.

MySQL Configuration
The relevant configuration settings used for MySQL are as follows:

binlog-format=STATEMENT
relay-log-info-repository=TABLE
master-info-repository=TABLE

As described in the test procedure, the
slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is:

0 worker threads:
   - current (i.e. single threaded) sequential mode
   - 1 x IO thread and 1 x SQL thread
   - SQL thread both reads and executes the events

1 worker thread:
   - sequential mode
   - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread
   - coordinator reads the event and hands it to the worker who executes

2+ worker threads:
   - parallel execution
   - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads
   - coordinator reads events and hands them to the workers who execute them

Results
Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads).

Figure 1: 5x Higher Performance with Multi-Threaded Slaves

The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented.

The raw numbers behind this graph are reported in the Appendix section of this post.



Figure 2: Detailed Results

As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas.

Other conclusions from the results:

· Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution.

· As expected, having more workers than schemas adds no visible benefit.

Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance:


relay-log-info-repository=TABLE
master-info-repository=TABLE

For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE.

Conclusion

As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas.

This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab).

You can learn more about MySQL 5.6 from the documentation 

Please don’t hesitate to comment on this or other replication blogs with feedback and questions.

Appendix – Detailed Results


PlanetMySQL Voting: Vote UP / Vote DOWN

What is the proper size of InnoDB logs?

Апрель 10th, 2012

In one of my previous posts, “How to resize InnoDB logs?”, I gave the advice on how to safely change the size of transaction logs. This time, I will explain why doing it may become necessary.

A brief introduction to InnoDB transaction logs

The transaction logs handle REDO logging, which means they keep the record of all recent modifications performed by queries in any InnoDB table. But they are a lot more than just an archive of transactions. The logs play important part in the process of handling writes. When a transaction commits, InnoDB synchronously makes a note of any changes into the log, while updating the actual table files happens asynchronously and may take place much later. Each log entry is assigned a Log Sequence Number – an incremental value that always uniquely identifies a change.

InnoDB writes changes to the log as a transaction commits, while updating data in the table file happens asynchronously and may take place much later

 

Such design serves database in two ways.

First, it optimizes MySQL performance. It enables InnoDB to use light sequential I/O to store the modifications on disk as transactions commit and delay expensive random I/O required for data and index updates for when it is more convenient. Buffered updates may be then rearranged or merged in order to further optimize disk access.

And second, after a crash InnoDB can use the logs contents to perform recovery. As updates to the data files are done asynchronously, in the event of an unclean shut down, any modifications that existed only in the buffer pool would be lost. The ability to replay changes from the log deals with the problem.

InnoDB logs are circular, which means that they create a loop. When database reaches the end of the last file (commonly there are two files), it begins writing into the first one again.

When database reaches the end of the last file (commonly there are two files), it begins writing into the first one again.

They are also fixed size, so in order to prevent them from filling, InnoDB implements a background mechanism called checkpointing, which manages the process of synchronizing modified pages to disk (step (3) in the first illustration). As it runs, it marks exactly one log record as checkpoint, which states that all modifications carrying a younger LSN value were already safely stored in data files. This means the log contents prior to that LSN is no longer needed for recovery or any other purpose. As the checkpoint progresses, continuously chasing the most recent LSN, it keeps freeing the log space behind it, which can then be re-used by future writes.

The challenges

What are the main challenges of choosing the right size for InnoDB logs?

Making them small enough to avoid unnecessarily long distance between checkpoint LSN and the most recent LSN to avoid needlessly long recovery times.

With newer MySQL versions this should not be a significant factor anymore as the recovery process has been greatly optimized in MySQL 5.1.46 and in 5.5.

In older versions, however, the process used to be slow and would often needed hours to replay transactions from even medium-sized logs such as 512MB or 1GB, so it is a risk that needs to be evaluated.

Making them large enough to fit writes over sufficiently long period of time, so that InnoDB has some room to maneuver in deciding when to flush some of the buffer pool contents.

When the transaction logs are set too small for given workload, MySQL performance may suffer. During busier periods incoming writes may start pushing LSN faster than checkpoint can progress and the log space will start filling up. After a threshold is exceeded, checkpointing becomes very intense as InnoDB spots the upcoming problem that it may be short of free log space soon. When this doesn’t help, the engine may need to start blocking queries as it is flushing the buffer pool contents and advancing the checkpoint.

When the transaction logs are set too small for given workload, MySQL performance may suffer.

What size works?

When installing a new database, the transaction logs often need to be configured without too much knowledge about future workload. One can simply try choosing a reasonable size such as 64M for a pretty average database. There is no reason to go below this value. Of course, the more writes this new database is expected to take, the larger they may need to be. In any case it is extremely important to always change the default size, which is just two files of 5MB. Such configuration is not sufficient for any serious purpose, so do not ever never allow it even on a development server.

With database already running in testing or in production, the necessary size can be calculated based on the rate at which data is written into the transaction logs as this information is available from MySQL. A good rule says that the logs should be able to hold at least one hour worth of changes. In order to come up with a number, simply check how quickly Log Sequence Number progresses. Be sure not to check this during quiet periods as this has to be tuned for the peak usage.

Calculating the size

The information can be found in InnoDB status output, which you can obtain with SHOW ENGINE INNODB STATUS:

mysql> \P grep 'Log sequence number'
PAGER set to 'grep 'Log sequence number''
mysql> SHOW ENGINE INNODB STATUS\G
Log sequence number 21060750647056
1 row in set (0.06 sec)

Using the command line pager helped limiting the output to only the relevant information.

In reality, LSN value represents an offset from byte zero of the transaction log, so since the database has been initialized. Seeing how it changes essentially means seeing how much data was written into it over a period of time. So to figure out the amount of changes happening to a database:

  • check the most recent LSN using the method shown above
  • wait some time
  • check the LSN again
  • subtract the two values

Here is how to do it in practice:

mysql> \P grep 'Log sequence number'
PAGER set to 'grep 'Log sequence number''
mysql> SHOW ENGINE INNODB STATUS\G SELECT SLEEP(60); SHOW ENGINE INNODB STATUS\G
21057170602213
1 row in set (0.06 sec)

1 row in set (1 min 0.00 sec)

21057190468976
1 row in set (0.05 sec)

The two LSN values are 21057170602213 and 21057190468976. Let’s calculate the difference:

mysql> select ROUND((21057190468976 - 21057170602213)/ 1024 / 1024) as MB;
+------+
| MB   |
+------+
|   19 |
+------+
1 row in set (0.00 sec)

19 megabytes were written into the log file in one minute. Using this information, how large the logs should be to keep at least one hour worth of writes? 19MB * 60 minutes / 2 files = 570MB. The division by two comes from the fact that InnoDB uses two log files by default, while we need to set size for each individual file. Therefore we calculated that this database needs innodb_log_file_size set to at least 512MB.

In MySQL 5.1 or newer, a query against INFORMATION_SCHEMA.GLOBAL_STATUS can be used instead of looking at the InnoDB status output.

SELECT @a1 := variable_value AS a1
FROM information_schema.global_status
WHERE variable_name = 'innodb_os_log_written'
UNION ALL
SELECT Sleep(60)
UNION ALL
SELECT @a2 := variable_value AS a2
FROM information_schema.global_status
WHERE variable_name = 'innodb_os_log_written';

SELECT ROUND((@a2-@a1) * 60 / 1024 / 1024 / @@innodb_log_files_in_group) as MB;

PlanetMySQL Voting: Vote UP / Vote DOWN

OurSQL Episode 86: Speed Demon

Апрель 9th, 2012

This week we talk with Martin Farach-Colton of Tokutek about what's new with TokuDB.

News/Events/Feedback
Oracle announced the MySQL Connect Conference to be held in San Francisco on Saturday, September 29 and Sunday, September 30th. Call for papers and registration will open on April 16th.

There is a free webcast on Efficiently deploying new MySQL applications on Windows .

read more


PlanetMySQL Voting: Vote UP / Vote DOWN