Archive for the ‘Backups’ Category

Kontrollkit – new version available for download

Апрель 27th, 2010
Just a quick notice to let everyone know that there is a new version of Kontrollkit available. There are some required bug fixes to the formerly new python backup script and some Solaris compatible changes to the various my.cnf files. You can download the new version here: http://kontrollsoft.com/software-downloads, or here: http://code.google.com/p/kontrollkit/
PlanetMySQL Voting: Vote UP / Vote DOWN

How fast is FLUSH TABLES WITH READ LOCK?

Апрель 25th, 2010

A week or so ago at the MySQL conference, I visited one of the backup vendors in the Expo Hall. I started to chat with them about their MySQL backup product. One of the representatives told me that their backup product uses FLUSH TABLES WITH READ LOCK, which he admitted takes a global lock on the whole database server. However, he proudly told me that it only takes a lock for "a couple of milliseconds." This is a harmful misconception that many backup vendors seem to hold dear.

The truth is, this command can take a lock for an indeterminate amount of time. It might complete in milliseconds on a test system in a laboratory, but I have seen it take an extremely long time on production systems, measured in many minutes, or potentially even hours. And during this time, the server will get completely blocked (not just read-only!) To understand why, let's look at what this command actually does. There are several important parts of processing involved in the command.

Requesting the lock

The FLUSH TABLES WITH READ LOCK command immediately requests the global read lock. As soon as this happens, even before the lock is granted to it, all other processes attempting to modify anything in the system are locked out. In theory, this might not seem so bad because after all, the command acquires only a read lock. Other commands that need only a read lock can coexist with this. However, in practice, most tables are both read and written. The first write query to each table will immediately block against the requested global read lock, and subsequent read queries will block against the write query's requested table lock, so the real effect is that the table is exclusively locked, and all new requests into the system are blocked. Even read queries!

Waiting for the lock

Before the FLUSH TABLES WITH READ LOCK command can successfully acquire the lock, anything else that currently holds the lock must finish what it's doing. That means that every currently running query, including SELECT queries, must finish. So if there is a long-running query on the system, or an open transaction or another process that holds a table lock, the FLUSH TABLES WITH READ LOCK command itself will block until the other queries finish and all locks are released. This can take a very long time. It is not uncommon for me to log on to a customer's system and see a query that has been running for minutes or hours. If such a query were to begin running just before the FLUSH TABLES WITH READ LOCK command is issued, the results could be very bad.

Here's one example of what the system can look like while this process is ongoing:

SQL:
  1. mysql> SHOW processlist;
  2. +----+------+-----------+------+------------+------+-------------------+----------------------------------------------------------------------+
  3. | Id | User | Host      | db   | Command    | Time | State             | Info                                                                 |
  4. +----+------+-----------+------+------------+------+-------------------+----------------------------------------------------------------------+
  5. 4 | root | localhost | test | Query      |   80 | Sending DATA      | SELECT count(*) FROM t t1 JOIN t t2 JOIN t t3 JOIN t t4 WHERE t1.b=0 |
  6. 5 | root | localhost | test | Query      |   62 | Flushing TABLES   | FLUSH TABLES WITH READ LOCK                                          |
  7. 6 | root | localhost | test | FIELD List |   35 | Waiting FOR TABLE |                                                                      |
  8. 7 | root | localhost | test | Query      |    0 | NULL              | SHOW processlist                                                     |
  9. +----+------+-----------+------+------------+------+-------------------+----------------------------------------------------------------------+
  10. 4 rows IN SET (0.00 sec)

Notice that connection 6 can't even log in because it was a MySQL command-line client that wasn't started with -A, and it's trying to get a list of tables and columns in the current database for tab-completion. Note also that "Flushing tables" is a misnomer -- connection 5 is not flushing tables yet. It's waiting to get the lock.

Flushing tables

After the FLUSH TABLES WITH READ LOCK command finally acquires the lock, it must begin flushing data. This does not apply to all storage engines. However, MyISAM does not attempt to flush its own data to the disk during normal processing. It relies on the operating system to flush the data blocks to disk when it decides to. As a result, a system that has a lot of MyISAM data might have a lot of dirty blocks in the operating system buffer cache. This can take a long time to flush. During that time, the entire system is still locked. After all the data is finished, the FLUSH TABLES WITH READ LOCK command completes and sends its response to the client that issued it.

Holding the lock

The final part of this command is the duration during which the lock is held. The lock is released with UNLOCK TABLES or a number of other commands. Most backup systems that use FLUSH TABLES WITH READ LOCK are performing a relatively short operation inside of the lock, such as initiating a filesystem snapshot. So in practice, this often ends up being the shortest portion of the operation.

Conclusion

A backup system that is designed for real production usage must not assume that FLUSH TABLES WITH READ LOCK will complete quickly. In some cases, it is unavoidable. This includes backing up a mixture of MyISAM and InnoDB data. But many installations do not mix their data this way, and should be able to configure a backup system to avoid this global lock. There is no reason to take a lock at all for backing up only InnoDB data. Completely lock-free backups are easy to take. Backup vendors should build this capability into their products.


Entry posted by Baron Schwartz | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

xtrabackup-1.2

Апрель 13th, 2010

Dear Community,

The notice is hereby given that next version 1.2 of XtraBackup software is released.

The list of changes in this version includes:
Changelog:

  • XtraBackup supports now XtraDB 10
  • tar4ibd supports variable page size and fast_checksum of XtraDB
  • Supported 32bit platform for Centos 5, Centos 4, Debian lenny and etch, Ubuntu Karmic, Jaunty, Intrepid, Hardy, FreeBSD

Fixeb bugs in this release:

The binary packages for as well as source code of the XtraBackup is available on http://www.percona.com/percona-builds/XtraBackup/XtraBackup-1.2/.

Debian and RPM are available in Percona repository.

Since now debian packages in Percona APT repository are signed by our private key.

To check authenticity of the packages you need to install the public key:

CODE:
  1. # wget http://www.percona.com/percona-builds/RPM-GPG-KEY-percona
  2.  
  3. # apt-key add RPM-GPG-KEY-percona

To verify source code, you have to install the key in trastedkeys.gpg:

CODE:
  1. # wget http://www.percona.com/percona-builds/RPM-GPG-KEY-percona
  2.  
  3. # gpg --no-default-keyring --keyring trustedkeys.gpg --import RPM-GPG-KEY-percona

The project lives on Launchpad : https://launchpad.net/percona-xtrabackup and you can report bug to Launchpad bug system:
https://launchpad.net/percona-xtrabackup/+filebug. The documentation is available on our Wiki.

For general questions use our Pecona-discussions group, and for development question Percona-dev group.

For support, commercial and sponsorship inquiries contact Percona.


Entry posted by Aleksandr Kuzminsky | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Kontrollkit – new backup script is partition space aware

Март 31st, 2010
I’ve been wanting to write a backup script for a while now that does the following: reads the partition information for the directory that you are backing up into and computes the used/available/percentage-available space. Then it reads the total data size from the MySQL tables and ensures that you have enough space on disk (for [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

Statistics of InnoDB tables and indexes available in xtrabackup

Сентябрь 14th, 2009

If you ever wondered how big is that or another index in InnoDB ... you had to calculate it yourself by multiplying size of row (which I should add is harder in the case of a VARCHAR - since you need to estimate average length) on count of records. And it still would be quite inaccurate as secondary indexes tend to take more space. So we added more detailed index statistics into our xtrabackup utility. The thanks for this feature goes to a well known Social Network who sponsored the development.

We chose to put this into xtrabackup for a couple of reasons - the first is that running statistics on your backup database does not need to hurt production servers, and the second reason is that running statistic on a stopped database is more accurate than with online (although online is also supported, but you may have inexact results).

Let's see how it works. I have one table with size 13Gb what was filled during about 2.5 years.
The table is:

CODE:
  1. CREATE TABLE `link_out104` (
  2.   `domain_id` int(10) unsigned NOT NULL,
  3.   `link_id` int(10) unsigned NOT NULL auto_increment,
  4.   `url_from` varchar(255) NOT NULL,
  5.   `url_to` varchar(255) NOT NULL,
  6.   `anchor` varchar(255) NOT NULL,
  7.   `from_site_id` int(10) unsigned NOT NULL,
  8.   `from_forum_id` int(10) unsigned NOT NULL,
  9.   `from_author_id` int(10) unsigned NOT NULL,
  10.   `from_message_id` bigint(20) unsigned NOT NULL,
  11.   `message_published` timestamp NOT NULL default CURRENT_TIMESTAMP,
  12.   `kind` enum('link','img') NOT NULL,
  13.   `url_title` varchar(255) NOT NULL,
  14.   `isexternal` tinyint(3) unsigned NOT NULL,
  15.   `revert_domain` varchar(255) NOT NULL,
  16.   `url_prefix` varchar(255) NOT NULL,
  17.   `from_domain_id` int(10) unsigned NOT NULL,
  18.   `ext` varchar(25) NOT NULL,
  19.   `linktype` enum('html','video','mp3','image','pdf','other') NOT NULL,
  20.   `message_day` date NOT NULL,
  21.   `mod_is` tinyint(3) unsigned NOT NULL default '0',
  22.   `is_adult` tinyint(3) unsigned NOT NULL default '0',
  23.   PRIMARY KEY  (`link_id`),
  24.   UNIQUE KEY `domain_id_2` (`domain_id`,`link_id`),
  25.   KEY `domain_id` (`domain_id`,`from_site_id`,`message_published`),
  26.   KEY `revert_domain` (`revert_domain`,`url_prefix`(80)),
  27.   KEY `from_site_id` (`from_site_id`,`message_published`),
  28.   KEY `site_message` (`from_site_id`,`message_day`,`isexternal`),
  29.   KEY `from_message_id` (`from_message_id`,`link_id`)
  30. ) ENGINE=InnoDB AUTO_INCREMENT=26141165 DEFAULT CHARSET=utf8;

And size of file is about 12.88 GB

-rw-r--r-- 1 root root 13832814592 Sep 10 14:41 link_out104.ibd

So to get statistics we run:

xtrabackup --stats --tables=art.link* --datadir=/mnt/data/mysql/

which will show something like this:

CODE:
  1. <INDEX STATISTICS>
  2.  
  3.   table: art/link_out104, index: PRIMARY, space id: 12, root page 3
  4.   estimated statistics in dictionary:
  5.     key vals: 25265338, leaf pages 497839, size pages 498304
  6.   real statistics:
  7.      level 2 pages: pages=1, data=5395 bytes, data/pages=32%
  8.      level 1 pages: pages=415, data=6471907 bytes, data/pages=95%
  9.         leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91%
  10.  
  11.   table: art/link_out104, index: domain_id_2, space id: 12, root page 4
  12.   estimated statistics in dictionary:
  13.     key vals: 27755790, leaf pages 23125, size pages 26495
  14.   real statistics:
  15.      level 2 pages: pages=1, data=510 bytes, data/pages=3%
  16.      level 1 pages: pages=30, data=393125 bytes, data/pages=79%
  17.         leaf pages: recs=25958413, pages=23125, data=337459369 bytes, data/pages=89%
  18.  
  19.   table: art/link_out104, index: domain_id, space id: 12, root page 5
  20.   estimated statistics in dictionary:
  21.     key vals: 3006231, leaf pages 43255, size pages 49600
  22.   real statistics:
  23.      level 2 pages: pages=1, data=2850 bytes, data/pages=17%
  24.      level 1 pages: pages=114, data=1081375 bytes, data/pages=57%
  25.         leaf pages: recs=25953873, pages=43255, data=545031333 bytes, data/pages=76%
  26.  
  27.   table: art/link_out104, index: revert_domain, space id: 12, root page 6
  28.   estimated statistics in dictionary:
  29.     key vals: 1204830, leaf pages 133869, size pages 153984
  30.   real statistics:
  31.      level 3 pages: pages=1, data=373 bytes, data/pages=2%
  32.      level 2 pages: pages=6, data=58143 bytes, data/pages=59%
  33.      level 1 pages: pages=832, data=9146283 bytes, data/pages=67%
  34.         leaf pages: recs=25839414, pages=133869, data=1566961607 bytes, data/pages=71%
  35.  
  36.   table: art/link_out104, index: from_site_id, space id: 12, root page 7
  37.   estimated statistics in dictionary:
  38.     key vals: 330426, leaf pages 33889, size pages 38848
  39.   real statistics:
  40.      level 2 pages: pages=1, data=1764 bytes, data/pages=10%
  41.      level 1 pages: pages=84, data=711669 bytes, data/pages=51%
  42.         leaf pages: recs=25956416, pages=33889, data=441259072 bytes, data/pages=79%
  43.  
  44.   table: art/link_out104, index: site_message, space id: 12, root page 8
  45.   estimated statistics in dictionary:
  46.     key vals: 1399286, leaf pages 32260, size pages 36992
  47.   real statistics:
  48.      level 2 pages: pages=1, data=1680 bytes, data/pages=10%
  49.      level 1 pages: pages=80, data=677460 bytes, data/pages=51%
  50.         leaf pages: recs=25956043, pages=32260, data=441252731 bytes, data/pages=83%
  51.  
  52.   table: art/link_out104, index: from_message_id, space id: 12, root page 9
  53.   estimated statistics in dictionary:
  54.     key vals: 25964521, leaf pages 27979, size pages 28160
  55.   real statistics:
  56.      level 2 pages: pages=1, data=798 bytes, data/pages=4%
  57.      level 1 pages: pages=38, data=587559 bytes, data/pages=94%
  58.         leaf pages: recs=25958413, pages=27979, data=441293021 bytes, data/pages=96%

The output is intensive, let me highlight some points:

CODE:
  1. table: art/link_out104, index: PRIMARY, space id: 12, root page 3
  2.         leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91%

It says that PRIMARY key (which is the table by itself, as InnoDB is clustering data by primary key) takes 497839 pages ( 16KB each) and size of data 7492026403 bytes or (6.98 GB). And density ( fitting data into pages) is quite good - 91%. But it was expected, as table is really mostly inserted in, updates and deletes are rare).

And let's take index domain_id

CODE:
  1. table: art/link_out104, index: domain_id, space id: 12, root page 5
  2.         leaf pages: recs=25953873, pages=43255, data=545031333 bytes, data/pages=76%

you can see the allocated pages (43255 pages or 708689920 bytes) are filled only by 76% ( data takes 545031333 bytes). And that means that 150MB are just waste of space. Which is really even worse for key revert_domain

leaf pages: recs=25839414, pages=133869, data=1566961607 bytes, data/pages=71%
.

For this key about 600MB is empty.

This needs a bit of explaining:
This does not have as good efficiency as the primary key, but a lot of this is to be expected. In a lot of cases we insert into the primary key in order which makes things very predictable, but the inserts into the secondary key index are random - which leads to a lot of page splits.

One helpful new feature to address this is in XtraDB/InnoDB plugin - fast index creation. With this feature, InnoDB creates indexes by sort, so page fill factor should be quite good.

To check that, there is xtrabackup --stats for index domain_id created for table in Barracuda format with Fast creation method:

CODE:
  1. table: art/link_out104, index: domain_id, space id: 15, root page 49160
  2.   estimated statistics in dictionary:
  3.     key vals: 5750565, leaf pages 34383, size pages 34496
  4.   real statistics:
  5.      level 2 pages: pages=1, data=1375 bytes, data/pages=8%
  6.      level 1 pages: pages=55, data=859575 bytes, data/pages=95%
  7.         leaf pages: recs=25958413, pages=34383, data=545126673 bytes, data/pages=96%

As you see this time it takes 34383 pages (compare to 43255 in previous statistics).

Though it would be interesting to see how it will grow with further inserts, and I also suspect random INSERTS into so dense space going to be slower than in previous case.

The --stats is not in xtrabackup release yet, only in source code repository, but should be released quite soon.

And the last point of the post - if you are badly missing some features in MySQL, InnoDB, InnoDB-plugin, XtraDB, XtraBackup - you know whom ask for!


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL 5.1 and InnoDB Hot Backup Gotcha

Сентябрь 1st, 2009

Recently while we were building a slave with a newer version of MySQL 5.1 from an InnoDB Hot backup, the following error occurred when we ran mysql_upgrade:

mysql.time_zone                                    OK
mysql.time_zone_leap_second                        OK
mysql.time_zone_name                               OK
mysql.time_zone_transition                         OK
mysql.time_zone_transition_type                    OK
mysql.user                                         OK
Running 'mysql_fix_privilege_tables'...
ERROR 13 (HY000) at line 311: Can't get stat of './mysql/general_log.CSV' (Errcode: 2)
ERROR 13 (HY000) at line 316: Can't get stat of './mysql/slow_log.CSV' (Errcode: 2)
FATAL ERROR: Upgrade failed

The problem is that in MySQL 5.1, it is possible to log the slow query log and general log to tables in the mysql schema (source: Selecting General Query and Slow Query Log Output Destinations). These tables are created by default as CSV tables for performance reasons, and are created even if MySQL is set not to log to tables.

CSV tables, however, are not copied as part of the InnoDB Hot Backup process (by the wrapper script innobackupex.pl), thus creating this error. The fix to get the slave working was to copy the .CSV and .CSM files (each CSV table has three files: a .FRM file, a .CSV file, and a .CSM file), and re-run mysql_upgrade.

To avoid having to fix the CSV table issue every time, we looked at patching innobackup.pl to backup CSV tables properly. We found it involved a simple change. To fix it for InnoDB Hot Backup, simply replace line 1449 from the original:

    my $wildcard = '*.{frm,MYD,MYI,MRG,TRG,TRN,opt}';

to:

    my $wildcard = '*.{frm,MYD,MYI,MRG,TRG,TRN,CSM,CSV,opt}';

Xtrabackup, with its wrapper innobackupex.pl, also has the same issue with not backing up CSV tables. Similarly, the fix is to change line 1811 of innobackupex.pl from:

    my $wildcard = '*.{frm,MYD,MYI,MRG,TRG,TRN,ARM,ARZ,opt,par}';

to:

    my $wildcard = '*.{frm,MYD,MYI,MRG,TRG,TRN,ARM,ARZ,CSM,CSV,opt,par}';

We submitted this issue to Percona (innobackupex fails to backup CSV tables causing mysql_upgrade to fail) and to Innobase (innobackup.pl fails to backup CSV tables causing mysql_upgrade to fail).


PlanetMySQL Voting: Vote UP / Vote DOWN

xtrabackup-0.9

Август 21st, 2009

Dear Community,

The release 0.9 of the opensource backup tool for InnoDB and XtraDB is available for download.

Changelog:

  • tar4ibd could not treat over 8GB file, now max 64GB
  • prepare-speed-hack is added

Fixed bugs:

The binary packages for RHEL4,5, Debian, FreeBSD, MacOS as well as source code of the XtraBackup is available on http://www.percona.com/mysql/xtrabackup/0.9/.

The project lives on Launchpad : https://launchpad.net/percona-xtrabackup and you can report bug to Launchpad bug system:
https://launchpad.net/percona-xtrabackup/+filebug. The documentation is available on our Wiki.

For general questions use our Pecona-discussions group, and for development question Percona-dev group.

For support, commercial and sponsorship inquiries contact Percona.


Entry posted by Aleksandr Kuzminsky | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN