Archive for the ‘Backups’ Category

Ten things to remember about MySQL backups

Май 4th, 2012

Read the original article at Ten things to remember about MySQL backups

 

  1. Use Hot Backups

Hot backups are an excellent way to backup MySQL.  They can run without blocking your application, and save tons on restore time.  Percona’s xtrabackup tool is a great way to do this.  We wrote a howto on using xtrabackup for hotbackups.

  1. Use Logical Backups

Just because we love hot backups using xtrabackup doesn’t mean mysqldump isn’t useful.  Want to load data into Amazon RDS?  Want to isolate and load only one schema, or just one table?  All these great uses make mysqldump indispensable.  Use it in combination with periodic hot backups to give you more recovery options.

  1. Replication isn’t a backup

While replication provides a great way to keep a hot copy of your production database, it’s not the same as a backup.  Why?  Operator error, that’s why!  People make mistakes, drop tables and database schemas that later need to be restored.  This can and will happen, so head off the disaster by doing real backups.

As an additional note, if you’re using replication, you surely want to perform regular checksums of your data.  These ensure that the primary and secondary do indeed contain the same data.

  1. Firedrills & Restore Time

The only way to be sure your backup is complete is to test restoring everything.  Yes it’s a pain, but it will inevitably be a learning experience.  You’ll document the process to speed it up in future tests, you’ll learn how long recovery takes, and find additional pieces to the pie that must be kept in place.  Doing this in advance of d-day is

Different backups have different recovery times.  In the industry vernacular, your RTO or recovery time objective should inform what will work for you.  Although a mysqldump may take 30 minutes to complete, your restore of that data might take 8 hours or more.  That’s due in part to rebuilding all those indexes.  When you perform the dump one create index statement is formulated from the data dictionary, but on import the data must be sorted and organized to rebuild the index from scratch.  Percona’s mysqldump utility will capitalize on MySQL’s fast index rebuild for Innodb tables.  According to the Percona guys this can bring a big improvement in import time.  Yet another great reason to use the Percona distro!

  1. Transaction Logs

If you want to be able to do point in time recovery, you’ll need all the binlog files as well.  These are being created all the time, while new transactions are completed in your database. If your last backup was last night at 3am, and you want to recovery today until 3pm, you’ll need all the binary logs from the intervening hours to apply to that backup.  This process is called point-in-time recovery, and can bring your database restore up to the current commited transactions.

  1. Backup Config Files

Don’t forget that lonely /etc/my.cnf file.  That’s an important part of a backup if you’re rebuilding on a newly built server.  It may not need to be backed up with the same frequency, but should be included.

  1. Stored Code & Grants

Stored procedures, triggers and functions are all stored in the mysql database schema.  If you are doing a restore of just one database schema, you may not have this, or it may make the restore more complicated.  So it can be a good idea to backup code separately.  mysqldump can do this with the –routines option.  Hot backups by their nature, will capture everything in the entire instance – that is all database schemas including the system ones.

Grants are another thing you may want to backup separately.  For the same reasons as stored code, grants are stored in the system tables.  Percona toolkit includes a nice tool for this called pt-show-grants.  We recommend running this periodically anyway, as it’ll give you some perspective on permissions granted in your database.  You’re reviewing those right?

  1. Events & Cronjobs

MySQL allows the running of events inside the database.  SHOW EVENTS or SHOW EVENTS schema_name will display the events scheduled.

You may also have cronjobs enabled.  Use crontab -l to display those for specific users.  Be sure to check at least “mysql” and “root” users as well as other possible application users on the server.

  1. Monitoring

Backups are a nit picky job, and often you don’t know if they’re complete until it’s time to restore.  That’s why we recommend firedrills above, and they’re very important.  You can also monitor the backups themselves.  Use an error log with mysqldump or xtrabackup, and check that logfile for new messages.  In addition you can check the size of the resulting backup file.  If it has changed measurably from the recent backup sizes, it may indicate problems.  Is your backup size 0, something serious is wrong.  Half the size of recent ones, it may have failed halfway through, or the filesystem filled up.

  1. Security

This is often overlooked area, but may be a concern for some environments.  Is the data contained in your backup sensitive?  Consider where the backups are stored and retained for long term.  Reason who has access to those files, and make use of the least privileges rule.

 

 

 

For more articles like these go to iHeavy, Inc +1-212-533-6828


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona XtraBackup 1.4

Ноябрь 22nd, 2010

Percona XtraBackup 1.4 is now available for download.

Version 1.4 fixes problems related to incremental backups. If you do incremental backups, it’s strongly recommended that you upgrade to this release.

Functionality Added or Changed

  • Incremental backups have changed and now allow the restoration of full backups containing certain rollback transactions that previously caused problems. Please see Preparing the Backups and the --apply-log-only option. (From innobackupex, the --redo-only option should be used). (Yasufumi Kinoshita)
  • The XtraBackup Test Suite was implemented and is now a standard part of each distribution. (Aleksandr Kuzminsky)
  • Other New Features:
    • The --prepare option now reports xtrabackup_binlog_pos_innodb if the information exists. (Yasufumi Kinoshita)
    • When --prepare is used to restore a partical backup, the data dictionary is now cleaned and contains only tables that exist in the backup. (Yasufumi Kinoshita))
    • The --table option was extended to accept several regular expression arguments, separated by commas. (Yasufumi Kinoshita)
  • Other Changes:
    • Ported to the Percona Server 5.1.47-11 code base. (Yasufumi Kinoshita)
    • XtraBackup now uses the memory allocators of the host operating system, rather than the built-in InnoDB allocators (see Using Operating System Memory Allocators). (Yasufumi Kinoshita)

Bugs Fixed

  • Bug #595770 – Binaries are stripped by rpmbuild, so __os_install_post is redefined to change the default behaviour. (Aleksandr Kuzminsky)
  • Bug #589639 – Fixed a problem of hanging when tablespaces were deleted during the recovery process. (Yasufumi Kinoshita)
  • Bug #611960 – Fixed a segmentation fault in “xtrabackup”. (Yasufumi Kinoshita)
  • Miscellaneous important fixes related to incremental backups.

Release Notes for this and previous releases of Percona Xtrabackup can be found in our Wiki.

The latest downloads are available on our website. The latest source code can be found on Launchpad.

Please report any bugs found at Bugs in Percona XtraBackup.

For general questions, use our Percona Discussions Group, and for development questions our Percona Development Group.

For support, commercial, and sponsorship inquiries, contact Percona.


Entry posted by Fred Linhoss | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Why do I recommend switching over from MyISAM to Innodb!

Ноябрь 16th, 2010
Although MyISAM has been the default storage engine for MySQL but its soon going to change with the release of MySQL server 5.5. Not only that, more and more people are shifting over to the Innodb storage engine and the reasons for that is the tremendous benefits, not only in terms of performance, concurrency, ACID-transactions, foreign key constraints, but also because of the way it helps out the DBA with hot-backups support, automatic crash recovery and avoiding data inconsistencies which can prove to be a pain with MyISAM. In this article I try to hammer out the reasons why you should move on to using Innodb instead of MyISAM.
PlanetMySQL Voting: Vote UP / Vote DOWN

Lost innodb tables, xfs and binary grep

Ноябрь 9th, 2010

Before I start a story about the data recovery case I worked on yesterday, here’s a quick tip – having a database backup does not mean you can restore from it. Always verify your backup can be used to restore the database! If not automatically, do this manually, at least once a month. No, seriously – in most of the recovery cases I worked on, customers did have some sort of backup, but it just wasn’t working, complete and what not. Someone set it up and never bothered to check if it still works after a while.

Anyway, this post is not really about the backups but rather about few interesting things I learned during last recovery case.

First, some facts about the system and how data was lost:

  • MySQL had a dedicated partition on XFS file system
  • Server was running innodb_file_per_table
  • There was a production master and two slaves, all had same setting
  • Developer accidentally ran DROP DATABASE X on the wrong machine (production master)
  • All slaves followed and dropped their copy of the data
  • The important tables were all InnoDB
  • Having a backup, customer has first attempted to restore from backup on the production master

Luckily (or rather, unfortunately) backup only had table definitions but not the data so no data was written to file system. Mind however that restoring a backup could have been fatal if it would have written some junk data as that would have overwritten the deleted files. Now, here’s what I learned while working on this case:

Recovering from XFS is possible. Just a month ago we had a team meeting in Mallorca where we went through various data loss scenarios. One of them was deleted files on xfs – we all agreed on few things:

  • recovering files from xfs is hard, if at all possible
  • we had no recovery cases on xfs, most likely because:
  • whoever is using xfs, is smart enough to have backups set up properly

Now I’m not picking on the customer or anything – indeed they did have a backup set up, it’s just that some (most important) tables weren’t backed up. We did not try any of the file recovery tools for xfs – apparently they are all targeting specific file types and sure enough InnoDB is not one of the supported files. What we did is we simply ran page_parser on the (already) unmounted file system treating it as a raw device. I was surprised how amazingly simple and fast it was (did you know that latest version of page_parser identifies pages by infimum and supremum records?) – 10G partition was scanned in like 5 minutes and all 4G of innodb pages were successfully written to a separate partition. That’s the easy part though – you run page parser, wait and see what you get.

If InnoDB Data Dictionary was not overwritten by an attempt to restore from the backup, actually second part would’ve been quite easy too, but it was so I could no longer identify correct PK id for specific tables by just mapping data dictionary table records to index records. Instead I had to grep for specific character sequences against all pages. Note however that only works for text in uncompressed text columns (varchar, char, text) but what if tables don’t have any text columns at all? Then, you read further.

GNU grep won’t match binary strings. This isn’t new, I kind of knew grep couldn’t look for binary “junk”, but I really needed it to. Why? Well, here’s few of the scenarios we’ve gone through yesterday:

1. There was this rather big table with integer and enum columns only, where we knew a rather unique PK, well something like 837492636 so we needed a way to find pages that match it. InnoDB would internally store integers in 4-bytes rather than 10 bytes if it were stored as a sequence of characters, so “grep -r 837492636 /dir” would not have worked.

2. There was another table, a small one with 4 smallint columns where all we could match on was a sequence of numbers from a single record – customer knew that there was at least one row with the following sequence: 7, 3, 7, 8. Matching by any of the numbers would be insane as it would match all of the pages while matching on numbers as a sequence of characters would not work for many reasons.

This is where I found bgrep which was exactly the tool for the task. In the case number one, I have just converted number 837492636 to it’s binary representation 0×31EB1F9C and ran “bgrep 31EB1F9C /dir” – there were only like 10 other matches across the 4 gigabytes of pages, some of them probably from the secondary pages, but when you only have that many pages to check it’s really simple.

Second case seemed somewhat complicated, but it really wasn’t – all of the columns were fixed size – 2bytes each, so the thing we had to look for was this sequence: 0007000300070008. I was expecting a lot of mismatches but in fact I ended up with only one match pointing exactly to the right page and so the right index id.

The other thing I would note about bgrep – it was so much faster than matching text using grep, so if you happen to have a lot of data to scan and you have to choose between matching text and number, matching a number using bgrep may work much better.

We are considering shipping bgrep as part of percona recovery toolset, with some additional converters so we can match against various date/time columns as well.


Entry posted by Aurimas Mikalauskas | 2 comments

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

An argument for not using mysqldump

Ноябрь 8th, 2010

I have a 5G mysqldump which takes 30 minutes to restore from backup.  That means that when the database reaches 50G, it should take 30×10=5 hours to restore.  Right?  Wrong.

Mysqldump recovery time is not linear.  Bigger tables, or tables with more indexes will always take more time to restore.

If I restore from a raw backup (LVM snapshot, xtrabackup, innodb hot backup), it is very easy to model how much longer recovery time will take:

Backup is 80G
Copy is at 70MB/s.
10G is already complete.
= ((80-10) * 1024)/70/60 = ~17 minutes

I can tell progress with mysqldump by monitoring the rate at which show global status like 'Handler_write'; increases and compare it to my knowledge of about how many rows are in each table.  But progress != a magic number like “17 minutes”.  Not unless I do a lot of complex modeling.

I am not saying a 5 hour recovery is good or bad.  What I am saying is knowing remaining time is very important during disaster recovery.  Being able to say “we’ll be back at 2PM” is much better than saying “we’ll be back between 1PM and 4PM.. maybe”.


Entry posted by Morgan Tocker | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Easy MySQL: how to backup databases to a remote machine

Август 14th, 2010

Here’s a simple answer to a simple question. “How do I run a backup of MySQL to another machine without writing to the local server’s filesystem?” – this is especially useful if you are running out of space on the local server and cannot write a temporary file to the filesystem during backups.

Method one – this writes a remote file.
mysqldump [options] [db_name|--all-databases]| gzip -c | ssh user@host.com "cat > /path/to/new/file.sql.gz"

Method two – this writes directly into a remote mysql server
mysqldump [options] [db_name|--all-databases]| mysql --host=[remote host] –user=root –password=[pass] [db_name]


PlanetMySQL Voting: Vote UP / Vote DOWN

Why you can’t rely on a replica for disaster recovery

Август 1st, 2010

A couple of weeks ago one of my colleagues and I worked on a data corruption case that reminded me that sometimes people make unsafe assumptions without knowing it. This one involved SAN snapshotting that was unsafe.

In a nutshell, the client used SAN block-level replication to maintain a standby/failover MySQL system, and there was a failover that didn’t work; both the primary and fallback machine had identically corrupted data files. After running fsck on the replica, the InnoDB data files were entirely deleted.

When we arrived on the scene, there was a data directory with an 800+ GB data file, which we determined had been restored from a SAN snapshot. Accessing this file caused a number of errors, including warnings about accessing data outside of the partition boundaries. We were eventually able to coax the filesystem into truncating the data file back to a size that didn’t contain invalid pointers and could be read without errors on the filesystem level. From InnoDB’s point of view, though, it was still completely corrupted. The “InnoDB file” contained blocks of data that were obviously from other files, such as Python exception logs. The SAN snapshot was useless for practical purposes. (The client decided not to try to extract the data from the corrupted file, which we have specialized tools for doing. It’s an intensive process that costs a little money.)

The problem was that the filesystem was ext2, with no journaling and no consistency guarantees. A snapshot on the SAN is just the same as cutting the power to the machine — the block device is in an inconsistent state. A filesystem that can survive that has to ensure that it writes the data to the block device such that it can bring into a consistent state later. The techniques for doing this include things like ordered writes and meta-data journaling. But ext2 does not know how to do that. The data that’s seen by the SAN is some jumble of blocks that represents the most efficient way to transfer the changed blocks over the interconnect, without regard to logical consistency on the filesystem level.

Two things can help avoid such a disaster: 1) get qualified advice and 2) don’t trust the advice; backups and disaster recovery plans must be tested periodically.

This case illustrates an important point that I repeat often. The danger of using a replica as a backup is that data loss on the primary can affect the replica, too. This is true no matter what type of replication is being used. In this case it’s block-level SAN replication. DRBD would behave just the same way. At a higher level, MySQL replication has the same weakness. If you rely on a MySQL slave for a “backup,” you’ll be out of luck when someone accidentally runs DROP TABLE on your master. That statement will promptly replicate over and drop the table off your “backup.”

I still see people using a replica as a backup, and I know it’s just a matter of time before they lose data. In my experience, the types of errors that will propagate through replication are much more common than those that’ll be isolated to just one machine, such as hardware failures.


Entry posted by Baron Schwartz | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Recover BLOB fields

Июль 1st, 2010

For a long time long types like BLOB, TEXT were not supported by Percona InnoDB Recovery Tool. The reason consists in a special way InnoDB stores BLOBs.

An InnoDB table is stored in a clustered index called PRIMARY. It must exist even if a user hasn't defined the primary index. The PRIMARY index pages are identified by 8-bytes number index_id. The highest 4 bytes are always 0, so index_id is often notated as o:<4 bytes number>, e.g. 0:258. The pages are ordered in a binary tree. Primary index is used as a key. Inside a page records are stored in a linked list.

InnoDB page by default is 16k. Obviously if a record is too long, a single page can't store it. If the total record size is less than UNIV_PAGE_SIZE/2 - 200 (this is roughly 7k) then the full record is stored in the page of PRIMARY index. Let's call it internal. In InnoDB sources they have type FIL_PAGE_INDEX*. If the record is longer than 7k bytes, only first 768 bytes of every BLOB field are stored internally. The rest is stored in external pages. They have type FIL_PAGE_TYPE_BLOB. Page type is stored in a FIL_PAGE_TYPE field of the page header . In an earlier post Peter described in details how BLOBs are stored.

Let me illustrate a record format of the example of the table:

CODE:
  1. CREATE TABLE `t1` (
  2. `ID` int(11) unsigned NOT NULL,
  3. `NAME` varchar(120),
  4. `N_FIELDS` int(10),
  5. PRIMARY KEY (`ID`)
  6. ) ENGINE=InnoDB DEFAULT
  7. CHARSET=latin1

Here COMPACT format is used, which is default in MySQL >= 5.1.

The record consists of four parts:

  1. Offsets. Effectively these are field lengths. Only variable length types have offsets. The offset can be one or two bytes depending on maximum field size. The highest bit of the offset is 1 if the field is stored in external pages (i.e. long BLOB field)
  2. NULL fields. A bit per NULL-able field padded to minimum number of bytes to store all flags.
  3. So called extra bytes. These are 5 bytes where different flags are stored like "record is deleted" flag. The last two bytes are the relative pointer to the next record in the page.
  4. User data. TRX_ID is a transaction id. PTR_ID is a pointer in a rollback segment to the old version of the record.

So if a field is the long one, it has 1) The highest bit of the offset is set to "1", 2) After 768 bytes there are 20 bytes in the end where the following:

  1. BTR_EXTERN_SPACE_ID - space id where the next piece of the field is stored
  2. BTR_EXTERN_PAGE_NO - page id
  3. BTR_EXTERN_OFFSET - offset inside a page. An external page has a header. The similar pointer to the next page is stored in it.
  4. BTR_EXTERN_LEN - length of the next piece.

The external pages are linked until BTR_EXTERN_PAGE_NO is FIL_NULL.

Percona InnoDB Recovery Tool supports now recovery of long fields. It is still in development branch, but should be released after QA tests.

The complexity of BLOB fields brings prerequisites to successfully recover a record with BLOB : all pieces of the BLOB field must be reachable by pointers. That means BTR_EXTERN_PAGE_NO, BTR_EXTERN_OFFSET and BTR_EXTERN_LEN must not be corrupted.

The tool outputs the recovered table in tab-separated values format. BLOBs are printed in a hex form - 0ACD86...

To upload the table back you should utilize UNHEX function:

CODE:
  1. mysql>
  2. LOAD DATA INFILE '/path/to/datafile'
  3. REPLACE INTO TABLE <table_name>
  4. FIELDS TERMINATED BY '\t'
  5. OPTIONALLY ENCLOSED BY '"'
  6. LINES STARTING BY '<table_name>\t'
  7. (id,sessionid,uniqueid,username,nasipaddress,@var1,@var2,etc)
  8. SET
  9.   blobfield = UNHEX(@var1),
  10.   datefield2 = FROM_UNIXTIME(@var2,'%Y %D %M %h:%i:%s %x');

* - there is a typo in Recovery of Lost or Corrupted InnoDB Tables Presentation


Entry posted by Aleksandr Kuzminsky | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

A backup today saves you tomorrow

Апрель 30th, 2010

Whether you’re working with MySQL, MySQL Cluster, or any other RDBMS, every database with a requirement for persistent data should have a backup. As a Production DBA you’re the insurance policy to safeguard the data. Bad things do happen. Backups are your safety net to ensure you always have a way to recover the database should the worst happen and it becomes irreparable.

There are many ways to produce a consistent backup of MySQL, I have listed a few of the options available below; Remember backups are your safety net, failing to retrieve a consistent backup when you need it most can be a very career limiting move, so no matter which backup method you choose always test your backups!

Logical Backups
The ever popular mysqldump is a backup and export utility that can be used to make logical backups. With a logical backup, the data and schema are exported from the database and written using a series of create and insert SQL statements to a flat file, to restore the backup, the file is simply passed back to MySQL to recreate the database objects and data using the SQL code.
For small to medium sized databases a logical backup can be an excellent choice, however when working with larger databases, they can take a long time to both produce and recover. For this reason, logical backups aren’t usually the primary backup method when working with VLDBs, however if it’s possible it’s still a great idea to try and use logical backups as part of a wider backup policy to provide an added layer of safety.
There are a number of tools available that perform similar functions to mysqldump, one good example of these would be mk-parallel-dump available from the Maatkit toolkit, whilst not strictly a logical backup tool, by breaking tables into chunks and dumping the data on multiple concurrent threads mk-parallel-dump can significantly increase the speed of exporting data.

Database File Backups
A common way to backup MySQL is to make a copy of the database files on disk. In an environment that’s not operating 24/7 this can easily be done by shutting down MySQL and copying the files out of hours, however even in an environment that’s running 24/7 where there are no obvious opportunities for downtime, there are still a number of options available to make a consistent copy of the database files.
mysqlhotcopy is a perl script that can be used to automate copying the database files of MyISAM and Archive tables. mysqlhotcopy is able to take a consistent copy of the files whilst MySQL is still running as it both flushes the tables and acquires a read lock prior to the copy. The main issue with this method is the tables are locked for the duration of the copy, depending on the size of the database this can obviously have a noticeable impact.
By installing MySQL on a logical volume manager such as LVM or ZFS, it’s possible to take a consistant copy of the database files by taking a snapshot of the filesystem where the database file reside. This method isn’t limited to MyISAM and Archive tables, and can safely be used for InnoDB. Again when working with MyISAM it’s important to flush and lock the tables prior to the snapshot, however as the lock is only required whilst the snapshot is created, this doesn’t have the same impact as with mysqlhotcopy. mylvmbackup is a very useful perl script written by Lenz Grimmer that automates a LVM Snapshot and database file backup for both MyISAM and InnoDB tables. Snapshots can be a sound alternative for backups of databases of any size, but they do require some additional technical knowledge, additionally they can have a performance overhead, so it’s important to benchmark a Snapshot to ensure the impact on the operation isn’t intrusive.
Another alternative to backup the database files without interrupting the live service would be to use MySQL Replication to replicate the database to a slave, the slave can then be stopped to allow a backup of the database files offline. In most cases, introducing an additional slave, requires additional hardware – this obviously increases the cost of the overall solution, MySQL Replication is also asynchronous so it’s important to monitor a slave when using it for backups.

3rd Party Tools
There are a number of third party tools available for backing up MySQL, Zmanda Recovery Manager for MySQL provides an automated way of executing a backup using mysqldump or one of the various database file backup techniques mentioned above. ZRM for MySQL is available in both Community and Enterprise Editions, the commercial Enterprise Edition extends the number of backup options supported and ties in with a GUI to provide an added featureset.
InnoDB HotBackup is a tool commercially available from InnoBase (a subsidery of Oracle) that allows a consistent backup to be taken on MyISAM or InnoDB without interupting the database or requiring additional architecture. XtraBackup is an open source tool developed by Percona that adopts the same approach as InnoDB HotBackup and allows a consistent backup to be taken on MyISAM, InnoDB or Percona’s propriety engine XtraDB. Both InnoDB HotBackup and Xtrabackup are capable alternatives to database file backups.

MySQL Cluster
MySQL Cluster has a native backup tool that can be executed from the Management Node to take a consistent backup of the data on each Data Node.
mysqldump can be used in combination on the Application Nodes to backup the Database Schemas including any routines, triggers, and events, it can also be used to create a logical backup or data export on MySQL Cluster but to ensure consistency the Cluster has to be put into single user mode.


PlanetMySQL Voting: Vote UP / Vote DOWN

Kontrollkit – new version available for download

Апрель 27th, 2010
Just a quick notice to let everyone know that there is a new version of Kontrollkit available. There are some required bug fixes to the formerly new python backup script and some Solaris compatible changes to the various my.cnf files. You can download the new version here: http://kontrollsoft.com/software-downloads, or here: http://code.google.com/p/kontrollkit/
PlanetMySQL Voting: Vote UP / Vote DOWN