Archive for the ‘XtraBackup’ Category

Chain Copying to Multiple hosts

Май 18th, 2012

This week I was given the task of repopulating our entire primary database cluster.  This was due to an alter that had to be performed on our largest table.  It was easiest to run it on one host and populate the dataset from that host everywhere.

I recalled a while back reading a blog post from Tumblr about how to chain a copy to multiple hosts using a combination of nc, tar, and pigz.  I used this, with a few other things to greatly speed up our repopulation process.  As I was repopulating production servers, I did a combination of raw data copy and xtrabackup streams across our servers, depending on the position in our replication setup.

For a normal straight copy, here’s what I did:

On the last host, configure netcat to listen and then pipe the output through pigz and tar to uncompress and untar.  This needs to be run in the destination directory:

nc -l 1337 | pigz -d | tar -xvf -

On any hosts in the middle of the chain, you do the same thing with one extra step.  Using a fifo to redirect the stream to the next host:

mkfifo copy_redirect
nc next_host_in_chain 1337 <copy_redirect &
nc -l 1337 | tee copy_redirect | pigz -d | tar -xvf -

And on the source host you actually make the stream.  This is where I differed the most from what Tumblr had written.  I added a progress bar using pv.

tar -c /data/mysql/ | pv --size $( du -sh /data/mysql/ | cut -f1 ) | pigz | nc first_host_in_chain 1337

To do this with an xtrabackup stream, the commands are similar.  On each host, tar needs to add the “i” flag (to become “tar -xvfi -”).  The progress bar here became slightly less accurate, but was still a good rough estimate of the progress.  On the source host, the command became:

innobackupex --stream=tar /tmp/ --slave-info  | pv --size $( du -sh /data/mysql/ | cut -f1 ) | pigz | nc first_host_in_chain 1337

I found that using this method, for a raw copy, I was able to achieve between 300 and 350 MB/sec copying large tables.  Smaller tables averaged slower speeds.  I didn’t do enough testing here to see where the bottleneck was.  I can say that it was not network, cpu, or io.  Our servers involved have 10 GBit network and FusionIO drives.  Increasing the compression level may have helped add some throughput here as well.  Copying a 1.4 TB Dataset to 3 destination servers took under 2 hours.

This is definitely a tool that I will be adding to my arsenal to use on a regular basis.



PlanetMySQL Voting: Vote UP / Vote DOWN

Backup your sandbox with XtraBackup

Апрель 15th, 2012
Today I tried to make incremental backups of a MariaDB instance in a MySQL sandbox with Percona XtraBackup.
I used the recently released XtraBackup 2.0. And of course there is documentation about making incremental backups. 

MySQL sandbox makes it easy to run many different MySQL versions on one machine. It does this by changing the port number, data directory, UNIX socket location and a whole lot more.

So I first started with a full backup and after that I used that backup as a base for the incremental backups. To do that I had to specify the port number which is 5522 and the username and password for the msandbox account. As MySQL uses a UNIX socket instead of a TCP connection if the hostname is localhost I specified 127.0.0.1 as hostname to force a TCP connection. That worked!

Then I created the incremental backup by using the --incremental option and the --incremental-basedir option to specify the location of the full backup. That also worked!

Then I tried to make a backup while putting some load on my database. I did use  "INSERT INTO test1(col1) SELECT col1 FROM test1" to do this.

The full and incremental backups still worked, or at least that's what the backup script told me. But the size of the incremental backups was quite small. And I noticed that the LSN was very small and not increasing. The xtrabackup_checkpoints file also told me that the backups where all for exactly the same LSN. As the LSN is only for InnoDB, I verified the table type for my test tables. And my test tables were in fact InnoDB. A "SHOW ENGINE INNODB STATUS\G" told me that the LSN was in fact increasing.

It turned out that XtraBackup was making backups of /var/lib/mysql instead of ~/sandboxes/msb_5_5_22-mariadb/data/. Adding "--defaults-file=~/sandboxes/msb_5_5_22-mariadb/my.sandbox.cnf" to the innobackupex command did correct this.

After specifying the correct config file I did try to make backups under load again. It failed due to the logfiles being too small. So I stopped the database, removed the ib_logfile's and started the database with a larger InnoDB logfile size.

Then It all worked flawlessly!

So you should make sure that your backup completes without errors AND that your backups is from the right database. Of course testing restores regularly would also detect this.

PlanetMySQL Voting: Vote UP / Vote DOWN

Sessions at the Percona Live MySQL Conference that interest me

Март 9th, 2012

For the past many years, there’s been a conference in April, at the Santa Clara Convention Centre where the topic has been MySQL and the surrounding ecosystem. The first year I went, I gave a talk on the new features in MySQL Cluster 5.1 to a overflowing room of attendees. For me, it’s an event that’s mixed with speaking about something I’ve been working on and talking to other attendees about everything from how a particular part of the server works to where we can escape to for nearby good vegan food.

So, I thought I’d share some of the sessions that I’m really looking forward to. My selection is probably atypical, but may be interesting to others. I’m not going to list the keynotes, although they are often of a lot of value. I’m also going to attempt to avoid listing a few really awesome well known speakers simply because there are other really interesting sessions that also need exposure!

  • Starring Sakila: Building Data Warehouses and BI solutions using MySQL and Pentaho
    I need to base decisions off data, not simply a gut feeling (I’m not Stephen Colbert after all). I ran into a bunch of stumbling blocks when trying to work with Pentaho a couple of weeks ago, and I’m really hoping that this session shines some light on how to use it to better and more easily make arguments based on evidence to others in the company.
  • Testing MySQL Databases: The State Of The Art
    I’ve worked with Patrick for several years now, and he’s currently a valuable member of my team at Percona. For those who are interested in the state of the art of open source database testing, this is the session to be in.
  • Getting InnoDB Compression Ready for Facebook Scale
    This session is on at the same time as I’m speaking, so I probably won’t be able to attend (people keep coming to my sessions so I usually can’t sneak out). I’m really interested in how they’ve modified the compression code to help with their (large) workload.
  • Backing Up Facebook
    I hear that Facebook has a couple of database servers, a few dozen users and a few floppy disks full of data. This should be a fun story :)
  • Introducing XtraBackup Manager
    Being responsible for XtraBackup development at Percona, the XtraBackup topics really interest me. Lachlan has been working on a simple backup manager for XtraBackup to help create something that is a more complete backup solution than a tool which simply creates a backup.
  • Extending Xtrabackup – A Point-In-Time System
    Another good case of using XtraBackup as part of a comprehensive backup strategy. I have to be honest, I’m looking for ways in which we can improve XtraBackup to better fit the needs of people. It may be that there are a few small things we can do to make it easier for people do deploy and use.
  • Getting Started with Drizzle 7.1
    We’re about to do the 7.1 release of Drizzle! If you’re interested in having a SQL database that is designed to be used in large scale web applications and cloud environments, come along to this talk.
  • MySQL Idiosyncrasies That Bite
    I have to admit, I’m interested in Ronalds talk here to basically ensure we didn’t miss fixing anything in Drizzle. I do promise not to at any point yell out “Fixed in Drizzle” though.

Go here to register: http://www.percona.com/live/mysql-conference-2012/ (early bird pricing and discounted hotel rooms end March 12th, so you want to register sooner rather than later).


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing Percona XtraBackup 1.6.5

Февраль 11th, 2012

Percona is glad to announce the release of Percona XtraBackup 1.6.5 on 10 February, 2012 (Downloads are available here and from the Percona Software Repositories).

This release is purely composed of bug fixes and is the current stable release of Percona XtraBackup.

There are some important bug fixes around incremental backups, parallel backups and backups on databases with the system tablespace being multiple files. The full release notes and details are available here: http://www.percona.com/doc/percona-xtrabackup/release-notes/1.6/1.6.5.html


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing Percona XtraBackup 1.6.5

Февраль 11th, 2012

Percona is glad to announce the release of Percona XtraBackup 1.6.5 on 10 February, 2012 (Downloads are available here and from the Percona Software Repositories).

This release is purely composed of bug fixes and is the current stable release of Percona XtraBackup.

There are some important bug fixes around incremental backups, parallel backups and backups on databases with the system tablespace being multiple files. The full release notes and details are available here: http://www.percona.com/doc/percona-xtrabackup/release-notes/1.6/1.6.5.html


PlanetMySQL Voting: Vote UP / Vote DOWN

Xtrabackup 1.6.4 for Solaris 10 and 11

Январь 27th, 2012

If you need Xtrabackup for Solaris 10 and 11 (x64), you can download it from the link on the following page:

xtrabackup-solaris10_x86_64

Fwiw, we needed this for a server, and it’s not a standard package available for download, so I just wanted to make this available to all.

(Sorry if you’ve read this for a third time, and for the double link – I think the planet mysql feed might reject posts with direct ftp links.)

Hope this helps.


PlanetMySQL Voting: Vote UP / Vote DOWN

Xtrabackup 1.6.4 for Solaris 10 and 11

Январь 27th, 2012

If you need Xtrabackup for Solaris 10 and 11 (x64), you can download it from the link on the following page:

xtrabackup-solaris10_x86_64

Fwiw, we needed this for a server, and it’s not a standard package available for download, so I just wanted to make this available to all.

(Sorry if you’ve read this for a third time, and for the double link – I think the planet mysql feed might reject posts with direct ftp links.)

Hope this helps.


PlanetMySQL Voting: Vote UP / Vote DOWN

dbqp and Xtrabackup testing

Ноябрь 8th, 2011

So I’m back from the Percona dev team’s recent meeting.  While there, we spent a fair bit of time discussing Xtrabackup development.  One of our challenges is that as we add richer features to the tool, we need equivalent testing capabilities.  However, it seems a constant in the MySQL world that available QA tools often leave something to be desired.  The randgen is a literal wonder-tool for database testing, but it is also occasionally frustrating / doesn’t scratch every testing itch.  It is based on technology SQL Server was using in 1998 (MySQL began using it in ~2007, IIRC).  So this is no knock, it is merely meant to be an example of a poor QA engineer’s frustrations ; )  While the current Xtrabackup test suite is commendable, it also has its limitations. Enter the flexible, adaptable, and expressive answer: dbqp.

One of my demos at the dev meeting was showing how we can set up tests for Xtrabackup using the unittest paradigm.  While this sounds fancy, basically, we take advantage of Python’s unittest and write classes that use their code.  The biggest bit dbqp does is search the specified server code (to make sure we have everything we should), allocate and manage servers as requested by the test cases, and do some reporting and management of the test cases.  As the tool matures, I will be striving to let more of the work be done by unittest code rather than things I have written : )

To return to my main point, we now have two basic tests of xtrabackup:

Basic test of backup + restore:

  1. Populate server
  2. Take a validation snapshot (mysqldump)
  3. Take the backup (via innobackupex)
  4. Clean datadir
  5. Restore from backup
  6. Take restored state snapshot and compare to original state

Slave setup

  1. Similar to our basic test except we create a slave from the backup, replicating from the backed up server.
  2. After the initial setup, we ensure replication is set up ok, then we do additional work on the master and compare master and slave states

One of the great things about this is that we have the magic of assertions.  We can insert them at any point of the test we feel like validating and the test will fail with useful output at that stage.  The backup didn’t take correctly?  No point going through any other steps — FAIL! : )  The assertion methods just make it easy to express what behavior we are looking for.  We want the innobackupex prepare call to run without error?
Boom goes the dynamite!:

# prepare our backup
cmd = ("%s --apply-log --no-timestamp --use-memory=500M "
"--ibbackup=%s %s" %( innobackupex
, xtrabackup
, backup_path))
retcode, output = execute_cmd(cmd, output_path, exec_path, True)
self.assertEqual(retcode, 0, msg = output)

From these basic tests, it will be easy to craft more complex test cases.  Creating the slave test was simply matter of adapting the initial basic test case slightly.  Our plans include: *heavy* crash testing of both xtrabackup and the server, enhancing / expanding replication tests by creating heavy randgen loads against the master during backup and slave setup, and other assorted crimes against database software.  We will also be porting the existing test suite to use dbqp entirely…who knows, we may even start working on Windows one day ; )

These tests are by no means the be-all-end-all, but I think they do represent an interesting step forward.  We can now write actual, honest-to-goodness Python code to test the server.  On top of that, we can make use of the included unittest module to give us all sorts of assertive goodness to express what we are looking for.  We will need to and plan to refine things as time moves forward, but at the moment, we are able to do some cool testing tricks that weren’t easily do-able before.

If you’d like to try these tests out, you will need the following:
* dbqp (bzr branch lp:dbqp)
* DBD:mysql installed (test tests use the randgen and this is required…hey, it is a WONDER-tool!) : )
* Innobackupex, a MySQL / Percona server and the appropriate xtrabackup binary.

To run them:
$./dbqp.py –suite=xtrabackup_basic –basedir=/path/to/mysql –xtrabackup-path=/mah/path –innobackupex-path=/mah/other/path –default-server-type=mysql –no-shm

Some next steps for dbqp include:
1)  Improved docs
2)  Merging into the Percona Server trees
3)  Setting up test jobs in Jenkins (crashme / sqlbench / randgen)
4)  Other assorted awesomeness

Naturally, this testing goodness will also find its way into Drizzle (which currently has a 7.1 beta out).  We definitely need to see some Xtrabackup test cases for Drizzle’s version of the tool (mwa ha ha!) >: )


PlanetMySQL Voting: Vote UP / Vote DOWN

xtrabackup incremental backups

Август 12th, 2011

We wanted to optimize and test backups for one of our new large scale setup before we could finalize on the backup plan. Our challenge was the data volume was almost 30x more than our normal volumes. Since this involved large volume of data we thought this might be a good candidate to test incremental backups. We wrote a wrapper script to save the states between full backups and incremental backups and did some tests with smaller data sets. It worked perfectly fine. The version of xtrabackup we were testing was 1.6.2.

The incremental backups were completing well within 30minutes. We set out to test what would happen if the amount of incremental diff was large. While doing this test, we started getting backup failures with the following error

20110726_225312.log-110727 00:56:10 innobackupex: Starting to backup .frm, .MRG, .MYD, .MYI,
20110726_225312.log-innobackupex: .TRG, .TRN, .ARM, .ARZ, .CSM, .CSV and .opt files in
20110726_225312.log-innobackupex: subdirectories of ‘/var/lib/mysql’
20110726_225312.log:innobackupex: Error: Broken pipe at /var/backups/xtrabackup/bin/innobackupex line 336.
20110726_225312.log-Backup failed.
20110726_225312.log-Deleting bad data directory…
20110726_225312.log-Done. Removed /ddmysql/backups/20110726_225312

Initially we thought this was a one of case and made quite a few reruns. It was failing consistently. We wanted to get to the bottom of this. This was important for us to finalize the backup plan. We ran the innobackupex script through a debugger to find out where it was failing. The broken pipe error was getting triggered when the script was trying to send a ping to the mysql server so that the connection doesn’t timeout after the number of seconds mentioned in “mysql_keep_alive_timeout“  variable. This was set to around 1800. Our conclusion was when the incremental was of smaller size, the script never got to send the timeout ping. We reran the script with a much larger timeout value for a larger incremental data set and the backup was completed successfully.

The script seems to lose the connection handle during  course of execution and when it tries to ping on the handle it encounters a broken pipe. Wondering if there are other tried and tested ways to overcome this issue. Bug has been logged with the xtrabackup team to see if there are other options.


Tagged: broken pipe, bug, incremental, innobackupex, mysql, percona, xtrabackup

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL data backup: going beyond mysqldump

Март 29th, 2011

A user on a linux user group mailing list asked about this, and I was one of the people replying. Re-posting here as I reckon it’s of wider interest.

> [...] tens of gigs of data in MySQL databases.
> Some in memory tables, some MyISAM, a fair bit InnoDB. According to my
> understanding, when one doesn’t have several hours to take a DB
> offline and do dbbackup, there was/is ibbackup from InnoBase.. but now
> that MySQL and InnoBase have both been ‘Oracle Enterprised’, said
> product is now restricted to MySQL Enterprise customers..
>
> Some quick searching has suggested Percona XtraBackup as a potential
> FOSS alternative.
> What backup techniques do people employ around these parts for backups
> of large mixed MySQL data sets where downtime *must* be minimised?
>
> Has your backup plan ever been put to the test?

You should put it to the test regularly, not just when it’s needed.
An untested backup is not really a backup, I think.

At Open Query we tend to use dual master setups with MMM, other replication slaves, mysqldump, and XtracBackup or LVM snapshots. It’s not just about having backups, but also about general resilience, maintenance options, and scalability. I’ll clarify:

  • XtraBackup and LVM give you physical backups. that’s nice if you want to recover or clone a complete instance as-is. But if anything is wrong, it’ll be all stuffed (that is, you can sometimes recover InnoDB tablespaces and there are tools for it, but time may not be on your side). Note that LVM cannot snapshot between multiple volumes consistently, so if you have your InnoDB ibdata/IBD files and iblog files on separate spindles, using LVM is not suitable.
  • mysqldump for logical (SQL) backups. Most if not all setups should have this. Even if the file(s) were to be corrupted, they’re still readable since it’s plain SQL. You can do partial restores, which is handy in some cases. It’ll be slower to load so having *only* an SQL dump of a larger dataset is not a good idea.
  • some of the above backups can and should *also* be copied off-site. that’s for extra safety, but in terms of recovery speed it may not be optimal and should not be relied upon.
  • having dual masters is for easier maintenance without scheduled outages, as well as resilience when for instance hardware breaks (and it does).
  • slaves. You can even delay a slave (Maatkit has a tool for this), so that would give you a live correct image even in case of a user error, provided you get to it in time. Also, you want enough slack in your infra to be able to initialise a new slave off an existing one. Scaling up at a time when high load is already occurring can become painful if your infra is not prepared for it.

A key issue to consider is this… if the dataset is sufficiently large, and the online requirements high enough, you can’t afford to just have backups. Why? Because, how quickly can you deploy new suitable hardware, install OS, do restore, validate, put back online?

In many cases one or more aspects of the above list simply take too long, so my summary would be “then you don’t really have a backup”. Clients tend to argue with me on that, but only fairly briefly, until they see the point: if a restore takes longer than you can afford, that backup mechanism is unsuitable.

So, we use a combination of tools and approaches depending on needs, but in general terms we aim for keeping the overall environment online (individual machines can and will fail! relying on a magic box or SAN to not fail *will* get you bitten) to vastly reduce the instances where an actual restore is required.
Into that picture also comes using separate test/staging servers to not have developers stuff around on live servers (human error is an important cause of hassles).

In our training modules, we’ve combined the backups, recovery and replication topics as it’s clearly all intertwined and overlapping. Discussing backup techniques separate from replication and dual master setups makes no sense to us. It needs to be put in place with an overall vision.

Note that a SAN is not a backup strategy. And neither is replication on its own.


PlanetMySQL Voting: Vote UP / Vote DOWN