Archive for the ‘update’ Category

TokuDB v6.0: Download Available

Апрель 30th, 2012

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Sysbench Performance

I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In particular, we’ve been working on our locking schemes and have made some nice improvements in multi-threaded performance. In TokuDB v5.2, we outperformed InnoDB on sysbench by about 20% out to 64 threads. The following shows the performance of TokuDB v6.0 vs InnoDB on the same test:

InnoDB now has better multi-threading as well, so with standard compression on, we are now neck-in-neck with InnoDB out to 64 client threads, and then pull ahead out to 1024 client threads. With high compression, we top out at 72% faster than InnoDB!

We hope you enjoy this and all the other TokuDB v6.0 improvements.

To learn more about TokuDB:

  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Read the Bloor Research Report on TokuDB v6.0 here.

    PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Frequent Checkpoints with No Performance Hit

Апрель 13th, 2012

Checkpointing — which involves periodically writing out dirty pages from memory — is central to the design of crash recovery for both TokuDB and InnoDB. A key issue in designing a checkpointing system is how often to checkpoint, and TokuDB takes a very different approach from InnoDB. The frequency of InnoDB depends on the amount of fuzzy checkpointing, the log-file size, and how often the memory files with dirty pages — but the upshot is that it runs a checkpoint infrequently. TokuDB runs a checkpoint one minute after the last one ended.

Frequent checkpoints make for fast recovery. Once MySQL crashes, the storage engine needs to replay the log to get back to a correct state. The length of the log is a function of the time since the last checkpoint. And replaying the log is single threaded. So TokuDB recovers in minutes, and usually much faster. If InnoDB crashes late in its checkpoint cycle it can take hours or more to recover. Indeed, there is considerable lore around making InnoDB recover faster.

So what’s the downside to frequent checkpoints? Up until now, the answer was simple: when you are in a checkpoint, your performance drops. This was famously illustrated for InnoDB when Vadim Tkachenko at Percona Consulting showed that MySQL could become completely unresponsive for minutes at a time during an InnoDB checkpoint. We see a similar outcome here:

In this case we see a stall in which the throughput drops to around 25%, and the stall lasts for minutes. I want to stress that fuzzy checkpointing was designed to help avoid catastrophic checkpoints, and sometimes it works, but the tpcc benchmark shows that it doesn’t always work.

In previous versions of TokuDB, we also had a dip in performance associated with checkpoints, but frequent checkpoints are also smaller checkpoints, so our performance would drop to around 80% of peak for a couple of seconds. A drop to 80% is better than a drop to 25%, but we knew we could do better.

And we did. As of TokuDB v6.0, we’ve eliminated the performance variability from checkpointing. We’re still checkpointing just as frequently, so you still get fast recovery. How? It was a combination of reducing the amount of work a checkpoint needs to do and fixing the locking interaction between checkpoints and other operations. Below is a sysbench benchmark. This is a case where InnoDB checkpoint behavior is as good as it gets, and I wanted to compare us with InnoDB’s best case, not its worst case.

Sysbench performance with different compressors

This graph shows that TokuDB v6.0 has no checkpoint variability. It turns out that TokuDB v6.0 with standard compression has about the same average TPS as TokuDB v5.2, but with no checkpointing artifacts. Finally, if you have the CPU budget for it, turning on aggressive compression gives a big boost in transactions per second, still with no checkpointing variability.

So in a nutshell, I feel like we’ve taken care of the checkpointing issue. As of TokuDB v6.0, we have the upside of frequent checkpoints — small logs and fast recovery — without the downside of variability. The engineering team at Tokutek is pretty proud of these results.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Frequent Checkpoints with No Performance Hit

Апрель 13th, 2012

Checkpointing — which involves periodically writing out dirty pages from memory — is central to the design of crash recovery for both TokuDB and InnoDB. A key issue in designing a checkpointing system is how often to checkpoint, and TokuDB takes a very different approach from InnoDB. The frequency of InnoDB depends on the amount of fuzzy checkpointing, the log-file size, and how often the memory files with dirty pages — but the upshot is that it runs a checkpoint infrequently. TokuDB runs a checkpoint one minute after the last one ended.

Frequent checkpoints make for fast recovery. Once MySQL crashes, the storage engine needs to replay the log to get back to a correct state. The length of the log is a function of the time since the last checkpoint. And replaying the log is single threaded. So TokuDB recovers in minutes, and usually much faster. If InnoDB crashes late in its checkpoint cycle it can take hours or more to recover. Indeed, there is considerable lore around making InnoDB recover faster.

So what’s the downside to frequent checkpoints? Up until now, the answer was simple: when you are in a checkpoint, your performance drops. This was famously illustrated for InnoDB when Vadim Tkachenko at Percona Consulting showed that MySQL could become completely unresponsive for minutes at a time during an InnoDB checkpoint. We see a similar outcome here:

In this case we see a stall in which the throughput drops to around 25%, and the stall lasts for minutes. I want to stress that fuzzy checkpointing was designed to help avoid catastrophic checkpoints, and sometimes it works, but the tpcc benchmark shows that it doesn’t always work.

In previous versions of TokuDB, we also had a dip in performance associated with checkpoints, but frequent checkpoints are also smaller checkpoints, so our performance would drop to around 80% of peak for a couple of seconds. A drop to 80% is better than a drop to 25%, but we knew we could do better.

And we did. As of TokuDB v6.0, we’ve eliminated the performance variability from checkpointing. We’re still checkpointing just as frequently, so you still get fast recovery. How? It was a combination of reducing the amount of work a checkpoint needs to do and fixing the locking interaction between checkpoints and other operations. Below is a sysbench benchmark. This is a case where InnoDB checkpoint behavior is as good as it gets, and I wanted to compare us with InnoDB’s best case, not its worst case.

Sysbench performance with different compressors

This graph shows that TokuDB v6.0 has no checkpoint variability. It turns out that TokuDB v6.0 with standard compression has about the same average TPS as TokuDB v5.2, but with no checkpointing artifacts. Finally, if you have the CPU budget for it, turning on aggressive compression gives a big boost in transactions per second, still with no checkpointing variability.

So in a nutshell, I feel like we’ve taken care of the checkpointing issue. As of TokuDB v6.0, we have the upside of frequent checkpoints — small logs and fast recovery — without the downside of variability. The engineering team at Tokutek is pretty proud of these results.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Even Better Compression

Апрель 11th, 2012

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks of data — on the order of MB, rather than the 16KB that InnoDB uses — which is a big part of why we can get better compression. For InnoDB, compression is attempted on 16KB pieces, with inefficiencies if the block compresses too little or too much. InnoDB’s compression woes are well documented.

In TokuDB v6.0, you can choose between two types of compression by setting the ROW_FORMAT in the CREATE TABLE or ALTER TABLE commands. One compression setting, “standard,” uses less CPU. The other setting, “aggressive,” uses more CPU but usually does a better job of compressing, sometimes much better.

Let’s look at the numbers (benchmark details here).

Comparison of Compression Levels

In this case, we’ve achieved 29x compression!

So when should you use the standard compressor and when should you use the more aggressive compressor? Compression is all done in the background, so it basically depends on the number of cores you have. If you have enough idle cores, the aggressive compressor will not slow down your database — in fact, the following graph shows that you can use TokuDB’s aggressive compressor to improve your overall database performance.

Sysbench performance with different compressors

If you don’t have enough spare cores, then the standard compressor may be better, since in that case, the compressor may contend with other parts of the system for CPU resources. The exact cutoff depends on the particulars of your system, but an easy rule of thumb might be to use standard if you have 6 or fewer cores, and otherwise use aggressive.

In either case, you get great compression. Compression performance is strongly affected by many factors, and we are always on the lookout for interesting use cases, so please post any interesting results you might get with the two settings.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.
  • Catch Tokutek Software Engineer Leif Walsh’s presentation at Percona Live on April 11th at 4:30 pm
  • Catch Tokutek VP of Marketing’s Lawrence Schwartz’s Lightning Talk at Percona Live on April 11th at 6:30 pm

PlanetMySQL Voting: Vote UP / Vote DOWN

Evidenzia Upgrades to TokuDB v5.2 to Address Storage Growth and Scale Performance

Февраль 27th, 2012

Ensuring sufficient disk I/O to catch copyright violations at network speed.

Evidenzia GmbH & Co. KG

Issues addressed:

  • Storage growth, including maxed-out disk I/O utilization
  • Performance issues and business impact due to slow selects
  • Inability to revise data schema on the fly

The Company: Evidenzia GmbH & Co. KG is one of the leading partners of the software, movie and music industry when it comes to tracing copyright infringements and illegal file sharing activities in peer-to-peer networks. Evidenzia helps copyright owners in protecting their intellectual property. Their powerful technologies enable copyright owners to trace and document illegal file sharing activities in P2P networks reliably. All data and documentation may then be used as evidence in court.

The Challenge: Evidenzia ingests a large amount of logging information each hour. The data not only needs to be processed in parallel for instant reporting, but also has to be stored in case it is ever needed as evidence in a legal case. To meet these needs, Evidenzia logs IP addresses while also performing a connect to each peer. In the process, the software fetches data to match it to the copyrighted material for proof of copyright violation.

“Prior to TokuDB, we were using InnoDB for storing all the data. We found that as the tables grew bigger, the selects were becoming slower, taking as much as an hour or more, and the disk I/O was growing higher” according to Director of Operations Bastian Axter.

To keep up with the workload, Evidenzia had considered several options, but they failed to meet program performance and price goals. These included:

Flash memory (SSD cache) – Storing all the data on SSD was much too expensive so Evidenzia considered using SSD cache inside the RAID controller. After testing this approach, Evidenzia discovered that it would not help because there was still too much data spread randomly to the disk, and the cache could not improve with random reads.

Partitioning  - “Partitioning was one option that was reviewed to divide up the load,” Axter said. ” However, the management overhead that would have been required for all the tables and partitions was excessive. This approach would clearly have introduced more problems than it could have solved and would have resulted in additional management headache.”

The Solution: With Tokudb 5.2, Evidenzia can do all the inserts and selects in parallel and also delete deprecated data out of the same table, without the need to call an “optimize table” or slow down the other processes (insert/select). In addition, the compression of TokuDB tables proved invaluable in keeping the required disk space low.

“The fast indexes and the ability to delete without having to optimize the table, as well as the unique ability of Hot Index addition, really brought home how powerful TokuDB is” according to Axter.  ”For these reasons, we were able to convert other tables to TokuDB as well.”

Below is a graph of the disk-usage (I/O max 100%) of the primary database, which shows the dramatic drop in disk I/O at week 46 when Evidenzia deployed TokuDB:

Disk utilization before and after TokuDB

“Most of the I/O came from the long running selects; they are gone since we introduced TokuDB into production,” according to Axter. “The overall impact on disk I/O was impressive, dropping from near 80-100% down to 5-10%.”

The Benefits: 

Cost Savings: With growth in InnoDB, as selects were slowing down, disk I/O was rising. Evidenzia would have had to buy additional drives just to keep up with the I/O. In addition, the compression on InnoDB wasn’t up to the task of being able to significantly shrink the tables on disk. “With TokuDB, we saved over 70% on storage,” according to Axter.

Performance: “There was an immediate impact with selects with TokuDB. These went from taking over an hour down to taking just minutes,” noted Axter. “Not only did TokuDB assist us with the select slowdown from large tables, but it also addressed our problems with deletes. Prior to TokuDB, deletes of already processed and archived data were far too slow because of the huge and slow fragmented indexes.”

Flexibility of Operations: With InnodDB, “optimize table” to rebuild the indexes was too disruptive to the business since it would block the whole logging process. With TokuDB, however, indexes don’t fragment and so they never require the database to be taken offline to rebuild them.


PlanetMySQL Voting: Vote UP / Vote DOWN

1 Billion Insertions – The Wait is Over!

Январь 26th, 2012

iiBench measures the rate at which a database can insert new rows while maintaining several secondary indexes. We ran this for 1 billion rows with TokuDB and InnoDB starting last week, right after we launched TokuDB v5.2. While TokuDB completed it in 15 hours, InnoDB took 7 days.

The results are shown below. At the end of the test, TokuDB’s insertion rate remained at 17,028 inserts/second whereas InnoDB had dropped to 1,050 inserts/second. That is a difference of over 16x. Our complete set of benchmarks for TokuDB v5.2 can be found here.

Benchmark Details: Ubuntu 10.10; 2x Xeon X5460; 16GB RAM; 8x 146GB 10k SAS in RAID10. Each data point is the average insertion rate for the last 2 million rows. 

We developed the iiBench benchmark to measure performance for a use case that occurs commonly in production applications, such as online advertising, social media, and network management.

iiBench simulates a pattern of usage for always-on applications that:

  • Require fast query performance and hence require indexes
  • Have high data insert rates
  • Cannot wait for offline batch processing and hence require the indexes be maintained as data comes in

Note that iiBench was created as an open-source benchmark, which allows others to freely use it, extend it, and contribute their changes back. We originally unveiled the benchmark in the context of a challenge issued at the 2008 OpenSQL camp. Since then, iiBench has been downloaded and used many times, and ported by the community (in this case, Mark Callaghan) to a Python Script.

Please let us know any feedback you have on iiBench. For additional information on…

  • iibench overview click here
  • TokuDB version 5.2 Overview click here
  • TokuDB version 5.2 Performance, including iibench, SysBench, Compression, and TPCC-like, click here

PlanetMySQL Voting: Vote UP / Vote DOWN

Upgrading Tungsten Replicator: as easy as …

Сентябрь 23rd, 2011
When I talked about the usability improvements of Tungsten Replicator, I did not mention the procedure for upgrading.I was reminded about it by a question in the TR mailing list, and since the question was very relevant, I updated the Tungsten Cookbook with some quick upgrading instructions.A quick upgrading procedure is as important as the installer. Since we release software quite often, either because we have scheduled features to release or because of bug fixes, users want to apply a new release to an existing installation without much fuss. You can do the upgrade with a very quick and painless procedure.Let's suppose that you have installed one Tungsten Replicator cluster using this command:

#
# using tungsten-replicator 2.0.4
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
./tools/tungsten-installer \
--master-slave \
--master-host=r1 \
--datasource-user=tungsten \
--datasource-password=secret \
--service-name=dragon \
--home-directory=$TUNGSTEN_HOME \
--cluster-hosts=r1,r2,r3,r4 \
--start-and-report
If you want to upgrade to the very latest Tungsten Replicator 2.0.5, build 321, this is what you need to do.
  • Get the latest tarball, and expand it;
  • Stop the replicator;
  • Run the update command (this will also restart the replicator)
  • Check that the replicator is running again.
The actual upgrade command is in bold in the following script.

#
# using tungsten-replicator 2.0.5-321 (get it from bit.ly/tr20_builds)
#
TUNGSTEN_HOME=/home/tungsten/installs/master_slave
HOSTS=(r1 r2 r3 r4)
for HOST in ${HOSTS[*]}
do
ssh $HOST $TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/replicator stop
./tools/update --host=$HOST --user=tungsten --release-directory=$TUNGSTEN_HOME -q
$$TUNGSTEN_HOME/tungsten/tungsten-replicator/bin/trepctl -host $HOST services
done
One benefit of this procedure, in addition to being brief and effective, is that the previous binaries are preserved.Before the upgrade, you will see:

$ ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten 75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.4
/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4
The 'tungsten' directory is a symlink to the actual binaries inside the 'releases' directory.After the upgrade, the same directory looks like this:

ls -lh ~/installs/master_slave/ ~/installs/master_slave/releases
/home/tungsten/installs/master_slave/:
total 32K
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 backups
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 configs
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 relay
drwxrwxr-x 4 tungsten tungsten 4.0K Sep 22 22:06 releases
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 service-logs
drwxrwxr-x 2 tungsten tungsten 4.0K Sep 22 22:03 share
drwxrwxr-x 3 tungsten tungsten 4.0K Sep 22 22:03 thl
lrwxrwxrwx 1 tungsten tungsten 75 Sep 22 22:06 tungsten -> /home/tungsten/installs/master_slave/releases/tungsten-replicator-2.0.5-321

/home/tungsten/installs/master_slave/releases:
total 8.0K
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:03 tungsten-replicator-2.0.4
drwxr-xr-x 6 tungsten tungsten 4.0K Sep 22 22:06 tungsten-replicator-2.0.5-321
If you did some manual change to the files in 2.0.4, you will be able to retrieve them. Upgrading from earlier versions of Tungsten Replicator is not as smooth. Since we changed the installation format, it has become incompatible from previous versions. Clusters running TR 2.0.3 need to be reinstalled manually. The next upgrade, though, will be much faster!

PlanetMySQL Voting: Vote UP / Vote DOWN

Dude, Where’s my Fractal Tree?

Июль 18th, 2011

Unless you are Aston Kutcher (@aplusk), or one of his Hollywood buddies, you don’t need to read any further. Allow me to explain…

Over the weekend, we launched our new website. This type of announcement used to be interesting in the high-tech world. I heard Kara Swisher of the WSJ’s All things D speak at a MassTLC event in May.  She admitted back in the 1990s, when the web was just getting into high gear, that a new website from an interesting company might actually get some coverage. Not anymore.

I’ve also been told at all the SEO classes I’ve taken that as much as marketing folks sweat over every detail, link, font and color, it all doesn’t matter. And as far as Google is concerned, your site could be all in 5 point courier grey font on a black background – as long as you have the right keywords and lots of links to you, you’ll be ranked well and the right people could find you.

So, who can I share my excitement with over our new site? It occurred to me this weekend – Ashton Kutcher and his Hollywood pals!

With Ashton’s recent investment in MemSQL maybe Hollywood is finally getting hip to databases! If that’s the case, then Ashton, do we have a site for you! The all new Tokutek.com brings you:

We hope you’ll swing by and check us out!


PlanetMySQL Voting: Vote UP / Vote DOWN

Update on “A Tale Of a Bug”

Август 4th, 2010

The bug I talked about a little while ago has now also had the fix I wrote committed to the mysql-trunk 5.5.6-m3 repository.


PlanetMySQL Voting: Vote UP / Vote DOWN

OpenSQL Camp Europe: Time to cast your votes!

Июль 15th, 2010

If you wonder why there hasn't been an update from me for quite a while — I just returned from two months of paternal leave, in which I actually managed to stay away from the PC most of the time. In the meanwhile, I've officially become an Oracle employee and there is a lot of administrative things to take care of... But it feels good to be back!

During my absence, Giuseppe and Felix kicked off the Call for Papers for this year's European OpenSQL Camp, which will again take place in parallel to FrOSCon in St. Augustin (Germany) on August 21st/22nd. We've received a number of great submissions, now we would like to ask our community about your favourites!

Basically it's "one vote per person per session" and you can cast your votes in two ways, either by twittering @opensqlcamp or via the opensqlcamp mailing list. The procedure is outlined in more detail on this wiki page.

As we need to finalize the schedule and inform the speakers, the voting period will close this coming Sunday, 18th of July. So don't hesitate, cast your votes now! Based on your feedback we will compile the session schedule for this year's camp. Thanks for your help!


PlanetMySQL Voting: Vote UP / Vote DOWN