Archive for the ‘TokuDB’ Category

TokuDB v6.0: Frequent Checkpoints with No Performance Hit

Апрель 13th, 2012

Checkpointing — which involves periodically writing out dirty pages from memory — is central to the design of crash recovery for both TokuDB and InnoDB. A key issue in designing a checkpointing system is how often to checkpoint, and TokuDB takes a very different approach from InnoDB. The frequency of InnoDB depends on the amount of fuzzy checkpointing, the log-file size, and how often the memory files with dirty pages — but the upshot is that it runs a checkpoint infrequently. TokuDB runs a checkpoint one minute after the last one ended.

Frequent checkpoints make for fast recovery. Once MySQL crashes, the storage engine needs to replay the log to get back to a correct state. The length of the log is a function of the time since the last checkpoint. And replaying the log is single threaded. So TokuDB recovers in minutes, and usually much faster. If InnoDB crashes late in its checkpoint cycle it can take hours or more to recover. Indeed, there is considerable lore around making InnoDB recover faster.

So what’s the downside to frequent checkpoints? Up until now, the answer was simple: when you are in a checkpoint, your performance drops. This was famously illustrated for InnoDB when Vadim Tkachenko at Percona Consulting showed that MySQL could become completely unresponsive for minutes at a time during an InnoDB checkpoint. We see a similar outcome here:

In this case we see a stall in which the throughput drops to around 25%, and the stall lasts for minutes. I want to stress that fuzzy checkpointing was designed to help avoid catastrophic checkpoints, and sometimes it works, but the tpcc benchmark shows that it doesn’t always work.

In previous versions of TokuDB, we also had a dip in performance associated with checkpoints, but frequent checkpoints are also smaller checkpoints, so our performance would drop to around 80% of peak for a couple of seconds. A drop to 80% is better than a drop to 25%, but we knew we could do better.

And we did. As of TokuDB v6.0, we’ve eliminated the performance variability from checkpointing. We’re still checkpointing just as frequently, so you still get fast recovery. How? It was a combination of reducing the amount of work a checkpoint needs to do and fixing the locking interaction between checkpoints and other operations. Below is a sysbench benchmark. This is a case where InnoDB checkpoint behavior is as good as it gets, and I wanted to compare us with InnoDB’s best case, not its worst case.

Sysbench performance with different compressors

This graph shows that TokuDB v6.0 has no checkpoint variability. It turns out that TokuDB v6.0 with standard compression has about the same average TPS as TokuDB v5.2, but with no checkpointing artifacts. Finally, if you have the CPU budget for it, turning on aggressive compression gives a big boost in transactions per second, still with no checkpointing variability.

So in a nutshell, I feel like we’ve taken care of the checkpointing issue. As of TokuDB v6.0, we have the upside of frequent checkpoints — small logs and fast recovery — without the downside of variability. The engineering team at Tokutek is pretty proud of these results.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Even Better Compression

Апрель 11th, 2012

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks of data — on the order of MB, rather than the 16KB that InnoDB uses — which is a big part of why we can get better compression. For InnoDB, compression is attempted on 16KB pieces, with inefficiencies if the block compresses too little or too much. InnoDB’s compression woes are well documented.

In TokuDB v6.0, you can choose between two types of compression by setting the ROW_FORMAT in the CREATE TABLE or ALTER TABLE commands. One compression setting, “standard,” uses less CPU. The other setting, “aggressive,” uses more CPU but usually does a better job of compressing, sometimes much better.

Let’s look at the numbers (benchmark details here).

Comparison of Compression Levels

In this case, we’ve achieved 29x compression!

So when should you use the standard compressor and when should you use the more aggressive compressor? Compression is all done in the background, so it basically depends on the number of cores you have. If you have enough idle cores, the aggressive compressor will not slow down your database — in fact, the following graph shows that you can use TokuDB’s aggressive compressor to improve your overall database performance.

Sysbench performance with different compressors

If you don’t have enough spare cores, then the standard compressor may be better, since in that case, the compressor may contend with other parts of the system for CPU resources. The exact cutoff depends on the particulars of your system, but an easy rule of thumb might be to use standard if you have 6 or fewer cores, and otherwise use aggressive.

In either case, you get great compression. Compression performance is strongly affected by many factors, and we are always on the lookout for interesting use cases, so please post any interesting results you might get with the two settings.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.
  • Catch Tokutek Software Engineer Leif Walsh’s presentation at Percona Live on April 11th at 4:30 pm
  • Catch Tokutek VP of Marketing’s Lawrence Schwartz’s Lightning Talk at Percona Live on April 11th at 6:30 pm

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Getting Rid of Slave Lag

Апрель 10th, 2012

Master/slave replication is an important tool that gets used in many ways: distributing read loads among many slaves for performance, using a slave for backups so the master can handle live load, geographically distributed disaster recovery, etc. The Achilles’ Heal of slave performance is that slave workloads are single-threaded. The master can have many clients inserting, updating, querying, whereas the slave has only one insertion client: the master. InnoDB single-client performance is much slower than its multi-client performance, which means that the bottleneck in a master/slave system is often the rate at which a slave can keep up.

If the master has an average transactions per second (tps) that is higher than what the slave can handle, the slave will fall further and further behind. If the slaves are being used to distribute read workload, for example, the results they produce will fall further out of date. If a slave is used to generate a backup (e.g. the slave is taken offline to produce a backup snapshot), then the slave has a harder time catching up with the master once it comes back online, and if it never catches up, the value of a backup is reduced.

So slave lag caused by single-client performance is a big problem. The good news is that TokuDB has enough data ingestion horsepower that it can keep up with some big single-threaded workloads. We’ve been able to show this with our newly released TokuDB v6.0.

Here’s what we did to measure the impact of slave lag. We made a version of TPCC that generates transactions at some user-definable rate. We fed the transactions to a master/slave combo for 60s. At the rates we measured, the master was able to finish all transactions during the 60s window. Then we waited to see how long it would take the slave to finish its work. At 1000tps, both TokuDB and InnoDB slaves were able to complete the work in 60s, which means there was no slave lag. By the time we got to 3000tps, InnoDB was taking over 140s to finish the work, meaning that it was falling more than 80s behind every 60s. TokuDB was still keeping up.

Slave Complete Time

You can find all the details of the experiement here.

In TokuDB v6.0 we are introducing XA (two-phase transactions) which is a common way that binlog replication has been implemented in MySQL. Combined with great slave performance, this make TokuDB a great choice for replication.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.
  • Catch Tokutek Software Engineer Leif Walsh’s presentation at Percona Live on April 11th at 4:30 pm
  • Catch Tokutek VP of Marketing’s Lawrence Schwartz’s Lightning Talk at Percona Live on April 11th at 6:30 pm

PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing TokuDB v6.0: Less Slave Lag and More Compression

Апрель 9th, 2012

We are excited to announce TokuDB® v6.0, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.

This version offers feature and performance enhancements over previous releases, support for XA (two-phase transactional commits), better compression, and reduced performance variability associated with checkpointing. This release also brings TokuDB support up to date on MySQL v5.1, MySQL v5.5 and MariaDB v5.2. There’s a lot of great technical stuff under the hood in this release and I’ll be reviewing the improvements one-by-one over the course of this week.

I’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.

Replication Slave Lag
One of the things TokuDB does well is single-threaded insertions, which translates directly into less slave lag. With TokuDB v6.0, we introduce support for XA, which insures for a more robust environment for many replication use cases. High insertion rate and XA support make TokuDB a drop-in replacement for InnoDB in replication environments. In the next blog, I’ll be giving some performance numbers.

Compression
TokuDB has great compression. Starting with TokuDB v6.0, you’ll have a choice between standard compression and aggressive compression. Aggressive compression uses more cores and usually does a significantly better job at compressing. I’ll get into the details of our new compression feature in another post.

Checkpoint variability
TokuDB checkpoints frequently, which makes recovery super-fast. InnoDB checkpoints infrequently, because checkpoints slow InnoDB performance significantly. Our customers tell us they require stable and stall-free performance, even in the face of checkpointing. With TokuDB v6.0 we deliver that and with no drop in throughput. We think you’ll be happy with the results — frequent checkpoints and fast recovery with no performance hit! — details of which forthcoming.

Performance
This release continues our improvements for multi-client scaling and in-memory performance. We’ve made great strides. Numbers for this in an upcoming post.

TokuDB v6.0 maintains all our established advantages: fast trickle load, fast bulk load, fast range queries through clustering indexes, hot schema changes, no fragmentation, and full MySQL compatibility for ease of installation. See our benchmark page for details.

To learn more about TokuDB:

Replication, compression, reduced variability, improved performance and support for MySQL v5.5. Enjoy!


PlanetMySQL Voting:
Vote UP / Vote DOWN

OLTP and OLAP – Have Your Cake and Eat it Too!

Апрель 2nd, 2012

Looks like we’ll be having some more fun at the Percona Live MySQL Conference! In addition to our booth and my colleague Tim’s talk, my lightning talk was accepted. The title is “OLTP and OLAP – Have Your Cake and Eat it Too!” The lightning talks, given in a TBD order, will start Wednesday evening (April 11th) at around 6:30 pm.

Below is the abstract I submitted.

 

OLTP and OLAP – Have Your Cake and Eat it Too!

The real end-game for Big Data is to have transactional and analytic data on the same database. Just imagine the maximum value one could get out of designing analytics as part of operations.

Databases used to be all-serving 40 years ago with one type of indexing structure used ubiquitously. This all worked well with the simple drives and application of the time. As the decades wore on, applications, storage, and database needs evolved. In response, the basic database diverged into all sorts of speciality products (ie, column store, NoSQL, etc…). However, it needn’t be this way.

We’ll walk you through the decades and show the path from single database to pluralism back to single database. All in 5 minutes!


PlanetMySQL Voting: Vote UP / Vote DOWN

Looking for Global Collisions

Март 28th, 2012

On Monday, I took a break from planning for the upcoming Percona Live MySQL Conference (where we have a sessionlightning talkbooth, and other misc activities planned) to go attend the UK-Massachusetts Innovation Economies Conference at the MIT Media Lab. The event featured Gov. Deval Patrick, MIT Media Lab Director Joi Ito, industry experts such as Sheila Marcelo (founder of Care.com), and UK trade representatives, including Minister Mark Prisk. A key topic was entrepreneurship, including how breakthroughs happen.

From Left: Consul General Phil Budden, Gov. Deval Patrick, Mark Prisk MP, and Joi Ito

The discussion started with a focus on home — the Commonwealth of Massachusetts. First, Patrick noted that MA is doing well in entrepreneurship, with $2.7B in VC in 2010 and the top position in the Kauffman New Economy Index.  Next, Ito got into details around technology models, noting that he is trying to spread the proven low cost software model to other areas that MA shines in, such as big data, infrastructure, life sciences and robotics.  Finally, consulate members noted how Boston has become an example for entrepreneurship for the UK .  Stew McTavish, who was here from ideaSpace in the UK, has been looking at our incubator and early stage centers, such as the CIC to learn more. He engaged Akhil Nigam, from MassChallenge, on this topic.

In their discussion, Nigham and McTavish spoke about how important it is to facilitate collisions, or meetings of random people with similar goals and interests. The analogy they used was particle physics, where the most interesting phenomena occur when things collide. Stew noted that there are two ways to “increase collisions” for startups – make them move faster (give them more money, advice, talent, etc…) and make the space smaller (stick many diverse groups in a building like at the CIC).

Do collisions help? McTavish had a real life example from ideaSpace. He spoke about how one investor left a meeting to go get some coffee after realizing the CEO’s pitch wasn’t a fit for his fund.  He bumped into another entrepreneur working down the hall, started talking, and liked what he heard. Six months later, he ended up investing £200k in the second company. Proof positive that unexpected hallway “collisions” can be beneficial.

It’s exactly this type of frequent, active, and interesting set of global collisions that I look forward to next month at the Percona Live MySQL conference. While the talks are generally excellent, much of the value of the show is from having a talented and diverse crowd come together with unexpected hallway (or Pedro’s) conversations where solutions are hatched, partnerships are forged, and networks are built.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

Looking for Global Collisions

Март 28th, 2012

On Monday, I took a break from planning for the upcoming Percona Live MySQL Conference (where we have a sessionlightning talkbooth, and other misc activities planned) to go attend the UK-Massachusetts Innovation Economies Conference at the MIT Media Lab. The event featured Gov. Deval Patrick, MIT Media Lab Director Joi Ito, industry experts such as Sheila Marcelo (founder of Care.com), and UK trade representatives, including Minister Mark Prisk. A key topic was entrepreneurship, including how breakthroughs happen.

From Left: Consul General Phil Budden, Gov. Deval Patrick, Mark Prisk MP, and Joi Ito

The discussion started with a focus on home — the Commonwealth of Massachusetts. First, Patrick noted that MA is doing well in entrepreneurship, with $2.7B in VC in 2010 and the top position in the Kauffman New Economy Index.  Next, Ito got into details around technology models, noting that he is trying to spread the proven low cost software model to other areas that MA shines in, such as big data, infrastructure, life sciences and robotics.  Finally, consulate members noted how Boston has become an example for entrepreneurship for the UK .  Stew McTavish, who was here from ideaSpace in the UK, has been looking at our incubator and early stage centers, such as the CIC to learn more. He engaged Akhil Nigam, from MassChallenge, on this topic.

In their discussion, Nigham and McTavish spoke about how important it is to facilitate collisions, or meetings of random people with similar goals and interests. The analogy they used was particle physics, where the most interesting phenomena occur when things collide. Stew noted that there are two ways to “increase collisions” for startups – make them move faster (give them more money, advice, talent, etc…) and make the space smaller (stick many diverse groups in a building like at the CIC).

Do collisions help? McTavish had a real life example from ideaSpace. He spoke about how one investor left a meeting to go get some coffee after realizing the CEO’s pitch wasn’t a fit for his fund.  He bumped into another entrepreneur working down the hall, started talking, and liked what he heard. Six months later, he ended up investing £200k in the second company. Proof positive that unexpected hallway “collisions” can be beneficial.

It’s exactly this type of frequent, active, and interesting set of global collisions that I look forward to next month at the Percona Live MySQL conference. While the talks are generally excellent, much of the value of the show is from having a talented and diverse crowd come together with unexpected hallway (or Pedro’s) conversations where solutions are hatched, partnerships are forged, and networks are built.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

Win Free MySQL Conference Tickets!

Март 19th, 2012

Tokutek and Percona are giving away free tickets to the Percona Live MySQL Conference and Expo (worth $995 each), and you can win them! We’re also giving away copies of High Performance MySQL, 3rd Edition (worth $55 each).

This year’s event is the best ever, with a better lineup of talks and speakers than ever before.  It’s the one event you should not miss if you’re at all interested in MySQL.  We really want you to be there — and that’s why we’re joining with Percona to give away free tickets! It’s easy to enter:

  • Follow our Twitter feed, and retweet us when we mention this contest
  • Tweet “My favorite #MySQL conference session” with a link to your favorite
  • “Like” your favorite conference session with Facebook
  • +1 your favorite conference session via Google Plus

To Tweet, “like,” or +1 a session, just browse to the session and use the social sharing buttons on it.

It is OK to enter multiple times — each time you enter increases your chances of winning. The contest runs until Thursday, so you can enter on multiple days and increase your odds further.

The official contest rules, including more ways to enter the contest, are on Percona’s blog post.

Good luck!


PlanetMySQL Voting: Vote UP / Vote DOWN

Big Data and MySQL – a Discussion with SiliconANGLE on theCUBE

Март 13th, 2012

Given all the focus and hype on Big Data, I was excited to have the chance at the recent O’Reilly Strata Show to sit down with Jeff Kelly, one of the top rated “Big Data” analysts, to give a MySQL perspective. Below is my interview with Jeff Kelly and David Floyer.

http://siliconangle.tv/video/cube-strata-conference-2012-lawrence-schwartz

In the segment, you’ll find a number of topics. These include indexing technology, NoSQL vs. MySQL, when to use flash drives, how to avoid partitioning, and customer uses cases.

David makes a particularly salient point in the discussion. He notes that the real end-game for Big Data is to have transactional and analytic data on the same database. His thought was that one could get maximum value out of designing analytics as part of operations. We think this is a critical area too — it is also one thing that TokuDB does really well. TokuDB plays here by allowing such high insertion rates, that it is easy to stand up many indexes and improve query performance on freshly arriving data. Likewise, having hot schemas gives analysts a lot of flexibility to slice and dice the data in a very dynamic fashion.

Jeff ended the conversation by asking what Big Data can do for society as a whole, a big picture question he posed to everyone on theCube at Strata. Our example for this was one of our users who is leveraging machine data for exploration of astrophysical phenomena with a fleet of satellites. There were other great examples from the conference keynote speeches as well. While on a day-to-day level we get caught up in the details of our data technologies, I hope the big picture topics get some exploration as well at the upcoming Percona MySQL conference.  It’s these big picture questions that drive new and exciting approaches to Big Data and can also serve as the motivator for new talent to enter into the industry.


PlanetMySQL Voting: Vote UP / Vote DOWN

O’Reilly Strata 2012: The Year of the Data Scientist

Март 5th, 2012

We had the privilege this past week to be invited to be part of the 2012 O’Reilly Strata “Making Data Work” Conference. Some of our photos from the event are here. At the event, we were excited to have Tokutek described in front of the approximately 2,500 attendees during the keynote sessions.

Overall, the diversity of topics discussed at the conference was impressive, spanning databases, developer tools, data visualization techniques, customer stories, and business implications. The full agenda is here.

For those who missed it, here are some great resources:

At the show, Tokutek was one of ten companies selected for the Startup Showcase. In this process, we were the only database company to receive an honorable mention.

We had a number of great conversations with participants at the show. Common themes and questions we received around MySQL focused on how to scale performance of MySQL, when to consider flash drives or more RAM, and considerations for keeping MySQL + TokuDB over going to NoSQL.

As part of the show, I also had the chance to talk with O’Reilly’s Mac Slocum about Tokutek.

With all the interest in Big Data, Tim O’Reilly summed up the conference well, saying “data science is the new black”. 2012 is clearly the year of the data scientist – and we have the database that will make him or her successful.

 


PlanetMySQL Voting: Vote UP / Vote DOWN