Archive for the ‘benchmarking’ Category

TokuDB v6.0: Download Available

Апрель 30th, 2012

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Sysbench Performance

I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In particular, we’ve been working on our locking schemes and have made some nice improvements in multi-threaded performance. In TokuDB v5.2, we outperformed InnoDB on sysbench by about 20% out to 64 threads. The following shows the performance of TokuDB v6.0 vs InnoDB on the same test:

InnoDB now has better multi-threading as well, so with standard compression on, we are now neck-in-neck with InnoDB out to 64 client threads, and then pull ahead out to 1024 client threads. With high compression, we top out at 72% faster than InnoDB!

We hope you enjoy this and all the other TokuDB v6.0 improvements.

To learn more about TokuDB:

  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Read the Bloor Research Report on TokuDB v6.0 here.

    PlanetMySQL Voting: Vote UP / Vote DOWN

Benchmarking your MySQL servers

Апрель 17th, 2012


Benchmarking tools like Sysbench and DBT2 has helped alot of DBAs in measuring their MySQL databases performance. By benchmarking you will really know how far your current setup will go. In this part, you will learn how to install the sysbench in Ubuntu and other Enterprise linux.

1. Installing sysbench to Ubuntu is never been easy as issuing the apt-get command. You can also download the source from sourceforge.

In ubuntu, execute below.


sudo apt-get install sysbench

2. Installing to enterprise linux like Oracle Linux/ Red Hat Linux. Download the rpm file from rpmfind.net or any sites providing rpm downloads. You can also check this source.

3. Once downloaded try to install it if it will not find the dependencies file.


rpm -ivh sysbench-04.12-5.el6.x86_64.rpm

In this example, i used a 64-bit machine with Oracle Linux 6.0 installed. There are required dependencies before sysbench will be installed.

1. It will require that the machine has a GCC Compiler. To install it,

yum install gcc

Yum will not work to unregistered linux, so to get this done you need to create a local repository.

2. Other dependencies


libcrypto.so.10; libssl.so.10 (openssl-1.0.0-4.el6.x86_64.rpm)
libgssapi_krb5.so.2 (krb5-libs-1.8.2-3.el6..x86_64.rpm)
libldap_r-2.4.so.2 (openldap-2.4.19-15.el6..x86_64.rpm)
libpq.so.5 (postgresql-libs-8.4.4-2.el6.x86_64.rpm)
libc.so.6; libcrypt.so.1; libm.so.6; libpthread.so.0 (glibc-2.12-1.7.el6.x86_64.rpm)
libfreebl3.so (nss-softokn-freebl-3.12.7-1.1.el6..x86_64.rpm)

All these dependencies are available in your Linux DVD/ISO Installers.

Sysbench Test Modes

1. Create 'sbtest' database first
2. Prepare the database


sysbench --db-driver=mysql --test=oltp –mysql-table-engine=innodb --oltp-table-size=1000000 --mysql-socket=/var/lib/mysql/mysql.sock --mysql-user=root --mysql-password=yourpassword prepare
This command will create 'sbtest' table and inserts 1M records

3. OLTP Test

OLTP Read Only

sysbench --db-driver=mysql --num-threads=16 --max-requests=100000 --test=oltp --oltp-table-size=1000000 –mysql-socket=/var/lib/mysql/mysql.sock --oltp-read-only --mysql-user=root --mysql-password=yourpassword run

OLTP Read+Write

sysbench --db-driver=mysql --num-threads=16 --max-requests=100000 --test=oltp --oltp-table-size=1000000 –mysql-socket=/var/lib/mysql/mysql.sock --oltp-test-mode=complex --mysql-user=root --mysql-password=admin run
this command will run the actual benchmark with 16 client threads, limiting the total number of request by 100,000

4. CPU Test

sysbench --test=cpu --cpu-max-prime=20000 run

5. Thread Test - This test mode was written to benchmark scheduler performance.

sysbench --num-threads=64 --test=threads --thread-yields=100 --thread-locks=2 run

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Frequent Checkpoints with No Performance Hit

Апрель 13th, 2012

Checkpointing — which involves periodically writing out dirty pages from memory — is central to the design of crash recovery for both TokuDB and InnoDB. A key issue in designing a checkpointing system is how often to checkpoint, and TokuDB takes a very different approach from InnoDB. The frequency of InnoDB depends on the amount of fuzzy checkpointing, the log-file size, and how often the memory files with dirty pages — but the upshot is that it runs a checkpoint infrequently. TokuDB runs a checkpoint one minute after the last one ended.

Frequent checkpoints make for fast recovery. Once MySQL crashes, the storage engine needs to replay the log to get back to a correct state. The length of the log is a function of the time since the last checkpoint. And replaying the log is single threaded. So TokuDB recovers in minutes, and usually much faster. If InnoDB crashes late in its checkpoint cycle it can take hours or more to recover. Indeed, there is considerable lore around making InnoDB recover faster.

So what’s the downside to frequent checkpoints? Up until now, the answer was simple: when you are in a checkpoint, your performance drops. This was famously illustrated for InnoDB when Vadim Tkachenko at Percona Consulting showed that MySQL could become completely unresponsive for minutes at a time during an InnoDB checkpoint. We see a similar outcome here:

In this case we see a stall in which the throughput drops to around 25%, and the stall lasts for minutes. I want to stress that fuzzy checkpointing was designed to help avoid catastrophic checkpoints, and sometimes it works, but the tpcc benchmark shows that it doesn’t always work.

In previous versions of TokuDB, we also had a dip in performance associated with checkpoints, but frequent checkpoints are also smaller checkpoints, so our performance would drop to around 80% of peak for a couple of seconds. A drop to 80% is better than a drop to 25%, but we knew we could do better.

And we did. As of TokuDB v6.0, we’ve eliminated the performance variability from checkpointing. We’re still checkpointing just as frequently, so you still get fast recovery. How? It was a combination of reducing the amount of work a checkpoint needs to do and fixing the locking interaction between checkpoints and other operations. Below is a sysbench benchmark. This is a case where InnoDB checkpoint behavior is as good as it gets, and I wanted to compare us with InnoDB’s best case, not its worst case.

Sysbench performance with different compressors

This graph shows that TokuDB v6.0 has no checkpoint variability. It turns out that TokuDB v6.0 with standard compression has about the same average TPS as TokuDB v5.2, but with no checkpointing artifacts. Finally, if you have the CPU budget for it, turning on aggressive compression gives a big boost in transactions per second, still with no checkpointing variability.

So in a nutshell, I feel like we’ve taken care of the checkpointing issue. As of TokuDB v6.0, we have the upside of frequent checkpoints — small logs and fast recovery — without the downside of variability. The engineering team at Tokutek is pretty proud of these results.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Frequent Checkpoints with No Performance Hit

Апрель 13th, 2012

Checkpointing — which involves periodically writing out dirty pages from memory — is central to the design of crash recovery for both TokuDB and InnoDB. A key issue in designing a checkpointing system is how often to checkpoint, and TokuDB takes a very different approach from InnoDB. The frequency of InnoDB depends on the amount of fuzzy checkpointing, the log-file size, and how often the memory files with dirty pages — but the upshot is that it runs a checkpoint infrequently. TokuDB runs a checkpoint one minute after the last one ended.

Frequent checkpoints make for fast recovery. Once MySQL crashes, the storage engine needs to replay the log to get back to a correct state. The length of the log is a function of the time since the last checkpoint. And replaying the log is single threaded. So TokuDB recovers in minutes, and usually much faster. If InnoDB crashes late in its checkpoint cycle it can take hours or more to recover. Indeed, there is considerable lore around making InnoDB recover faster.

So what’s the downside to frequent checkpoints? Up until now, the answer was simple: when you are in a checkpoint, your performance drops. This was famously illustrated for InnoDB when Vadim Tkachenko at Percona Consulting showed that MySQL could become completely unresponsive for minutes at a time during an InnoDB checkpoint. We see a similar outcome here:

In this case we see a stall in which the throughput drops to around 25%, and the stall lasts for minutes. I want to stress that fuzzy checkpointing was designed to help avoid catastrophic checkpoints, and sometimes it works, but the tpcc benchmark shows that it doesn’t always work.

In previous versions of TokuDB, we also had a dip in performance associated with checkpoints, but frequent checkpoints are also smaller checkpoints, so our performance would drop to around 80% of peak for a couple of seconds. A drop to 80% is better than a drop to 25%, but we knew we could do better.

And we did. As of TokuDB v6.0, we’ve eliminated the performance variability from checkpointing. We’re still checkpointing just as frequently, so you still get fast recovery. How? It was a combination of reducing the amount of work a checkpoint needs to do and fixing the locking interaction between checkpoints and other operations. Below is a sysbench benchmark. This is a case where InnoDB checkpoint behavior is as good as it gets, and I wanted to compare us with InnoDB’s best case, not its worst case.

Sysbench performance with different compressors

This graph shows that TokuDB v6.0 has no checkpoint variability. It turns out that TokuDB v6.0 with standard compression has about the same average TPS as TokuDB v5.2, but with no checkpointing artifacts. Finally, if you have the CPU budget for it, turning on aggressive compression gives a big boost in transactions per second, still with no checkpointing variability.

So in a nutshell, I feel like we’ve taken care of the checkpointing issue. As of TokuDB v6.0, we have the upside of frequent checkpoints — small logs and fast recovery — without the downside of variability. The engineering team at Tokutek is pretty proud of these results.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.

PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Even Better Compression

Апрель 11th, 2012

A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.

I decided to present numbers on the same set of data as the old post, so see that post for experimental details.

But first, what are the changes? TokuDB compresses large blocks of data — on the order of MB, rather than the 16KB that InnoDB uses — which is a big part of why we can get better compression. For InnoDB, compression is attempted on 16KB pieces, with inefficiencies if the block compresses too little or too much. InnoDB’s compression woes are well documented.

In TokuDB v6.0, you can choose between two types of compression by setting the ROW_FORMAT in the CREATE TABLE or ALTER TABLE commands. One compression setting, “standard,” uses less CPU. The other setting, “aggressive,” uses more CPU but usually does a better job of compressing, sometimes much better.

Let’s look at the numbers (benchmark details here).

Comparison of Compression Levels

In this case, we’ve achieved 29x compression!

So when should you use the standard compressor and when should you use the more aggressive compressor? Compression is all done in the background, so it basically depends on the number of cores you have. If you have enough idle cores, the aggressive compressor will not slow down your database — in fact, the following graph shows that you can use TokuDB’s aggressive compressor to improve your overall database performance.

Sysbench performance with different compressors

If you don’t have enough spare cores, then the standard compressor may be better, since in that case, the compressor may contend with other parts of the system for CPU resources. The exact cutoff depends on the particulars of your system, but an easy rule of thumb might be to use standard if you have 6 or fewer cores, and otherwise use aggressive.

In either case, you get great compression. Compression performance is strongly affected by many factors, and we are always on the lookout for interesting use cases, so please post any interesting results you might get with the two settings.

To learn more about TokuDB:

  • Download a free trial of TokuDB.
  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Come to our booth #410 at Percona Live.
  • Catch Tokutek Software Engineer Leif Walsh’s presentation at Percona Live on April 11th at 4:30 pm
  • Catch Tokutek VP of Marketing’s Lawrence Schwartz’s Lightning Talk at Percona Live on April 11th at 6:30 pm

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Conference and Expo Talk on Benchmarking

Февраль 2nd, 2012

I’ll be speaking on April 11th at 4:30 pm in Room 4 in at the Percona Conference and Expo Talk. The topic will be “Creating a Benchmark Infrastructure That Just Works.

Throughout my career I’ve been involved with maintaining the performance of database applications and therefore created many benchmark frameworks. At Tokutek, an important part of my role is measuring the performance of our storage engine over time and versus competing solutions. There is nothing proprietary about what I’ve created, it can be used anywhere.

My presentation will cover how I created the benchmark infrastructure at Tokutek:

  • Hardware and software considerations (including physical vs. virtual)
  • Selecting benchmarks
  • Capturing detailed information during the benchmark
  • Automation
  • Storing results
  • Visualization
  • Trend analysis
  • Continuous integration (monitoring the performance of future versions)
  • Self-service (let people get the information they want)

Track: Tools
Experience level: Intermediate

Tokutek is also a sponsor of the show and will have an expo booth. So, I hope to see you at my talk and/or at our booth.


PlanetMySQL Voting: Vote UP / Vote DOWN

1 Billion Insertions – The Wait is Over!

Январь 26th, 2012

iiBench measures the rate at which a database can insert new rows while maintaining several secondary indexes. We ran this for 1 billion rows with TokuDB and InnoDB starting last week, right after we launched TokuDB v5.2. While TokuDB completed it in 15 hours, InnoDB took 7 days.

The results are shown below. At the end of the test, TokuDB’s insertion rate remained at 17,028 inserts/second whereas InnoDB had dropped to 1,050 inserts/second. That is a difference of over 16x. Our complete set of benchmarks for TokuDB v5.2 can be found here.

Benchmark Details: Ubuntu 10.10; 2x Xeon X5460; 16GB RAM; 8x 146GB 10k SAS in RAID10. Each data point is the average insertion rate for the last 2 million rows. 

We developed the iiBench benchmark to measure performance for a use case that occurs commonly in production applications, such as online advertising, social media, and network management.

iiBench simulates a pattern of usage for always-on applications that:

  • Require fast query performance and hence require indexes
  • Have high data insert rates
  • Cannot wait for offline batch processing and hence require the indexes be maintained as data comes in

Note that iiBench was created as an open-source benchmark, which allows others to freely use it, extend it, and contribute their changes back. We originally unveiled the benchmark in the context of a challenge issued at the 2008 OpenSQL camp. Since then, iiBench has been downloaded and used many times, and ported by the community (in this case, Mark Callaghan) to a Python Script.

Please let us know any feedback you have on iiBench. For additional information on…

  • iibench overview click here
  • TokuDB version 5.2 Overview click here
  • TokuDB version 5.2 Performance, including iibench, SysBench, Compression, and TPCC-like, click here

PlanetMySQL Voting: Vote UP / Vote DOWN

Compression Benchmarking: Size vs. Speed (I want both)

Сентябрь 15th, 2011

I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.

Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses iiBench to measure the insert performance and compression achieved by TokuDB and InnoDB. I tested InnoDB compression with two values of key_block_size (4k and 8k) and with compression disabled.


As you can see in the above graph, compression allows for the database to use significantly less disk space. TokuDB achieved 51% compression, InnoDB achieved 50% for key_block_size=4 and and 47% compression for key_block_size=8. [Note: The random nature of iiBench makes it difficult to compress]


Traditionally there is a “size versus speed” trade-off when compressing data. Data compression utilities have long offered variable levels of aggressiveness, spending more time compressing files usually results in smaller files. The InnoDB benchmarks bear this out, as the compression level increases the insert performance declines. On the other hand, TokuDB achieves the highest level of compression while out-performing InnoDB in all scenarios, even InnoDB without compression. TokuDB is running 33.4x faster than InnoDB configured to achieve similar levels of compression. Note, “Inserts per Second” was measured as the exit velocity of the benchmark run (the average of the last million inserts).

How much compression can be achieved?

To answer this I decided to load some web application performance data (log style data with stored procedure names, database instance names, begin and ending execution timestamps, duration row counts, and parameter values). TokuDB achieved 18x compression, far more than InnoDB. It also loaded the data much faster but that is a blog entry for another day…


Benchmark details

Application

  • iiBench, insert 25mm rows, 1000 rows per commit

Environment

  • Intel Core-i7/920 @ 3.6GHz, 12GB DDR3 @ 1600MHz, 2 x SATA II
  • Ubuntu 11.04, TokuDB 5.0.4, MySQL 5.1.52, InnoDB plug-in 1.0.13

Server/Session Variables

  • unique_checks=1
  • tokudb_commit_sync=0
  • tokudb_cache_size=2G
  • innodb_buffer_pool_size=2G
  • innodb_flush_method=O_DIRECT
  • innodb_doublewrite=false
  • innodb_flush_log_at_trx_commit=0
  • innodb_log_file_size=1000M
  • innodb_file_per_table=true
  • innodb_log_buffer_size=16M
  • innodb_file_format=barracuda

PlanetMySQL Voting: Vote UP / Vote DOWN

Compression Benchmarking: Size vs. Speed (I want both)

Сентябрь 15th, 2011

I’m creating a library of benchmarks and test suites that will run as part of a Continuous Integration (CI) process here at Tokutek. My goal is to regularly measure several aspects of our storage engine over time: performance, correctness, memory/CPU/disk utilization, etc. I’ll also be running tests against InnoDB and other databases for comparative analysis. I plan on posting a series of blog entries as my CI framework evolves, for now I have the results of my first benchmark.

Compression is an always-on feature of TokuDB. There are no server/session variables to enable compression or change the compression level (one goal of TokuDB is to have as few tuning parameters as possible). My compression benchmark uses iiBench to measure the insert performance and compression achieved by TokuDB and InnoDB. I tested InnoDB compression with two values of key_block_size (4k and 8k) and with compression disabled.


As you can see in the above graph, compression allows for the database to use significantly less disk space. TokuDB achieved 51% compression, InnoDB achieved 50% for key_block_size=4 and and 47% compression for key_block_size=8. [Note: The random nature of iiBench makes it difficult to compress]


Traditionally there is a “size versus speed” trade-off when compressing data. Data compression utilities have long offered variable levels of aggressiveness, spending more time compressing files usually results in smaller files. The InnoDB benchmarks bear this out, as the compression level increases the insert performance declines. On the other hand, TokuDB achieves the highest level of compression while out-performing InnoDB in all scenarios, even InnoDB without compression. TokuDB is running 33.4x faster than InnoDB configured to achieve similar levels of compression. Note, “Inserts per Second” was measured as the exit velocity of the benchmark run (the average of the last million inserts).

How much compression can be achieved?

To answer this I decided to load some web application performance data (log style data with stored procedure names, database instance names, begin and ending execution timestamps, duration row counts, and parameter values). TokuDB achieved 18x compression, far more than InnoDB. It also loaded the data much faster but that is a blog entry for another day…


Benchmark details

Application

  • iiBench, insert 25mm rows, 1000 rows per commit

Environment

  • Intel Core-i7/920 @ 3.6GHz, 12GB DDR3 @ 1600MHz, 2 x SATA II
  • Ubuntu 11.04, TokuDB 5.0.4, MySQL 5.1.52, InnoDB plug-in 1.0.13

Server/Session Variables

  • unique_checks=1
  • tokudb_commit_sync=0
  • tokudb_cache_size=2G
  • innodb_buffer_pool_size=2G
  • innodb_flush_method=O_DIRECT
  • innodb_doublewrite=false
  • innodb_flush_log_at_trx_commit=0
  • innodb_log_file_size=1000M
  • innodb_file_per_table=true
  • innodb_log_buffer_size=16M
  • innodb_file_format=barracuda

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Community – what do you want in a load testing framework?

Май 10th, 2011

So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench, that allows anyone to compare their server performance via repeatable tests. It’s fun to see other people’s benchmarks but it’s often difficult to repeat and compare since most tests aren’t fully documented in their blog posts – this could be a solution to that.

Currently I have the harness doing iterations based on:

  • incrementing (choose a global dynamic variable, ie: sync_binlog=0-1000) system values
  • storage engine vs storage engine for the same workload
  • thread quantity increments for read-only or read+write
  • N-nodes in a cluster workloads with WRR traffic distribution (need to code WLC and others)
  • QPS testing for connection pool vs open/close connection
  • multi-table vs single-table workloads

Outputs available: CSV, XML, JSON for easy integration into any number of the various graphing frameworks available. I’ll probably code up a light weight python http server preloaded with Highcharts and Sparklines so you can see your benchmarks easily without having to roll your own graphs.

Quick now, tell me what you’d like me to code for you!


PlanetMySQL Voting: Vote UP / Vote DOWN