Archive for the ‘FusionIO’ Category

Virident tachIOn: New player on Flash PCI-E cards market

Июнь 15th, 2010

(Note: The review was done as part of our consulting practice, but is totally independent and fully reflects our opinion)

In my talk on MySQL Conference and Expo 2010 “An Overview of Flash Storage for Databases” I mentioned that most likely there are other players coming soon. I actually was not aware about any real names at that time, it was just a guess, as PCI-E market is really attractive so FusionIO can’t stay alone for long time. So I am not surprised to see new card provided by Virident and I was lucky enough to test a pre-production sample Virident tachIOn 400GB SLC card.

I think it will be fair to say that Virident targets where right now FusionIO has a monopoly, and it will finally bring some competition to the market, which I believe is good for the end users. I am looking forward to price competition ( not having real numbers I can guess that vendors still put high margin in the price) as well as high performance in general and stable performance under high load in particular, and also competition in capacity and data reliability areas.

Priceline for Virident tachIOn cards already shows the price competition: oriented price for tachIOn 400GB is 13,600$ (that is 34$/GB) , and entry-base card is 200GB with price 6,800$ (there also is 300GB card in product line). Price for FusionIO 160GB SLC ( from dell.com, price on 14-Jun-2010 ) is 6,308.99$ ( that is 39.5$/GB)

Couple words about product, I know that Virident engineering team was concentrating on getting stable write performance in long running
write activities and in cases when space utilization is close to 100%. As you may know (check my presentation) SSD design requires background
“garbage collector” activity, which requires space to operate and Virident card already has enough space reservation to get stable write performance even when the disk is almost full.

As for reliability, I think, the design of the card is quite neat. The card by itself contains bunch of replaceable flash modules, and each individual module can be changed in case of failure. Also internally modules are joined in RAID (it is fully transparent for end user).

All this guarantees good level of confidence in data reliability: if a single module fails, the internal RAID will allow to continue operations, and after the replacement of module – it will be rebuilt. It still leaves the controller on card as single point of failure, but in this case all flash modules can be safely relocated to the new card with working controller. (Note: It was not tested by Percona engineers, but taken from vendor’s specification)

As for power failures – flash modules also come with capacitors which guarantees data delivery to final media even if power is lost on the main host. (Note: It was not tested by Percona engineers, but taken from vendor’s specification)

Now to most interesting part – performance numbers. I took sysbench fileio benchmark with 16KB blocksize to see what maximal performance we can expect.

Server specification is:

  • Supermicro X8DTH series motherboard
  • 2 x Xeon E5520 (2.27GHz) processors w/HT enabled (16 cores)
  • 64GB of ECC/Registered DDR3 DRAM
  • Centos 5.3 2-6.18.164 Kernel
  • Filesystem is XFS formatted with mkfs.xfs -s size=4096 option ( size=4096, sector size, is very important to have aligned IO requests) and mounted with nobarrier option
  • Benchmark: sysbench fileio on 100GB file, 16KB blocksize

The raw results are available on Wiki

And the graphs for random read, writes and sequential writes:

I think very interesting to see distribution of 95% response time results ( 0 time is obviously the problem in sysbench, which has no enough time resolution for such very fast operations)

As you can see we can get about 400MB/sec random write bandwidth with 8-16 threads and
with 3.1ms (for 8 threads) and 3.8ms (16 threads) response time in 95% of cases.

As some issue here, I should mention, that despite the good response time results,
the maximal response time in some cases can jump to 300 ms per request, and I was told
it corresponds to garbage collector activity and will be fixed in the production release of driver.

I think it would be fair to get comparison with FusionIO card, especially for write pressure case
As you may know FusionIO recommends to have space reservation to get sustainable write performance
(Tuning Techniques for Writes).

I took FusionIO ioDrive 160GB SLC card, and tested fully formatted card (filesize 145GB), card formatted with 25% space reservation (file size 110GB), and Virident card 390GB filesize. It also allows us to see if Virident tachIOn card can sustain write in fully utilized card.

As disclaimer I want to mention that Virident tachIOn card was fine tuned by Virident engineers, while FusionIO card was tuned only by me and I may not have all knowledge needed for FusionIO tuning.

First graph is random reads, so see compare read performance

As you see in 1 and 4 threads FusionIO is better, while with more threads Virident card scales better

And now random writes:

You can see that FusionIO definitely needs space reservation to provide high write bandwidth, and it comes with
cost hit ( 25% space reservation -> 25% increase $/GB).

In conclusion I can highlight:

  • I am impressed with architecture design with replaceable individual flash modules, I think it establishes new high-end standard for flash devices
  • With single card you can get over 1GB/sec bandwidth in random reads (16-64 working threads), and it is the maximal results what I’ve seen so far ( again for single card)
  • Random write bandwidth exceeds 400MB/sec (8-16 working threads)
  • Random read/write mix results are also impressive, and it can be quite important in workloads like FlashCache, where card have both concurrent read and write pressure
  • Quite stable sequential writes performance (important in question for log related activity in MySQL)

I am looking forward to present results in sysbench oltp, tpcc workload, and also in FlashCahce mode.


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

FlashCache: tpcc workload with FusionIO card as cache

Июнь 3rd, 2010

This run is very similar what I had on Intel SSD X25-M card, but now I use FusionIO 80GB SLC card. I chose this card as smallest available card (and therefore cheapest. On Dell.com you can see it for about $3K). There is also FusionIO IO-Xtreme 80GB card, which is however MLC based and it could be not best choice for FlashCache usage ( as there high write rate on FlashCache for both reading and writing to/from disks, so lifetime could be short).

Also Facebook team released WriteThrough module for FlashCache, which could be good trade-off if you want extra warranty for data consistency and your load is mostly read-bound, so I tested this mode also.

All setup is similar to previous post, so let me just post the results with FlashCache on FusionIO in 20% dirty page, 80% dirty pages and write-through modes. I used full 80GB for caching ( total size of data is about 100GB).

Conclusions from the graph:

  • with 80% dirty page we have about 4x better throughput ( comparing to RAID).
  • Write-through mode is about 2x gain, but remember that load is very write intensive and all benefits in write-through mode come only from cached reads, so it is pretty good for this scenario

On this post I finish my runs on FlashCache for now and I think it may be considered for real usage, at least you may evaluate how it works on your workloads.


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL 5.5.4 in tpcc-like workload

Апрель 22nd, 2010

MySQL-5.5.4 ® is the great release with performance improvements, let’s see how it performs in
tpcc-like workload.

The full details are on Wiki page
http://www.percona.com/docs/wiki/benchmark:mysql:554-tpcc:start

I took MySQL-5.5.4 with InnoDB-1.1, tpcc-mysql benchmark with 200W ( about 18GB worth of data),
InnoDB log files are 3.8GB size, and run with different buffer pools from 20GB to 6GB. The storage is FusionIO 320GB MLC card with XFS-nobarrier. .

While the raw results are available on Wiki, there are graphical results.

I intentionally put all line on the same graph to show trends.

It seems adaptive_flushing is not able to keep up and you see periodical drops when InnoDB starts flushing. I hope InnoDB team will fix it before 5.5 GA.

I expect reasonable request how it can be compared with Percona Server/XtraDB, so there is
the same load on our server:

As you see our adaptive_checkpoint algorithm is performing much stable.

And to put direct comparison, there is side-to-side results for 10GB buffer_pool case.

So as you see InnoDB is doing great, trying to keep performance even, as in previous release, there was about 1.7x times difference. I expect to see more improvements in 5.5-GA.


Entry posted by Vadim | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

The innodb_plugin – a pleasant surprise!

Март 4th, 2010

I’ve heard about the innodb_plugin but not had time to put it to the test.

Recently though due to some problems I’ve been having with the MySQL Enterprise Monitor (Merlin) I’ve had to try a few changes and had the opportunity to try out the innodb plugin.

I have been using Merlin for some time and like it a lot. It is not perfect but does a good job for me.  However, since upgrading to version 2.1 I have been having some database load problems. I long ago split the merlin server into a front- and back-end server with the backend running a standard MySQL 5.1 Advanced package. That has been working fine.

I have been monitoring more and more mysqld servers and recently the database backend could not cope. Basically the writes of data collected from the agents and the deletes of old date (purging) caused too much I/O and that is on a box with 6 disks in RAID-10 with a battery backed write-cache.

So I upgraded the db server to a new box with lots of memory and a 300 GB Fusion IO card. I expected all problems to go away. Well not quite. In spite of the solid state drive which was not I/O bound, and the CPU which was not CPU bound, mysqld could not keep up with the load. This was running the MySQL 5.1.44 Advanced rpm. Looking more deeply it seems that mysqld itself was the bottleneck and there was too much contention on the PK by the different INSERTing and DELETing threads.

The Merlin team suggested trying the innodb_plugin (1.0.6) and all of a sudden the bottleneck seems to have gone away.

This is the iostat output taken before switching to the innodb plugin:

Device:    rrqm/s wrqm/s   r/s     w/s  rsec/s  wsec/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
fioa         0.00   0.00  4.21 4049.50   33.67 32337.07    16.83 16168.54     7.99   192.42    6.31   0.25 100.22
fioa         0.00   0.00  9.78 4379.44  153.29 34942.12    76.65 17471.06     8.00     0.00    6.17   0.00   0.00
fioa         0.00   0.00 20.80 4455.40  408.00 35552.60   204.00 17776.30     8.03     0.00    5.96   0.00   0.00
fioa         0.00   0.00 21.80 4685.00  267.20 37391.20   133.60 18695.60     8.00    24.00    5.87   0.21 100.02
fioa         0.00   0.00 23.60 5200.40  320.00 41490.40   160.00 20745.20     8.00     0.00    6.04   0.00   0.00
fioa         0.00   0.00 15.17 5020.36  202.79 40055.09   101.40 20027.54     7.99     0.00    5.65   0.00   0.00

This is the iostat output taken after switching to the innodb plugin:

Device:    rrqm/s wrqm/s   r/s     w/s  rsec/s   wsec/s   avgrq-sz avgqu-sz   await  svctm  %util
fioa         0.00   0.00  4.40 2344.80  160.00 201520.00     85.85     0.00    0.41   0.00   0.00
fioa         0.00   0.00  2.00 2188.20   64.00 193500.80     88.38     0.00    0.41   0.00   0.00
fioa         0.00   0.00  1.80 2113.60   57.60 200889.20     94.99     0.00    0.44   0.00   0.00
fioa         0.00   0.00  0.40 1961.60   12.80 194291.20     99.03     0.00    0.43   0.00   0.00
fioa         0.00   0.00  0.00 2118.00    0.00 202496.40     95.61     0.00    0.47   0.00   0.00
fioa         0.00   0.00  0.00 2030.00    0.00 191482.40     94.33     0.00    0.46   0.00   0.00
fioa         0.00   0.00  0.00 2152.60    0.00 208485.20     96.85     0.00    0.44   0.00   0.00
fioa         0.00   0.00  0.00 1936.20    0.00 178732.40     92.31     0.00    0.42   0.00   0.00
fioa         0.00   0.00  4.79 1249.70  153.29 115475.45     92.17     0.00    0.40   0.00   0.00

Note: the first set of figures was taken using CentOS 4, and the second using CentOS 5. The IO statistics aren’t exactly identical and on CentOS 5 for some reason the driver appears not to be providing all the stats. However the wsec/s value clearly shows a significant performance improvement and the original problem mysqld was having of not being able to purge as fast as it was inserting data seems to have been solved. At least initial signs seem to indicate this. The only configuration change made to the server was the following:

ignore_builtin_innodb
plugin-load=innodb=ha_innodb_plugin.so;innodb_trx=ha_innodb_plugin.so;innodb_locks=ha_innodb_plugin.so;innodb_lock_waits=ha_innodb_plugin.so;innodb_cmp=ha_innodb_plugin.so;innodb_cmp_reset=ha_innodb_plugin.so;innodb_cmpmem=ha_innodb_plugin.so;innodb_cmpmem_reset=ha_innodb_plugin.so

innodb_adaptive_flushing = 1
innodb_io_capacity = 1000

Conclusion, if you are having performance problems on your MySQL server and perhaps the hardware is not the bottleneck then try using the plugin. It may make a big difference.

Also a big thanks to the Merlin team for helping me out with this problem and getting things up and running.


PlanetMySQL Voting: Vote UP / Vote DOWN