Archive for the ‘TokuDB’ Category

Challenges of Big Databases with MySQL – IOUG Presentation

Май 24th, 2012

 

 

Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL® is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine plug-ins.

I recently gave a talk at IOUG Collaborate, a copy of which can be found here. This presentation discusses how MySQL’s storage engines have recently made dramatic progress in large database manageability.

A complete list of MySQL talks from the show can be found here.


PlanetMySQL Voting: Vote UP / Vote DOWN

Challenges of Big Databases with MySQL – IOUG Presentation

Май 24th, 2012

 

 

Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL® is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine plug-ins.

I recently gave a talk at IOUG Collaborate, a copy of which can be found here. This presentation discusses how MySQL’s storage engines have recently made dramatic progress in large database manageability.

A complete list of MySQL talks from the show can be found here.


PlanetMySQL Voting: Vote UP / Vote DOWN

SwRI Chooses TokuDB to Tackle Machine Data for an 800M+ Record Database

Май 16th, 2012

Tackling machine data on the ground to ensure successful operations for NASA in space

Issues addressed:

  • Scaling MySQL to multi-terabytes
  • Insertion rates as InnoDB hit a performance wall
  • Schema flexibility to handle an evolving data model

The Company:  Southwest Research Institute (SwRI) is an independent, nonprofit applied research and development organization. The staff of more than 3,000 specializes in the creation and transfer of technology in engineering and the physical sciences. Currently, SwRI is part of an international team working on the NASA Magnetospheric Multiscale (MMS) mission. MMS is a Solar Terrestrial Probes mission comprising four identically instrumented spacecraft that will use Earth’s magnetosphere as a laboratory to study the microphysics of three fundamental plasma processes: magnetic reconnection, energetic particle acceleration, and turbulence.

The Challenge:  SwRI is responsible for archiving an enormous quantity of data generated by the Hot Plasma Composition Analyzer (HPCA). The device is used to count hydrogen, helium, and oxygen ions in space at different energy levels. These instruments require extensive calibration data and each one is a customized, high precision device that is built, tested, and integrated by hand. SwRI must capture and store all the test and calibration data during the 2-3 week bursts activity that are required for each of the 4 devices.

“During each of these calibration runs, there are several data sources flowing into the server, each one leading to an index in the database,” said Greg Dunn, a Senior Research Engineer at SWRI. “Each packet that arrives gets a timestamp, message type, file name and location associated with it. A second process goes through that data and parses it out – information such as voltage, temperature, pressure, current, ion energy, particle counts, and instrument health must be inserted into the database for every record. This can load the database with up to 400 or 500 inserts per second.”

“Being able to monitor the performance of the instrument and judge the success of the tests and calibrations in near real time is critical to the project,” noted Dunn. “There are limited windows to do testing cycles and make adjustments for any issues that arise. Any significant slip in the testing could cost tens of thousands of dollars and jeopardize the timing of the satellite launch.”

“We started seeing red flags with InnoDB early in the ramp-up phase of the project, as our initial data set hit 400GB,” said Dunn. “Size was the first issue. Each test run was generating around 94 million inserts or around 90GB of data, quickly exceeding the capacity allocated for the program. In addition, as our database grew to 800M records, we saw InnoDB insertion performance drop off to a trickle. Even with modest data streams at 100 records per second, InnoDB was topping out at 45 insertions per second. Being able to monitor these crucial calibration activities in a timely fashion and in a cost effective manner was at risk.”

To keep up with the workload and data set, SwRI considered several options, but they failed to meet program performance and price goals. These included:

Partitioning / Separate Databases – “We considered partitioning, but this can be a challenge to set up and it introduces additional complexity,” said Dunn. “We also looked at putting each calibration into its own database, but that would have made it much more difficult to correlate across different databases.”

Additional RAM – “Increasing the available RAM from 12 GB up to 100 GB was not enough by itself,” claimed Dunn. “We briefly considered keeping everything in RAM, but that was not a realistic or efficient way to address a data set size that was promising to grow to several terabytes by the end of the program.”

The Solution:  Once TokuDB was installed, SwRI’s big data management headache quickly subsided. “The impact to our required storage was dramatic,” noted Dunn. “We benefited from over 9x compression. In our comparison benchmarks, we went from 452GB with InnoDB to 49GB with TokuDB.”

There was also a dramatic improvement in performance. “Suddenly, we no longer had to struggle to keep up with hundreds of insertions per second,” stated Dunn. “Our research staff could immediately see whether or not the experiment was running correctly and whether the test chamber was being used effectively. We didn’t have to worry that insufficient data analysis horsepower might lead to downstream schedule delays.”

The Benefits: 

Cost Savings: “The hardware savings were impressive,” noted Dunn. “With InnoDB, going to larger servers, adding 100s of GBs of additional RAM along with many additional drives would have easily cost $20,000 or more, and still would not have addressed all our needs. TokuDB was by far both a cheaper and simpler solution.”

Hot Column Addition: “As we continue to build out the system and retool the experiments, flexibility in schema remains important,” stated Dunn. “TokuDB’s capability to quickly add columns of data is a good match for our environment, where our facility is still evolving and sometimes has new sensors or monitors installed that need to be added to existing large tables.”

Fast Loader: “The open source toolset that Tokutek designed to parallelize the loading of the database was very helpful,” said Dunn.  “We were able to bring down the load of the database from MySQL dump backup from 30 hours to 7 hours.”


PlanetMySQL Voting: Vote UP / Vote DOWN

Tokutek Welcomes Gerry Narvaja!

Май 14th, 2012

We are excited to have Gerry Narvaja start today at Tokutek! Gerry has spent more than 25 years in the software industry, most of them working with databases for different kinds of applications, from embedded to large-scale web products. Gerry worked first at MySQL, and then Sun Microsystems supporting the Sales teams. In 2008 he transitioned into being a Senior MySQL DBA. Gerry graduated as an Electronic Engineer from I.T.B.A (Instituto Tecnológico de Buenos Aires) and has an M.B.A. from Universidad del Salvador in collaboration with S.U.N.Y.A (State University of NY at Albany).

Gerry enjoys helping users to solve complex database production issues. For almost a year he has been co-hosting the popular MySQL Community podcast, OurSQL, which was given the MySQL Community Contributor of the Year 2012 award at the recent Percona MySQL Users Conference. Gerry and Martín Farach-Colton, our CTO, will also be speaking next month at the first ever Latin American MySQL / MariaDB Conference in Argentina.

Please feel free to drop Gerry a line at gerry@tokutek.com with your toughest MySQL and MariaDB issues!


PlanetMySQL Voting: Vote UP / Vote DOWN

Tokutek Welcomes Gerry Narvaja!

Май 14th, 2012

We are excited to have Gerry Narvaja start today at Tokutek! Gerry has spent more than 25 years in the software industry, most of them working with databases for different kinds of applications, from embedded to large-scale web products. Gerry worked first at MySQL, and then Sun Microsystems supporting the Sales teams. In 2008 he transitioned into being a Senior MySQL DBA. Gerry graduated as an Electronic Engineer from I.T.B.A (Instituto Tecnológico de Buenos Aires) and has an M.B.A. from Universidad del Salvador in collaboration with S.U.N.Y.A (State University of NY at Albany).

Gerry enjoys helping users to solve complex database production issues. For almost a year he has been co-hosting the popular MySQL Community podcast, OurSQL, which was given the MySQL Community Contributor of the Year 2012 award at the recent Percona MySQL Users Conference. Gerry and Martín Farach-Colton, our CTO, will also be speaking next month at the first ever Latin American MySQL / MariaDB Conference in Argentina.

Please feel free to drop Gerry a line at gerry@tokutek.com with your toughest MySQL and MariaDB issues!


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona Live Slides and Video Available: The Right Read Optimization is Actually Write Optimization

Май 10th, 2012

In April, I got to give a talk at Percona Live, about why The Right Read Optimization is Actually Write Optimization. It was my first industry talk, so I was delighted when someone in the audience said “I feel like I just earned a college credit.”

Box offered to host everyone’s slides from the conference here (mine is here). A big thanks from me to Sheeri Cabral, for recording my talk and posting it online!

The focus of the talk starts with why write optimization is what you want to do in many situations, especially if you need read optimization. Then I get in to some of the theory on optimizing writes by laying out your data better on disk. We approach this gradually, beginning with how B-trees work and progressing with a few simple rules for getting better performance, and see some of the tradeoffs inherent in these techniques.


PlanetMySQL Voting: Vote UP / Vote DOWN

Tokutek and PalominoDB Partner to Bring Scale, Performance to Database Deployments

Май 2nd, 2012

MySQL storage engine provider joins forces with leading database consultants to deliver support for growing number of MySQL and MariaDB customers

Lexington, MA – (May 2, 2012) – Tokutek, the leader in high-performance and agile database storage engines, today announced a strategic partnership with PalominoDB, a premier database operations and engineering consultancy, to provide database services and support to joint customers. Tokutek’s storage engine will be complemented with PalominoDB’s operational excellence, 24×7 on-call support and access to the company’s skilled team of professional database administrators (DBAs).

“TokuDB has immeasurably improved our ability to react to changing business requirements in a large data environment. The ability to change schemas and indexes on the fly and no need to repair fragmented indexes has led to a simplification of our environment and reduced maintenance windows,” said Adrian Roston, CTO, Frequency. “With PalominoDB’s knowledge and expertise, we were rapidly able to leverage TokuDB’s advantages and substantially improve our system’s throughput.”

TokuDB is a highly scalable, zero-maintenance downtime MySQL Storage Engine that delivers indexing-based query acceleration, improved replication performance, unparalleled compression, and hot schema modifications. Under the agreement, PalominoDB will provide end-to-end solutions and support for MySQL and MariaDB systems that run on the TokuDB storage engine.

“Tokutek’s ability to improve database performance brings an entirely new value proposition to MySQL,” said Laine Campbell, Owner and CEO at PalominoDB.

“In partnering with Tokutek, PalominoDB is making a firm commitment to expanding MySQL’s viability as an enterprise-class database capable of supporting complex queries with high data rates on terabyte-scale databases.”

“PalominoDB brings unrivaled domain expertise and a range of service offerings to the MySQL and MariaDB market,” said John Partridge, President and CEO of Tokutek. “Tokutek’s partnership with PalominoDB will help TokuDB deployments go smoothly and provide access to extended support and design capabilities for customers needing those services.”

 

About PalominoDB

For startups and established companies of all sizes, PalominoDB provides ongoing operational support and professional expertise in database architecture, performance and scale. With a focus on open-source and other best-in-class software components, and extensive experience in all major and emerging database technologies, PalominoDB engages with customers to develop custom, cost-effective projects and long-term support contracts in areas from system design to automation to business intelligence and more. PalominoDB is renowned for an emphasis on transparency, communication and responsiveness, as well as providing operational excellence for leading companies including Zappos, Chegg, Technorati, Slideshare and Zendesk. For more information, please visit www.palominodb.com

About Tokutek Inc.
Tokutek, Inc. is the leader in high-performance and agile database storage engines. TokuDB is a highly scalable, zero-maintenance downtime MySQL Storage Engine that delivers indexing-based query acceleration, improved replication performance, unparalleled compression, and hot schema modifications. TokuDB is a “drop-in” storage engine requiring no changes to MySQL applications or code and is fully ACID and MVCC compliant. The company is headquartered in Lexington, MA and has offices in New York, NY. For more information, visit tokutek.com.



PlanetMySQL Voting: Vote UP / Vote DOWN

TokuDB v6.0: Download Available

Апрель 30th, 2012

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Sysbench Performance

I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In particular, we’ve been working on our locking schemes and have made some nice improvements in multi-threaded performance. In TokuDB v5.2, we outperformed InnoDB on sysbench by about 20% out to 64 threads. The following shows the performance of TokuDB v6.0 vs InnoDB on the same test:

InnoDB now has better multi-threading as well, so with standard compression on, we are now neck-in-neck with InnoDB out to 64 client threads, and then pull ahead out to 1024 client threads. With high compression, we top out at 72% faster than InnoDB!

We hope you enjoy this and all the other TokuDB v6.0 improvements.

To learn more about TokuDB:

  • Read the press release here.
  • Hear me talk about TokuDB v6.0 on the MySQL Database Community Podcast in Episode 86.
  • Read the Bloor Research Report on TokuDB v6.0 here.

    PlanetMySQL Voting: Vote UP / Vote DOWN

My Talk on Tuesday at IOUG COLLABORATE 12

Апрель 20th, 2012

 

 

Challenges of Big Databases with MySQL

Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine plug-ins.

This presentation discusses how MySQL’s storage engines have recently made dramatic progress in large database manageability. I’ll be speaking Tuesday (4/24) 8:00 am in Lagoon D. Details can be found here. A complete list of MySQL talks can be found here.


PlanetMySQL Voting: Vote UP / Vote DOWN

Percona MySQL Conference and Expo Week in Review

Апрель 18th, 2012

Thanks to all of those who came by our booth and to see Leif’s presentation on Read Optimization, and to my Lightning Talk on OLTP and OLAP at the Percona MySQL Conference and Expo. It was an incredible week and a great place to launch TokuDB v6.0 from! A big thanks to Percona for a great event, to Pythian for a fantastic dinner, and to SkySQL for a worthwhile follow on. We are also very grateful to Network World for giving us a product of the week award, and to Bloor Research for an insightful review of TokuDB v6.0.

Mr. Bill Gets Hammered by Big Data

For those who missed it, here is a copy of Leif’s presentation with a good photo from Percona. Thanks to Sheeri for her tweet as well. In addition, here is a copy of my Lightning Talk (in case you were too distracted by Mr.Bill). There were some great photos taken by Mark Lehmann (including the one shown above, and those in the “Scanner Wars“) as well as Percona. Thanks to Erin,  SheeriAmrith and Ernie for their tweets too!

I considered a detailed conference review, but others have already captured the event so well that there was little to add. In case you missed it, there are great write-ups by O’Reilly, Percona, Shlomi, and several others.

Thanks again to those who came by!

 


PlanetMySQL Voting: Vote UP / Vote DOWN