Archive for the ‘mysql’ Category
Manage hierarchical data with MySQL stored procedures
Май 21st, 2012PlanetMySQL Voting: Vote UP / Vote DOWN
How having many tables affects MySQL memory usage?
Май 18th, 2012You could say: what could be the reason for having really big number of tables? Just design the application properly! It’s not always that easy. And this post isn’t really about arguing whether having many tables is good or not, it’s about what happens in terms of memory usage if you already reached that point.
Btw what do I mean by *many*? From my experience it’s tens of thousends or even millions rather than hundreds.
The inspiration for me to write this post was strong desire to try out the latest declared improvements in that area announced to be done in MySQL 5.6. _1
What I did was a very simple test where I loaded sql dump of databases and tables definitions only into different versions of MySQL. During the load time I was monitoring memory usage of mysqld process (RSS value from ps aux) as a function of tables number. Additionaly I compared disk usage in case you choose single innodb table space versus file per table, also in relation to number of tables.
Settings I used for test were all default and I chose lowest possible innodb buffer pool:
innodb_buffer_pool_size = 8Mexcept when I started to test Percona versions I realized that the lowest allowed value there was 32MB, this warning appeared in error log after the service start:
“[Warning] option ‘innodb-buffer-pool-size’: signed value 8388608 adjusted to 33554432″
So both Percona versions I tested had 32MB and all others had 8MB but I think it wasn’t that important in this particular test.
Additionally for Percona versions I used the
innodb_dict_size_limitvariable introduced already in 2009 _2.
The sample table set that I used has around 50 different tables with various number of fields and indexes.
Here are the results of creating more than a million tables:
I wonder what is the difference between the patch that Percona did few years ago (and which was available starting from 5.0.77-b13 of Percona Server) and latest fix available in MySQL 5.6.x branch. Any way It’s really nice to see it among all other great improvements that are happening in 5.6.x development line.
OK, now let’s see how the disk usage is affected in this simple test by just creating empty tables in two scenarios: single InnoDB ibdata file and when innodb_file_per_table option is set. In this case I observed no differences between various MySQL versions, so you can see only one here.
Related links:
http://www.mysqlperformanceblog.com/2010/05/06/how-much-memory-innodb-dictionary-can-take/
http://fromdual.com/how-mysql-behaves-with-many-schemata-tables-and-partitions
http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-performance-table-cache
http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage-continued/
http://ma.tt/2006/03/wordpress-and-lyceum/
References:
- http://blogs.innodb.com/wp/2011/12/improving-innodb-memory-usage/
- http://www.mysqlperformanceblog.com/2009/02/11/limiting-innodb-data-dictionary/
PlanetMySQL Voting: Vote UP / Vote DOWN
Upcoming conferences to learn more about MySQL Cluster & Replication
Май 18th, 2012There are a couple of conferences coming up where you can expect to learn about the latest developments in MySQL Cluster and MySQL Replication (as well as what else is happening in MySQL 5.6).
The first is the Oracle MySQL Innovation Day which is being held in Oracle HQ at Redwood Shores. This is an all-day event on 5th June – unfortunately I won’t be able to attend this one but there will be lots of great Cluster and replication sessions. If you can’t make it out to California then there will be a live Webcast. You can register here to attend in person or join the webcast.
The second is MySQL Connect – this runs the weekend before Oracle OpenWorld in San Francisco; it’s not until 29th September but it’s worth registering now to get the early bird pricing and save $500 (end 13th July). There are lots of great sessions lined up both from the MySQL experts within Oracle and users and community members.
PlanetMySQL Voting: Vote UP / Vote DOWN
Log Buffer #272, A Carnival of the Vanities for DBAs
Май 18th, 2012PlanetMySQL Voting: Vote UP / Vote DOWN
Log Buffer #272, A Carnival of the Vanities for DBAs
Май 18th, 2012PlanetMySQL Voting: Vote UP / Vote DOWN
Chain Copying to Multiple hosts
Май 18th, 2012This week I was given the task of repopulating our entire primary database cluster. This was due to an alter that had to be performed on our largest table. It was easiest to run it on one host and populate the dataset from that host everywhere.
I recalled a while back reading a blog post from Tumblr about how to chain a copy to multiple hosts using a combination of nc, tar, and pigz. I used this, with a few other things to greatly speed up our repopulation process. As I was repopulating production servers, I did a combination of raw data copy and xtrabackup streams across our servers, depending on the position in our replication setup.
For a normal straight copy, here’s what I did:
On the last host, configure netcat to listen and then pipe the output through pigz and tar to uncompress and untar. This needs to be run in the destination directory:
nc -l 1337 | pigz -d | tar -xvf -
On any hosts in the middle of the chain, you do the same thing with one extra step. Using a fifo to redirect the stream to the next host:
mkfifo copy_redirect
nc next_host_in_chain 1337 <copy_redirect &
nc -l 1337 | tee copy_redirect | pigz -d | tar -xvf -
And on the source host you actually make the stream. This is where I differed the most from what Tumblr had written. I added a progress bar using pv.
tar -c /data/mysql/ | pv --size $( du -sh /data/mysql/ | cut -f1 ) | pigz | nc first_host_in_chain 1337
To do this with an xtrabackup stream, the commands are similar. On each host, tar needs to add the “i” flag (to become “tar -xvfi -”). The progress bar here became slightly less accurate, but was still a good rough estimate of the progress. On the source host, the command became:
innobackupex --stream=tar /tmp/ --slave-info | pv --size $( du -sh /data/mysql/ | cut -f1 ) | pigz | nc first_host_in_chain 1337
I found that using this method, for a raw copy, I was able to achieve between 300 and 350 MB/sec copying large tables. Smaller tables averaged slower speeds. I didn’t do enough testing here to see where the bottleneck was. I can say that it was not network, cpu, or io. Our servers involved have 10 GBit network and FusionIO drives. Increasing the compression level may have helped add some throughput here as well. Copying a 1.4 TB Dataset to 3 destination servers took under 2 hours.
This is definitely a tool that I will be adding to my arsenal to use on a regular basis.
PlanetMySQL Voting: Vote UP / Vote DOWN
Meet the MySQL Experts Podcast: MySQL Replication Global Transaction Identifiers & HA Utilities
Май 17th, 2012In the latest episode
of our “Meet The MySQL Experts” podcast, Luis Soares,
Engineering Manager of MySQL Replication discusses the new Global Transaction
Identifiers (GTIDs) that are part of the latest MySQL 5.6 Development Release. We are also joined by Chuck Bell who
discusses how the new MySQL HA utilities use GTIDs to create a self-healing
replication topology.
In the podcast, we cover how GTIDs and the HA utilities are implemented, how they are
configured and considerations for their use.
You can also learn
more from Luis’ blog on GTIDs in MySQL 5.6 and Chuck’s blog on the HA utilities.
Of course, GTIDs are
just one of the major new features of MySQL replication. For a complete
overview, take a look at our DevZone article: MySQL 5.6 Replication - Enabling the Next Generation of Web & Cloud
Services.
You can try out MySQL
5.6 and GTIDs by downloading the Development Release (select Development Release tab)
Enjoy the GTID podcast and let us know what topics you would like covered in future podcasts! Also check out the library of Meet the MySQL Experts podcasts
PlanetMySQL Voting: Vote UP / Vote DOWN
Training in London next week
Май 16th, 2012I’m going to deliver MySQL Training next week (May 21-24) in London.
This is a rare opportunity as I do not personally deliver a lot of Training, especially outside of US. There are still some places left if you want to sign up.
You will also get a signed copy of High Performance MySQL 3rd edition as an attendee.
PlanetMySQL Voting: Vote UP / Vote DOWN
Benchmarking single-row insert performance on Amazon EC2
Май 16th, 2012I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, but still the tool serves well for benchmark purposes.
OK, let’s start off with the configuration details.
Configuration
First of all let me describe the EC2 instance type that I used.
EC2 Configuration
I chose m2.4xlarge instance as that’s the instance type with highest memory available, and memory is what really really matters.
High-Memory Quadruple Extra Large Instance 68.4 GB of memory 26 EC2 Compute Units (8 virtual cores with 3.25 EC2 Compute Units each) 1690 GB of instance storage 64-bit platform I/O Performance: High API name: m2.4xlarge
As for the IO configuration I chose 8 x 200G EBS volumes in software RAID 10.
Now let’s come to the MySQL configuration.
MySQL Configuration
I used Percona Server 5.5.22-55 for the tests. Following is the configuration that I used:
## InnoDB options innodb_buffer_pool_size = 55G innodb_log_file_size = 1G innodb_log_files_in_group = 4 innodb_buffer_pool_instances = 4 innodb_adaptive_flushing = 1 innodb_adaptive_flushing_method = estimate innodb_flush_log_at_trx_commit = 2 innodb_flush_method = O_DIRECT innodb_max_dirty_pages_pct = 50 innodb_io_capacity = 800 innodb_read_io_threads = 8 innodb_write_io_threads = 4 innodb_file_per_table = 1 ## Disabling query cache query_cache_size = 0 query_cache_type = 0
You can see that the buffer pool is sized at 55G and I am using 4 buffer pool instances to reduce the contention caused by buffer pool mutexes. Another important configuration that I am using is that I am using “estimate” flushing method available only on Percona Server. The “estimate” method reduces the impact of traditional InnoDB log flushing, which can cause downward spikes in performance. Other then that, I have also disabled query cache to avoid contention caused by query cache on write heavy workload.
OK, so that was all about the configuration of the EC2 instance and MySQL.
Now as far as the benchmark itself is concerned, I made no code changes to iiBench, and used the version available here. But I changed the table to use range partitioning. I defined a partitioning scheme such that every partition would hold 100 million rows.
Table Structure
The table structure of the table with no secondary indexes is as follows:
CREATE TABLE `purchases_noindex` ( `transactionid` int(11) NOT NULL AUTO_INCREMENT, `dateandtime` datetime DEFAULT NULL, `cashregisterid` int(11) NOT NULL, `customerid` int(11) NOT NULL, `productid` int(11) NOT NULL, `price` float NOT NULL, PRIMARY KEY (`transactionid`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE (transactionid) (PARTITION p0 VALUES LESS THAN (100000000) ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN (200000000) ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN (300000000) ENGINE = InnoDB, PARTITION p3 VALUES LESS THAN (400000000) ENGINE = InnoDB, PARTITION p4 VALUES LESS THAN (500000000) ENGINE = InnoDB, PARTITION p5 VALUES LESS THAN (600000000) ENGINE = InnoDB, PARTITION p6 VALUES LESS THAN (700000000) ENGINE = InnoDB, PARTITION p7 VALUES LESS THAN (800000000) ENGINE = InnoDB, PARTITION p8 VALUES LESS THAN (900000000) ENGINE = InnoDB, PARTITION p9 VALUES LESS THAN (1000000000) ENGINE = InnoDB, PARTITION p10 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
While the structure of the table with secondary indexes is as follows:
CREATE TABLE `purchases_index` ( `transactionid` int(11) NOT NULL AUTO_INCREMENT, `dateandtime` datetime DEFAULT NULL, `cashregisterid` int(11) NOT NULL, `customerid` int(11) NOT NULL, `productid` int(11) NOT NULL, `price` float NOT NULL, PRIMARY KEY (`transactionid`), KEY `marketsegment` (`price`,`customerid`), KEY `registersegment` (`cashregisterid`,`price`,`customerid`), KEY `pdc` (`price`,`dateandtime`,`customerid`) ) ENGINE=InnoDB AUTO_INCREMENT=11073789 DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE (transactionid) (PARTITION p0 VALUES LESS THAN (100000000) ENGINE = InnoDB, PARTITION p1 VALUES LESS THAN (200000000) ENGINE = InnoDB, PARTITION p2 VALUES LESS THAN (300000000) ENGINE = InnoDB, PARTITION p3 VALUES LESS THAN (400000000) ENGINE = InnoDB, PARTITION p4 VALUES LESS THAN (500000000) ENGINE = InnoDB, PARTITION p5 VALUES LESS THAN (600000000) ENGINE = InnoDB, PARTITION p6 VALUES LESS THAN (700000000) ENGINE = InnoDB, PARTITION p7 VALUES LESS THAN (800000000) ENGINE = InnoDB, PARTITION p8 VALUES LESS THAN (900000000) ENGINE = InnoDB, PARTITION p9 VALUES LESS THAN (1000000000) ENGINE = InnoDB, PARTITION p10 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
Also, I ran 5 instances of iiBench simultaneously to simulate 5 concurrent connections writing to the table, with each instance of iiBench writing 200 million single row inserts, for a total of 1 billion rows. I ran the test both with the table purchases_noindex which has no secondary index and only a primary index, and against the table purchases_index which has 3 secondary indexes. Another thing I would like to share is that, the size of the table without secondary indexes is 56G while the size of the table with secondary indexes is 181G.
Now let’s come down to the interesting part.
Results
With the table purchases_noindex, that has no secondary indexes, I was able to achieve an avg. insert rate of ~25k INSERTs Per Second, while with the table purchases_index, the avg. insert rate reduced to ~9k INSERTs Per Second. Let’s take a look at the graphs have a better view of the whole picture.
Note, in the above graph, we have “millions of rows” on the x-axis and “INSERTs Per Second” on the y-axis.
The reason why I have chosen to show “millions of rows” on the x-axis so that we can see the impact of growth in data-set on the insert rate.
We can see that adding the secondary indexes to the table has decreased the insert rate by 3x, and its not even consistent. While with the table having no secondary indexes, you can see that the insert rate is pretty much constant remaining between ~25k to ~26k INSERTs Per Second. But on the other hand, with the table having secondary indexes, we can see that there are regular spikes in the insert rate, and the variation in the rate can be classified as large, because it varies between ~6.5k to ~12.5k INSERTs per second, with noticeable spikes after every 100 million rows inserted.
I noticed that the insert rate drop was mainly caused by IO pressure caused by increase in flushing and checkpointing activity. This caused spikes in write activity to the point that the insert rate was decreased.
Conclusion
As we all now there are pros and cons to using secondary indexes. While secondary indexes cause read performance to improve, but they have an impact on the write performance. Well most of the apps rely on read performance and hence having secondary indexes is an obvious choice. But for those applications that are write mostly or that rely a lot on write performance, reducing the no. of secondary indexes or even going away with secondary indexes causes a write throughput increase of 2x to 3x. In this particular case, since I was mostly concerned with write performance, so I went ahead to choose a table structure with no secondary indexes. Other important things to consider when you are concerned with write performance is using partitioning to reduce the size of the B+tree, having multiple buffer pool instances to reduce contention problems caused by buffer pool mutexes, using “estimate” checkpoint method to reduce chances of log flush storms and disabling the query cache.
PlanetMySQL Voting: Vote UP / Vote DOWN
SwRI Chooses TokuDB to Tackle Machine Data for an 800M+ Record Database
Май 16th, 2012Tackling machine data on the ground to ensure successful operations for NASA in space
Issues addressed:
- Scaling MySQL to multi-terabytes
- Insertion rates as InnoDB hit a performance wall
- Schema flexibility to handle an evolving data model
The Company: Southwest Research Institute (SwRI) is an independent, nonprofit applied research and development organization. The staff of more than 3,000 specializes in the creation and transfer of technology in engineering and the physical sciences. Currently, SwRI is part of an international team working on the NASA Magnetospheric Multiscale (MMS) mission. MMS is a Solar Terrestrial Probes mission comprising four identically instrumented spacecraft that will use Earth’s magnetosphere as a laboratory to study the microphysics of three fundamental plasma processes: magnetic reconnection, energetic particle acceleration, and turbulence.
The Challenge: SwRI is responsible for archiving an enormous quantity of data generated by the Hot Plasma Composition Analyzer (HPCA). The device is used to count hydrogen, helium, and oxygen ions in space at different energy levels. These instruments require extensive calibration data and each one is a customized, high precision device that is built, tested, and integrated by hand. SwRI must capture and store all the test and calibration data during the 2-3 week bursts activity that are required for each of the 4 devices.
“During each of these calibration runs, there are several data sources flowing into the server, each one leading to an index in the database,” said Greg Dunn, a Senior Research Engineer at SWRI. “Each packet that arrives gets a timestamp, message type, file name and location associated with it. A second process goes through that data and parses it out – information such as voltage, temperature, pressure, current, ion energy, particle counts, and instrument health must be inserted into the database for every record. This can load the database with up to 400 or 500 inserts per second.”
“Being able to monitor the performance of the instrument and judge the success of the tests and calibrations in near real time is critical to the project,” noted Dunn. “There are limited windows to do testing cycles and make adjustments for any issues that arise. Any significant slip in the testing could cost tens of thousands of dollars and jeopardize the timing of the satellite launch.”
“We started seeing red flags with InnoDB early in the ramp-up phase of the project, as our initial data set hit 400GB,” said Dunn. “Size was the first issue. Each test run was generating around 94 million inserts or around 90GB of data, quickly exceeding the capacity allocated for the program. In addition, as our database grew to 800M records, we saw InnoDB insertion performance drop off to a trickle. Even with modest data streams at 100 records per second, InnoDB was topping out at 45 insertions per second. Being able to monitor these crucial calibration activities in a timely fashion and in a cost effective manner was at risk.”
To keep up with the workload and data set, SwRI considered several options, but they failed to meet program performance and price goals. These included:
Partitioning / Separate Databases – “We considered partitioning, but this can be a challenge to set up and it introduces additional complexity,” said Dunn. “We also looked at putting each calibration into its own database, but that would have made it much more difficult to correlate across different databases.”
Additional RAM – “Increasing the available RAM from 12 GB up to 100 GB was not enough by itself,” claimed Dunn. “We briefly considered keeping everything in RAM, but that was not a realistic or efficient way to address a data set size that was promising to grow to several terabytes by the end of the program.”
The Solution: Once TokuDB was installed, SwRI’s big data management headache quickly subsided. “The impact to our required storage was dramatic,” noted Dunn. “We benefited from over 9x compression. In our comparison benchmarks, we went from 452GB with InnoDB to 49GB with TokuDB.”
There was also a dramatic improvement in performance. “Suddenly, we no longer had to struggle to keep up with hundreds of insertions per second,” stated Dunn. “Our research staff could immediately see whether or not the experiment was running correctly and whether the test chamber was being used effectively. We didn’t have to worry that insufficient data analysis horsepower might lead to downstream schedule delays.”
The Benefits:
Cost Savings: “The hardware savings were impressive,” noted Dunn. “With InnoDB, going to larger servers, adding 100s of GBs of additional RAM along with many additional drives would have easily cost $20,000 or more, and still would not have addressed all our needs. TokuDB was by far both a cheaper and simpler solution.”
Hot Column Addition: “As we continue to build out the system and retool the experiments, flexibility in schema remains important,” stated Dunn. “TokuDB’s capability to quickly add columns of data is a good match for our environment, where our facility is still evolving and sometimes has new sensors or monitors installed that need to be added to existing large tables.”
Fast Loader: “The open source toolset that Tokutek designed to parallelize the loading of the database was very helpful,” said Dunn. “We were able to bring down the load of the database from MySQL dump backup from 30 hours to 7 hours.”
PlanetMySQL Voting: Vote UP / Vote DOWN



