More Oracle ACEs for MySQL

Январь 27th, 2012
Oracle ACEs for MySQL

As Keith announced today, there are two more Oracle ACE Directors for MySQL expertise. In case you are wondering how an ACE Director compares to a regular ACE, here is an overview and some FAQ.

PlanetMySQL Voting: Vote UP / Vote DOWN

Statistics counters for Multi Range Read

Январь 27th, 2012

MariaDB 5.3 has now three statistics counters for Multi Range Read optimization:

MariaDB [test]> show status like 'Handler_mrr%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| Handler_mrr_extra_key_sorts   | 0     |
| Handler_mrr_extra_rowid_sorts | 0     |
| Handler_mrr_init              | 0     |
+-------------------------------+-------+
3 rows in set (0.08 sec)

I’ve just added the first two. The reason for having them is as follows: the point of MRR is to provide speedup over regular execution by doing reads in disk order. In order to make reads in disk order, MRR needs buffer space where it accumulates and sorts read requests. If there are too many read requests to fit into the buffer, MRR will make multiple accumulate-sort-read passes.

Doing multiple passes allows MRR to operate when having limited buffer space, but the speedup will be not as great as with one big disk-ordered read sweep. The purpose of Handler_mrr_extra_key_sorts and Handler_mrr_extra_rowid_sorts is to count the additional accumulate-sort-read passes, so you’re able to tell if you will benefit from increasing your @@mrr_buffer_size and @@join_buffer_size settings.

There are two counters, _extra_key_sorts and _extra_rowid_sorts, because MariaDB has two places where it will do sorting:

  1. sort rowids before reading table records
  2. sort key values before making a bunch of index lookups

MRR code will try to distribute buffer space between them in an optimal way. The decision is a guess based on the available statistics, and can be wrong. Having both counters will allow us to check how the guess will work in practice.

p.s. if you could not make any sense of anything above, try reading Multi Range Read page in our knowlegebase. We have just put there a hopefully-readable explanation of what MRR is.


PlanetMySQL Voting: Vote UP / Vote DOWN

Oracle MySQL Developer Days — Germany & France

Январь 27th, 2012
MySQL is coming to Frankfurt am Main in Germany for a MySQL Developer Day, Febuary 9th.

This developer day has a wide range of sessions and two great resources:  Johannes Schlüter, MySQL Software Developer (great PHP resource as well) and Giuseppe Maxia, Oracle ACE Director for MySQL.

Take advantage of this chance to learn more about MySQL !  More information on the event in Germany can be found here.

Paris France will also host a Developer day in March 21st. More information on this will becoming soon.

PlanetMySQL Voting: Vote UP / Vote DOWN

Using jemalloc to fix a performance problem

Январь 27th, 2012

We are in the process of upgrading our RPM build toolchain from gcc 4.1 and glibc 2.5 to gcc 4.6 and glibc 2.13. The initial build was OK as there were not many new compiler warnings and regression tests passed. I then ran sysbench to check for obvious performance problems and there was a problem. The graph shows that performance was much worse at high concurrency for glibc 2.13.

 

The performance test is simple. I ran a modified version of sysbench and setup a database with 8 InnoDB tables with 2M rows in each table. The InnoDB buffer pool caches all of the tables. Then 8 sysbench processes are started, each process uses between 8 and 128 threads and each process uses only one of the 8 tables. The test transaction is a query that fetches one row by primary key.

 

The test is done by starting mysqld, warming up the buffer pool, running 8 sysbench processes for 8 threads per-process, then for 16 threads per-process, ..., then for 128 threads per-process. The mysqld process is not restarted before each iteration. Many statistics are collected after each iteration. The tests were run on 2-socket hosts. One used Linux 2.6.18-194 and has 4 cores per socket. The other used Linux 2.6.38 and has 6 cores per socket.

 

I did not know whether the problem was in MySQL source, gcc, glibc or the combination of the older Linux kernel and newer glibc. I looked at SHOW MUTEX STATUS and there were 135 times more OS Waits for kernel_mutex with glibc 2.13. Then I looked at SHOW ENGINE INNODB STATUS and there was much more mutex activity for glibc 2.13:

Mutex spin waits 120293328, rounds 2735707989, OS waits 63755681

Spin rounds per wait: 22.74 mutex

 

Compare this to the activity for glibc 2.5 listed below. With glibc 2.13 there were 3X more calls to the mutex spin-wait function, and each call did much more spinning (22.74 versus 2.37). Finally there were many more times when a thread could not get the mutex while doing the spin-wait and then waited on a condition variable (~64M versus ~478k).

Mutex spin waits 43606684, rounds 103394151, OS waits 477980

Spin rounds per wait: 2.37 mutex

 

The output from vmstat was also interesting for the high-concurrency tests when performance was bad. The context switch rate and user CPU rate were much higher with glibc 2.13. From the procs column the sum of the r and b columns were much lower with glibc 2.13.  The sum was similar to the number of client connections (and mysqld threads) for glibc 2.5. But the sum was very small, about 50, for glibc 2.3. I joked that someone implemented an M:N thread library in glibc. But I am not an expert on vmstat columns.

 

The initial test host used Linux 2.6.18, the kernel for CentOS 5.2. I repeated the tests on a host with Linux 2.6.38 and they were the same. So at this point I can rule out the interation between glibc and an old kernel. At this point I also began to get a lot of help. Someone else looked at pre-processed source and disassembly for mutex_spin_wait and its helper functions. But the code looked OK. Then they confirmed that the busy-wait loop ran for the same time with both versions of the gcc compiler (4.1 and 4.6). A long time ago there was a busy-wait loop that didn't wait.

 

Given that performance was similar up to 512 concurrent clients my guess was that glibc or the new pthread library were the problem and that the new version of gcc was OK. I was not willing to rule out MySQL source and found a bug where it used invalid values to set thread priorities, but fixing this bug did not fix the performance problem.

 

I ran a few more tests using the perf tool. This is like oprofile, but much better and I think one of our Linux kernel gurus (Arun) is working on it. The output from that showed much more time was spent in mutex related functions. I also received advice from another systems guru (Paul) that I should repeat the tests using jemalloc. I ignored this advice at first because I didn't think it would help (insert laugh track here).

 

I repeated the test with thread_cache_size=2000. It had been set to 0 prior to this. This fixed the performance problem although this was a workaround rather than a fix. Why would calls to pthread_create and pthread_exit have such an impact on performance? To confirm that the problem was thread create/destroy I ran the test for 512 concurrent clients in a loop with thread_cache_size=0 so that each iteration of the test created and destroyed 512 threads. Performance dropped on the third or fourth iteration. But all that I have now is a workaround not a fix.

 

Paul repeated his request for me to run a test with jemalloc. I consented to do this as long as it was easy for me to setup. I didn't want to build an RPM linked with jemalloc just yet. So I modified mysqld_safe to use LD_PRELOAD with jemalloc when starting mysqld. This fixed the problem as displayed in this graph. The four lines are described below. Performance is only bad when glibc 2.13 is used without one of jemalloc or thread_cache_size=2000.

  • jemalloc_nocache - uses glibc 2.13, jemalloc and thread_cache_size=0
  • glibc213_cache - uses glibc 2.13 and thread_cache_size=2000
  • glibc213_nocache - uses glibc 2.13 and thread_cache_size=0
  • glibc25_nocache - uses glibc 2.5 and thread_cache_size=0

Have you experienced this with glibc 2.13? Ubuntu and Debian use eglibc rather than glibc and I don't know whether that makes a difference. Paul knew that jemalloc might fix this problem because he fixed a similar problem elsewhere at work. Eventually we will write a reproduction case in a small amount of C, but for now my summary of his description is that the thread stacks changed from being allocated with MAP_32BIT in the old glibc to now using the 64-bit address space. This conflicts with malloc and there is more fragmentation in the 64-bit mmap space. The issue is fixed by jemalloc because it is better at avoiding fragmentation.

 

It is likely that the problem can also be fixed by allocating thread stacks with malloc before calling pthread_create. I don't know yet whether MySQL source should be changed to do that. I filed bug 64150 for this.


PlanetMySQL Voting: Vote UP / Vote DOWN

Well deserved !

Январь 27th, 2012
Giuseppe Maxia and Sarah Novotny both have now been awarded Oracle ACE Directors for MySQL !

They both are well respected in the MySQL community and have achieved great things so far. We look forward to seeing what they will do next.

PlanetMySQL Voting: Vote UP / Vote DOWN

Transparent Encryption for NDB nodes (MySQL Cluster) – a First Look

Январь 27th, 2012

MySQL Cluster usage has certainly continued to spread and recently accelerate well beyond its initial telco vertical roots into Healthcare, Financial Services, SaaS and more. With those additions it certainly becomes desirable for many to provide transparent encryption on the NDB nodes where the data, logs, and checkpoints that write to disk. I’ll not go into all those reasons in this blog, but certainly there are plenty, these white papers provide  more details, especially if you are running within hosted, managed, or cloud environments platforms.

The solution for ndb in a nutshell was straight forward:
1 Set up Gazzang ezNcrypt Flex Platform
2 Stop the ndb process prior to encrypting the ndb_data directory
3 Encrypt the ndb_data directory
ezncrypt -e @ndbdata /home/mysql/my_cluster/ndb_data
4 Add a Flex ACL Rules granting ndbd access to the encryption keys.
ezncrypt-access-control -a “ALLOW @ndbdata * /home/mysql/mysql-cluster-gpl-7.1.18-linux-i686-glibc23/bin/ndbd”
5 Restart ndb

Note: if you setup a single node test environment or if for some some reason want to run it for you will also need to add a rule for ndb_mgmd then also add -

ezncrypt-access-control -a “ALLOW @ndbdata * /home/mysql/mysql-cluster-gpl-7.1.18-linux-i686-glibc23/bin/ndb_mgmd”

 

Certainly there are many more things you can do to protect MySQL Cluster data on Linux – and I will follow through with those details or details on usage in specific environments including clouds, but this is a good start and shows how easy this is to accomplish, and Gazzang adds key management, process, access, monitoring, and many other benefits aside from the encryption itself. For more ideas around that see this EMA paper

With the release of of our 2.2.2 product coming in February of 2012 you will see that we have added ndb to our supported engines list to MySQL. Gazzangs platform is simple and easy to install and as you can see here. If you are interested just Try it out.



PlanetMySQL Voting: Vote UP / Vote DOWN

Transparent Encryption for NDB nodes (MySQL Cluster) – a First Look

Январь 27th, 2012

MySQL Cluster usage has certainly continued to spread and recently accelerate well beyond its initial telco vertical roots into Healthcare, Financial Services, SaaS and more. With those additions it certainly becomes desirable for many to provide transparent encryption on the NDB nodes where the data, logs, and checkpoints that write to disk. I’ll not go into all those reasons in this blog, but certainly there are plenty, these white papers provide  more details, especially if you are running within hosted, managed, or cloud environments platforms.

The solution for ndb in a nutshell was straight forward:
1 Set up Gazzang ezNcrypt Flex Platform
2 Stop the ndb process prior to encrypting the ndb_data directory
3 Encrypt the ndb_data directory
ezncrypt -e @ndbdata /home/mysql/my_cluster/ndb_data
4 Add a Flex ACL Rules granting ndbd access to the encryption keys.
ezncrypt-access-control -a “ALLOW @ndbdata * /home/mysql/mysql-cluster-gpl-7.1.18-linux-i686-glibc23/bin/ndbd”
5 Restart ndb

Note: if you setup a single node test environment or if for some some reason want to run it for you will also need to add a rule for ndb_mgmd then also add -

ezncrypt-access-control -a “ALLOW @ndbdata * /home/mysql/mysql-cluster-gpl-7.1.18-linux-i686-glibc23/bin/ndb_mgmd”

 

Certainly there are many more things you can do to protect MySQL Cluster data on Linux – and I will follow through with those details or details on usage in specific environments including clouds, but this is a good start and shows how easy this is to accomplish, and Gazzang adds key management, process, access, monitoring, and many other benefits aside from the encryption itself. For more ideas around that see this EMA paper

With the release of of our 2.2.2 product coming in February of 2012 you will see that we have added ndb to our supported engines list to MySQL. Gazzangs platform is simple and easy to install and as you can see here. If you are interested just Try it out.



PlanetMySQL Voting: Vote UP / Vote DOWN

Chance to give your views on MySQL Cluster 7.2 content

Январь 27th, 2012

MySQL Cluster 7.2 Quick PollThe MySQL Cluster 7.2 Development Milestone Release has been out for a while now and we’d love to hear which are your favourite features – it takes just a few seconds to complete the Quick-Poll. It should literally take seconds to complete and will provide us with valuable feedback on the kind of features are most useful – so that we can build more of them in the future!


PlanetMySQL Voting: Vote UP / Vote DOWN

Chance to give your views on MySQL Cluster 7.2 content

Январь 27th, 2012

MySQL Cluster 7.2 Quick PollThe MySQL Cluster 7.2 Development Milestone Release has been out for a while now and we’d love to hear which are your favourite features – it takes just a few seconds to complete the Quick-Poll. It should literally take seconds to complete and will provide us with valuable feedback on the kind of features are most useful – so that we can build more of them in the future!


PlanetMySQL Voting: Vote UP / Vote DOWN

Last chance to vote for MySQL+ community awards 2011, VOTE NOW !

Январь 27th, 2012

You have until Jan. 31 to vote for your favorites tools and services, so, vote now !

Thanks again all folks for your keen interest and your involvement, it was a big surprise to see so many contributors

Follow this link to vote : http://www.mysqlplus.net/2012/01/05/vote-mysqlplus-community-awards-2011/

And come february the 1st for the final results…

 

 


PlanetMySQL Voting: Vote UP / Vote DOWN