Archive for the ‘gdb’ Category

Using Aspersa to capture diagnostic data

Май 7th, 2010

I frequently encounter MySQL servers with intermittent problems that don’t happen when I’m watching the server. Gathering good diagnostic data when the problem happens is a must. Aspersa includes two utilities to make this easier.

The first is called ’stalk’. It would be called ‘watch’ but that’s already a name of a standard Unix utility. It simply watches for a condition to happen and fires off the second utility.

This second utility does most of the work. It is called ‘collect’ and by default, it gathers stats on a number of things for 30 seconds. It names these statistics according to the time it was started, and places them into a directory for analysis.

Here’s a sample of how to use the tools. In summary: get them and make them executable, then configure them; then start a screen session and run the ’stalk’ utility as root. Go do something else and come back later to check! A code sample follows.

$ wget http://aspersa.googlecode.com/svn/trunk/stalk
$ wget http://aspersa.googlecode.com/svn/trunk/collect
$ chmod +x stalk collect
$ mkdir -p ~/bin
$ mv stalk collect ~/bin
$ vim ~/bin/stalk # Configure it
$ screen -S stalking.the.server
$ sudo ~/bin/stalk

Inside the ’stalk’ tool, you’ll see a few things you can configure. By default, it tries to connect to mysqld via mysqladmin and see how many threads are connected to the server. If this increases over 100 (a sample number you should almost certainly change), or if it can’t connect to mysqld, then it fires off the ‘collect’ tool, or whatever else you configure it to execute.

The ‘collect’ tool, by default, captures a variety of things including disk usage, cpu usage, internal status from mysqld, and even oprofile (which it saves using the standard oprofile save feature; you must use opreport to get your report later). There is also a commented-out section to run GDB if you want stack traces. This is not enabled by default because that’ll freeze mysqld briefly. Usually this is OK if mysqld is already unresponsive during the problem!

Related posts:

  1. Aspersa, a new opensource toolkit Some of th
  2. How to capture debugging information with mk-loadavg Maatkit
  3. What data types does your innovative storage engine NOT support? I’ve

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

A growing trend: InnoDB mutex contention

Март 4th, 2010

I’ve been noticing an undeniable trend in my consulting engagements in the last year or so, and when I vocalized this today, heads nodded all around me. Everyone sees a growth in the number of cases where otherwise well-optimized systems are artificially limited by InnoDB contention problems.

A year ago, I simply wasn’t seeing the need for analysis of GDB backtraces en masse. These days, I’m writing custom tools to gather and analyze backtraces. A year ago, I simply looked at the SEMAPHORE section of SHOW INNODB STATUS. These days I’m writing custom tools to aggregate and reformat that data so I can interpret it more easily. And I’m actually seeing cases of this type of problem multiple times every week. I remember the first time I ran into a server that was literally optimized to the limit, but struggling under the load. It was something new for me, not that long ago. Oh, I’d seen it before, plenty, but was always able to point out where something could be improved without changing InnoDB itself. Now it’s commonplace: schemas are fine — check. Queries are all well-indexed — check. Everything else — check. InnoDB is bottlenecked and absolutely nothing can be improved — check.

Part of the difference is the rapidly improving hardware. It’s getting hard to buy a server with fewer than 8 or even 16 cores, and 16GB of RAM feels like something I’d install in a wristwatch. But I also suspect that if I’d been characterizing the workload of servers over time in a way that was easy to compare, I’d see a clear trend towards bigger data and more queries per second. We’re just pushing MySQL + InnoDB harder today than we ever have before.

What can be done? Well, InnoDB needs to be improved, that’s all. Oracle, Percona, Google, Facebook and others are working on it, and in many cases these efforts have yielded dramatic results. But there is still much room for improvement.

Related posts:

  1. Hindsight on a scalable replacement for InnoDB A while ag
  2. What do you know about Oracle’s InnoDB+ storage engine? That
  3. Xtrabackup is for InnoDB tables too, not just XtraDB Just thoug

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Read ahead…

Январь 2nd, 2010

Mark wrote about how to find situations where InnoDB read-ahead is a bottleneck. What he didn’t disclose, though, is his trick to disable read-ahead without restart or recompile of MySQL. See, there’s no internal “disable read ahead knob”. But there is…

buf_read_ahead_random(...){ ...
       if (srv_startup_is_before_trx_rollback_phase) {
                /* No read-ahead to avoid thread deadlocks */
                return(0);
        }

This variable is tested at two functions – buf_read_ahead_linear() and buf_read_ahead_linear() and nowhere else. So yeah, “server startup is before transaction rollback phase” is another way of saying “don’t do read ahead, please please”.

gdb -ex "set  srv_startup_is_before_trx_rollback_phase=1" \
    --batch -p $(pidof mysqld)

And many servers bottlenecked on this became much much much faster (and 2000 concurrent threads running dropped to 10). Of course, this is most visible in high-latency-high-throughput I/O situations, but we’re hitting this contention spot on local disk setups too.

Don’t forget to have the fix if gdb decides to be nasty and locks up your server :)


PlanetMySQL Voting: Vote UP / Vote DOWN

Read ahead…

Январь 2nd, 2010

Mark wrote about how to find situations where InnoDB read-ahead is a bottleneck. What he didn’t disclose, though, is his trick to disable read-ahead without restart or recompile of MySQL. See, there’s no internal “disable read ahead knob”. But there is…

buf_read_ahead_random(...){ ...
       if (srv_startup_is_before_trx_rollback_phase) {
                /* No read-ahead to avoid thread deadlocks */
                return(0);
        }

This variable is tested at two functions – buf_read_ahead_linear() and buf_read_ahead_linear() and nowhere else. So yeah, “server startup is before transaction rollback phase” is another way of saying “don’t do read ahead, please please”.

gdb -ex "set  srv_startup_is_before_trx_rollback_phase=1" \
    --batch -p $(pidof mysqld)

And many servers bottlenecked on this became much much much faster (and 2000 concurrent threads running dropped to 10). Of course, this is most visible in high-latency-high-throughput I/O situations, but we’re hitting this contention spot on local disk setups too.

Don’t forget to have the fix if gdb decides to be nasty and locks up your server :)


PlanetMySQL Voting: Vote UP / Vote DOWN

GDB 7!

Октябрь 9th, 2009

I wasn’t prepared for this. After spending months playing with GDB development trees I somehow entirely missed that 7.0 release is getting close, and took me more than an hour to spot it.

My favorite features are python scripting and non-stop debugging. I was toying around with python scripting for a while, and was planning to make backtraces make sense. Having hands that open means that one can see PHP backtraces, when gdb’ing apache, see table names and states when MySQL thread access handler interfaces, or remote IPs and users, when it is writing to network. Process inspection can simply rock, if right tools are created using these new capabilities, and I’m way too excited when I think about those. “Always have debugging symbols” gets way more meaning now.

Another issue I’ve been trying to resolve lately is avoiding long locking periods for running processes (directly attaching to process can freeze its work for a second or so, which isn’t that tolerable in production environments). GDB is getting closer to the async debugging capabilities – where one can run a debugger without actually stopping anything.

So, congratulations GDB team, now it is job for us to find all the uses of the tool. It has been invaluable so far, but this is much much more.


PlanetMySQL Voting: Vote UP / Vote DOWN

How GDB helped me fix a Drizzle Bug

Сентябрь 3rd, 2009

The other day I found a nice surprise on my inbox. Jay Pipes asked me if I'd like to try fixing a small bug on Drizzle. It looked pretty simple, and the bug report included a big part of the fix. I accepted without a doubt.

I decided to first change trans_prealloc_size from uint32_t to uint64_t. That was done on drizzled/session.h.
Then, I went to drizzle/set_var.cc and changed sys_trans_prealloc_size from sys_var_session_uint32_t to sys_var_session_uint64_t (and removed the two extra parameters).

At first, that looked like it was everything I needed to do. I compiled drizzle, executed the queries that were included on the bug report, and that almost worked!

set session transaction_prealloc_size=1024*1024*1024*4;
set session transaction_prealloc_size=1024*1024*1024*5;
and
set session transaction_prealloc_size=1024*1024*1024*6; were all being truncated to "4294966272"

The pending problem was that I no longer received a warning telling me that those values had been truncated.

I looked again at the code, used gdb to step through the code, but I just couldn't tell what the problem was.
I was then forced to do what I do when I'm stuck on a computer problem, I had to take a break :).

... And that was a great idea, because while away from the computer, I thought of looking at sys_var_session_uint64_t::update, and compare it to sys_var_session_uint32_t::update.

It turned out that sys_var_session_uint64_t::update was missing a call to throw_bounds_warning(), no wonder I wasn't getting the truncation warnings.

It was time to enable the test suite that targeted this bug, run this particular test, commit, push .... and I suddenly thought that I should also run the regular tests, you know, make test. And I was glad I did. There were about 7 test cases that failed.

The firs thought was that these changes I just made uncovered some mysterious bugs on Drizzle. But after looking closely at the test cases, I noticed that there was something else going on.

If I executed this query: set group_concat_max_len = 4;, the session variable group_concat_max_len ended up with a value of 4294967295.

It was time for some serious GDB'ing. There are a few articles here, here and here that explain how to use gdb and Drizzle, but I start it a little different.

Starting GDB and Drizzle

On the terminal, I run:
[wizard]$ gdb --args /Applications/drizzle/sbin/drizzled --port=3306 --basedir=/Applications/drizzle --datadir=/Applications/drizzle/var

After a few seconds, I am at the (gdb) prompt. This is the time to set breakpoints, which would help me find this little issue.
I tried doing this:
(gdb)break drizzled/set_var.cc:744
but gdb told me: "No source file named drizzled/set_var.cc", I had this same problem trying to use gdb on the MySQL Proxy, and after trying different things, I found the correct way to specify the file name:
(gdb)break set_var.cc:744

*Note: when debugging plugins, you specify the filenames like this:
(gdb)break plugin/logging_gearman/logging_gearman.cc:150

(include the plugin folder)

That one breakpoint was all I needed, I then started drizzle by typing:
(gdb)run
After pressing enter, you see many lines, but the last few are (similar to)

. done
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins.
090827 23:45:16 InnoDB: highest supported file format is Barracuda.
090827 23:45:16 InnoDB Plugin 1.0.3 started; log sequence number 46439
Listening on :::3306
Listening on 0.0.0.0:3306
/Applications/mysql/enterprise/drizzle/sbin/drizzled: ready for connections.
Version: '2009.08.1124' Source distribution (bug-fixes)

Drizzle was ready for my tests. I went to a new terminal and executed

set group_concat_max_len = 4;
The drizzle prompt hanged there (waiting on the breakpoint I set earlier)

This is the code near the breakpoint

bool sys_var_session_uint64_t::update(Session *session, set_var *var)
{
uint64_t tmp= var->save_result.uint64_t_value;

if (tmp > max_system_variables.*offset) // <-- this was line 744
  throw_bounds_warning(session, true, true, getName(), (int64_t) tmp);
  tmp= max_system_variables.*offset;
if (option_limits)
  tmp= fix_unsigned(session, tmp, option_limits);
if (var->type == OPT_GLOBAL)
{
/* Lock is needed to make things safe on 32 bit systems */
pthread_mutex_lock(&LOCK_global_system_variables);
global_system_variables.*offset= (uint64_t) tmp;
pthread_mutex_unlock(&LOCK_global_system_variables);
}
else
session->variables.*offset= (uint64_t) tmp;
return 0;
}


Back on the GDB terminal, I saw:

Breakpoint 1, sys_var_session_uint64_t::update (this=0x4dec60, session=0x1044e00, var=0x10470c8) at set_var.cc:744
744 if (tmp > max_system_variables.*offset)

It was time to look at the value of tmp, so I typed
(gdb) print tmp
$4 = 4

So far, so good, I wanted to see 4 as the result, let's step through the code, and see when tmp gets a much higher value.
(gdb) step
746 tmp= max_system_variables.*offset;

Check the value of tmp


(gdb) print tmp
$5 = 4
(gdb) step
748 if (option_limits)
(gdb) print tmp
$6 = 4294967295

There it was, whatever happened between line 746 and 748 is causing the problem. By now you may have already spotted the issue, but I just didn't see it. I stared at that piece of code for about a minute, and then I realized what the problem was.
The if statement on line 744 was a single line statement, before I added the call to throw_bounds_warning(). Because I am so used to having brackets, even for single line if statement, I forgot to add a pair of {}.

*Note: For completeness, once you are done stepping through the code, you can type
(gdb)cont
and the program will continue to run as normal. On my case, I could then run on the drizzle terminal:


drizzle> show variables like "group_concat_max_len";
+----------------------+------------+
| Variable_name | Value |
+----------------------+------------+
| group_concat_max_len | 4294967295 |
+----------------------+------------+
1 row in set (0.01 sec)


Conclusion.
If you don't have another pair of eyes to look over your code, you could ask gdb for some help :)


PlanetMySQL Voting: Vote UP / Vote DOWN

A script snippet for aggregating GDB backtraces

Август 31st, 2009

A short time ago in a galaxy nearby, Domas Mituzas wrote about contention profiling with GDB stack traces. Mark Callaghan found the technique useful, and contributed an awk script (in the comments) to aggregate stack traces and identify which things are blocking most threads. I’ve used it myself a time or five. But I’ve found myself wanting it to be fancier, for various reasons. So I wrote a little utility that can aggregate and pretty-print backtraces. It can handle unresolved symbols, and aggregate by only the first N lines of the stack trace. Here’s an example of a mysqld instance that’s really, really frozen up:

bt-aggregate -4 samples/backtrace.txt | head -n12
2396 threads with the following stack trace:
        #0  0x00000035e7c0a4b6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        #1  0x00000000005f2bd8 in open_table ()
        #2  0x00000000005f3fb4 in open_tables ()
        #3  0x00000000005f4247 in open_and_lock_tables_derived ()

4 threads with the following stack trace:
        #0  0x00000035e7c0a4b6 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
        #1  0x0000000000780099 in os_event_wait_low ()
        #2  0x000000000077de42 in os_aio_simulated_handle ()
        #3  0x000000000074a261 in fil_aio_wait ()

Related posts:

  1. A tweak to column alignment for the mext script I tweaked
  2. The unexpected consequences of SELinux I’ve

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Evil replication management

Июль 30th, 2009

When one wants to script automated replication chain building, certain things are quite annoying, like immutable replication configuration variables. For example, at certain moments log_slave_updates is more than needed, and thats what the server says:

mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| log_slave_updates | OFF   |
+-------------------+-------+
1 row in set (0.00 sec)

mysql> set global log_slave_updates=1;
ERROR 1238 (HY000): Variable 'log_slave_updates' is a read only variable

Of course, there are few options, roll in-house fork (heheeeee!), restart your server, and keep warming up your tens of gigabytes of cache arenas, or wait for MySQL to ship a feature change in next major release. Then there are evil tactics:

mysql> system gdb -p $(pidof mysqld)
                       -ex "set opt_log_slave_updates=1" -batch
mysql> show variables like 'log_slave_updates';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| log_slave_updates | ON    |
+-------------------+-------+
1 row in set (0.00 sec)

I don’t guarantee safety of this when slave is running, but… stopping and starting slave threads is somewhat cheaper, than stopping and starting big database instance, right?

What else can we do?

mysql> show slave status \G
...
     Replicate_Do_DB: test
...
mysql> system gdb -p $(pidof mysqld)
          -ex 'call rpl_filter->add_do_db(strdup("hehehe"))' -batch
mysql> show slave status \G
...
      Replicate_Do_DB: test,hehehe
...

It is actually possible to add all sorts of filters this way, rpl_filter.h can be good reference :) So now that you want to throw out some data from your slaves, restart isn’t needed. Unfortunately, deleting entries isn’t possible via rpl_filter methods, but you can always edit base_ilists, can’t you?

P.S. having this functionality inside server would definitely be best.

On binaries and -fomit-frame-pointer

Июль 27th, 2009

Over last few years 64-bit x86 platform has became ubiquitous, thus making stupid memory limitations a thing of some forgotten past. 64-bit processors made internal memory structures bigger, but compensated that with twice the amount and twice larger registers.

But there’s one thing that definitely got worse – gcc, the compiler, has a change in default compilation options – it omits frame pointers from binaries in x86_64 architecture. People have advocated against that back in 1997 because of very simple reasons, that are still very much existing today too – frame pointers are needed for efficient stack trace calculations, and stack traces are very very useful, sometimes.

So, this change means that oprofile is not able to give hierarchical profiles, it also means that MySQL crash information will not include most useful bit – the failing stack trace. This decision has been just because of performance reasons – frame pointer takes whole register (though on x86_32 that meant 1 out of 8, on x86_64 it is 1 out of 16), which could be used to optimize the application.

I tested two MySQL builds, one built with ‘-O3 -g -fno-omit-frame-pointer’ and other with -fomit-frame-pointer instead – and performance difference was negligible. It was around 1% in sysbench tests, and slightly over 3% at tight-loop select benchmark(100000000,(select asin(5+5)+sin(5+5))); on a 2-cpu Opteron box.

The summary suggestion for this flag would be very simple. If you don’t care about fixing or making your product faster, you can probably use 1% speed-up. But if getting actual real-time performance data can lead to much better performance fixes, and fast introspection means qualified engineers can diagnose problems much faster, 1% or even 3% is not that much. So, add ‘-fno-omit-frame-pointer’ to CFLAGS and CXXFLAGS, and enjoy things getting back to normal :)

By the way, while I was at it, I did some empiric tests of other options, but one that irks me most is not using ‘-g’ on production binaries. See, debugging information, symbol tables, etc – they all cause around 0% performance difference. The only difference is that e.g. mysqld binary will weight 30M, instead of 6M (though that fat will not be loaded into RAM, and will only cost diskspace).

Why does debugging information matter? It doesn’t, if you don’t attempt to be power-user. It does, if you enjoy crazy debugger tricks (like one here or here). Oh, and of course, bonus GDB trick, how to run KILL without connecting to server:

(gdb) thread apply all bt
...
(gdb) thread 2
[Switching to thread 2 (Thread 0x44a76950 (LWP 23955))]#0  ...
(gdb) bt
#0  0x00007f7821e68e1d in pthread_cond_timedwait...
#1  0x000000000052dc0e in Item_func_sleep::val_int (this=0x12317f0)...
#2  0x0000000000501484 in Item::send (this=0x12317f0, ...
...
#15 0x00000000005caf29 in do_command (thd=0x11da290) ...
...
(gdb) frame 15
#15 0x00000000005caf29 in do_command (thd=0x11da290) at sql_parse.cc:854
854	  return_value= dispatch_command(command, thd, packet+1, ...
(gdb) set thd->killed = THD::KILL_QUERY
(gdb) continue

And the client gets ‘ERROR 1317 (70100): Query execution was interrupted’ :-)