Archive for the ‘tuning’ Category

Cluster — spins/contentions and thread stuck in..

Сентябрь 2nd, 2010
I get a number of question about contentions/"stuck in..". So here comes some explanation to:
  • Contention
  • Thread Stuck in
  • What you can do about it
In 99% of the cases the contentions written out in the out file of the data nodes (ndb_X_out.log) is nothing to pay attention to.

sendbufferpool waiting for lock, contentions: 6000 spins: 489200
sendbufferpool waiting for lock, contentions: 6200 spins: 494721


Each spin is read from the L1 cache (4 cycles on a Nehalem (3.2GHz), so about a nanosecond).
1 spin = 1.25E-09 seconds (1.25ns)

In the above we have:
(494721-489200)/(6200-6000)= 27 spins/contention
Time spent on a contention=27 x 1.25E-09=3.375E-08 seconds (0.03375 us)

So we don't have a problem..

Another example (here is a lock guarding a job buffer (JBA = JobBuffer A, in short it handles signals for heartbeats and some other small things, all traffic goes over JobBuffer B).

jbalock thr: 1 waiting for lock, contentions: 145000 spins: 3280892543
jbalock thr: 1 waiting for lock, contentions: 150000 spins: 3403539479


(3403539479-3280892543)/(150000-145000)=24529 spins/contention
Time spent on a contention: 3.06613E-05 seconds (30.66us )

This is a bit higher than I would have expected and I think more analysis is needed. However, i tend not to get these contentions on a busy system.

Ndb kernel thread X is stuck in ..

Ndb kernel thread 4 is stuck in: Job Handling elapsed=100 Watchdog: User time: 82 System time: 667
Ndb kernel thread 4 is stuck in: Job Handling elapsed=200

Watchdog: User time: 82 System time: 668
Ndb kernel thread 4 is stuck in: Job Handling elapsed=300
Watchdog: User time: 82 System time: 669
Ndb kernel thread 4 is stuck in: Job Handling elapsed=400
Watchdog: User time: 82 System time: 670

Here the important is to look at how User time and System time behaves.
If User time is constant (as it is here - 82ms), but the System time is growing (667, 668 etc) which indicates that the OS kernel is busy.
Slow network? Sub-optimal kernel version? NIC drivers? swapping? some kernel process using too much cpu?

If User time is growing it is probably because the ndb kernel is overloaded.

What can you do about this?
  • In config.ini:
    RealtimeScheduler=1

    LockExecThreadToCPU=[cpuids]
  • check that cpuspeed is not running ( yum remove cpuspeed )
  • .. and finally ask us to optimize more!
Also, pay attention if you get the contentions on an idle Cluster or a busy Cluster.

PlanetMySQL Voting: Vote UP / Vote DOWN

Databases: Normalization or Denormalization. Which is the better technique?

Август 30th, 2010

This has really been a long debate as to which approach is more performance orientated, normalized databases or denormalized databases. So this article is a step on my part to figure out the right strategy, because neither one of these approaches can be rejected outright. I will start of by discussing the pros and cons of both the approaches.

Pros and Cons of a Normalized database design.

Normalized databases fair very well under conditions where the applications are write-intensive and the write-load is more than the read-load. This is because of the following reasons:

  • Normalized tables are usually smaller and have a smaller foot-print because the data is divided vertically among many tables. This allows them to perform better as they are small enough to get fit into the buffer.
  • The updates are very fast because the data to be updated is located at a single place and there are no duplicates.
  • Similarly the inserts are very fast because the data has to be inserted at a single place and does not have to be duplicated.
  • The selects are fast in cases where data has to be fetched from a single table, because normally normalized tables are small enough to get fit into the buffer.
  • Because the data is not duplicated so there is less need for heavy duty group by or distinct queries.

Although there seems to be much in favor of normalized tables, with all the cons outlined above, but the main cause of concern with fully normalized tables is that normalized data means joins between tables. And this joining means that read operations have to suffer because indexing strategies do not go well with table joins.

Now lets have a look at the pros and cons of a denormalized database design.

Pros and cons of denormalized database design.

Denormalized databases fair well under heavy read-load and when the application is read intensive. This is because of the following reasons:

  • The data is present in the same table so there is no need for any joins, hence the selects are very fast.
  • A single table with all the required data allows much more efficient index usage. If the columns are indexed properly, then results can be filtered and sorted by utilizing the same index. While in the case of a normalized table, since the data would be spread out in different tables, this would not be possible.

Although for reasons mentioned above selects can be very fast on denormalized tables, but because the data is duplicated, the updates and inserts become complex and costly.

Having said that neither one of the approach can be entirely neglected, because a real world application is going to have both read-loads and write-loads. Hence the correct way would be to utilize both the normalized and denormalized approaches depending on situations.

Using normalized and denormalized approaches together.

The most common way of mixing denormalized and normalized approaches is to duplicate related columns from one table into another table. Let me show you by example:

Suppose you have a products table and an orders table.
The normalized approach would be to only have the product_id in the orders table and all the other product related information in the products table.

But that would make the query that filters by product_name and sorts by order_date inefficient because both are stored in different tables.

In a fully normalized schema, such a query would be performed in the following manner:

SELECT product_name, order_date
FROM orders INNER JOIN products USING(product_id)
WHERE product_name like 'A%'
ORDER by order_date DESC

As you can see MySQL here will have to scan the order_date index on the orders table and then compare the corresponding product_name in the products table to see if the name starts with A.

The above query can be drastically improved by denormalizing the schema a little bit, so that the orders table now includes the product_name column as well.

SELECT product_name, order_date
FROM orders
WHERE product_name like 'A%'
ORDER by order_date DESC

See how the query has become much simpler, there is no join now and a single index on columns product_name, order_date can be used to do the filtering as well as the sorting.

So can both the techniques be used together? Yes they can be, because real word applications have a mix of read and write loads.

Final words.

Although, denormalized schema can greatly improve performance under extreme read-loads but the updates and inserts become complex as the data is duplicate and hence has to be updated/inserted in more than one places.

One clean way to go about solving this problem is through the use of triggers. For example in our case where the orders table has the product_name column as well, when the value of product_name has to be updated, then it can simply be done in the following way:

  • Have a trigger setup on the products table that updates the product_name on any update to the products table.
  • Execute the update query on the products table. The data would automatically be updated in the orders table because of the trigger.

However, when denormalizing the schema, do take into consideration, the number of times you would be updating records compared to the number of times you would be executing SELECTs. When mixing normalization and denormalization, focus on denormalizing tables that are read intensive, while tables that are write intensive keep them normalized.


PlanetMySQL Voting: Vote UP / Vote DOWN

dbbenchmark.com – now supporting MySQL on OSX 10.6

Август 29th, 2010

Just a quick note to let everyone know that our new benchmarking script now supports OSX 10.6 on Intel hardware. That means you can run one simple command and get all of the sequential and random INSERT and SELECT performance statistics about your database performance. As usual the script is open source and released under the new BSD license. Give is a try by downloading now! See the download page for more details.


PlanetMySQL Voting: Vote UP / Vote DOWN

dbbenchmark.com – Benchmarking script now available

Август 28th, 2010

You can download the first release of the benchmarking script here: http://code.google.com/p/dbbenchmark/

Please read the README file or consult the Support page before running the benchmarks.


PlanetMySQL Voting: Vote UP / Vote DOWN

dbbenchmark.com – Site launched

Август 28th, 2010

Welcome to DBbenchmarks.com, a publicly accessible database that tracks anonymously submitted data about MySQL server performance. You can use this site to see research the performance of certain types of hardware when running MySQL. Our open-source benchmarking script is free to own and use, we only ask that you allow the script to connect to this database and submit the results. All results and data collected is anonymous and viewable on this site. We only track performance data from MySQL – you can see the list on the About page.

Check out the database of benchmarks here: [link]


PlanetMySQL Voting: Vote UP / Vote DOWN

Kontrollbase – new version available with improved analytics

Август 23rd, 2010
A new version of Kontrollbase – the enterprise monitoring, analytics, reporting, and historical analysis webapp for MySQL database administrators and advanced users of MySQL databases – is available for download. There are several upgrades to the reporting code with improved alert algorithms as well as a new script for auto-archiving of the statistics table based [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

What do MySQL Consultants do?

Июль 8th, 2010

One role of a MySQL consultant is to review an existing production system. Sometimes you have sufficient time and access, and other times you don’t. If I am given a limited time here is a general list of things I look at.

  1. Review Server architecture, OS, Memory, Disks (including raid and partition type), Network etc
  2. Review server load and identify physical bottleneck
  3. Look at all running processes
  4. Look specifically at MySQL processes
  5. Review MySQL Error Log
  6. Determine MySQL version
  7. Look at MySQL configuration (e.g. /etc/my.cnf)
  8. Look at running MySQL Variables
  9. Look at running MySQL status (x n times)
  10. Look at running MySQL INNODB status (x n times) if used
  11. Get Database and Schema Sizes
  12. Get Database Schema
  13. Review Slow Query Log
  14. Capture query sample via SHOW FULL PROCESSLIST (locked and long running)
  15. Analyze Binary Log file
  16. Capture all running SQL

Here are some of the commands I would run.

2. Review server load and identify physical bottleneck

$ vmstat 5 720 > vmstat.`date +%y%m%d.%H%M%S`.txt

4. Look at MySQL processes

$ ps -eopid,fname,rss,vsz,user,command | grep -e "RSS" -e "mysql"
  PID COMMAND    RSS    VSZ USER     COMMAND
 5463 grep       764   5204 ronald   grep -e RSS -e mysql
13894 mysqld_s   596   3936 root     /bin/sh /usr/bin/mysqld_safe
13933 mysqld   4787812 5127208 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/vol/mysql/mysqldata --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
13934 logger     608   3840 root     logger -p daemon.err -t mysqld_safe -i -t mysqld

$ ps -eopid,fname,rss,vsz,user,command | grep " mysqld " | grep -v grep | awk '{print $3,$4}'
4787820 5127208

5. Review MySQL Error Log

The error log can be found in various different places based on the operating system and configuration. It is important to find the right log, the SHOW GLOBAL VARIABLES LIKE ‘log_error’ will determine the location.

This is generally overlooked, however this can quickly identify some underlying problems with a MySQL environment.

7. Look at MySQL configuration

$ [ -f /etc/my.cnf ] &&  cat /etc/my.cnf
$ [ -f /etc/mysql/my.cnf ] &&  cat /etc/mysql/my.cnf
$ find / -name  "*my*cnf" 2>/dev/null

8. Look at running MySQL Variables

$ mysqladmin -uroot -p variables

9. Look at running MySQL status (x n times)

$ mysqladmin -uroot -p extended-status

It is important to run this several times at regular intervals, say 60 seconds, 60 minutes, or 24 hours.

I also have dedicated scripts that can perform this. Check out Log MySQL Stats.

11. Get Database and Schema Sizes

Check out my scripts on my MySQL DBA page

14. Capture Locked statements

Check out my script for Capturing MySQL sessions.

15. Analyze Binary Log file

Check out my post on using mk-query-digest.

16. Capture all SQL

Check out my post on DML Stats per table

Moving forward

Of course the commands I run exceeds this initial list, and gathering this information is only


PlanetMySQL Voting: Vote UP / Vote DOWN

Benchmarking MySQL ACID performance with SysBench

Июнь 21st, 2010

A couple of question I get a lot from MySQL customers is “how will this hardware upgrade improve my transactions per second (TPS)” and “what level of TPS will MySQL perform on this hardware if I’m running ACID settings?” Running sysbench against MySQL with different values for per-thread and global memory buffer sizes, ACID settings, and other settings gives me concrete values to bring to the customer to show the impact that more RAM, faster CPUs, faster disks, or cnf changes have on the server. Here are some examples for a common question: “If I’m using full ACID settings vs non-ACID settings what performance am I going to get from this server?”

Let’s find out by running sysbench with the following settings (most are self explanatory – if not the man page can explain them):

  • sysbench –test=oltp –db-driver=mysql –oltp-table-size=1000000 –mysql-engine-trx=yes –oltp-test-mode=complex –oltp-read-only=off –oltp-dist-type=special –max-requests=0 –num-threads=8 –max-time=120 –init-rng=on run

MySQL Settings:

In the first test MySQL is set to the following ACID related settings. This will give us results for TPS performance without full ACID compliance – very common settings on a server that is handling blogs, ad serving, general business websites, and other roles where full ACID is not required and performance is valued over the benefits of full ACID. These are important settings when we look at the difference in performance when we change to full ACID in the second test.

  • innodb_flush_log_at_trx_commit = 0
  • sync_binlog=0
  • transaction-isolation=REPEATABLE-READ

System configuration and InnoDB buffer pool size:

  • XEON E5345 Series 2.33ghz 8-core, 16GB RAM, Local SATA 7.2K disks
  • innodb_buffer_pool_size = 10G

Full result set from sysbench:

Summary OLTP test statistics:

  • queries performed:
  • transactions:                        172426 (1436.83 per sec.)
  • read/write requests:                 3276664 (27304.51 per sec.)
  • other operations:                    344882 (2873.91 per sec.)

Take away results:

We can simplify the results by looking at the following TPS results for this non-ACID test:

  • transactions:                        172426 (1436.83 per sec.)

Let’s go ahead and run the test again with different ACID settings. This will give us the TPS results for full ACID compliance:

  • innodb_flush_log_at_trx_commit = 1
  • sync_binlog=1
  • transaction-isolation=REPEATABLE-READ

We get the following results for TPS:

  • transactions:                     3197   (26.58 per sec.)
  • read/write requests:                 60743  (505.04 per sec.)
  • other operations:                    6394   (53.16 per sec.)

Final Results:

So as you can see the difference between full ACID settings and not (on the same server with only those values on the cnf being changed) results in a huge difference in performance on this standard database server. We can now hand this data to the customer and they will know what impact the settings will have on their application’s performance and what to expect when running full ACID vs non-ACID.

More info on using sysbench here: http://sysbench.sourceforge.net


PlanetMySQL Voting: Vote UP / Vote DOWN

Tuning InnoDB Concurrency Tickets

Май 25th, 2010

InnoDB has an oft-unused parameter innodb_concurrency_tickets that seems widely misunderstood. From the docs: "The number of threads that can enter InnoDB concurrently is determined by the innodb_thread_concurrency variable. A thread is placed in a queue when it tries to enter InnoDB if the number of threads has already reached the concurrency limit. When a thread is allowed to enter InnoDB, it is given a number of “free tickets” equal to the value of innodb_concurrency_tickets, and the thread can enter and leave InnoDB freely until it has used up its tickets. After that point, the thread again becomes subject to the concurrency check (and possible queuing) the next time it tries to enter InnoDB. The default value is 500..."

What this means from a practical perspective is that each query is allocated 500 tickets when it begins executing. Each time it enters InnoDB, this number is decremented until it reaches zero ("entering InnoDB" appears only to occur when a row is accessed). When it reaches zero, it may-or-may-not be put into a queue and wait to continue execution. InnoDB doesn't provide us a way in which to determine how many concurrency tickets a query uses, making this parameter notoriously difficult to tune. It is important to note that this variable only comes in to play when innodb_thread_concurrency is greater than zero.

On a stock install of MySQL, here are some example queries and the corresponding number of concurrency tickets used for each:

SQL:
  1. mysql> CREATE TABLE test_table (
  2.     ->     id int
  3.     -> ) ENGINE=InnoDB; -- 0 Tickets Used
  4. Query OK, 0 rows affected (0.36 sec)
  5.  
  6. mysql> INSERT INTO test_table (id) VALUES (1); -- 0 Tickets Used
  7. Query OK, 1 row affected (0.00 sec)
  8.  
  9. mysql> SELECTFROM test_table; -- 1 Ticket Used
  10. +------+
  11. | id   |
  12. +------+
  13. |    1 |
  14. +------+
  15. 1 row IN SET (0.00 sec)
  16.  
  17. mysql> INSERT INTO test_table (id) VALUES (2),(3); -- 0 Tickets Used
  18. Query OK, 2 rows affected (0.00 sec)
  19. Records: 2  Duplicates: 0  Warnings: 0
  20.  
  21. mysql> SELECT COUNT(*) FROM test_table; -- 3 Tickets Used
  22. +----------+
  23. | COUNT(*) |
  24. +----------+
  25. |        3 |
  26. +----------+
  27. 1 row IN SET (0.00 sec)
  28.  
  29. mysql> UPDATE test_table SET id=4 WHERE id=1; -- 4 Tickets Used (because no index, a table scan is performed)
  30. Query OK, 1 row affected (0.00 sec)
  31. Rows matched: 1  Changed: 1  Warnings: 0
  32.  
  33. mysql> ALTER TABLE test_table ADD INDEX (id); -- 5 Tickets Used
  34. Query OK, 3 rows affected (0.01 sec)
  35. Records: 3  Duplicates: 0  Warnings: 0

And now on to a more interesting scenario: foreign keys

SQL:
  1. mysql> CREATE TABLE parent (id INT NOT NULL,
  2.     ->                      PRIMARY KEY (id)
  3.     -> ) ENGINE=INNODB; -- 0 Tickets Used
  4. Query OK, 0 rows affected (0.01 sec)
  5.  
  6. mysql> CREATE TABLE child (id INT, parent_id INT,
  7.     ->                     INDEX par_ind (parent_id),
  8.     ->                     FOREIGN KEY (parent_id) REFERENCES parent(id)
  9.     ->                       ON DELETE CASCADE
  10.     -> ) ENGINE=INNODB; -- 0 Tickets Used
  11. Query OK, 0 rows affected (0.00 sec)
  12.  
  13. mysql> INSERT INTO parent (id) VALUES (1),(2),(3),(4); -- 3 Tickets Used
  14. Query OK, 4 rows affected (0.03 sec)
  15.  
  16. mysql> INSERT INTO child (id, parent_id) VALUES (1,1),(1,1),(2,1); -- 2 Tickets Used
  17. Query OK, 3 rows affected (0.00 sec)
  18. Records: 3  Duplicates: 0  Warnings: 0
  19.  
  20. mysql> DELETE FROM child WHERE 1; -- 6 Tickets Used
  21. Query OK, 3 rows affected (0.02 sec)
  22.  
  23. mysql> ALTER TABLE `child` ADD PRIMARY KEY (`id`,`parent_id`); -- 0 Tickets Used
  24. Query OK, 0 rows affected (0.02 sec)
  25. Records: 0  Duplicates: 0  Warnings: 0
  26.  
  27. mysql> INSERT INTO `child` (`id`,`parent_id`) VALUES (1,1), (1,2), (2,1),(2,2); -- 3 Tickets Used
  28. Query OK, 4 rows affected (0.01 sec)
  29. Records: 4  Duplicates: 0  Warnings: 0

So, how can we put this into practice, since this information isn't available to most users?

INSERT w/PRIMARY KEY defined: Number of rows inserted - 1
INSERT w/FOREIGN KEY constraint: Number of rows inserted - 1
SELECT: 1 ticket per row returned
UPDATE: 1 ticket per row examined + 1 ticket per row updated
DELETE: 1 ticket per row examined + 1 ticket per row deleted
ALTER: (2 * rows in the table) - 1

As with any performance optimization effort, you will want to optimize for the common case. If you have a very simple workload, you can calculate these values by hand. But for most workloads with a complex access pattern, we'll need to estimate or wait for InnoDB to expose this information to us.

What happens in the case where I have two distinct access patterns: single row primary-key lookups and SELECT statements that examine 900 rows? If innodb_concurrency_tickets is set to 500, then all of the single row PK lookups will execute without ever being subject to an additional concurrency check (there is always one when a thread first enters InnoDB) while the 900-row SELECT statements will always be subject to one additional concurrency check (we actually care less about the concurrency check itself than the possibility that it may become queued). Your first instinct may be to increase innodb_concurrency_tickets to >=900 in this case, but that isn't necessarily the best decision. As stated in the docs, the number of threads that can enter InnoDB is limited by innodb_thread_concurrency (which is why these two variables are most often tuned in concert). To continue the example, if innodb_thread_concurrency is set to 8 and eight 900-row-SELECT statements come in, they will effectively block the PK lookups until one of them is subject to a concurrency check or complete execution and exit InnoDB. If innodb_concurrency_tickets had been increased to >= 900, then ALL of the PK lookups would be blocked until the 900-row-SELECT statements complete execution.

With a maximum value of 4,294,967,295 this has the potential to block other queries for a significant amount of time. Setting innodb_concurrency_tickets too high can have startlingly negative performance implications. On the other hand, if we determine that 99% of the traffic are these single row PK lookups and only 1% are the 900-row SELECTs, we may be tempted to lower the setting to 1 to accommodate the "typical case". The effects of this, though, would be to cause the 900-row SELECT statements to be subject to 899 concurrency checks. This means 899 potential opportunities to be queued! So, as with most other parameters, this is a balancing act.

It really comes down to the importance of the applicable queries. Imagine those 900-row SELECT statements were actually 10,000 row selects, this would become a more pressing issue. If they are reporting queries used only internally, then it is not so much of an issue and you can leave innodb_concurrency_tickets rather small. If, on the other hand, these are the queries that lead to revenue generation, you may want to give them a bit more dedicated CPU time so they execute that much faster (even at the expense of the PK lookups). In other words, if you're optimizing for throughput in this scenario, you will tune innodb_concurrency_tickets to the 99th percentile of small PK lookups. If you're optimizing for response time, you would set it larger to accommodate the larger (important) select statements.

A quick sysbench run gives us the following results (X-axis is innodb_concurrency_tickets, Y-axis is txn/sec. More is better). Since all sysbench queries are 10 rows or less, we don't really expect to see much of a difference here:

Details:

CODE:
  1. sysbench --test=oltp --oltp-table-size=80000000 --oltp-read-only=off --init-rng=on --num-threads=16 --max-requests=0 --oltp-dist-type=uniform --max-time=300  --mysql-user=root --mysql-socket=/var/lib/mysql/mysql.sock run

Applicable my.cnf settings:

CODE:
  1. innodb_buffer_pool_size=24G
  2. innodb_data_file_path=ibdata1:10M:autoextend
  3. innodb_file_per_table=1
  4. innodb_flush_log_at_trx_commit = 1
  5. innodb_log_buffer_size               = 8M
  6. innodb_log_files_in_group=2
  7. innodb_log_file_size=1900M
  8. innodb_thread_concurrency=16
  9. innodb_flush_method             = O_DIRECT
  10. innodb_write_io_threads=8
  11. innodb_read_io_threads=8
  12. innodb_io_capacity=500
  13. innodb_max_dirty_pages_pct=90
  14. max_connections=3000
  15. query_cache_size=0
  16. skip-name-resolve
  17. table_cache=10000


Entry posted by Ryan Lowe | No comment

Add to: delicious | digg | reddit | netscape | Google Bookmarks


PlanetMySQL Voting: Vote UP / Vote DOWN

Tuning your Cluster with ndbinfo (7.1) part 1 of X

Апрель 20th, 2010
The new ndbinfo interface in 7.1 is really useful to assist in tuning MySQL Cluster. Here is an example (more will follow):

I started with one test where I inserted two blobs (1KB + 1KB) in one table.
From 16 threads (colocated with one mysqld, two data nodes, separate computers) and one application driving the load I reached about 6960TPS, and the utilization of the redo buffers (controlled by the parameter RedoBuffer in config.ini) looked like:

mysql< select * from ndbinfo.logbuffers;
+---------+----------+--------+----------+----------+--------+
| node_id | log_type | log_id | log_part | total | used |
+---------+----------+--------+----------+----------+--------+
| 3 | REDO | 0 | 1 | 50331648 | 196608 |
| 3 | REDO | 0 | 2 | 50331648 | 294912 |
| 3 | REDO | 0 | 3 | 50331648 | 131072 |
| 3 | REDO | 0 | 4 | 50331648 | 229376 |
| 4 | REDO | 0 | 1 | 50331648 | 229376 |
| 4 | REDO | 0 | 2 | 50331648 | 262144 |
| 4 | REDO | 0 | 3 | 50331648 | 163840 |
| 4 | REDO | 0 | 4 | 50331648 | 229376 |
+---------+----------+--------+----------+----------+--------+
8 rows in set (0.01 sec)
Which is basically nothing.

I then increased the load and inserted 2 x 5120B BLOBs (from 16 threads one MySQL server), and run with an insert speed of 4320TPS:
mysql< select * from ndbinfo.logbuffers;
+---------+----------+--------+----------+----------+----------+
| node_id | log_type | log_id | log_part | total | used |
+---------+----------+--------+----------+----------+----------+
| 3 | REDO | 0 | 1 | 50331648 | 11468800 |
| 3 | REDO | 0 | 2 | 50331648 | 31522816 |
| 3 | REDO | 0 | 3 | 50331648 | 42008576 |
| 3 | REDO | 0 | 4 | 50331648 | 43057152 |
| 4 | REDO | 0 | 1 | 50331648 | 14090240 |
| 4 | REDO | 0 | 2 | 50331648 | 17432576 |
| 4 | REDO | 0 | 3 | 50331648 | 10321920 |
| 4 | REDO | 0 | 4 | 50331648 | 12615680 |
+---------+----------+--------+----------+----------+----------+

Above you can see that the redo buffers are used (the load will be spread around, and it is hard to catch a moment where the load is even on all buffers), and now the application started to throw the error "Got temporary error 1221 'REDO buffers overloaded (increase RedoBuffer)' from NDBCLUSTER (1297)"

I can now follow the instruction to increase the REDO buffer, but would it help in this case?
No, no and no.
The disk is too slow to keep up and cannot write out to disk in the same rate as the application writes out.

'iostat' gives:
< iostat -kx 1

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
cciss/c0d1 0.00 27796.00 0.00 1454.00 0.00 115196.00 158.45 12.03 8.25 0.66 95.30
dm-0 0.00 0.00 0.00 29270.00 0.00 117080.00 8.00 274.79 9.33 0.03 95.20
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


And here you can see that the disks are quite utilized. This means that I have two options now if I want to be able to sustain the 4320TPS insert load:
  • Increase the number of data nodes (computers) so instead of having two computers, I should have four so that I spread the load across more hardware
  • Improve my disk subsystem (add better disks, e.g, to have 2-4 disk spindles to spread the load on), or by having the REDO log on device cciss/c0d1 and the the LCP on device cciss/c0d0.
The CPU, could that also been an bottleneck in this case? No, it was not the issue. The CMVMI thread (one of the data nodes threads) was spending 44.4% polling data from the other nodes, and it is reading in quite large packets so that is why it was the heaviest user of CPU of the data node threads.
5453 root      20   0 6594m 4.1g 6956 R 44.4 51.9   4:05.64 ndbmtd
5471 root 20 0 6594m 4.1g 6956 S 32.5 51.9 3:39.07 ndbmtd
5474 root 20 0 6594m 4.1g 6956 R 26.6 51.9 2:25.55 ndbmtd
5475 root 20 0 6594m 4.1g 6956 S 23.7 51.9 2:25.01 ndbmtd
5476 root 20 0 6594m 4.1g 6956 R 23.7 51.9 2:20.83 ndbmtd
5473 root 20 0 6594m 4.1g 6956 R 21.7 51.9 2:26.57 ndbmtd

PlanetMySQL Voting: Vote UP / Vote DOWN