<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PlanetMysql.ru - информация о СУБД MySQL &#187; production</title>
	<atom:link href="http://planetmysql.ru/category/production/feed/" rel="self" type="application/rss+xml" />
	<link>http://planetmysql.ru</link>
	<description>Блог о самой популярной СУБД MySQL</description>
	<lastBuildDate>Fri, 25 May 2012 06:11:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Get social and healthy with GlassFish</title>
		<link>http://blogs.sun.com/theaquarium/entry/get_social_and_healthy_with?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=get-social-and-healthy-with-glassfish</link>
		<comments>http://blogs.sun.com/theaquarium/entry/get_social_and_healthy_with#comments</comments>
		<pubDate>Mon, 04 Apr 2011 07:01:24 +0000</pubDate>
		<dc:creator>The Aquarium</dc:creator>
				<category><![CDATA[glassfish]]></category>
		<category><![CDATA[pointdebate]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[reference]]></category>
		<category><![CDATA[stories]]></category>
		<category><![CDATA[story]]></category>
		<category><![CDATA[tinyhabit]]></category>

		<guid isPermaLink="false">http://blogs.sun.com/theaquarium/entry/get_social_and_healthy_with</guid>
		<description><![CDATA[Two new stories have been published this week and both of them use GlassFish 3.1 in production. If you haven't seen them before, "Stories" is a blog with production use of GlassFish by small, medium, and large users with user questionnaires describing their experience with the rest of the community.





The first story is PointDebate, a "social network company that stir up, engage and give voice to most diverse opinions". They've been following pretty closely all the recent updates of GlassFish and now run the latest 3.1 version (only a month after it was released). They application is built using Java EE 6 and JSF in particular with RichFaces. The full architecture includes MySQL as well as EHCache and uses JMS to "decouple operations" (an somewhat underutilized architectural pattern if you ask me).




  







  




The second story, TinyHabits, an online service "to maintain a healthy lifestyle despite leading a busy life that leaves very little time to incorporate healthy habits" is yet another Java EE 6 application with GlassFish as a platform chosen for its simplicity and robust administration and monitoring. This service also just moved to the latest and greatest version 3.1 (from 3.0.1), also uses JSF 2.0 (with PrimeFaces this time), uses both PostgreSQL and MongoDB  and runs production on Amazon EC2. Check it out.]]></description>
			<content:encoded><![CDATA[<p>
Two new <a href="http://blogs.sun.com/stories">stories</a> have been published this week and both of them use GlassFish 3.1 in production. If you haven't seen them before, "Stories" is a blog with production use of GlassFish by small, medium, and large users with user questionnaires describing their experience with the rest of the community.
</p>

<table><tr>
<td valign="top">
<p>
The first story is <a href="http://pointdebate.net/">PointDebate</a>, a <em>"social network company that stir up, engage and give voice to most diverse opinions"</em>. They've been following pretty closely all the recent updates of GlassFish and now run the latest 3.1 version (only a month after it was released). They application is built using Java EE 6 and JSF in particular with RichFaces. The full architecture includes MySQL as well as EHCache and uses JMS to <em>"decouple operations"</em> (an somewhat underutilized architectural pattern if you ask me).
</p>
</td>
<td>
<a href="http://blogs.sun.com/stories/entry/pointdebate_online_communication_platform_using" title="PointDebate: Online communication platform using Java EE 6 &amp; GlassFish 3.1">
  <img src="http://blogs.sun.com/theaquarium/resource/pointdebate-logo.png" alt="ALT DESCR" hspace="4" vspace="4" align="left" />
</a>
</td>
</tr></table>


<table><tr><td>
<a href="http://blogs.sun.com/stories/entry/tinyhabit_healthy_lifestyle_using_java" title="Tinyhabit - Healthy lifestyle using Java EE 6, GlassFish 3.1, and NetBeans">
  <img src="http://blogs.sun.com/theaquarium/resource/tinyhabit-logo.png" alt="ALT DESCR" hspace="4" vspace="4" align="left" />
</a>
</td>
<td valign="top">
<p>
The second story, <a href="http://tinyhabits.com">TinyHabits</a>, an online service <em>"to maintain a healthy lifestyle despite leading a busy life that leaves very little time to incorporate healthy habits"</em> is yet another Java EE 6 application with GlassFish as a platform chosen for its simplicity and robust administration and monitoring. This service also just moved to the latest and greatest version 3.1 (from 3.0.1), also uses JSF 2.0 (with PrimeFaces this time), uses both PostgreSQL and MongoDB  and runs production on Amazon EC2. <a href="http://blogs.sun.com/stories/entry/tinyhabit_healthy_lifestyle_using_java">Check it out</a>.
</p>
</td></tr></table><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=27853&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=27853&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2011/04/04/get-social-and-healthy-with-glassfish/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySQL Partitioning – can save you or kill you</title>
		<link>http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mysql-partitioning-%25e2%2580%2593-can-save-you-or-kill-you</link>
		<comments>http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/#comments</comments>
		<pubDate>Sat, 11 Dec 2010 07:25:54 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[production]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=4174</guid>
		<description><![CDATA[I wanted for a while to write about using MySQL Partitioning for Performance Optimization and I just got a relevant customer case to illustrate it.   First you need to understand how partitions work internally. Partitions are on the low level are separate table.   This means when you're doing lookup by partitioned key you will look at one (or some of) partitions, however lookups by other keys will need to perform lookup in all partitions and hence can be a lot slower.    The gain from updates typically comes from having smaller BTREE  on the active partition(s) which allows for a lot better fit.   Having potentially fewer level in BTREE is not that significant issue.
So lets see at example:
PLAIN TEXT
SQL:




CREATE TABLE `tbl` &#40;


&#160; `id` bigint&#40;20&#41; UNSIGNED AUTO_INCREMENT NOT NULL,


&#160; `uu` varchar&#40;255&#41; DEFAULT NULL,


&#160; `data` bigint&#40;20&#41; UNSIGNED DEFAULT NULL,


&#160; PRIMARY KEY &#40;`id`&#41;,


&#160; KEY `uu` &#40;`uu`&#41;,


&#41; ENGINE=InnoDB 






The access pattern to this table is to lookup data by "uu" which has UUID values and when  number of deletes by "id" and bunch of inserts.   The deletes are mainly clustered around most recent id values.
The table (and index) is much larger than buffer pool size.
The first problem was replication lag, which are mainly due to modifying the uu index.  This is because UUID() spreads values prefix very well effectively giving almost uniform access to all BTREE.  To solve this problem partitioning was a good choice -  PARTITION BY HASH (id div 10000000) PARTITIONS 32  - This allows to partition data to 32 partitions placing sequential ranges of 10M values  in the same partition - very handy if you have very active access to values which ave been added to the table recently.
Using this trip replication could be speed up about 10 times as couple of partitions which were actively used could fit in buffer pool completely so replication became CPU bound (single thread) instead of IO bound.  
You could celebrate but hey.... you need to check the impact on master too.  Master in its turn was getting a lot of lookups by the uu value which is not part of partitioned key and hence we're looking at 32 logical lookups, one per partition.  True only one of the partitions would contain the value but many of them will require physical IO and going down to the leaf key to verify such value does not exist, which reduced performance for random selects by UUID from 400 to 20 per second (from single thread).  
Decreasing number of partitions made replication less efficient but the number of selects the table could deliver was increasing and there seems to be a reasonable number which would allow replication to perform better when it is now, while selects still performed in the amount system needs. 
What is a take away ?  When you're creating partitions think clearly what you're trying to archive. Partitioning is not some magic feature which just makes everything a lot faster.  I've seen some people applying partition to basically all of their tables without much a thought and believe me results were not pretty.
    
    Entry posted by Peter Zaitsev &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>I wanted for a while to write about using MySQL Partitioning for Performance Optimization and I just got a relevant customer case to illustrate it.   First you need to understand how partitions work internally. Partitions are on the low level are separate table.   This means when you're doing lookup by partitioned key you will look at one (or some of) partitions, however lookups by other keys will need to perform lookup in all partitions and hence can be a lot slower.    The gain from updates typically comes from having smaller BTREE  on the active partition(s) which allows for a lot better fit.   Having potentially fewer level in BTREE is not that significant issue.</p>
<p>So lets see at example:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span>CREATE</span> <span>TABLE</span> <span>`tbl`</span> <span>&#40;</span></div>
</li>
<li>
<div>&nbsp; <span>`id`</span> bigint<span>&#40;</span><span>20</span><span>&#41;</span> <span>UNSIGNED</span> <span>AUTO_INCREMENT</span> <span>NOT</span> <span>NULL</span>,</div>
</li>
<li>
<div>&nbsp; <span>`uu`</span> varchar<span>&#40;</span><span>255</span><span>&#41;</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div>&nbsp; <span>`data`</span> bigint<span>&#40;</span><span>20</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div>&nbsp; <span>PRIMARY</span> <span>KEY</span> <span>&#40;</span><span>`id`</span><span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; <span>KEY</span> <span>`uu`</span> <span>&#40;</span><span>`uu`</span><span>&#41;</span>,</div>
</li>
<li>
<div><span>&#41;</span> ENGINE=InnoDB </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>The access pattern to this table is to lookup data by "uu" which has UUID values and when  number of deletes by "id" and bunch of inserts.   The deletes are mainly clustered around most recent id values.<br />
The table (and index) is much larger than buffer pool size.</p>
<p>The first problem was replication lag, which are mainly due to modifying the uu index.  This is because UUID() spreads values prefix very well effectively giving almost uniform access to all BTREE.  To solve this problem partitioning was a good choice -  <strong>PARTITION BY HASH (id div 10000000) PARTITIONS 32</strong>  - This allows to partition data to 32 partitions placing sequential ranges of 10M values  in the same partition - very handy if you have very active access to values which ave been added to the table recently.</p>
<p>Using this trip replication could be speed up about 10 times as couple of partitions which were actively used could fit in buffer pool completely so replication became CPU bound (single thread) instead of IO bound.  </p>
<p>You could celebrate but hey.... you need to check the impact on master too.  Master in its turn was getting a lot of lookups by the uu value which is not part of partitioned key and hence we're looking at 32 logical lookups, one per partition.  True only one of the partitions would contain the value but many of them will require physical IO and going down to the leaf key to verify such value does not exist, which reduced performance for random selects by UUID from 400 to 20 per second (from single thread).  </p>
<p>Decreasing number of partitions made replication less efficient but the number of selects the table could deliver was increasing and there seems to be a reasonable number which would allow replication to perform better when it is now, while selects still performed in the amount system needs. </p>
<p>What is a take away ?  When you're creating partitions think clearly what you're trying to archive. Partitioning is not some magic feature which just makes everything a lot faster.  I've seen some people applying partition to basically all of their tables without much a thought and believe me results were not pretty.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Peter Zaitsev |
      <a href="http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/&amp;title=MySQL%20Partitioning%20%E2%80%93%20can%20save%20you%20or%20kill%20you" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/&amp;title=MySQL%20Partitioning%20%E2%80%93%20can%20save%20you%20or%20kill%20you" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/&amp;title=MySQL%20Partitioning%20%E2%80%93%20can%20save%20you%20or%20kill%20you" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/&amp;T=MySQL%20Partitioning%20%E2%80%93%20can%20save%20you%20or%20kill%20you" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/12/11/mysql-partitioning-can-save-you-or-kill-you/&amp;title=MySQL%20Partitioning%20%E2%80%93%20can%20save%20you%20or%20kill%20you" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26711&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26711&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/12/11/mysql-partitioning-%e2%80%93-can-save-you-or-kill-you/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How well does your table fits in innodb buffer pool ?</title>
		<link>http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-well-does-your-table-fits-in-innodb-buffer-pool</link>
		<comments>http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/#comments</comments>
		<pubDate>Fri, 10 Dec 2010 01:59:48 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[percona]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[xtradb]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=4166</guid>
		<description><![CDATA[Understanding how well your tables and indexes fit to buffer pool are often very helpful to understand why some queries are IO bound and others not - it may be because the tables and indexes they are accessing are not in cache, for example being washed away by other queries.  MySQL Server does not provide any information of this type,  Percona Server  however adds number of tables to Information Schema which makes this information possible. It is just few queries away:
PLAIN TEXT
SQL:




mysql&#62; SELECT `schema` AS table_schema,innodb_sys_tables.name AS table_name,innodb_sys_indexes.name AS index_name,cnt,dirty,hashed,round&#40;cnt*100/index_size,2&#41; fit_pct&#160; &#160;FROM &#40;SELECT index_id,count&#40;*&#41; cnt,sum&#40;dirty=1&#41; dirty ,sum&#40;hashed=1&#41; hashed FROM innodb_buffer_pool_pages_index GROUP BY index_id&#41; bp JOIN innodb_sys_indexes ON id=index_id JOIN innodb_sys_tables ON table_id=innodb_sys_tables.id JOIN innodb_index_stats ON innodb_index_stats.table_name=innodb_sys_tables.name AND innodb_sys_indexes.name=innodb_index_stats.index_name AND innodb_index_stats.table_schema=innodb_sys_tables.schema&#160; ORDER BY cnt DESC LIMIT 20;


+--------------+--------------+--------------+------+-------+--------+---------+


&#124; table_schema &#124; table_name&#160; &#160;&#124; index_name&#160; &#160;&#124; cnt&#160; &#124; dirty &#124; hashed &#124; fit_pct &#124;


+--------------+--------------+--------------+------+-------+--------+---------+


&#124; test&#160; &#160; &#160; &#160; &#160;&#124; a&#160; &#160; &#160; &#160; &#160; &#160; &#124; c&#160; &#160; &#160; &#160; &#160; &#160; &#124; 7976 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;13.73 &#124;


&#124; test&#160; &#160; &#160; &#160; &#160;&#124; a&#160; &#160; &#160; &#160; &#160; &#160; &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160;59 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160; 0.08 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p1&#160; &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160;22 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;22.68 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p0&#160; &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160;22 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;22.68 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p2&#160; &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160;21 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;21.65 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p3&#160; &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160;18 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;18.56 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p3&#160; &#124; k&#160; &#160; &#160; &#160; &#160; &#160; &#124;&#160; &#160; 4 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p2&#160; &#124; k&#160; &#160; &#160; &#160; &#160; &#160; &#124;&#160; &#160; 4 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p1&#160; &#124; k&#160; &#160; &#160; &#160; &#160; &#160; &#124;&#160; &#160; 4 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


&#124; sbtest&#160; &#160; &#160; &#160;&#124; sbtest#P#p0&#160; &#124; k&#160; &#160; &#160; &#160; &#160; &#160; &#124;&#160; &#160; 4 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


&#124; stats&#160; &#160; &#160; &#160; &#124; TABLES&#160; &#160; &#160; &#160;&#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160; 2 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; &#160;66.67 &#124;


&#124; stats&#160; &#160; &#160; &#160; &#124; TABLES&#160; &#160; &#160; &#160;&#124; TABLE_SCHEMA &#124;&#160; &#160; 1 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


&#124; percona&#160; &#160; &#160; &#124; transactions &#124; PRIMARY&#160; &#160; &#160; &#124;&#160; &#160; 1 &#124;&#160; &#160; &#160;0 &#124;&#160; &#160; &#160; 0 &#124;&#160; 100.00 &#124;


+--------------+--------------+--------------+------+-------+--------+---------+


13 rows IN SET &#40;0.04 sec&#41; 






This query shows information about how many pages are in buffer pool for given table (cnt), how many of them are dirty (dirty),   and what is the percentage of index fits in memory (fit_pct)
For illustration purposes I've created one table with partitions to show you will have the real "physical" table name which identifies table down to partition, which is very helpful for analyzes of your
access to partitions - you can actually check if your "hot" partitions really end up in the cache and "cold" are out of the cache, or is something happening which pushes them away from the cache.
You can use this feature to tune buffer pool invalidation strategy, for example play with innodb_old_blocks_pct and innodb_old_blocks_time actually observing data stored in buffer pool rather than using some form of temporary measures. 
I often check these stats during warmup to see what is really getting warmed up first as well as how buffer pool is affected by batch jobs, alter tables, optimize table etc - the lasting impact these may have on system performance is often caused by impact they have on buffer pool which may take hours to recover.
This tool can be also helpful for capacity planning/performance management. In many cases you would learn you need a certain fit to buffer pool for tables/indexes for reasonable performance, you may try to count it too but it may be pretty hard as there are a lot of variables, including page fill factors etc. 
    
    Entry posted by Peter Zaitsev &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>Understanding how well your tables and indexes fit to buffer pool are often very helpful to understand why some queries are IO bound and others not - it may be because the tables and indexes they are accessing are not in cache, for example being washed away by other queries.  MySQL Server does not provide any information of this type,  <a href="http://www.percona.com/software/percona-server/">Percona Server </a> however adds number of tables to Information Schema which makes this information possible. It is just few queries away:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div>mysql&gt; <span>SELECT</span> <span>`schema`</span> <span>AS</span> table_schema,innodb_sys_tables.name <span>AS</span> table_name,innodb_sys_indexes.name <span>AS</span> index_name,cnt,dirty,hashed,round<span>&#40;</span>cnt*<span>100</span>/index_size,<span>2</span><span>&#41;</span> fit_pct&nbsp; &nbsp;<span>FROM</span> <span>&#40;</span><span>SELECT</span> index_id,count<span>&#40;</span>*<span>&#41;</span> cnt,sum<span>&#40;</span>dirty=<span>1</span><span>&#41;</span> dirty ,sum<span>&#40;</span>hashed=<span>1</span><span>&#41;</span> hashed <span>FROM</span> innodb_buffer_pool_pages_index <span>GROUP</span> <span>BY</span> index_id<span>&#41;</span> bp <span>JOIN</span> innodb_sys_indexes <span>ON</span> id=index_id <span>JOIN</span> innodb_sys_tables <span>ON</span> table_id=innodb_sys_tables.id <span>JOIN</span> innodb_index_stats <span>ON</span> innodb_index_stats.table_name=innodb_sys_tables.name <span>AND</span> innodb_sys_indexes.name=innodb_index_stats.index_name <span>AND</span> innodb_index_stats.table_schema=innodb_sys_tables.schema&nbsp; <span>ORDER</span> <span>BY</span> cnt <span>DESC</span> <span>LIMIT</span> <span>20</span>;</div>
</li>
<li>
<div>+<span>--------------+--------------+--------------+------+-------+--------+---------+</span></div>
</li>
<li>
<div>| table_schema | table_name&nbsp; &nbsp;| index_name&nbsp; &nbsp;| cnt&nbsp; | dirty | hashed | fit_pct |</div>
</li>
<li>
<div>+<span>--------------+--------------+--------------+------+-------+--------+---------+</span></div>
</li>
<li>
<div>| test&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| a&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | c&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | <span>7976</span> |&nbsp; &nbsp; &nbsp;<span>0</span> |&nbsp; &nbsp; &nbsp; <span>0</span> |&nbsp; &nbsp;<span>13</span>.<span>73</span> |</div>
</li>
<li>
<div>| test&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| a&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | <span>PRIMARY</span>&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp;<span>59</span> |&nbsp; &nbsp; &nbsp;<span>0</span> |&nbsp; &nbsp; &nbsp; <span>0</span> |&nbsp; &nbsp; <span>0</span>.<span>08</span> |</div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p1&nbsp; | PRIMARY&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp;22 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; &nbsp;22.68 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p0&nbsp; | PRIMARY&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp;22 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; &nbsp;22.68 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p2&nbsp; | PRIMARY&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp;21 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; &nbsp;21.65 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p3&nbsp; | PRIMARY&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp;18 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; &nbsp;18.56 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p3&nbsp; | k&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; 4 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; 100.00 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p2&nbsp; | k&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; 4 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; 100.00 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p1&nbsp; | k&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; 4 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; 100.00 |</span></div>
</li>
<li>
<div>| sbtest&nbsp; &nbsp; &nbsp; &nbsp;| sbtest<span>#P#p0&nbsp; | k&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; 4 |&nbsp; &nbsp; &nbsp;0 |&nbsp; &nbsp; &nbsp; 0 |&nbsp; 100.00 |</span></div>
</li>
<li>
<div>| stats&nbsp; &nbsp; &nbsp; &nbsp; | <span>TABLES</span>&nbsp; &nbsp; &nbsp; &nbsp;| <span>PRIMARY</span>&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; <span>2</span> |&nbsp; &nbsp; &nbsp;<span>0</span> |&nbsp; &nbsp; &nbsp; <span>0</span> |&nbsp; &nbsp;<span>66</span>.<span>67</span> |</div>
</li>
<li>
<div>| stats&nbsp; &nbsp; &nbsp; &nbsp; | <span>TABLES</span>&nbsp; &nbsp; &nbsp; &nbsp;| TABLE_SCHEMA |&nbsp; &nbsp; <span>1</span> |&nbsp; &nbsp; &nbsp;<span>0</span> |&nbsp; &nbsp; &nbsp; <span>0</span> |&nbsp; <span>100</span>.<span>00</span> |</div>
</li>
<li>
<div>| percona&nbsp; &nbsp; &nbsp; | transactions | <span>PRIMARY</span>&nbsp; &nbsp; &nbsp; |&nbsp; &nbsp; <span>1</span> |&nbsp; &nbsp; &nbsp;<span>0</span> |&nbsp; &nbsp; &nbsp; <span>0</span> |&nbsp; <span>100</span>.<span>00</span> |</div>
</li>
<li>
<div>+<span>--------------+--------------+--------------+------+-------+--------+---------+</span></div>
</li>
<li>
<div><span>13</span> rows <span>IN</span> <span>SET</span> <span>&#40;</span><span>0</span>.<span>04</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>This query shows information about how many pages are in buffer pool for given table (cnt), how many of them are dirty (dirty),   and what is the percentage of index fits in memory (fit_pct)<br />
For illustration purposes I've created one table with partitions to show you will have the real "physical" table name which identifies table down to partition, which is very helpful for analyzes of your<br />
access to partitions - you can actually check if your "hot" partitions really end up in the cache and "cold" are out of the cache, or is something happening which pushes them away from the cache.</p>
<p>You can use this feature to tune buffer pool invalidation strategy, for example play with innodb_old_blocks_pct and innodb_old_blocks_time actually observing data stored in buffer pool rather than using some form of temporary measures. </p>
<p>I often check these stats during warmup to see what is really getting warmed up first as well as how buffer pool is affected by batch jobs, alter tables, optimize table etc - the lasting impact these may have on system performance is often caused by impact they have on buffer pool which may take hours to recover.</p>
<p>This tool can be also helpful for capacity planning/performance management. In many cases you would learn you need a certain fit to buffer pool for tables/indexes for reasonable performance, you may try to count it too but it may be pretty hard as there are a lot of variables, including page fill factors etc. </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Peter Zaitsev |
      <a href="http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/&amp;title=How%20well%20does%20your%20table%20fits%20in%20innodb%20buffer%20pool%20?" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/&amp;title=How%20well%20does%20your%20table%20fits%20in%20innodb%20buffer%20pool%20?" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/&amp;title=How%20well%20does%20your%20table%20fits%20in%20innodb%20buffer%20pool%20?" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/&amp;T=How%20well%20does%20your%20table%20fits%20in%20innodb%20buffer%20pool%20?" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/12/09/how-well-does-your-table-fits-in-innodb-buffer-pool/&amp;title=How%20well%20does%20your%20table%20fits%20in%20innodb%20buffer%20pool%20?" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26699&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26699&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/12/10/how-well-does-your-table-fits-in-innodb-buffer-pool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Thinking about running OPTIMIZE on your Innodb Table ?  Stop!</title>
		<link>http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=thinking-about-running-optimize-on-your-innodb-table-stop</link>
		<comments>http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/#comments</comments>
		<pubDate>Thu, 09 Dec 2010 21:06:16 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[xtradb]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=4163</guid>
		<description><![CDATA[Innodb/XtraDB tables do benefit from being reorganized often.   You can get data physically laid out in primary key order as well as get better feel for primary key and index pages and so using less space,
it is just OPTIMIZE TABLE might not be best way to do it. 
If you're running Innodb Plugin on Percona Server with XtraDB you get benefit of a great new feature - ability to build indexes by sort instead of via insertion.  This process can be a lot faster, especially for large indexes which would get inserts in very random order, such as indexes on UUID column or something similar. It also produces a lot better fill factor.  The problem is.... OPTIMIZE TABLE for Innodb tables does not get advantage of it for whatever reason.   
Lets take a look at little benchmark I done by running OPTIMIZE for a second time on a table which is some 10 times larger than amount of memory I allocated for buffer pool:
PLAIN TEXT
SQL:




CREATE TABLE `a` &#40;


&#160; `id` int&#40;10&#41; UNSIGNED NOT NULL AUTO_INCREMENT,


&#160; `c` char&#40;64&#41; DEFAULT NULL,


&#160; PRIMARY KEY &#40;`id`&#41;,


&#160; KEY `c` &#40;`c`&#41;


&#41; ENGINE=InnoDB AUTO_INCREMENT=12582913 DEFAULT CHARSET=latin1


&#160;


mysql&#62; SELECT * FROM a ORDER BY id LIMIT 10;


+----+------------------------------------------+


&#124; id &#124; c&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#124;


+----+------------------------------------------+


&#124;&#160; 1 &#124; 813cf02d7d65de2639014dd1fb574d4c481ecac7 &#124;


&#124;&#160; 2 &#124; 62960f5d5d50651e5a5983dacaedfa9a73a9ee87 &#124;


&#124;&#160; 3 &#124; cea33998792ffe28b16b9272b950102a9633439f &#124;


&#124;&#160; 4 &#124; 8346a7afa0a0791693338d96a07a944874340a1c &#124;


&#124;&#160; 5 &#124; b00faaa432f507a0d16d2940ca8ec36699f141c8 &#124;


&#124;&#160; 6 &#124; 8e00926cf6c9b13dc8e0664a744b7116c5c61036 &#124;


&#124;&#160; 7 &#124; f151fe34b66fd4d28521d5e7ccb68b0d5d81f21b &#124;


&#124;&#160; 8 &#124; 7fceb5afa200a27b81cab45f94903ce04d6f24db &#124;


&#124;&#160; 9 &#124; 0397562dc35b5242842d68de424aa9f0b409d60f &#124;


&#124; 10 &#124; af8efbaef7010a1a3bfdff6609e5c233c897e1d5 &#124;


+----+------------------------------------------+


10 rows IN SET &#40;0.04 sec&#41;


&#160;


# This is just random SHA(1) hashes


&#160;


mysql&#62; OPTIMIZE TABLE a;


+--------+----------+----------+-------------------------------------------------------------------+


&#124; TABLE&#160; &#124; Op&#160; &#160; &#160; &#160;&#124; Msg_type &#124; Msg_text&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#124;


+--------+----------+----------+-------------------------------------------------------------------+


&#124; test.a &#124; OPTIMIZE &#124; note&#160; &#160; &#160;&#124; TABLE does NOT support OPTIMIZE, doing recreate + analyze instead &#124;


&#124; test.a &#124; OPTIMIZE &#124; STATUS&#160; &#160;&#124; OK&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#124;


+--------+----------+----------+-------------------------------------------------------------------+


2 rows IN SET &#40;3 hours 3 min 35.15 sec&#41;


&#160;


mysql&#62; ALTER TABLE a DROP KEY c;


Query OK, 0 rows affected &#40;0.46 sec&#41;


Records: 0&#160; Duplicates: 0&#160; Warnings: 0


&#160;


mysql&#62; OPTIMIZE TABLE a;


+--------+----------+----------+-------------------------------------------------------------------+


&#124; TABLE&#160; &#124; Op&#160; &#160; &#160; &#160;&#124; Msg_type &#124; Msg_text&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#124;


+--------+----------+----------+-------------------------------------------------------------------+


&#124; test.a &#124; OPTIMIZE &#124; note&#160; &#160; &#160;&#124; TABLE does NOT support OPTIMIZE, doing recreate + analyze instead &#124;


&#124; test.a &#124; OPTIMIZE &#124; STATUS&#160; &#160;&#124; OK&#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#160; &#124;


+--------+----------+----------+-------------------------------------------------------------------+


2 rows IN SET &#40;4 min 5.52 sec&#41;


&#160;


mysql&#62; ALTER TABLE a ADD KEY&#40;c&#41;;


Query OK, 0 rows affected &#40;5 min 51.83 sec&#41;


Records: 0&#160; Duplicates: 0&#160; Warnings: 0 






That's right !  Optimizing table straight away takes over 3 hours, while dropping  indexes besides primary key, optimizing table and adding them back takes about 10 minutes, which is close than 20x speed difference and more compact index in the end.
So if you're considering running OPTIMIZE on your tables consider using this trick, it is especially handy when you're running it on the Slave where it is OK table is exposed without indexes for some time.
Note though nothing stops you from using LOCK TABLES on Innodb table to ensure there is not ton of queries starting reading table with no indexes and bringing box down.
You can also use this trick for ALTER TABLE which requires table rebuild. Dropping all indexes;  doing ALTER and when adding them back can be a lot faster than straight ALTER TABLE.
P.S  I do not know why this was not done when support for creating index by sorting was implemented.  It looks very strange to me to have this feature implemented but majority of high level commands
or tools (like mysqldump) do not get advantage of it and will use old slow method of building indexes by insertion. 
    
    Entry posted by peter &#124;
      2 comments
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>Innodb/XtraDB tables do benefit from being reorganized often.   You can get data physically laid out in primary key order as well as get better feel for primary key and index pages and so using less space,<br />
it is just OPTIMIZE TABLE might not be best way to do it. </p>
<p>If you're running Innodb Plugin on Percona Server with XtraDB you get benefit of a great new feature - ability to build indexes by sort instead of via insertion.  This process can be a lot faster, especially for large indexes which would get inserts in very random order, such as indexes on UUID column or something similar. It also produces a lot better fill factor.  The problem is.... OPTIMIZE TABLE for Innodb tables does not get advantage of it for whatever reason.   </p>
<p>Lets take a look at little benchmark I done by running OPTIMIZE for a second time on a table which is some 10 times larger than amount of memory I allocated for buffer pool:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span>CREATE</span> <span>TABLE</span> <span>`a`</span> <span>&#40;</span></div>
</li>
<li>
<div>&nbsp; <span>`id`</span> int<span>&#40;</span><span>10</span><span>&#41;</span> <span>UNSIGNED</span> <span>NOT</span> <span>NULL</span> <span>AUTO_INCREMENT</span>,</div>
</li>
<li>
<div>&nbsp; <span>`c`</span> char<span>&#40;</span><span>64</span><span>&#41;</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div>&nbsp; <span>PRIMARY</span> <span>KEY</span> <span>&#40;</span><span>`id`</span><span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; <span>KEY</span> <span>`c`</span> <span>&#40;</span><span>`c`</span><span>&#41;</span></div>
</li>
<li>
<div><span>&#41;</span> ENGINE=InnoDB <span>AUTO_INCREMENT</span>=<span>12582913</span> <span>DEFAULT</span> CHARSET=latin1</div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; <span>SELECT</span> * <span>FROM</span> a <span>ORDER</span> <span>BY</span> id <span>LIMIT</span> <span>10</span>;</div>
</li>
<li>
<div>+<span>----+------------------------------------------+</span></div>
</li>
<li>
<div>| id | c&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+<span>----+------------------------------------------+</span></div>
</li>
<li>
<div>|&nbsp; <span>1</span> | 813cf02d7d65de2639014dd1fb574d4c481ecac7 |</div>
</li>
<li>
<div>|&nbsp; <span>2</span> | 62960f5d5d50651e5a5983dacaedfa9a73a9ee87 |</div>
</li>
<li>
<div>|&nbsp; <span>3</span> | cea33998792ffe28b16b9272b950102a9633439f |</div>
</li>
<li>
<div>|&nbsp; <span>4</span> | 8346a7afa0a0791693338d96a07a944874340a1c |</div>
</li>
<li>
<div>|&nbsp; <span>5</span> | b00faaa432f507a0d16d2940ca8ec36699f141c8 |</div>
</li>
<li>
<div>|&nbsp; <span>6</span> | 8e00926cf6c9b13dc8e0664a744b7116c5c61036 |</div>
</li>
<li>
<div>|&nbsp; <span>7</span> | f151fe34b66fd4d28521d5e7ccb68b0d5d81f21b |</div>
</li>
<li>
<div>|&nbsp; <span>8</span> | 7fceb5afa200a27b81cab45f94903ce04d6f24db |</div>
</li>
<li>
<div>|&nbsp; <span>9</span> | 0397562dc35b5242842d68de424aa9f0b409d60f |</div>
</li>
<li>
<div>| <span>10</span> | af8efbaef7010a1a3bfdff6609e5c233c897e1d5 |</div>
</li>
<li>
<div>+<span>----+------------------------------------------+</span></div>
</li>
<li>
<div><span>10</span> rows <span>IN</span> <span>SET</span> <span>&#40;</span><span>0</span>.<span>04</span> sec<span>&#41;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div><span># This is just random SHA(1) hashes</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; <span>OPTIMIZE</span> <span>TABLE</span> a;</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div>| <span>TABLE</span>&nbsp; | Op&nbsp; &nbsp; &nbsp; &nbsp;| Msg_type | Msg_text&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div>| test.a | <span>OPTIMIZE</span> | note&nbsp; &nbsp; &nbsp;| <span>TABLE</span> does <span>NOT</span> support <span>OPTIMIZE</span>, doing recreate + analyze instead |</div>
</li>
<li>
<div>| test.a | <span>OPTIMIZE</span> | <span>STATUS</span>&nbsp; &nbsp;| OK&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div><span>2</span> rows <span>IN</span> <span>SET</span> <span>&#40;</span><span>3</span> hours <span>3</span> min <span>35</span>.<span>15</span> sec<span>&#41;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; <span>ALTER</span> <span>TABLE</span> a <span>DROP</span> <span>KEY</span> c;</div>
</li>
<li>
<div>Query OK, <span>0</span> rows affected <span>&#40;</span><span>0</span>.<span>46</span> sec<span>&#41;</span></div>
</li>
<li>
<div>Records: <span>0</span>&nbsp; Duplicates: <span>0</span>&nbsp; Warnings: <span>0</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; <span>OPTIMIZE</span> <span>TABLE</span> a;</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div>| <span>TABLE</span>&nbsp; | Op&nbsp; &nbsp; &nbsp; &nbsp;| Msg_type | Msg_text&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div>| test.a | <span>OPTIMIZE</span> | note&nbsp; &nbsp; &nbsp;| <span>TABLE</span> does <span>NOT</span> support <span>OPTIMIZE</span>, doing recreate + analyze instead |</div>
</li>
<li>
<div>| test.a | <span>OPTIMIZE</span> | <span>STATUS</span>&nbsp; &nbsp;| OK&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+<span>--------+----------+----------+-------------------------------------------------------------------+</span></div>
</li>
<li>
<div><span>2</span> rows <span>IN</span> <span>SET</span> <span>&#40;</span><span>4</span> min <span>5</span>.<span>52</span> sec<span>&#41;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; <span>ALTER</span> <span>TABLE</span> a <span>ADD</span> <span>KEY</span><span>&#40;</span>c<span>&#41;</span>;</div>
</li>
<li>
<div>Query OK, <span>0</span> rows affected <span>&#40;</span><span>5</span> min <span>51</span>.<span>83</span> sec<span>&#41;</span></div>
</li>
<li>
<div>Records: <span>0</span>&nbsp; Duplicates: <span>0</span>&nbsp; Warnings: <span>0</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>That's right !  Optimizing table straight away takes over 3 hours, while dropping  indexes besides primary key, optimizing table and adding them back takes about 10 minutes, which is close than <strong>20x speed difference</strong> and more compact index in the end.</p>
<p>So if you're considering running OPTIMIZE on your tables consider using this trick, it is especially handy when you're running it on the Slave where it is OK table is exposed without indexes for some time.<br />
Note though nothing stops you from using LOCK TABLES on Innodb table to ensure there is not ton of queries starting reading table with no indexes and bringing box down.</p>
<p>You can also use this trick for ALTER TABLE which requires table rebuild. Dropping all indexes;  doing ALTER and when adding them back can be a lot faster than straight ALTER TABLE.</p>
<p>P.S  I do not know why this was not done when support for creating index by sorting was implemented.  It looks very strange to me to have this feature implemented but majority of high level commands<br />
or tools (like mysqldump) do not get advantage of it and will use old slow method of building indexes by insertion. </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/#comments">2 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/&amp;title=Thinking%20about%20running%20OPTIMIZE%20on%20your%20Innodb%20Table%20?%20%20Stop!" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/&amp;title=Thinking%20about%20running%20OPTIMIZE%20on%20your%20Innodb%20Table%20?%20%20Stop!" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/&amp;title=Thinking%20about%20running%20OPTIMIZE%20on%20your%20Innodb%20Table%20?%20%20Stop!" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/&amp;T=Thinking%20about%20running%20OPTIMIZE%20on%20your%20Innodb%20Table%20?%20%20Stop!" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/12/09/thinking-about-running-optimize-on-your-innodb-table-stop/&amp;title=Thinking%20about%20running%20OPTIMIZE%20on%20your%20Innodb%20Table%20?%20%20Stop!" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26695&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26695&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/12/10/thinking-about-running-optimize-on-your-innodb-table-stop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting History of Table Sizes in MySQL</title>
		<link>http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=getting-history-of-table-sizes-in-mysql</link>
		<comments>http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/#comments</comments>
		<pubDate>Thu, 09 Dec 2010 06:02:44 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=4148</guid>
		<description><![CDATA[One data point which is very helpful but surprisingly few people have is the history of the table sizes.  Projection of data growth is very important component for capacity planning and simply watching the growth of space used on partition is not very helpful.
Now as MySQL 5.0+ has information schema collecting and keeping this data is very easy:
PLAIN TEXT
SQL:




CREATE DATABASE stats;


USE stats;


CREATE TABLE `tables` &#40;


`DAY` date NOT NULL,


`TABLE_SCHEMA` varchar&#40;64&#41; NOT NULL DEFAULT '',


`TABLE_NAME` varchar&#40;64&#41; NOT NULL DEFAULT '',


`ENGINE` varchar&#40;64&#41; DEFAULT NULL,


`TABLE_ROWS` bigint&#40;21&#41; UNSIGNED DEFAULT NULL,


`DATA_LENGTH` bigint&#40;21&#41; UNSIGNED DEFAULT NULL,


`INDEX_LENGTH` bigint&#40;21&#41; UNSIGNED DEFAULT NULL,


`DATA_FREE` bigint&#40;21&#41; UNSIGNED DEFAULT NULL,


`AUTO_INCREMENT` bigint&#40;21&#41; UNSIGNED DEFAULT NULL,


PRIMARY KEY&#40;DAY,TABLE_SCHEMA,TABLE_NAME&#41;,


KEY&#40;TABLE_SCHEMA,TABLE_NAME&#41;


&#41; ENGINE=INNODB DEFAULT CHARSET=utf8; 






And use this query to populate it:
PLAIN TEXT
SQL:




INSERT INTO stats.TABLES SELECT DATE&#40;NOW&#40;&#41;&#41;,TABLE_SCHEMA,TABLE_NAME,ENGINE,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE,AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES; 






I put it to the cron to run nightly as:
PLAIN TEXT
SQL:




1&#160; 0&#160; &#160;*&#160; *&#160; &#160; *&#160; &#160; mysql -u root -e "INSERT INTO stats.tables SELECT DATE(NOW()),TABLE_SCHEMA,TABLE_NAME,ENGINE,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE,AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES" 






Though if you're looking to keep it completely inside MySQL you can create appropriate event in MySQL 5.1+ 
Unless you're having millions of tables this is something you can set it up and forget but when a year later when someone asks you about growth rate for individual table you will have it handy.
    
    Entry posted by peter &#124;
      One comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>One data point which is very helpful but surprisingly few people have is the history of the table sizes.  Projection of data growth is very important component for capacity planning and simply watching the growth of space used on partition is not very helpful.</p>
<p>Now as MySQL 5.0+ has information schema collecting and keeping this data is very easy:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span>CREATE</span> <span>DATABASE</span> stats;</div>
</li>
<li>
<div><span>USE</span> stats;</div>
</li>
<li>
<div><span>CREATE</span> <span>TABLE</span> <span>`tables`</span> <span>&#40;</span></div>
</li>
<li>
<div><span>`DAY`</span> date <span>NOT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`TABLE_SCHEMA`</span> varchar<span>&#40;</span><span>64</span><span>&#41;</span> <span>NOT</span> <span>NULL</span> <span>DEFAULT</span> <span>''</span>,</div>
</li>
<li>
<div><span>`TABLE_NAME`</span> varchar<span>&#40;</span><span>64</span><span>&#41;</span> <span>NOT</span> <span>NULL</span> <span>DEFAULT</span> <span>''</span>,</div>
</li>
<li>
<div><span>`ENGINE`</span> varchar<span>&#40;</span><span>64</span><span>&#41;</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`TABLE_ROWS`</span> bigint<span>&#40;</span><span>21</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`DATA_LENGTH`</span> bigint<span>&#40;</span><span>21</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`INDEX_LENGTH`</span> bigint<span>&#40;</span><span>21</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`DATA_FREE`</span> bigint<span>&#40;</span><span>21</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>`AUTO_INCREMENT`</span> bigint<span>&#40;</span><span>21</span><span>&#41;</span> <span>UNSIGNED</span> <span>DEFAULT</span> <span>NULL</span>,</div>
</li>
<li>
<div><span>PRIMARY</span> <span>KEY</span><span>&#40;</span>DAY,TABLE_SCHEMA,TABLE_NAME<span>&#41;</span>,</div>
</li>
<li>
<div><span>KEY</span><span>&#40;</span>TABLE_SCHEMA,TABLE_NAME<span>&#41;</span></div>
</li>
<li>
<div><span>&#41;</span> ENGINE=INNODB <span>DEFAULT</span> CHARSET=utf8; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>And use this query to populate it:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span>INSERT</span> <span>INTO</span> stats.<span>TABLES</span> <span>SELECT</span> DATE<span>&#40;</span>NOW<span>&#40;</span><span>&#41;</span><span>&#41;</span>,TABLE_SCHEMA,TABLE_NAME,ENGINE,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE,<span>AUTO_INCREMENT</span> <span>FROM</span> INFORMATION_SCHEMA.<span>TABLES</span>; </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>I put it to the cron to run nightly as:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span>1</span>&nbsp; <span>0</span>&nbsp; &nbsp;*&nbsp; *&nbsp; &nbsp; *&nbsp; &nbsp; mysql -u root -e <span>"INSERT INTO stats.tables SELECT DATE(NOW()),TABLE_SCHEMA,TABLE_NAME,ENGINE,TABLE_ROWS,DATA_LENGTH,INDEX_LENGTH,DATA_FREE,AUTO_INCREMENT FROM INFORMATION_SCHEMA.TABLES"</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Though if you're looking to keep it completely inside MySQL you can create appropriate event in MySQL 5.1+ </p>
<p>Unless you're having millions of tables this is something you can set it up and forget but when a year later when someone asks you about growth rate for individual table you will have it handy.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/#comments">One comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/&amp;title=Getting%20History%20of%20Table%20Sizes%20in%20MySQL" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/&amp;title=Getting%20History%20of%20Table%20Sizes%20in%20MySQL" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/&amp;title=Getting%20History%20of%20Table%20Sizes%20in%20MySQL" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/&amp;T=Getting%20History%20of%20Table%20Sizes%20in%20MySQL" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/12/08/getting-history-of-table-sizes-in-mysql/&amp;title=Getting%20History%20of%20Table%20Sizes%20in%20MySQL" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26682&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26682&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/12/09/getting-history-of-table-sizes-in-mysql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Replication of MEMORY (HEAP) Tables</title>
		<link>http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=replication-of-memory-heap-tables</link>
		<comments>http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/#comments</comments>
		<pubDate>Sat, 16 Oct 2010 05:36:51 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[Replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=3705</guid>
		<description><![CDATA[Some Applications need to store some transient data which is frequently regenerated and MEMORY table look like a very good match for this sort of tasks.   Unfortunately this will bite when you will be looking to add Replication to your environment as MEMORY tables do not play well with replication.
The reason is very simple &#8211; both STATEMENT and ROW replication contain the changes to the data in binary logs.  This requires the data to be same on Master and Slave.    When you restart the slave you will lose contents of your MEMORY tables and replication will break.   STATEMENT replication will often continue to run, with contents of the table just being
different as there is a little checks whenever statements produce the same results on the slave.  ROW replication will
complain  about ROW not exist for UPDATE or DELETE operation.  
So what you can do ?
Use Innodb Table Instead     Innodb is quite fast when it fits in memory so for most applications this performance will be enough and it will save you from all complexity of different workarounds. 
Do not replicate MEMORY tables  If you do not really need MEMORY table on the slaves you can skip replicating it specifying replicate-ignore-table=db.memory_table.   Note you should not be using STATEMENT level replication with  INSERT &#8230; SELECT into  this memory table for this to work.   Be careful using data
on the Slave in this case as table will be empty.     Another nice trick sometimes is to make slave to generate its own
copy of the table, for example by running the same cron jobs MASTER runs to refresh this table periodically.
Restart Slaves Carefully   I would not use this as long term solution as there are going to be the reasons when SLAVE will not restart normally &#8211;  power goes down MySQL crashes etc.   If you however are using MEMORY table in replication and just want to do a restart without replication breaking you can do the following:
Add skip-slave-start in your my.cnf;  run SLAVE STOP;  dump all your memory tables using MySQLDump;  Restart the MySQL As planned;   Load Dumped tables; run SLAVE START;   Remove skip-slave-start from config file.    Be careful using it with MASTER-MASTER  or CHAIN/TREE replication.   In this case you will need to disable binary logging while loading data from mysqldump as you may not want these changes to be replicated.
What could have done better ? 
MySQL could have features to make it more convenient.  It would be great to have MEMORY table option which would save table to on disk file on shutdown and load it back on startup.  Of course you would lose the data on unclear start, but it is still handy for a lot of cases.
We could have the option similar to skip-slave-errors but specified on per-table basics.  This would allow me to simply allow to avoid all replication errors for MEMORY table which would make things more robust if table is
regenerated periodically.  It can be helpful in many other cases too.
    
    Entry posted by peter &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>Some Applications need to store some transient data which is frequently regenerated and MEMORY table look like a very good match for this sort of tasks.   Unfortunately this will bite when you will be looking to add Replication to your environment as MEMORY tables do not play well with replication.</p>
<p>The reason is very simple &#8211; both STATEMENT and ROW replication contain the changes to the data in binary logs.  This requires the data to be same on Master and Slave.    When you restart the slave you will lose contents of your MEMORY tables and replication will break.   STATEMENT replication will often continue to run, with contents of the table just being<br />
different as there is a little checks whenever statements produce the same results on the slave.  ROW replication will<br />
complain  about ROW not exist for UPDATE or DELETE operation.  </p>
<p>So what you can do ?</p>
<p><strong>Use Innodb Table Instead </strong>    Innodb is quite fast when it fits in memory so for most applications this performance will be enough and it will save you from all complexity of different workarounds. </p>
<p><strong>Do not replicate MEMORY tables</strong>  If you do not really need MEMORY table on the slaves you can skip replicating it specifying <strong>replicate-ignore-table=db.memory_table</strong>.   Note you should not be using STATEMENT level replication with  INSERT &#8230; SELECT into  this memory table for this to work.   Be careful using data<br />
on the Slave in this case as table will be empty.     Another nice trick sometimes is to make slave to generate its own<br />
copy of the table, for example by running the same cron jobs MASTER runs to refresh this table periodically.</p>
<p><strong>Restart Slaves Carefully </strong>  I would not use this as long term solution as there are going to be the reasons when SLAVE will not restart normally &#8211;  power goes down MySQL crashes etc.   If you however are using MEMORY table in replication and just want to do a restart without replication breaking you can do the following:<br />
Add <strong>skip-slave-start</strong> in your my.cnf;  run SLAVE STOP;  dump all your memory tables using MySQLDump;  Restart the MySQL As planned;   Load Dumped tables; run SLAVE START;   Remove skip-slave-start from config file.    Be careful using it with MASTER-MASTER  or CHAIN/TREE replication.   In this case you will need to disable binary logging while loading data from mysqldump as you may not want these changes to be replicated.</p>
<p>What could have done better ? </p>
<p>MySQL could have features to make it more convenient.  It would be great to have MEMORY table option which would save table to on disk file on shutdown and load it back on startup.  Of course you would lose the data on unclear start, but it is still handy for a lot of cases.</p>
<p>We could have the option similar to <strong>skip-slave-errors</strong> but specified on per-table basics.  This would allow me to simply allow to avoid all replication errors for MEMORY table which would make things more robust if table is<br />
regenerated periodically.  It can be helpful in many other cases too.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/&amp;title=Replication%20of%20MEMORY%20(HEAP)%20Tables" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/&amp;title=Replication%20of%20MEMORY%20(HEAP)%20Tables" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/&amp;title=Replication%20of%20MEMORY%20(HEAP)%20Tables" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/&amp;T=Replication%20of%20MEMORY%20(HEAP)%20Tables" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/10/15/replication-of-memory-heap-tables/&amp;title=Replication%20of%20MEMORY%20(HEAP)%20Tables" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26180&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26180&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/10/16/replication-of-memory-heap-tables/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The story of one MySQL Upgrade</title>
		<link>http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-story-of-one-mysql-upgrade</link>
		<comments>http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/#comments</comments>
		<pubDate>Sat, 09 Oct 2010 01:10:02 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[Replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=3684</guid>
		<description><![CDATA[I recently worked on upgrading MySQL from one of very early MySQL 5.0 versions to  Percona Server 5.1.  This was a classical upgrade scenario which can cause surprises.  Master and few slaves need to be upgraded.  It is a shared database used by tons of applications written by many people over more than 5 years timeframe. It did not have any extensive test suite we could use for validation. As you might guess in such cases some of the original authors have moved on and nobody is exactly sure what application does or does not do with the database.  Database is production critical with serious role in serious company so we can&#8217;t just do a reckless upgrade 
First we needed to do a sanity check on existing replication setup.  As we&#8217;re checking replication consistency down the road we need to make sure replication is in sync to begin with to avoid false positives.  mk-table-checksum is a tool to do it.   It turned out replication indeed had an issue replicating
triggers.  The problem should be fixed by upgrade so we just have to keep this into account.   
We move database to MySQL 5.1  As the database size is relatively small we do mysqldump and load which is a safest way, considering we&#8217;re speaking about 4 years worth of changes in versions.  We also ran mysql_fix_privilege_tables to ensure all new privileges are added, which is something I frequently see forgotten
Next step is setup MySQL 5.0 to 5.1 replication to see if it runs properly.  It turns out it does not because of the old bug  which I&#8217;ve also seen causing upgrade problems in number of other environments.   INSERT ON DUPLICATE KEY UPDATE had a unfair share of replication issues in MySQL 5.0.   There are number of ways the problem can be solved but first we decide to see how broad is it.   We let Slave to replicate with skip-slave-errors=1105  to see if we get any other problems spotted and in the meanwhile we go over binary logs for the last month to see how frequently this functionality is used.  Happily there are only few INSERT ON DUPLICATE KEY UPDATE query instances, and only one of them into table with AUTO_INCREMENT column (and so affected by this bug). It was easy enough to change the single application not to use INSERT ON DUPLICATE KEY UPDATE in this instance so it was done.   
So replication was running properly but does data match ?  (This also would cover data improperly loaded with mysqldump if there is such).   We stopped 5.0 and 5.1 slave at the same position and used mk-table-checksum to ensure the data is in sync.  mk-table-checksum can use replication to check consistency but comparing 2 servers directly is faster and we had a spare capacity which we could use.      First we ran the check using default CHECKSUM TABLE algorithm.  We got number of tables reporting wrong checksums while  running SELECT INTO OUTFILE and diffing these files reported no changes.  It turns out there are some subtle changes to CHECKSUM TABLE over the years which could report different checksum in some cases.   Rerunning check using BIT_XOR algorithm eliminated those false positives.   Another table remained though.    We used mk-table-sync &#8211;print
 as a diff tool for MySQL to see what is different in the tables.  It turned out  one of the float columns stored &#8220;-0&#8243; in MySQL 5.0 but  it was displayed as &#8220;0&#8243; when data loaded to Percona Server 5.1. This was not the issue for application and could be ignored.
So at this point we were sure the write traffic replicates properly to the new setup. It was the time to check how read traffic behaves.  We stopped both slaves at the same position again and  used tcpdump and mk-query-digest to get sample read traffic from both master and slave.   &#8211;sample=50 (or similar) option is important to check only limited number of samples for each query type &#8211; otherwise it can take a lot of time.  Running mk-upgrade with these queries showed some results differences which turned out to be false positives too &#8211; thanks to TABLE CHECKSUM mk-upgrade uses by default to check result sets.  &#8211;compare-results-method rows helped to remove them and we were down to only query time differences.   In most cases query time differences were not significant or Percona Server 5.1 did better but there were couple of queries where optimizer plan changed to significantly worse one and they were flagged to be fixed.  
At this point we were confident enough Slaves can handle the traffic and we could put them in production.  Before upgrading Master however we had to think about rollback plan if something goes wrong and we need to go back to MySQL 5.0 on the master.   To do this we set up replication from Percona Server 5.1 back to MySQL 5.0 and performed the same checks again &#8211; happily replication worked and there were no &#8220;drift&#8221;.  This allows us to simply to hook up old MySQL 5.0 and all it slaves as a slave off new master and  keep it for some time, with rollback to old setup being trivial.  This was the best choice as with New MySQL version upgrade involves new Operating System and hardware and any of them could be potential cause of rollback. 
MySQL Upgrade in my opinion is the process where hiring external consultant it especially makes sense.  The team, even if it includes skilled MySQL DBA typically does not need to go through major version upgrades more frequently than 3-5 years, so unless there are a lot of applications being upgraded by the same team it is hard to archive experience.  Also problems you encounter during upgrade are very different depending on the upgrade version &#8211; upgrade from MySQL 4.1 to 5.0 had a lot of different issues than upgrade from MySQL 5.0 to 5.1
Also, Maatkit is Awesome, though I believe you know already.
    
    Entry posted by peter &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>I recently worked on upgrading MySQL from one of very early MySQL 5.0 versions to  Percona Server 5.1.  This was a classical upgrade scenario which can cause surprises.  Master and few slaves need to be upgraded.  It is a shared database used by tons of applications written by many people over more than 5 years timeframe. It did not have any extensive test suite we could use for validation. As you might guess in such cases some of the original authors have moved on and nobody is exactly sure what application does or does not do with the database.  Database is production critical with serious role in serious company so we can&#8217;t just do a <a href="http://www.mysqlperformanceblog.com/2010/01/05/upgrading-mysql/">reckless upgrade </a></p>
<p>First we needed to do a sanity check on existing replication setup.  As we&#8217;re checking replication consistency down the road we need to make sure replication is in sync to begin with to avoid false positives.  <a href="http://www.maatkit.org/doc/mk-table-checksum.html">mk-table-checksum</a> is a tool to do it.   It turned out replication indeed had an issue replicating<br />
triggers.  The problem should be fixed by upgrade so we just have to keep this into account.   </p>
<p>We move database to MySQL 5.1  As the database size is relatively small we do mysqldump and load which is a safest way, considering we&#8217;re speaking about 4 years worth of changes in versions.  We also ran <strong>mysql_fix_privilege_tables</strong> to ensure all new privileges are added, which is something I frequently see forgotten</p>
<p>Next step is setup MySQL 5.0 to 5.1 replication to see if it runs properly.  It turns out it does not because of the <a href="http://bugs.mysql.com/bug.php?id=24432">old bug </a> which I&#8217;ve also seen causing upgrade problems in number of other environments.   INSERT ON DUPLICATE KEY UPDATE had a unfair share of replication issues in MySQL 5.0.   There are number of ways the problem can be solved but first we decide to see how broad is it.   We let Slave to replicate with <strong>skip-slave-errors=1105 </strong> to see if we get any other problems spotted and in the meanwhile we go over binary logs for the last month to see how frequently this functionality is used.  Happily there are only few INSERT ON DUPLICATE KEY UPDATE query instances, and only one of them into table with AUTO_INCREMENT column (and so affected by this bug). It was easy enough to change the single application not to use INSERT ON DUPLICATE KEY UPDATE in this instance so it was done.   </p>
<p>So replication was running properly but does data match ?  (This also would cover data improperly loaded with mysqldump if there is such).   We stopped 5.0 and 5.1 slave at the same position and used mk-table-checksum to ensure the data is in sync.  mk-table-checksum can use replication to check consistency but comparing 2 servers directly is faster and we had a spare capacity which we could use.      First we ran the check using default CHECKSUM TABLE algorithm.  We got number of tables reporting wrong checksums while  running SELECT INTO OUTFILE and diffing these files reported no changes.  It turns out there are some subtle changes to CHECKSUM TABLE over the years which could report different checksum in some cases.   Rerunning check using BIT_XOR algorithm eliminated those false positives.   Another table remained though.    We used <a href="http://www.maatkit.org/doc/mk-table-sync.html">mk-table-sync &#8211;print<br />
</a> as a diff tool for MySQL to see what is different in the tables.  It turned out  one of the float columns stored &#8220;-0&#8243; in MySQL 5.0 but  it was displayed as &#8220;0&#8243; when data loaded to Percona Server 5.1. This was not the issue for application and could be ignored.</p>
<p>So at this point we were sure the write traffic replicates properly to the new setup. It was the time to check how read traffic behaves.  We stopped both slaves at the same position again and  used tcpdump and <a href="http://www.maatkit.org/doc/mk-query-digest.html">mk-query-digest</a> to get sample read traffic from both master and slave.  <strong> &#8211;sample=50</strong> (or similar) option is important to check only limited number of samples for each query type &#8211; otherwise it can take a lot of time.  Running <a href="http://www.maatkit.org/doc/mk-upgrade.html">mk-upgrade</a> with these queries showed some results differences which turned out to be false positives too &#8211; thanks to TABLE CHECKSUM mk-upgrade uses by default to check result sets.  <strong>&#8211;compare-results-method rows</strong> helped to remove them and we were down to only query time differences.   In most cases query time differences were not significant or Percona Server 5.1 did better but there were couple of queries where optimizer plan changed to significantly worse one and they were flagged to be fixed.  </p>
<p>At this point we were confident enough Slaves can handle the traffic and we could put them in production.  Before upgrading Master however we had to think about rollback plan if something goes wrong and we need to go back to MySQL 5.0 on the master.   To do this we set up replication from Percona Server 5.1 back to MySQL 5.0 and performed the same checks again &#8211; happily replication worked and there were no &#8220;drift&#8221;.  This allows us to simply to hook up old MySQL 5.0 and all it slaves as a slave off new master and  keep it for some time, with rollback to old setup being trivial.  This was the best choice as with New MySQL version upgrade involves new Operating System and hardware and any of them could be potential cause of rollback. </p>
<p>MySQL Upgrade in my opinion is the process where hiring external consultant it especially makes sense.  The team, even if it includes skilled MySQL DBA typically does not need to go through major version upgrades more frequently than 3-5 years, so unless there are a lot of applications being upgraded by the same team it is hard to archive experience.  Also problems you encounter during upgrade are very different depending on the upgrade version &#8211; upgrade from MySQL 4.1 to 5.0 had a lot of different issues than upgrade from MySQL 5.0 to 5.1</p>
<p>Also, <a href="http://www.maatkit.org/">Maatkit</a> is Awesome, though I believe you know already.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/&amp;title=The%20story%20of%20one%20MySQL%20Upgrade" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/&amp;title=The%20story%20of%20one%20MySQL%20Upgrade" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/&amp;title=The%20story%20of%20one%20MySQL%20Upgrade" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/&amp;T=The%20story%20of%20one%20MySQL%20Upgrade" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/10/08/the-story-of-one-mysql-upgrade/&amp;title=The%20story%20of%20one%20MySQL%20Upgrade" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26115&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26115&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/10/09/the-story-of-one-mysql-upgrade/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cache Miss Storm</title>
		<link>http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cache-miss-storm</link>
		<comments>http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/#comments</comments>
		<pubDate>Sat, 11 Sep 2010 00:07:35 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=3569</guid>
		<description><![CDATA[I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything would go back to normal.  What was interesting is Same query was taking large portion of slots in PROCESSLIST. I do not just mean query with same fingerprint but literally the same query with same constants.
What we observed was a cache miss storm  &#8211; situation which can happen with memcache (as in this case) as well as with query cache.  If you have the item which is expensive to generate but which has a lot of hits in the cache you can get into situation when many clients at once will have miss in the cache and will attempt to re-create the item pushing server to overload. Now because a lot of requests being proceed in parallel the response time for initial request may take a lot longer than if it is ran all by itself increasing the time it takes server to recover.
What do I mean by expensive query in this case ? This is the query which has too high ratio of requests to be served with 100% misses for portion of time.  For example if I have 100 accesses for given cache objects per second and it takes 500ms to populate it, it still will be too expensive, because  for these 500 ms it takes to populate the item 50 requests will be started (this is the average case, because of random arrivals the worse case is worse) which takes 25 seconds to deal with (assuming there is just one execution unit). Because we normally have multiple cores and multiple drives it can be less than that but it is enough to cause hiccup for a few seconds which is unacceptable for a lot of modern applications.
How can you deal with this problem ?  You should carefully watch  frequently accessed cache items  as well as cache items which  take long to generate in case of cache miss.   To find first one for memcached you can use mk-query-digest to analyze which items are requested frequently, it can decode memcached wire traffic.   For second you can have instrumentation in your applications or take a look at MySQL Slow queries &#8211; which is good enough if you populate each cache item with single query.
Optimize query if you can.  This is a good thing to do in any case but it may not be the only part of best solution.  You can get some query patterns getting slow over time as data size growths or execution plan changes, you can also have some items becoming hot unexpectedly due to changes to content interest or launch of new features. 
Use Smarter Cache  Especially with memcache it is you who decide how to populate the cache. There is number of techniques you can use to avoid this problem such as probabilistic invalidation, you can also put the special value in the cache to reflect it is being updated right now so you&#8217;re better wait rather than starting populating it.  For MySQL  Query Cache the solution should have been to make queries wait on first query started to complete. Unfortunately this have not been implemented so far.
Pre-Populate the cache  In some cases you can&#8217;t change how caching works easily, especially if it is built in in the application however it may be easier enough to identify hot items
and pre-populate them before they expire. So if item expires in 15 minutes you can refresh it every 10 minutes and so you get basically no misses.  This works best when there are few hot cache entries which cause the problem.
So if your server has seemingly random spikes of activities check this out &#8211; cache miss storm could be one of the possible causes.
    
    Entry posted by peter &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything would go back to normal.  What was interesting is Same query was taking large portion of slots in PROCESSLIST. I do not just mean query with same fingerprint but literally the same query with same constants.</p>
<p>What we observed was a <strong>cache miss storm</strong>  &#8211; situation which can happen with memcache (as in this case) as well as with query cache.  If you have the item which is expensive to generate but which has a lot of hits in the cache you can get into situation when many clients at once will have miss in the cache and will attempt to re-create the item pushing server to overload. Now because a lot of requests being proceed in parallel the response time for initial request may take a lot longer than if it is ran all by itself increasing the time it takes server to recover.</p>
<p>What do I mean by expensive query in this case ? This is the query which has too high ratio of requests to be served with 100% misses for portion of time.  For example if I have 100 accesses for given cache objects per second and it takes 500ms to populate it, it still will be too expensive, because  for these 500 ms it takes to populate the item 50 requests will be started (this is the average case, because of random arrivals the worse case is worse) which takes 25 seconds to deal with (assuming there is just one execution unit). Because we normally have multiple cores and multiple drives it can be less than that but it is enough to cause hiccup for a few seconds which is unacceptable for a lot of modern applications.</p>
<p>How can you deal with this problem ?  You should carefully watch  <strong>frequently accessed cache items</strong>  as well as cache items which <strong> take long to generate</strong> in case of cache miss.   To find first one for memcached you can use <a href="http://www.maatkit.org/doc/mk-query-digest.html">mk-query-digest</a> to analyze which items are requested frequently, it can decode memcached wire traffic.   For second you can have instrumentation in your applications or take a look at MySQL Slow queries &#8211; which is good enough if you populate each cache item with single query.</p>
<p><strong>Optimize query if you can.</strong>  This is a good thing to do in any case but it may not be the only part of best solution.  You can get some query patterns getting slow over time as data size growths or execution plan changes, you can also have some items becoming hot unexpectedly due to changes to content interest or launch of new features. </p>
<p><strong>Use Smarter Cache</strong>  Especially with memcache it is you who decide how to populate the cache. There is number of techniques you can use to avoid this problem such as probabilistic invalidation, you can also put the special value in the cache to reflect it is being updated right now so you&#8217;re better wait rather than starting populating it.  For MySQL  Query Cache the solution should have been to make queries wait on first query started to complete. Unfortunately this have not been implemented so far.</p>
<p><strong>Pre-Populate the cache </strong> In some cases you can&#8217;t change how caching works easily, especially if it is built in in the application however it may be easier enough to identify hot items<br />
and pre-populate them before they expire. So if item expires in 15 minutes you can refresh it every 10 minutes and so you get basically no misses.  This works best when there are few hot cache entries which cause the problem.</p>
<p>So if your server has seemingly random spikes of activities check this out &#8211; cache miss storm could be one of the possible causes.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/&amp;title=Cache%20Miss%20Storm" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/&amp;title=Cache%20Miss%20Storm" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/&amp;title=Cache%20Miss%20Storm" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/&amp;T=Cache%20Miss%20Storm" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/09/10/cache-miss-storm/&amp;title=Cache%20Miss%20Storm" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25832&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25832&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/09/11/cache-miss-storm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Estimating Replication Capacity</title>
		<link>http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=estimating-replication-capacity</link>
		<comments>http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/#comments</comments>
		<pubDate>Wed, 21 Jul 2010 02:51:11 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>
		<category><![CDATA[Replication]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=3373</guid>
		<description><![CDATA[It is easy for MySQL replication to become bottleneck when Master server is not seriously loaded and the more cores and hard drives the get the larger the difference becomes, as long as replication
remains single thread process.   At the same time it is a lot easier to optimize your system when your replication runs normally - if you need to add/remove indexes and do other schema changes you probably would be looking at some methods involving replication if you can't take your system down. So here comes the catch in many systems - we find system is in need for optimization when replication can't catch up but yet optimization process we're going to use relays on replication being functional and being able to catch up quickly.
So the question becomes how can we estimate replication capacity, so we can deal with replication load before slave is unable to catch up.
Need to replication capacity is not only needed in case you're planning to use replication to perform system optimization, it is also needed on other cases. For example in sharded environment you may need to schedule downtime or set object read only to move it to another shard.  It is much nicer if it can be planned in advance rather than done on emergency basics when slave(s) are unable to catch up and application is suffering because of stale data.    This especially applies to Software as Service providers which often have very strict SLA agreements with their customers and which can have a lot of data per customer so move can take considerable amount of time.
So what is replication capacity  I call replication capacity the ability to  replicate the master load.  If replication is able to replicate 3 times the write load from the master without falling behind I will call it replication capacity of 3.   When used with context of applying binary logs (for example point in time recovery from backup) replication capacity of 1 will mean  you can reply 1 hour worth of binary logs within 1 hour.     I will call "replication load"  the inverse of replication capacity -  this is basically what percentage of time the replication thread was busy replicating events vs staying idle. 
Note you can speak about idle replication capacity, when box does not do anything else as well as loaded replication capacity  when the box serves the normal load.  Both are important. You care about idle replication capacity when you have no load on the slave and need it to catch up or when restoring from backup, the loaded replication capacity matters during normal operation.
So we defined what replication capacity is. There is however no tools which can tell us straight what replication capacity is for the given system.  It also tends to float depending on the load similar as loadavg metrics.      Here are some of the ways to measure it:
1) Use "UserStats"  functionality from Google patches,  which is now available in Percona Server and MariaDB. This is the probably the easiest and most accurate approach but it
does not work in Oracle MySQL Server.   set  userstat_running=1  and run following query:
PLAIN TEXT
SQL:




mysql&#62; SELECT * FROM information_schema.user_statistics WHERE user="#mysql_system#" \G


*************************** 1. row ***************************


USER: #mysql_system#


TOTAL_CONNECTIONS: 1


CONCURRENT_CONNECTIONS: 0


CONNECTED_TIME: 446


BUSY_TIME: 74


CPU_TIME: 0


BYTES_RECEIVED: 0


BYTES_SENT: 63


BINLOG_BYTES_WRITTEN: 0


ROWS_FETCHED: 0


ROWS_UPDATED: 127576


TABLE_ROWS_READ: 4085689


SELECT_COMMANDS: 0


UPDATE_COMMANDS: 119127


OTHER_COMMANDS: 89557


COMMIT_TRANSACTIONS: 90259


ROLLBACK_TRANSACTIONS: 0


DENIED_CONNECTIONS: 1


LOST_CONNECTIONS: 0


ACCESS_DENIED: 0


EMPTY_QUERIES: 0


1 row IN SET &#40;0.00 sec&#41; 






In this case CONNECTED_TIME is 446 second, out of this replication thread was busy (BUSY_TIME) 74 seconds which means replication capacity is  446/74 = 6
You normally would not like to measure it from the start but rather take the difference in these counters every 5 minutes or other interval of your choice.
2) Use full slow query log and mk-query-digest.  This method is great for one time execution especially as it comes together with giving you the list of queries which load replication
the most.  It however works only with statement level replication.    You need to set  long_query_time=0 and log_slave_slow_statements=1 for this method to work.
Get the log file which will include all queries MySQL server ran with their times and run mk-query-digest with filter to only check queries from replication thread:
mk-query-digest slow-log --filter '($event-&#62;{user} &#124;&#124; "") =~ m/[SLAVE_THREAD]/' &#62; /tmp/report-slave.txt
In the report you will see something like this as a header:
PLAIN TEXT
SQL:




# 475s user time, 1.2s system time, 80.41M rss, 170.38M vsz


# Current date: Mon Jul 19 15:12:24 2010


# Files: slow-log


# Overall: 1.22M total, 1.27k unique, 558.56 QPS, 0.37x concurrency ______


# total min max avg 95% stddev median


# Exec time 819s 1us 92s 669us 260us 120ms 93us


# Lock time 28s 0 166ms 23us 49us 192us 25us


# Rows sent 4.27k 0 325 0.00 0 1.04 0


# Rows exam 30.88M 0 1.28M 26.48 0 3.07k 0


# Time range 2010-07-19 14:35:53 to 2010-07-19 15:12:22


# bytes 350.99M 5 1022.34k 301.01 719.66 5.75k 124.25


# Bytes sen 1.94M 0 9.42k 1.67 0 110.38 0


# Killed 0 0 0 0 0 0 0


# Last errn 34.11M 0 1.55k 29.26 0 185.83 0


# Merge pas 0 0 0 0 0 0 0


# Rows affe 875.19k 0 17.55k 0.73 0.99 25.61 0.99


# Rows read 2.20M 0 14.83k 1.88 1.96 24.68 1.96


# Tmp disk 4.15k 0 1 0.00 0 0.06 0


# Tmp table 14.19k 0 2 0.01 0 0.14 0


# Tmp table 8.30G 0 2.01M 7.12k 0 117.75k 0


# 0% (5k) Filesort


# 0% (5k) Full_join


# 0% (7k) Full_scan


# 0% (10k) Tmp_table


# 0% (4k) Tmp_table_on_disk 






There is a lot of interesting you can find out from this header but in relation to replication capacity - you can get replication load, which is same as "concurrency" figure (0.37x)   The concurrency as reported by mk-query-digest is sum of query execution time vs time range the log file covers.  In this case as we know there is only one replication thread it will be same as replication load.  This gives us replication capacity of  1/0.37 = 2.70 
This method should work with original MySQL Server in theory, though I have not tested it. Some versions had log_slave_slow_statements unreliable and also you may need to adjust regular expression for finding users replication thread uses.   
3) Processlist Pooling    This method is simple - the Slave thread has different status in Show Processlist depending on if it processes query or simply waiting.  By pooling processlist frequently (for example 10 times a second)  we can compute the approximate percentage the thread was busy vs idle.  Of course running processlist very aggressively can be an overhead especially if it is busy system with a lot of connections
PLAIN TEXT
SQL:




mysql&#62; SHOW processlist;


+--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+


&#124; Id &#124; User &#124; Host &#124; db &#124; Command &#124; Time &#124; State &#124; Info &#124;


+--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+


&#124; 801812 &#124; system user &#124; &#124; NULL &#124; Connect &#124; 2665 &#124; Waiting FOR master TO send event &#124; NULL &#124;


&#124; 801813 &#124; system user &#124; &#124; NULL &#124; Connect &#124; 0 &#124; Has READ ALL relay log; waiting FOR the slave I/O thread TO UPDATE it &#124; NULL &#124;


&#124; 802354 &#124; root &#124; localhost &#124; NULL &#124; Query &#124; 0 &#124; NULL &#124; SHOW processlist &#124;


+--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+


3 rows IN SET &#40;0.00 sec&#41; 






4) Slave Catchup/Binlog Application method.   We can just get the spare server with backups restored on it and apply binary log to it. If 1 hour worth of binary logs applies for 10 minutes we have replication capacity of 6.    The challenge of course having spare server around and it is quite labor intensive. At the same time it can be good measurement to take during backup recovery trials when you're doing this activity anyway.     Using this way you can also measure "cold" vs "hot" replication capacity as well as how long replication warmup takes.  It is very typical for servers with cold cache to perform a lot slower then they are warmed up.  Measuring times for each binary log separately should give you these numbers.
The less intrusive process which can be done in production (especially if you have slave which is used for backups/reporting etc) is to stop the replication for some time and when see how long it takes to catch up.  If you paused replication for 10 minutes and it took 5 minutes to catch up your replication capacity will be 3 (not 2) because you not only had to process the events for outstanding 10 minutes but also for these 5 minutes it took to catch up.  The formula is  (Time_Replication_Paused+Time_Took_To_Catchup)/Time_Took_To_Catchup.  
So how much of replication capacity do you need in the healthy system ?  It depends a lot on many things including how fast do you need to be able to recover from backups and how much your load variance is.  A lot of systems have special requirements on the time it takes to warmup too (there are different things you can do about it too).   First I would measure replication capacity on 5 minute intervals (or something similar) because it tends to vary a lot.    When I would suggest to ensure the loaded replication capacity is at least 3 during the peak load and 5 during the normal load.  This applies to normal operational load - if you push heavy ALTER TABLE through replication they will surely get your replication capacity down for their duration. 
One more thing about these methods -  methods 1,2,3 work well only if replication capacity is above 1, so system is caught up.   If it is less than 1, so the master writes more binary logs than slave can process they will show number close to 1.  the method 4 however  with work even if replication can't ever catch up  -  If  1 hour worth of binary logs takes 2 hours to apply, your replication capacity is 0.5.
    
    Entry posted by peter &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>It is easy for MySQL replication to become bottleneck when Master server is not seriously loaded and the more cores and hard drives the get the larger the difference becomes, as long as replication<br />
remains single thread process.   At the same time it is a lot easier to optimize your system when your replication runs normally - if you need to add/remove indexes and do other schema changes you probably would be looking at some methods involving replication if you can't take your system down. So here comes the catch in many systems - we find system is in need for optimization when replication can't catch up but yet optimization process we're going to use relays on replication being functional and being able to catch up quickly.</p>
<p>So the question becomes how can we estimate replication capacity, so we can deal with replication load before slave is unable to catch up.</p>
<p>Need to replication capacity is not only needed in case you're planning to use replication to perform system optimization, it is also needed on other cases. For example in sharded environment you may need to schedule downtime or set object read only to move it to another shard.  It is much nicer if it can be planned in advance rather than done on emergency basics when slave(s) are unable to catch up and application is suffering because of stale data.    This especially applies to Software as Service providers which often have very strict SLA agreements with their customers and which can have a lot of data per customer so move can take considerable amount of time.</p>
<p>So what is <strong>replication capacity</strong>  I call replication capacity the ability to  replicate the master load.  If replication is able to replicate 3 times the write load from the master without falling behind I will call it replication capacity of 3.   When used with context of applying binary logs (for example point in time recovery from backup) replication capacity of 1 will mean  you can reply 1 hour worth of binary logs within 1 hour.     I will call "replication load"  the inverse of replication capacity -  this is basically what percentage of time the replication thread was busy replicating events vs staying idle. </p>
<p>Note you can speak about <strong>idle replication capacity</strong>, when box does not do anything else as well as<strong> loaded replication capacity</strong>  when the box serves the normal load.  Both are important. You care about idle replication capacity when you have no load on the slave and need it to catch up or when restoring from backup, the loaded replication capacity matters during normal operation.</p>
<p>So we defined what replication capacity is. There is however no tools which can tell us straight what replication capacity is for the given system.  It also tends to float depending on the load similar as loadavg metrics.      Here are some of the ways to measure it:</p>
<p><strong>1) Use "UserStats"  functionality from Google patches</strong>,  which is now available in Percona Server and MariaDB. This is the probably the easiest and most accurate approach but it<br />
does not work in Oracle MySQL Server.   set <strong> userstat_running=1</strong>  and run following query:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div>mysql&gt; <span>SELECT</span> * <span>FROM</span> information_schema.user_statistics <span>WHERE</span> user=<span>"#mysql_system#"</span> \G</div>
</li>
<li>
<div>*************************** <span>1</span>. row ***************************</div>
</li>
<li>
<div>USER: <span>#mysql_system#</span></div>
</li>
<li>
<div>TOTAL_CONNECTIONS: <span>1</span></div>
</li>
<li>
<div>CONCURRENT_CONNECTIONS: <span>0</span></div>
</li>
<li>
<div>CONNECTED_TIME: <span>446</span></div>
</li>
<li>
<div>BUSY_TIME: <span>74</span></div>
</li>
<li>
<div>CPU_TIME: <span>0</span></div>
</li>
<li>
<div>BYTES_RECEIVED: <span>0</span></div>
</li>
<li>
<div>BYTES_SENT: <span>63</span></div>
</li>
<li>
<div>BINLOG_BYTES_WRITTEN: <span>0</span></div>
</li>
<li>
<div>ROWS_FETCHED: <span>0</span></div>
</li>
<li>
<div>ROWS_UPDATED: <span>127576</span></div>
</li>
<li>
<div>TABLE_ROWS_READ: <span>4085689</span></div>
</li>
<li>
<div>SELECT_COMMANDS: <span>0</span></div>
</li>
<li>
<div>UPDATE_COMMANDS: <span>119127</span></div>
</li>
<li>
<div>OTHER_COMMANDS: <span>89557</span></div>
</li>
<li>
<div>COMMIT_TRANSACTIONS: <span>90259</span></div>
</li>
<li>
<div>ROLLBACK_TRANSACTIONS: <span>0</span></div>
</li>
<li>
<div>DENIED_CONNECTIONS: <span>1</span></div>
</li>
<li>
<div>LOST_CONNECTIONS: <span>0</span></div>
</li>
<li>
<div>ACCESS_DENIED: <span>0</span></div>
</li>
<li>
<div>EMPTY_QUERIES: <span>0</span></div>
</li>
<li>
<div><span>1</span> row <span>IN</span> <span>SET</span> <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>In this case CONNECTED_TIME is 446 second, out of this replication thread was busy (BUSY_TIME) 74 seconds which means replication capacity is  446/74 = 6<br />
You normally would not like to measure it from the start but rather take the difference in these counters every 5 minutes or other interval of your choice.</p>
<p><strong>2) Use full slow query log and mk-query-digest.</strong>  This method is great for one time execution especially as it comes together with giving you the list of queries which load replication<br />
the most.  It however works only with statement level replication.    You need to set  <strong>long_query_time=0</strong> and <strong>log_slave_slow_statements=1</strong> for this method to work.<br />
Get the log file which will include all queries MySQL server ran with their times and run mk-query-digest with filter to only check queries from replication thread:</p>
<p><strong>mk-query-digest slow-log --filter '($event->{user} || "") =~ m/[SLAVE_THREAD]/' > /tmp/report-slave.txt</strong></p>
<p>In the report you will see something like this as a header:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div><span># 475s user time, 1.2s system time, 80.41M rss, 170.38M vsz</span></div>
</li>
<li>
<div><span># Current date: Mon Jul 19 15:12:24 2010</span></div>
</li>
<li>
<div><span># Files: slow-log</span></div>
</li>
<li>
<div><span># Overall: 1.22M total, 1.27k unique, 558.56 QPS, 0.37x concurrency ______</span></div>
</li>
<li>
<div><span># total min max avg 95% stddev median</span></div>
</li>
<li>
<div><span># Exec time 819s 1us 92s 669us 260us 120ms 93us</span></div>
</li>
<li>
<div><span># Lock time 28s 0 166ms 23us 49us 192us 25us</span></div>
</li>
<li>
<div><span># Rows sent 4.27k 0 325 0.00 0 1.04 0</span></div>
</li>
<li>
<div><span># Rows exam 30.88M 0 1.28M 26.48 0 3.07k 0</span></div>
</li>
<li>
<div><span># Time range 2010-07-19 14:35:53 to 2010-07-19 15:12:22</span></div>
</li>
<li>
<div><span># bytes 350.99M 5 1022.34k 301.01 719.66 5.75k 124.25</span></div>
</li>
<li>
<div><span># Bytes sen 1.94M 0 9.42k 1.67 0 110.38 0</span></div>
</li>
<li>
<div><span># Killed 0 0 0 0 0 0 0</span></div>
</li>
<li>
<div><span># Last errn 34.11M 0 1.55k 29.26 0 185.83 0</span></div>
</li>
<li>
<div><span># Merge pas 0 0 0 0 0 0 0</span></div>
</li>
<li>
<div><span># Rows affe 875.19k 0 17.55k 0.73 0.99 25.61 0.99</span></div>
</li>
<li>
<div><span># Rows read 2.20M 0 14.83k 1.88 1.96 24.68 1.96</span></div>
</li>
<li>
<div><span># Tmp disk 4.15k 0 1 0.00 0 0.06 0</span></div>
</li>
<li>
<div><span># Tmp table 14.19k 0 2 0.01 0 0.14 0</span></div>
</li>
<li>
<div><span># Tmp table 8.30G 0 2.01M 7.12k 0 117.75k 0</span></div>
</li>
<li>
<div><span># 0% (5k) Filesort</span></div>
</li>
<li>
<div><span># 0% (5k) Full_join</span></div>
</li>
<li>
<div><span># 0% (7k) Full_scan</span></div>
</li>
<li>
<div><span># 0% (10k) Tmp_table</span></div>
</li>
<li>
<div><span># 0% (4k) Tmp_table_on_disk </span></div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>There is a lot of interesting you can find out from this header but in relation to replication capacity - you can get replication load, which is same as "concurrency" figure (0.37x)   The concurrency as reported by mk-query-digest is sum of query execution time vs time range the log file covers.  In this case as we know there is only one replication thread it will be same as replication load.  This gives us replication capacity of  1/0.37 = 2.70 </p>
<p>This method should work with original MySQL Server in theory, though I have not tested it. Some versions had log_slave_slow_statements unreliable and also you may need to adjust regular expression for finding users replication thread uses.   </p>
<p><strong>3) Processlist Pooling </strong>   This method is simple - the Slave thread has different status in Show Processlist depending on if it processes query or simply waiting.  By pooling processlist frequently (for example 10 times a second)  we can compute the approximate percentage the thread was busy vs idle.  Of course running processlist very aggressively can be an overhead especially if it is busy system with a lot of connections</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>SQL:</span>
<div>
<div>
<ol>
<li>
<div>mysql&gt; <span>SHOW</span> processlist;</div>
</li>
<li>
<div>+<span>--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+</span></div>
</li>
<li>
<div>| Id | User | Host | db | Command | Time | State | Info |</div>
</li>
<li>
<div>+<span>--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+</span></div>
</li>
<li>
<div>| <span>801812</span> | system user | | <span>NULL</span> | Connect | <span>2665</span> | Waiting <span>FOR</span> master <span>TO</span> send event | <span>NULL</span> |</div>
</li>
<li>
<div>| <span>801813</span> | system user | | <span>NULL</span> | Connect | <span>0</span> | Has <span>READ</span> <span>ALL</span> relay log; waiting <span>FOR</span> the slave I/O thread <span>TO</span> <span>UPDATE</span> it | <span>NULL</span> |</div>
</li>
<li>
<div>| <span>802354</span> | root | localhost | <span>NULL</span> | Query | <span>0</span> | <span>NULL</span> | <span>SHOW</span> processlist |</div>
</li>
<li>
<div>+<span>--------+-------------+-----------+------+---------+------+-----------------------------------------------------------------------+------------------+</span></div>
</li>
<li>
<div><span>3</span> rows <span>IN</span> <span>SET</span> <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p><strong>4) Slave Catchup/Binlog Application method.</strong>   We can just get the spare server with backups restored on it and apply binary log to it. If 1 hour worth of binary logs applies for 10 minutes we have replication capacity of 6.    The challenge of course having spare server around and it is quite labor intensive. At the same time it can be good measurement to take during backup recovery trials when you're doing this activity anyway.     Using this way you can also measure "cold" vs "hot" replication capacity as well as how long replication warmup takes.  It is very typical for servers with cold cache to perform a lot slower then they are warmed up.  Measuring times for each binary log separately should give you these numbers.</p>
<p>The less intrusive process which can be done in production (especially if you have slave which is used for backups/reporting etc) is to stop the replication for some time and when see how long it takes to catch up.  If you paused replication for 10 minutes and it took 5 minutes to catch up your replication capacity will be 3 (not 2) because you not only had to process the events for outstanding 10 minutes but also for these 5 minutes it took to catch up.  The formula is  (Time_Replication_Paused+Time_Took_To_Catchup)/Time_Took_To_Catchup.  </p>
<p>So how much of replication capacity do you need in the healthy system ?  It depends a lot on many things including how fast do you need to be able to recover from backups and how much your load variance is.  A lot of systems have special requirements on the time it takes to warmup too (there are different things you can do about it too).   First I would measure replication capacity on 5 minute intervals (or something similar) because it tends to vary a lot.    When I would suggest to ensure the loaded replication capacity is at least 3 during the peak load and 5 during the normal load.  This applies to normal operational load - if you push heavy ALTER TABLE through replication they will surely get your replication capacity down for their duration. </p>
<p>One more thing about these methods -  methods 1,2,3 work well only if replication capacity is above 1, so system is caught up.   If it is less than 1, so the master writes more binary logs than slave can process they will show number close to 1.  the method 4 however  with work even if replication can't ever catch up  -  If  1 hour worth of binary logs takes 2 hours to apply, your replication capacity is 0.5.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/&amp;title=Estimating%20Replication%20Capacity" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/&amp;title=Estimating%20Replication%20Capacity" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/&amp;title=Estimating%20Replication%20Capacity" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/&amp;T=Estimating%20Replication%20Capacity" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/07/20/estimating-replication-capacity/&amp;title=Estimating%20Replication%20Capacity" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25347&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25347&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/21/estimating-replication-capacity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scaling: Consider both Size and Load</title>
		<link>http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=scaling-consider-both-size-and-load</link>
		<comments>http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 00:53:38 +0000</pubDate>
		<dc:creator>MySQL Performance Blog</dc:creator>
				<category><![CDATA[mysql]]></category>
		<category><![CDATA[production]]></category>

		<guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=3344</guid>
		<description><![CDATA[So lets imagine you have the server handling 100.000 user accounts.  You can see the CPU,IO and Network usage is below 10% of capacity &#8211; does it mean you can count on server being able to
handle 1.000.000 of accounts ?  Not really, and there are few reasons why,  I&#8217;ll name most important of them:
Contention   &#8211; This is probably the most obvious one.  MySQL (and systems in general) do not scale perfectly with numbers of CPUs and number of concurrent requests.  Reduced efficiency of CPU cache, Mutex contention and database lock contention all come here.   Some of them are preventable and can be reduced by code changes, such as there have been a lot of advanced in scalability of MySQL by improving locking code design, others, such as row level locks would require application changes to allow more concurrent process.   The scalability numbers depend a lot on the system scale, software and workload.
Data size impact  There are different type of applications out of there. Some (minority?) vary the load independently or almost independently of data size.  You can think about Google search engine &#8211; the data size on which search operation is performed is constant, no matter if you&#8217;re serving 10 queries a day or a billion.  True you probably would need to hold many copies of  data to support high load but this is a scaling through copying not the amount of data you have in the system.  Wikipedia is a similar case &#8211; the data size does not depend on the amount of readers, though writers contribute to the data size by creating new articles and increasing amount of versions in system.  Applications such as Facebook, Flickr or Twitter have a very clear correlation between traffic and users.
Each registered user will in average have N MB of data stored in database, and traffic system is getting is somewhat proportional (though often not linear) to amount of users.  
For systems of the first type the data size grows independently of traffic so it is fine to measure system capacity in  Transactions per second.   If system can handle twice amount of transactions per second it may be able to handle double the load.   For systems of  the second type you better use  Transactions/Second/User   or Transactions/Second/MB   (which is similar measures as users in average have certain amount of data each).   Doubling traffic for system of such type means handling twice amount of transactions on the twice amount of data. 
Increasing amount of data is very serious implication for system performance. Some queries have relatively small impact  (having LOG(N) scalability), others may have linear or even square complexity which
means increasing data size puts a very serious strain on the system.   What is also very important and often forgotten is caching.  Having twice amount of data means having half the cache &#8211; if you previously had 20% of data fits in memory, now it is only 10%.   The impact of cache on performance is very application dependent as well and may vary from insignificant to dramatic.  
You&#8217;re in the highest danger if you have very high portion of your database (or working) set fits in memory, hence having CPU bound workload.  As your data growths you may frequently find load becoming CPU bound and hence things becoming 10x slower (or more) sometimes with very modest size increase.  I&#8217;ve seen things slowing down about 10x from less than 50% increase in the data size.
I see the data size impact often omitted in &#8220;consolidation&#8221; tests &#8211; when you would get a new server and see it can handle 5x of the load of the old one, so you would consider you can put 5 &#8220;shards&#8221; on it.
5 shards surely come with 5x more data which you need to carefully take into account. 
Design Limits   This is the brother of contention but I decided to put it separately.  There are more things than contention which can limit the perfect scalability.  The Replication is perfect example in MySQL world.  Slave executes replication stream in single thread which means it replication can&#8217;t scale for large amount of writes.    The lack of parallel query execution is similar issue &#8211; you may have a lot of resources in terms of CPUs and disk but it can&#8217;t help to reduce response time of the single query.
Response Time  Do not forget you do not only be able to handle number of users in terms of capacity you also need to have response time to be within certain range in majority of cases. Some may look at 99% response time, some 95% but neither the less you want users to get response fast.  This means you can&#8217;t plan on loading system 100%.  There is a nice paper by Cary Millsap  explaining  this in more details.    Depending on the system and workload you may want to keep your system loaded no more than 80% in peak times, though applications which need to accommodate for larger traffic variance need to have a lot more spare room.
So in the end the math to scale your system may not be as straightforward as you think &#8211; you need to take number of things into account and I&#8217;d always suggest to confirm your modeling with benchmarks/performance evaluation if have a chance. 
    
    Entry posted by peter &#124;
      No comment
    Add to:  &#124;  &#124;  &#124;  &#124; ]]></description>
			<content:encoded><![CDATA[<p>So lets imagine you have the server handling 100.000 user accounts.  You can see the CPU,IO and Network usage is below 10% of capacity &#8211; does it mean you can count on server being able to<br />
handle 1.000.000 of accounts ?  Not really, and there are few reasons why,  I&#8217;ll name most important of them:</p>
<p><strong>Contention </strong>  &#8211; This is probably the most obvious one.  MySQL (and systems in general) do not scale perfectly with numbers of CPUs and number of concurrent requests.  Reduced efficiency of CPU cache, Mutex contention and database lock contention all come here.   Some of them are preventable and can be reduced by code changes, such as there have been a lot of advanced in scalability of MySQL by improving locking code design, others, such as row level locks would require application changes to allow more concurrent process.   The scalability numbers depend a lot on the system scale, software and workload.</p>
<p><strong>Data size impact</strong>  There are different type of applications out of there. Some (minority?) vary the load independently or almost independently of data size.  You can think about Google search engine &#8211; the data size on which search operation is performed is constant, no matter if you&#8217;re serving 10 queries a day or a billion.  True you probably would need to hold many copies of  data to support high load but this is a scaling through copying not the amount of data you have in the system.  Wikipedia is a similar case &#8211; the data size does not depend on the amount of readers, though writers contribute to the data size by creating new articles and increasing amount of versions in system.  Applications such as Facebook, Flickr or Twitter have a very clear correlation between traffic and users.<br />
Each registered user will in average have N MB of data stored in database, and traffic system is getting is somewhat proportional (though often not linear) to amount of users.  </p>
<p>For systems of the first type the data size grows independently of traffic so it is fine to measure system capacity in  Transactions per second.   If system can handle twice amount of transactions per second it may be able to handle double the load.   For systems of  the second type you better use  Transactions/Second/User   or Transactions/Second/MB   (which is similar measures as users in average have certain amount of data each).   Doubling traffic for system of such type means handling twice amount of transactions on the twice amount of data. </p>
<p>Increasing amount of data is very serious implication for system performance. Some queries have relatively small impact  (having LOG(N) scalability), others may have linear or even square complexity which<br />
means increasing data size puts a very serious strain on the system.   What is also very important and often forgotten is caching.  Having twice amount of data means having half the cache &#8211; if you previously had 20% of data fits in memory, now it is only 10%.   The impact of cache on performance is very application dependent as well and may vary from insignificant to dramatic.  </p>
<p>You&#8217;re in the highest danger if you have very high portion of your database (or working) set fits in memory, hence having CPU bound workload.  As your data growths you may frequently find load becoming CPU bound and hence things becoming 10x slower (or more) sometimes with very modest size increase.  I&#8217;ve seen things slowing down about 10x from less than 50% increase in the data size.</p>
<p>I see the data size impact often omitted in &#8220;consolidation&#8221; tests &#8211; when you would get a new server and see it can handle 5x of the load of the old one, so you would consider you can put 5 &#8220;shards&#8221; on it.<br />
5 shards surely come with 5x more data which you need to carefully take into account. </p>
<p><strong>Design Limits </strong>  This is the brother of contention but I decided to put it separately.  There are more things than contention which can limit the perfect scalability.  The Replication is perfect example in MySQL world.  Slave executes replication stream in single thread which means it replication can&#8217;t scale for large amount of writes.    The lack of parallel query execution is similar issue &#8211; you may have a lot of resources in terms of CPUs and disk but it can&#8217;t help to reduce response time of the single query.</p>
<p><strong>Response Time </strong> Do not forget you do not only be able to handle number of users in terms of capacity you also need to have response time to be within certain range in majority of cases. Some may look at 99% response time, some 95% but neither the less you want users to get response fast.  This means you can&#8217;t plan on loading system 100%.  There is a nice paper by Cary Millsap <a href="http://www.method-r.com/downloads/doc_download/44-thinking-clearly-about-performance"> explaining </a> this in more details.    Depending on the system and workload you may want to keep your system loaded no more than 80% in peak times, though applications which need to accommodate for larger traffic variance need to have a lot more spare room.</p>
<p>So in the end the math to scale your system may not be as straightforward as you think &#8211; you need to take number of things into account and I&#8217;d always suggest to confirm your modeling with benchmarks/performance evaluation if have a chance. </p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by peter |
      <a href="http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/#comments">No comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/&amp;title=Scaling:%20Consider%20both%20Size%20and%20Load" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/&amp;title=Scaling:%20Consider%20both%20Size%20and%20Load" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/&amp;title=Scaling:%20Consider%20both%20Size%20and%20Load" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/&amp;T=Scaling:%20Consider%20both%20Size%20and%20Load" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2010/07/13/scaling-consider-both-size-and-load/&amp;title=Scaling:%20Consider%20both%20Size%20and%20Load" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25298&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25298&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/14/scaling-consider-both-size-and-load/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

