<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PlanetMysql.ru - информация о СУБД MySQL &#187; caching</title>
	<atom:link href="http://planetmysql.ru/category/caching/feed/" rel="self" type="application/rss+xml" />
	<link>http://planetmysql.ru</link>
	<description>Блог о самой популярной СУБД MySQL</description>
	<lastBuildDate>Thu, 24 May 2012 14:20:27 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Database Architectures &amp; Performance</title>
		<link>http://scaledb.blogspot.com/2010/07/database-architectures-performance.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=database-architectures-performance</link>
		<comments>http://scaledb.blogspot.com/2010/07/database-architectures-performance.html#comments</comments>
		<pubDate>Tue, 20 Jul 2010 19:12:00 +0000</pubDate>
		<dc:creator>Mike Hogan</dc:creator>
				<category><![CDATA[caching]]></category>
		<category><![CDATA[cloud database]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[shared-disk]]></category>
		<category><![CDATA[shared-nothing]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[For decades the debate between shared-disk and shared-nothing databases has raged. The shared-disk camp points to the laundry list of functional benefits such as improved data consistency, high-availability, scalability and elimination of partitioning/replication/promotion. The shared-nothing camp shoots back with superior performance and reduced costs. Both sides have a point.First, let’s look at the performance issue. RAM (average access time of 200 nanoseconds) is considerably faster than disk (average access time of 12,000,000 nanoseconds). Let me put this 200:12,000,000 ratio into perspective. A task that takes a single minute in RAM would take 41 days in disk. So why do I bring this up?Shared-Nothing: Since the shared-nothing database has sole ownership of its data—it doesn’t share the data with other nodes—it can operate in the machine’s local RAM, only writing infrequently to disk (flushing the data to disk). This makes shared-nothing databases very fast.Shared-Disk: Cannot rely on the machine’s local RAM, because every write by one node must be instantly available to the other nodes, to ensure that they don’t use stale data and corrupt the database. So instead of relying on local RAM, all write transactions must be written to disk. This is where the 1 minute to 41 days ratio above comes into play and kills performance of shared-disk databases.Let’s look at some of the ways databases can utilize RAM instead of disk to improve performance:Read Cache: Databases typically use the RAM as a fast read cache. Upon reading data from the disk, this data is stored in the read cache so that subsequent use of that data is satisfied from RAM instead of the disk. For example, upon reading a person’s name from disk, that name is stored in the cache for fast access. The database wouldn’t need to read that name from disk again until that person’s name is changed (rare), or that RAM space is reused for a piece of data that is used more frequently. Read cache can significantly improve database performance. BOTH shared-disk and shared-nothing databases can exploit read cache. The shared-disk database just needs a system to either invalidate or update the data in read cache when one of the nodes has made a change. This is pretty standard in shared-disk databases.Background Writing: Writing data to the disk is by far the most time consuming process in a write transaction. During the transaction, that portion of the data is locked, meaning it is unavailable for other functions. So, if you can move the writing of the data outside of the transaction—write the data in the background—you get faster transactions, which means less locking contention, which means faster throughput. SHARED-NOTHING can exploit this performance enhancement, since each server owns the data in its RAM. However, shared-disk databases cannot do this because they need to share that updated data with the other database nodes in the cluster. Since the local node’s cache is not shared, in a shared-disk database, the only option is to use the shared disk to share that data across the nodes.Transactional Cache: The next step in utilizing RAM instead of disk is to use it in a transactional manner.  This means that the database can make multiple changes to data in RAM prior to writing the final results to disk. For example, if you have 100 widgets, you can store that inventory count in RAM, and then decrement it with each sale. If you sell 23 widgets, then instead of writing each transaction to disk, you update it in RAM. When you flush this data to disk, it results in a single disk write, writing the inventory number 77, instead of writing each of the 23 transactions individually to disk.SHARED-NOTHING can perform transactions on data while it is in RAM. Once again, shared-disk databases cannot do this because you might have multiple nodes updating the inventory. Since they cannot look into each others local RAM, they must once again write each transaction to disk.As you can see, shared-nothing databases have an inherent performance advantage. The next blog post will address how modern shared-disk databases address these performance challenges.]]></description>
			<content:encoded><![CDATA[For decades the debate between shared-disk and shared-nothing databases has raged. The shared-disk camp points to the laundry list of functional benefits such as improved data consistency, high-availability, scalability and elimination of partitioning/replication/promotion. The shared-nothing camp shoots back with superior performance and reduced costs. Both sides have a point.<br /><br />First, let’s look at the performance issue. RAM (average access time of 200 nanoseconds) is considerably faster than disk (average access time of 12,000,000 nanoseconds). Let me put this 200:12,000,000 ratio into perspective. A task that takes a single minute in RAM would take 41 days in disk. So why do I bring this up?<br /><br />Shared-Nothing: Since the shared-nothing database has sole ownership of its data—it doesn’t share the data with other nodes—it can operate in the machine’s local RAM, only writing infrequently to disk (flushing the data to disk). This makes shared-nothing databases very fast.<br /><br />Shared-Disk: Cannot rely on the machine’s local RAM, because every write by one node must be instantly available to the other nodes, to ensure that they don’t use stale data and corrupt the database. So instead of relying on local RAM, all write transactions must be written to disk. This is where the 1 minute to 41 days ratio above comes into play and kills performance of shared-disk databases.<br /><br />Let’s look at some of the ways databases can utilize RAM instead of disk to improve performance:<br /><br />Read Cache: Databases typically use the RAM as a fast read cache. Upon reading data from the disk, this data is stored in the read cache so that subsequent use of that data is satisfied from RAM instead of the disk. For example, upon reading a person’s name from disk, that name is stored in the cache for fast access. The database wouldn’t need to read that name from disk again until that person’s name is changed (rare), or that RAM space is reused for a piece of data that is used more frequently. Read cache can significantly improve database performance. <br /><br />BOTH shared-disk and shared-nothing databases can exploit read cache. The shared-disk database just needs a system to either invalidate or update the data in read cache when one of the nodes has made a change. This is pretty standard in shared-disk databases.<br /><br />Background Writing: Writing data to the disk is by far the most time consuming process in a write transaction. During the transaction, that portion of the data is locked, meaning it is unavailable for other functions. So, if you can move the writing of the data outside of the transaction—write the data in the background—you get faster transactions, which means less locking contention, which means faster throughput. <br /><br />SHARED-NOTHING can exploit this performance enhancement, since each server owns the data in its RAM. However, shared-disk databases cannot do this because they need to share that updated data with the other database nodes in the cluster. Since the local node’s cache is not shared, in a shared-disk database, the only option is to use the shared disk to share that data across the nodes.<br /><br />Transactional Cache: The next step in utilizing RAM instead of disk is to use it in a transactional manner.  This means that the database can make multiple changes to data in RAM prior to writing the final results to disk. For example, if you have 100 widgets, you can store that inventory count in RAM, and then decrement it with each sale. If you sell 23 widgets, then instead of writing each transaction to disk, you update it in RAM. When you flush this data to disk, it results in a single disk write, writing the inventory number 77, instead of writing each of the 23 transactions individually to disk.<br /><br />SHARED-NOTHING can perform transactions on data while it is in RAM. Once again, shared-disk databases cannot do this because you might have multiple nodes updating the inventory. Since they cannot look into each others local RAM, they must once again write each transaction to disk.<br /><br />As you can see, shared-nothing databases have an inherent performance advantage. The next blog post will address how modern shared-disk databases address these performance challenges.<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/122515591890601092-7612068860386850217?l=scaledb.blogspot.com" alt="" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25345&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25345&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/20/database-architectures-performance/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Advanced Squid Caching in Scribd: Cache Invalidation Techniques</title>
		<link>http://feedproxy.google.com/~r/Homo-Adminus/~3/4ywVA01ppFY/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=advanced-squid-caching-in-scribd-cache-invalidation-techniques</link>
		<comments>http://feedproxy.google.com/~r/Homo-Adminus/~3/4ywVA01ppFY/#comments</comments>
		<pubDate>Sat, 29 May 2010 17:02:17 +0000</pubDate>
		<dc:creator>Alexey Kovyrin</dc:creator>
				<category><![CDATA[Admin-tips]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[HTCP]]></category>
		<category><![CDATA[invalidation]]></category>
		<category><![CDATA[My Projects]]></category>
		<category><![CDATA[networks]]></category>
		<category><![CDATA[Nginx]]></category>
		<category><![CDATA[plugin]]></category>
		<category><![CDATA[squid]]></category>

		<guid isPermaLink="false">http://kovyrin.net/?p=322</guid>
		<description><![CDATA[Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache &#8211; cached content invalidation.
It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I&#8217;d like to look at a few techniques we use at Scribd to solve cache invalidation problems.

So, the first problem &#8211; ongoing cache invalidation when content changes. This is actually a pretty simple task in squid: you just use HTCP protocol and send CLR requests to your caching farm (we didn&#8217;t find any HTCP protocol implementations so we&#8217;ve implemented our own simple client that supports just one command).
Since we use haproxy to balance our traffic in the cluster it is hard to predict where should we send a purge request. So we fan those out to all cache servers.
To make sure cache purging won&#8217;t slow the site down, especially considering we need to do more that just a simple cache purge (submit documents to search indexes, etc, etc), we just spool a &#8220;document changed&#8221; request to a queue and then have a set of asynchronous processes that do all the work in background.
Next, The Hard Problem &#8211; handling full cache purges w/o killing our backend servers with 5x-10x traffic (our normal hit ratio is ~90-95%).
We&#8217;ve spent a lot of time thinking about this problem and the first idea we came up with was to have a loop process somewhere that would iterate all documents we have cached and purge them one by one&#8230; but that does not seem to be a practical solution when you have tens of millions documents (and few page versions per document) and obviously the solution would not scale with constantly growing documents corpus.
So we kept brainstorming and finally got one idea that works just perfectly for us: what if we&#8217;d be able to take our traffic and define a function f(t) that would return a percentage of the traffic that should be purged at any moment in time. So we did it &#8211; we&#8217;ve implemented a nginx module that would version our cache by assigning every cached page a revision (using a custom HTTP-headers + Vary-caching) and would be able to slowly migrate the cache from one revision to another over a pre-defined period of time.
Having that module we are able to do so called &#8220;slow&#8221; cache purges that could take any time from a few minutes (that still helps to reduce the load spike generated by the hottest content) up to many hours (this is what we normally use) or days (never used this option, but it is definitely possible).
Here is an example 100% cache purge over an 8 hour interval:

 Daily hit ratio graph:


 Weekly hit ratio graph:



As you can see, during those slow purges our cached pages would be slowly updated without putting too much pressure on the backend. Cache hit ratio would slowly degrade and then slowly get back to its normal levels, but with our normal (6-8 hours) purges hit ratio never gets lower that 65-70% which makes it possible for us to save huge amounts of money on not having 90% spare capacity just for the cache purge load surges (we used to have lots of spare application cluster capacity before introducing this approach).



  
]]></description>
			<content:encoded><![CDATA[<p>Having a <a href="http://kovyrin.net/2008/10/25/advanced-squid-caching-for-rails-applications-preface/">reverse-proxy</a> web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache &#8211; <em>cached content invalidation</em>.</p>
<p>It is a complex problem that usually consists of two smaller ones: i<em>ndividual cache elements invalidation</em> (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and <em>full cache purges</em> (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I&#8217;d like to look at a few techniques we use at <a href="http://www.scribd.com/">Scribd</a> to solve cache invalidation problems.</p>
<p><span></span></p>
<hr />So, the <strong>first problem &#8211; ongoing cache invalidation when content changes</strong>. This is actually a pretty simple task in squid: you just use <a href="http://www.htcp.org/">HTCP protocol</a> and send CLR requests to your caching farm (we didn&#8217;t find any HTCP protocol implementations so we&#8217;ve implemented <a href="http://github.com/kovyrin/htcp-ruby">our own simple client</a> that supports just one command).</p>
<p>Since we use <a href="http://haproxy.1wt.eu/">haproxy</a> to balance our traffic in the cluster it is hard to predict where should we send a purge request. So we fan those out to all cache servers.</p>
<p>To make sure cache purging won&#8217;t slow the site down, especially considering we need to do more that just a simple cache purge (submit documents to search indexes, etc, etc), we just spool a &#8220;document changed&#8221; request to a queue and then have a set of <a href="http://github.com/kovyrin/loops">asynchronous processes</a> that do all the work in background.</p>
<p>Next, <strong>The Hard Problem &#8211; handling full cache purges w/o killing our backend servers</strong> with 5x-10x traffic (our normal hit ratio is ~90-95%).</p>
<p>We&#8217;ve spent a lot of time thinking about this problem and the first idea we came up with was to have a loop process somewhere that would iterate all documents we have cached and purge them one by one&#8230; but that does not seem to be a practical solution when you have tens of millions documents (and few page versions per document) and obviously the solution would not scale with constantly growing documents corpus.</p>
<p>So we kept brainstorming and finally got one idea that works just perfectly for us: what if we&#8217;d be able to take our traffic and define a function <em>f(t)</em> that would return a percentage of the traffic that should be purged at any moment in time. So we did it &#8211; we&#8217;ve implemented a nginx module that would version our cache by assigning every cached page a revision (<a href="http://kovyrin.net/2009/07/21/advanced-squid-caching-scribd-logged-in-users-complex-urls/">using a custom HTTP-headers + Vary-caching</a>) and would be able to slowly migrate the cache from one revision to another over a pre-defined period of time.</p>
<p>Having that module we are able to do so called &#8220;slow&#8221; cache purges that could take any time from a few minutes (that still helps to reduce the load spike generated by the hottest content) up to many hours (this is what we normally use) or days (never used this option, but it is definitely possible).</p>
<p>Here is an example 100% cache purge over an 8 hour interval:</p>
<ol>
<li> Daily hit ratio graph:<br />
<a href="http://img.skitch.com/20100529-pkx64g6the9winqcnk6sigiyns.png" rel="shadowbox[post-322];player=img;"><img rel="shadowbox" src="http://img.skitch.com/20100529-pkx64g6the9winqcnk6sigiyns.preview.jpg" alt="day" /></a>
</li>
<li> Weekly hit ratio graph:<br />
<a href="http://img.skitch.com/20100529-nk2hyafgtbw1pc1nrkgbec8st3.png" rel="shadowbox[post-322];player=img;"><img rel="shadowbox" src="http://img.skitch.com/20100529-nk2hyafgtbw1pc1nrkgbec8st3.preview.jpg" alt="week" /></a>
</li>
</ol>
<p>As you can see, during those slow purges our cached pages would be slowly updated without putting too much pressure on the backend. Cache hit ratio would slowly degrade and then slowly get back to its normal levels, but with our normal (6-8 hours) purges hit ratio never gets lower that 65-70% which makes it possible for us to save huge amounts of money on not having 90% spare capacity just for the cache purge load surges (we used to have lots of spare application cluster capacity before introducing this approach).</p>

<p><a href="http://feedads.g.doubleclick.net/~a/-nlVyidsWJg1e-DbtPeODrcR9bY/0/da"><img src="http://feedads.g.doubleclick.net/~a/-nlVyidsWJg1e-DbtPeODrcR9bY/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/-nlVyidsWJg1e-DbtPeODrcR9bY/1/da"><img src="http://feedads.g.doubleclick.net/~a/-nlVyidsWJg1e-DbtPeODrcR9bY/1/di" border="0" ismap="true"></img></a></p><div>
<a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=4ywVA01ppFY:b2ode2vaNL0:D7DqB2pKExk"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?i=4ywVA01ppFY:b2ode2vaNL0:D7DqB2pKExk" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=4ywVA01ppFY:b2ode2vaNL0:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=4ywVA01ppFY:b2ode2vaNL0:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?i=4ywVA01ppFY:b2ode2vaNL0:V_sGLiPBpWU" border="0"></img></a>
</div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24897&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24897&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/05/29/advanced-squid-caching-in-scribd-cache-invalidation-techniques/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to memcached</title>
		<link>http://www.jurriaanpersyn.com/archives/2010/05/27/introduction-to-memcached/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=introduction-to-memcached</link>
		<comments>http://www.jurriaanpersyn.com/archives/2010/05/27/introduction-to-memcached/#comments</comments>
		<pubDate>Thu, 27 May 2010 21:52:32 +0000</pubDate>
		<dc:creator>Jurriaan Persyn</dc:creator>
				<category><![CDATA[caching]]></category>
		<category><![CDATA[ikdoeict]]></category>
		<category><![CDATA[invalidation]]></category>
		<category><![CDATA[kaho st. lieven]]></category>
		<category><![CDATA[memcached]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[tech]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://www.jurriaanpersyn.com/?p=354</guid>
		<description><![CDATA[These are the slides to a talk I did earlier this week for students of the professional bachelor in ICT course at KaHo St. Lieven. I wanted to give a clear and simple introduction to the memcached service, as I think it&#8217;s an invaluable tool in today&#8217;s web development. 
]]></description>
			<content:encoded><![CDATA[<p>These are the slides to a talk I did earlier this week for students of the <a href="http://www.ikdoeict.be/en">professional bachelor in ICT course</a> at <a href="http://www.kahosl.be">KaHo St. Lieven</a>. I wanted to give a clear and simple introduction to the memcached service, as I think it&#8217;s an invaluable tool in today&#8217;s web development. </p>
<div></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24886&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24886&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/05/28/introduction-to-memcached/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using the Query Cache effectively</title>
		<link>http://ronaldbradford.com/blog/using-the-mysql-query-cache-effectively-2009-09-28/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=using-the-query-cache-effectively</link>
		<comments>http://ronaldbradford.com/blog/using-the-mysql-query-cache-effectively-2009-09-28/#comments</comments>
		<pubDate>Mon, 28 Sep 2009 21:10:53 +0000</pubDate>
		<dc:creator>Ronald Bradford</dc:creator>
				<category><![CDATA[application performance]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[MySQL query cache]]></category>
		<category><![CDATA[Professional]]></category>
		<category><![CDATA[query-cache]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://ronaldbradford.com/blog/?p=2137</guid>
		<description><![CDATA[Maximize your strengths, minimize your weaknesses.
You can apply this approach to many things in life, I apply it to describing and using MySQL the product, and it&#8217;s components. The Query Cache like many features in MySQL, and indeed features in many different RDBMS products (don&#8217;t get me started on Oracle *features*) have relative benefits.  In one context it can be seen as ineffective, or even detrimental to your performance, however it&#8217;s course grain nature makes it both trivial to disable dynamically (SET GLOBAL query_cache_size=0;), and also easy to get basic statistics on current performance (SHOW GLOBAL STATUS LIKE &#8216;QCache%&#8217;;) to determine effectiveness and action appropriately.
The Query Cache is course grained, that is it is rather simple/dumb in nature.   When you understand the path of execution of a query within the MySQL kernel you learn a few key things.

When enabled, by default the Query Cache will cache all SELECT statements within certain defined system parameter conditions. There are of course exceptions such as non-deterministic functions, prepared statements in earlier versions etc.
Any DML/DDL statement for a table that has a query cached, flushes all query cache results that pertain to this table.
You can use SQL_CACHE and SQL_NO_CACHE as hints however you can&#8217;t configure on a table by table, or query basis.
The query cache works on an exact match of the query (including spaces and case) and other settings such the client character set, and protocol version.  If a match is found, data is returned in preformed network packets.

The Query Cache was not good when set to large values (e.g. &#62; 128M) due to in-efficient cache invalidation.  I&#8217;m not certain of the original source of this condition however Bug #21074, fixed in 5.0.50 and 5.1.21 is likely the reason.
My advice is to disable the Query Cache by default, especially for testing.  As a final stress test you can enable to determine if there is a benefit.
I wish MySQL would spend time in improving key features, for example the Query Cache lacks sufficient instrumentation like what queries are in the cache, what tables are in the cache, and also lack all the sufficient system parameters exposed  to fine tune.  I believe there is a patch to show the queries for example, but I was unable to find via a google search.
It is a powerful and easy technology if you use it well.  It involves architecting your solution appropriately, and knowing when the Query Cache is ineffective.
I have a number of circumstances where the query cache is extremely effective, or could be with simple modifications.  A recommendation to a recent client with a 1+TB database was to split historical and current data into two different instances. The data was already in separated tables, the application already performed dual queries, so the change was a simple as a new connection pool.  The benefits were huge, not only would the backup process be more efficient, some 500GB of data now only had to be backed up once (as is was 100% static), the scaling and recovery process improved, but the second MySQL instance could enable the query cache and the application would get a huge performance improvement with ZERO code changes for caching.  That&#8217;s a quick and easy win.
On a side note,  I wanted to title this &#8220;The MySQL Query Cache is not useless&#8221;.   When I was a MySQL employee I got reprimanded (twice) for blogging anything about MySQL that wasn&#8217;t positive.  This blog post is in direct response to Konstantin, a Sun/MySQL employee who actually works on the actually MySQL server code who wrote Query cache = useless?. In my view it is not useless.]]></description>
			<content:encoded><![CDATA[<p><b>Maximize your strengths, minimize your weaknesses.</b></p>
<p>You can apply this approach to many things in life, I apply it to describing and using MySQL the product, and it&#8217;s components. The Query Cache like many features in MySQL, and indeed features in many different RDBMS products (don&#8217;t get me started on Oracle *features*) have relative benefits.  In one context it can be seen as ineffective, or even detrimental to your performance, however it&#8217;s course grain nature makes it both trivial to disable dynamically (SET GLOBAL query_cache_size=0;), and also easy to get basic statistics on current performance (SHOW GLOBAL STATUS LIKE &#8216;QCache%&#8217;;) to determine effectiveness and action appropriately.</p>
<p>The Query Cache is course grained, that is it is rather simple/dumb in nature.   When you understand the path of execution of a query within the MySQL kernel you learn a few key things.</p>
<ul>
<li>When enabled, by default the Query Cache will cache all SELECT statements within certain defined system parameter conditions. There are of course exceptions such as non-deterministic functions, prepared statements in earlier versions etc.</li>
<li>Any DML/DDL statement for a table that has a query cached, flushes all query cache results that pertain to this table.</li>
<li>You can use SQL_CACHE and SQL_NO_CACHE as hints however you can&#8217;t configure on a table by table, or query basis.</li>
<li>The query cache works on an exact match of the query (including spaces and case) and other settings such the client character set, and protocol version.  If a match is found, data is returned in preformed network packets.<.li>
</ul>
<p>The Query Cache was not good when set to large values (e.g. > 128M) due to in-efficient cache invalidation.  I&#8217;m not certain of the original source of this condition however <a href="http://bugs.mysql.com/21074">Bug #21074</a>, fixed in <a href="http://dev.mysql.com/doc/refman/5.0/en/news-5-0-50.html">5.0.50</a> and <a href="http://dev.mysql.com/doc/refman/5.1/en/news-5-1-21.html">5.1.21</a> is likely the reason.</p>
<p>My advice is to disable the Query Cache by default, especially for testing.  As a final stress test you can enable to determine if there is a benefit.</p>
<p>I wish MySQL would spend time in improving key features, for example the Query Cache lacks sufficient instrumentation like what queries are in the cache, what tables are in the cache, and also lack all the sufficient system parameters exposed  to fine tune.  I believe there is a patch to show the queries for example, but I was unable to find via a google search.</p>
<p>It is a powerful and easy technology if you use it well.  It involves architecting your solution appropriately, and knowing when the Query Cache is ineffective.</p>
<p>I have a number of circumstances where the query cache is extremely effective, or could be with simple modifications.  A recommendation to a recent client with a 1+TB database was to split historical and current data into two different instances. The data was already in separated tables, the application already performed dual queries, so the change was a simple as a new connection pool.  The benefits were huge, not only would the backup process be more efficient, some 500GB of data now only had to be backed up once (as is was 100% static), the scaling and recovery process improved, but the second MySQL instance could enable the query cache and the application would get a huge performance improvement with ZERO code changes for caching.  That&#8217;s a quick and easy win.</p>
<p>On a side note,  I wanted to title this &#8220;The MySQL Query Cache is not useless&#8221;.   When I was a MySQL employee I got reprimanded (twice) for blogging anything about MySQL that wasn&#8217;t positive.  This blog post is in direct response to Konstantin, a Sun/MySQL employee who actually works on the actually MySQL server code who wrote <a href="http://kostja-osipov.livejournal.com/28914.html">Query cache = useless?</a>. In my view it is not useless.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=21327&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=21327&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2009/09/29/using-the-query-cache-effectively/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

