Archive for the ‘caching’ Category

Database Architectures & Performance

Июль 20th, 2010
For decades the debate between shared-disk and shared-nothing databases has raged. The shared-disk camp points to the laundry list of functional benefits such as improved data consistency, high-availability, scalability and elimination of partitioning/replication/promotion. The shared-nothing camp shoots back with superior performance and reduced costs. Both sides have a point.

First, let’s look at the performance issue. RAM (average access time of 200 nanoseconds) is considerably faster than disk (average access time of 12,000,000 nanoseconds). Let me put this 200:12,000,000 ratio into perspective. A task that takes a single minute in RAM would take 41 days in disk. So why do I bring this up?

Shared-Nothing: Since the shared-nothing database has sole ownership of its data—it doesn’t share the data with other nodes—it can operate in the machine’s local RAM, only writing infrequently to disk (flushing the data to disk). This makes shared-nothing databases very fast.

Shared-Disk: Cannot rely on the machine’s local RAM, because every write by one node must be instantly available to the other nodes, to ensure that they don’t use stale data and corrupt the database. So instead of relying on local RAM, all write transactions must be written to disk. This is where the 1 minute to 41 days ratio above comes into play and kills performance of shared-disk databases.

Let’s look at some of the ways databases can utilize RAM instead of disk to improve performance:

Read Cache: Databases typically use the RAM as a fast read cache. Upon reading data from the disk, this data is stored in the read cache so that subsequent use of that data is satisfied from RAM instead of the disk. For example, upon reading a person’s name from disk, that name is stored in the cache for fast access. The database wouldn’t need to read that name from disk again until that person’s name is changed (rare), or that RAM space is reused for a piece of data that is used more frequently. Read cache can significantly improve database performance.

BOTH shared-disk and shared-nothing databases can exploit read cache. The shared-disk database just needs a system to either invalidate or update the data in read cache when one of the nodes has made a change. This is pretty standard in shared-disk databases.

Background Writing: Writing data to the disk is by far the most time consuming process in a write transaction. During the transaction, that portion of the data is locked, meaning it is unavailable for other functions. So, if you can move the writing of the data outside of the transaction—write the data in the background—you get faster transactions, which means less locking contention, which means faster throughput.

SHARED-NOTHING can exploit this performance enhancement, since each server owns the data in its RAM. However, shared-disk databases cannot do this because they need to share that updated data with the other database nodes in the cluster. Since the local node’s cache is not shared, in a shared-disk database, the only option is to use the shared disk to share that data across the nodes.

Transactional Cache: The next step in utilizing RAM instead of disk is to use it in a transactional manner. This means that the database can make multiple changes to data in RAM prior to writing the final results to disk. For example, if you have 100 widgets, you can store that inventory count in RAM, and then decrement it with each sale. If you sell 23 widgets, then instead of writing each transaction to disk, you update it in RAM. When you flush this data to disk, it results in a single disk write, writing the inventory number 77, instead of writing each of the 23 transactions individually to disk.

SHARED-NOTHING can perform transactions on data while it is in RAM. Once again, shared-disk databases cannot do this because you might have multiple nodes updating the inventory. Since they cannot look into each others local RAM, they must once again write each transaction to disk.

As you can see, shared-nothing databases have an inherent performance advantage. The next blog post will address how modern shared-disk databases address these performance challenges.

PlanetMySQL Voting: Vote UP / Vote DOWN

Advanced Squid Caching in Scribd: Cache Invalidation Techniques

Май 29th, 2010

Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.

It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at Scribd to solve cache invalidation problems.


So, the first problem – ongoing cache invalidation when content changes. This is actually a pretty simple task in squid: you just use HTCP protocol and send CLR requests to your caching farm (we didn’t find any HTCP protocol implementations so we’ve implemented our own simple client that supports just one command).

Since we use haproxy to balance our traffic in the cluster it is hard to predict where should we send a purge request. So we fan those out to all cache servers.

To make sure cache purging won’t slow the site down, especially considering we need to do more that just a simple cache purge (submit documents to search indexes, etc, etc), we just spool a “document changed” request to a queue and then have a set of asynchronous processes that do all the work in background.

Next, The Hard Problem – handling full cache purges w/o killing our backend servers with 5x-10x traffic (our normal hit ratio is ~90-95%).

We’ve spent a lot of time thinking about this problem and the first idea we came up with was to have a loop process somewhere that would iterate all documents we have cached and purge them one by one… but that does not seem to be a practical solution when you have tens of millions documents (and few page versions per document) and obviously the solution would not scale with constantly growing documents corpus.

So we kept brainstorming and finally got one idea that works just perfectly for us: what if we’d be able to take our traffic and define a function f(t) that would return a percentage of the traffic that should be purged at any moment in time. So we did it – we’ve implemented a nginx module that would version our cache by assigning every cached page a revision (using a custom HTTP-headers + Vary-caching) and would be able to slowly migrate the cache from one revision to another over a pre-defined period of time.

Having that module we are able to do so called “slow” cache purges that could take any time from a few minutes (that still helps to reduce the load spike generated by the hottest content) up to many hours (this is what we normally use) or days (never used this option, but it is definitely possible).

Here is an example 100% cache purge over an 8 hour interval:

  1. Daily hit ratio graph:
    day
  2. Weekly hit ratio graph:
    week

As you can see, during those slow purges our cached pages would be slowly updated without putting too much pressure on the backend. Cache hit ratio would slowly degrade and then slowly get back to its normal levels, but with our normal (6-8 hours) purges hit ratio never gets lower that 65-70% which makes it possible for us to save huge amounts of money on not having 90% spare capacity just for the cache purge load surges (we used to have lots of spare application cluster capacity before introducing this approach).



PlanetMySQL Voting: Vote UP / Vote DOWN

Introduction to memcached

Май 28th, 2010

These are the slides to a talk I did earlier this week for students of the professional bachelor in ICT course at KaHo St. Lieven. I wanted to give a clear and simple introduction to the memcached service, as I think it’s an invaluable tool in today’s web development.


PlanetMySQL Voting: Vote UP / Vote DOWN

Using the Query Cache effectively

Сентябрь 29th, 2009

Maximize your strengths, minimize your weaknesses.

You can apply this approach to many things in life, I apply it to describing and using MySQL the product, and it’s components. The Query Cache like many features in MySQL, and indeed features in many different RDBMS products (don’t get me started on Oracle *features*) have relative benefits. In one context it can be seen as ineffective, or even detrimental to your performance, however it’s course grain nature makes it both trivial to disable dynamically (SET GLOBAL query_cache_size=0;), and also easy to get basic statistics on current performance (SHOW GLOBAL STATUS LIKE ‘QCache%’;) to determine effectiveness and action appropriately.

The Query Cache is course grained, that is it is rather simple/dumb in nature. When you understand the path of execution of a query within the MySQL kernel you learn a few key things.

  • When enabled, by default the Query Cache will cache all SELECT statements within certain defined system parameter conditions. There are of course exceptions such as non-deterministic functions, prepared statements in earlier versions etc.
  • Any DML/DDL statement for a table that has a query cached, flushes all query cache results that pertain to this table.
  • You can use SQL_CACHE and SQL_NO_CACHE as hints however you can’t configure on a table by table, or query basis.
  • The query cache works on an exact match of the query (including spaces and case) and other settings such the client character set, and protocol version. If a match is found, data is returned in preformed network packets.<.li>

The Query Cache was not good when set to large values (e.g. > 128M) due to in-efficient cache invalidation. I’m not certain of the original source of this condition however Bug #21074, fixed in 5.0.50 and 5.1.21 is likely the reason.

My advice is to disable the Query Cache by default, especially for testing. As a final stress test you can enable to determine if there is a benefit.

I wish MySQL would spend time in improving key features, for example the Query Cache lacks sufficient instrumentation like what queries are in the cache, what tables are in the cache, and also lack all the sufficient system parameters exposed to fine tune. I believe there is a patch to show the queries for example, but I was unable to find via a google search.

It is a powerful and easy technology if you use it well. It involves architecting your solution appropriately, and knowing when the Query Cache is ineffective.

I have a number of circumstances where the query cache is extremely effective, or could be with simple modifications. A recommendation to a recent client with a 1+TB database was to split historical and current data into two different instances. The data was already in separated tables, the application already performed dual queries, so the change was a simple as a new connection pool. The benefits were huge, not only would the backup process be more efficient, some 500GB of data now only had to be backed up once (as is was 100% static), the scaling and recovery process improved, but the second MySQL instance could enable the query cache and the application would get a huge performance improvement with ZERO code changes for caching. That’s a quick and easy win.

On a side note, I wanted to title this “The MySQL Query Cache is not useless”. When I was a MySQL employee I got reprimanded (twice) for blogging anything about MySQL that wasn’t positive. This blog post is in direct response to Konstantin, a Sun/MySQL employee who actually works on the actually MySQL server code who wrote Query cache = useless?. In my view it is not useless.


PlanetMySQL Voting: Vote UP / Vote DOWN

Advanced Squid Caching in Scribd: Logged In Users and Complex URLs Handling

Июль 22nd, 2009

It’s been a while since I’ve posted my first post about the way we do document pages caching in Scribd and this approach has definitely proven to be really effective since then. In the second post of this series I’d like to explain how we handle our complex document URLs and logged in users in the caching architecture.

First of all, let’s take a look at a typical Scribd’s document URL: http://www.scribd.com/doc/1/Improved-Statistical-Test.

As we can see, it consists of a document-specific part (/doc/1) and a non-unique human-readable slug part (/Improved-Statistical-Test). When a user comes to the site with a wrong slug in the document URL, we need to make sure we send the user to the correct URL with a permanent HTTP 301 redirect. So, obviously we can’t simply send our requests to the squid because it’d cause few problems:

  • When we change document’s title, we’d create a new cached item and would not be able to redirect users from the old URL to the new one
  • When we change a title, we’d pollute cache with additional document page copies.

One more problem that makes the situation even worse – we have 3 different kinds of users on the site:

  1. Logged in users – active web site users that are logged in and should see their name at the top of the page, should see all kinds of customized parts of the page, etc (especially when a page is their own document).
  2. Anonymous users – all users that are not logged in and visit the site with a flash-enabled browser
  3. Bots – all kinds of crawlers that can’t read flash content and need to see a plain text document version

All three kinds of users should see their own document page versions whether the page is cached or not.

So, how do we solve these two problems? Here is how.

First of all, to fix the URLs problem we’ve decided to rewrite the URL before it goes to a squid server. We change URLs to look like this: http://www.scribd.com/doc/1?__enable_docview_caching=1. This makes the document URL dependent only on a unique document ID that never change and sends an additional parameter to the backend to signal that the page could potentially be cached. The slug is sent to backend using an HTTP-header (X-Scribd-Slug) so that backend could check the slug and return a redirect if needed.

To make sure we won’t respond with a cached page to a request with an invalid URL (invalid slug basically), we use Vary: X-Scribd-Slug HTTP header which is implemented in Squid (only late 2.6 and 2.7) and makes it check specified headers in a request before responding with a cached content. If the header of the cached content is different then the header in the request, squid proxies the resuest to backend where we could handle the cache miss as we want.

Next, to resolve the users problem we’ve created a small nginx module that looking at a request headers could tell you whether the user is a bot or not and whether he’s logged in or an anonymous visitor. This module basically exposes a $scribd_user_id variable to our configs and we could use the variable to do separate configuration for different kinds of users.

At this point we do not cache document pages for logged in users so we basically have two copies of each page in the cache: flash-enabled document page and an inline document page. We do this separation by changing our cache URLs one more time: we add a scribd_user_id=$scribd_user_id variable (anonymous = 0, bot = -1) to the cache URL: http://www.scribd.com/doc/1?__enable_docview_caching=1&scribd_user_id=0.

And last, not not least, we use two really awesome Squid features called stale-while-revalidate and stale-if-error (AFAIR, they were invented in Yahoo! and then described by their squid admin).

Option stale-while-revalidate allow us to quickly serve content from the cache while doing background re-validation requests to the Rails backend. Option stale-if-error basically allows us to serve content from the squid cache when Rails backend is down/dead/slow.

All these changes allowed us to handle more traffic with less hardware and what is even more important, they helped us improve user experience with the site: response times dropped 2-3 fold and much less people see our Ouch! pages (HTTP 50x errors when backend is dead or overloaded). Here is an example of one of our servers’ hit ratio and traffic savings daily graphs:

graph_image

traffic_savings

This it with the logged in users and complex URLs handling in Scribd caching architecture. Next post in this series will explain how we do cache invalidation in Scribd. Stay tuned.