Archive for the ‘My Projects’ Category

DbCharmer 1.7.0 Release: Rails 3.0 Support and Forced Slave Reads

Сентябрь 1st, 2011

This week, after 3 months in the works, we’ve finally released version 1.7.0 of DbCharmer ruby gem – Rails plugin that significantly extends ActiveRecord’s ability to work with multiple databases and/or database servers by adding features like multiple databases support, master/slave topologies support, sharding, etc.

New features in this release:

  • Rails 3.0 support. We’ve worked really hard to bring all the features we supported in Rails 2.X to the new version of Rails and now I’m proud that we’ve implemented them all and the implementation looks much cleaner and more universal (all kinds of relations in rails 3 work in exactly the same way and we do not need to implement connection switching for all kinds of weird corner-cases in ActiveRecord).
  • Forced Slave Reads functionality. Now we could have models with slaves that are not used by default, but could be turned on globally (per-controller, per-action or in a block). This is a new feature that brings our master/slave routing capabilities to a really new level – we could now use it for a really mission-critical models on demand and not be afraid of breaking major functionality of our applications by switching them to slave reads.
  • Lots of changes were made in the structure of our code and tests to make sure it would be much easier for new developers to understand DbCharmer internals and make changes in its code.

Along with the new release we’ve got a brand new web site. You can find much better, cleaner and, most importantly, correct documentation for the library on the web site. We’ll be adding more examples, will try to add more in-depth explanation of our core functions, etc.

If you have any questions about the release, feel free to ask them in our new mailing list: DbCharmer Users Group.

For more updates in our releases, you can follow @DbCharmer on Twitter.



PlanetMySQL Voting: Vote UP / Vote DOWN

Advanced Squid Caching in Scribd: Cache Invalidation Techniques

Май 29th, 2010

Having a reverse-proxy web cache as one of the major infrastructure elements brings many benefits for large web applications: it reduces your application servers load, reduces average response times on your site, etc. But there is one problem every developer experiences when works with such a cache – cached content invalidation.

It is a complex problem that usually consists of two smaller ones: individual cache elements invalidation (you need to keep an eye on your data changes and invalidate cached pages when related data changes) and full cache purges (sometimes your site layout or page templates change and you need to purge all the cached pages to make sure users will get new visual elements of layout changes). In this post I’d like to look at a few techniques we use at Scribd to solve cache invalidation problems.


So, the first problem – ongoing cache invalidation when content changes. This is actually a pretty simple task in squid: you just use HTCP protocol and send CLR requests to your caching farm (we didn’t find any HTCP protocol implementations so we’ve implemented our own simple client that supports just one command).

Since we use haproxy to balance our traffic in the cluster it is hard to predict where should we send a purge request. So we fan those out to all cache servers.

To make sure cache purging won’t slow the site down, especially considering we need to do more that just a simple cache purge (submit documents to search indexes, etc, etc), we just spool a “document changed” request to a queue and then have a set of asynchronous processes that do all the work in background.

Next, The Hard Problem – handling full cache purges w/o killing our backend servers with 5x-10x traffic (our normal hit ratio is ~90-95%).

We’ve spent a lot of time thinking about this problem and the first idea we came up with was to have a loop process somewhere that would iterate all documents we have cached and purge them one by one… but that does not seem to be a practical solution when you have tens of millions documents (and few page versions per document) and obviously the solution would not scale with constantly growing documents corpus.

So we kept brainstorming and finally got one idea that works just perfectly for us: what if we’d be able to take our traffic and define a function f(t) that would return a percentage of the traffic that should be purged at any moment in time. So we did it – we’ve implemented a nginx module that would version our cache by assigning every cached page a revision (using a custom HTTP-headers + Vary-caching) and would be able to slowly migrate the cache from one revision to another over a pre-defined period of time.

Having that module we are able to do so called “slow” cache purges that could take any time from a few minutes (that still helps to reduce the load spike generated by the hottest content) up to many hours (this is what we normally use) or days (never used this option, but it is definitely possible).

Here is an example 100% cache purge over an 8 hour interval:

  1. Daily hit ratio graph:
    day
  2. Weekly hit ratio graph:
    week

As you can see, during those slow purges our cached pages would be slowly updated without putting too much pressure on the backend. Cache hit ratio would slowly degrade and then slowly get back to its normal levels, but with our normal (6-8 hours) purges hit ratio never gets lower that 65-70% which makes it possible for us to save huge amounts of money on not having 90% spare capacity just for the cache purge load surges (we used to have lots of spare application cluster capacity before introducing this approach).



PlanetMySQL Voting: Vote UP / Vote DOWN

DbCharmer – Rails Can Scale!

Апрель 17th, 2010

Back in November 2009 I was working on a project to port Scribd.com code base to Rails 2.2 and noticed that some old plugins we were using in 2.1 were abandoned by their authors. Some of them were just removed from the code base, but one needed a replacement – that was an old plugin called acts_as_readonlyable that helped us to distribute our queries among a cluster of MySQL slaves. There were some alternatives but we didn’t like them for one or another reasons so we’ve decided to go with creating our own ActiveRecord plugin, that would help us scale our databases out. That’s the story behind the first release of DbCharmer.

Today, six months after the first release of the gem and we’ve moved it to gemcutter (which is now the official gems hosting) and we’re already at version 1.6.11. The gem was downloaded more than 2000 times. There are (at least) 10+ large users that rely on this gem to scale their products out. And (this is the most exciting) we’ve added tons of new features to the product.

Here are the main features added since the first release:

  • Much better multi-database migrations support including default migrations connection changing.
  • We’ve added ActiveRecord associations preload support that makes it possible to move eager loading queries to the same connection where your finder queries go to.
  • We’ve improved ActiveRecord’s query logging feature and now you can see what connections your queries executed on (and yes, all those improvements are colorized :-) ).
  • We’ve added an ability to temporary remap any ActiveRecord connections to any other connections for a block of code (really useful when you need to make sure all your queries would go to some non-default slave and you do not want to mess with all your models).
  • The most interesting change: we’ve implemented some basic sharding functionality in ActiveRecord which currently is being used in production in our application.

As you can see now DbCharmer helps you to do three major scalability tasks in your Rails projects:

  1. Master-Slave clusters to scale out your Rails models reads.
  2. Vertical sharding by moving some of your models to a separate (maybe even dedicated) servers and still keep using AR associations
  3. Horizontal sharding by slicing your models data to pieces and placing those pieces into different databases and/or servers.

So, If you didn’t check DbCharmer out yet and you’re working on some large rails project that is (or going to be) facing scalability problems, go read the docs, download/install the gem and prove them that Rails CAN scale!



PlanetMySQL Voting: Vote UP / Vote DOWN