As Brian mentioned, a number of us traveled up to Seattle last week to discuss the road map for Drizzle, Gearman, and memcached. Thanks to everyone who was able to make it! It was great to see folks again (Northscale guys, Robert Hodges, Padraig), and meet a couple new people like Nathan, one of the Google Summer of Code students for Drizzle. I thought I’d take a moment to mention some of the discussions related to the tasks I’m working on.
For Drizzle, we talked about the new configuration and plugin system I’ve been digging into lately. Monty Taylor has been doing a great job refactoring the plugin loading, but there are still some steps to be taken to get things where we want. One of the big goals with all this is to have the plugin and config system not specific to Drizzle at all so we can use this in other projects as well (one being Gearman). There are a bunch of blueprints on Launchpad that describe the steps we’ll be taking. I’ll be writing up more details about this soon.
My other big Drizzle task this milestone is to solidify the protocol work. This means finishing the new libdrizzle integration, continue and finish the first phase of the new protocol, add SASL/TLS support, and add in some new libdrizzle features like sharding and client-side prepared statement API (nothing being pushed down to server yet). Again, you can follow these tasks in the libdrizzle blueprints on launchpad.
We also had some great discussion on Drizzle replication with Robert Hodges from Continuent, discussing many issues like log consistency, snapshots, and event stream format. One particular topic was how Drizzle will handle atomic table snapshot and replication log position (usually a LOCK TABLES, snapshot, and grab binlog file/postion in MySQL). One suggestion was to not take snapshots and instead provide a replication stream compression tool that can act as the backup (and by compression, I mean combines entries for the same rows/fields to make replay of the log faster). This may still be unacceptable for some folks, but it may be enough for many and requires virtually zero overhead on the live server (just another replication stream reader).
On the Gearman front we talked about the current status and what new features are most important. Having queue replay ability (for multiple workers) was discussed, along with a pluggable job result cache. The pluggable job result cache would allow the Gearman job server to basically act like a memcached server as well. The difference is that when you have a cache miss, a worker is fired off to populate that cache entry, and all clients requesting that key can wait for the single job (removing the stampeding problem). We are going to be using new features in libmemcached such as embedded cache and server-side protocol to compete this. You can follow the Gearman tasks in the Gearman blueprints on launchpad.
We also discussed the upcoming rewrite of the Gearman job server in C++ for modularity and future maintenance. I was still going back and forth a bit on the C vs C++ thing, but after people suggested I send some of my C macros to the Obfuscated C Code Contest I think I’m a bit more on the C++ side (these were the “polymorphism for C” macros).
I’m not really involved in memcached development (yet?), but I follow the mailing list and other discussions closely. I’m really interested in the embedded libmemcached changes coming down the line, mostly because of integration opportunities with Drizzle and Gearman. Perhaps someday I’ll start hacking on some of this or the mainline memcached server for fun.
More detailed blog posts coming soon on some of these topics. Please let me know if you have any thoughts!
PlanetMySQL Voting: Vote UP / Vote DOWN