Archive for the ‘Drizzle’ Category

Taxonomy of database tools

Май 12th, 2012

Taxonomy of Database Tools

In the MySQL ecosphere there is an ecosystem of tools.  Like real-world ecosystems, the “creatures” in the MySQL tools ecosystem can be classified and organized by a taxonomy.  There are already multiple taxonomies of software bugs (e.g. A Taxonomy of Bugs), but as far as I know this is the first Taxonomy of Database Tools.  A taxonomy of database tools serves useful purposes, as discussed in the previously linked page.  For me, the most useful purpose is the high-level ecosystem view which I use to compare MySQL tools to Drizzle tools.  In so doing, one sees clearly how the MySQL tools ecosystem is thriving whereas the Drizzle tools ecosystem is just budding, so to speak.  For other people, I imagine two overarching interests in a taxonomy of database tools.

First, by laying out the ecosphere in a simple, organized, and comprehensible fashion, a taxonomy of database tools can permit a user (DBA, sysadmin, etc.) to see how well they are “tooled”.  For example, when I gave a presentation on pt-table-checksum at PLMCE 2012, I was surprised to learn how many people never used a tool to verify replication data integrity.  I did not bother to ask why, but I suspect it is because they were not aware that such tools existed.  By looking at this taxonomy of database tools, some users might discover a new type of tool of which there are already many examples.

Second, a taxonomy of database tools is interesting for developers because it reveals where a database server has missing capabilities that users compensate for with tools.  Point in case: pt-table-checksum is used to verifying replication data integrity because until MySQL 5.6 this capability did not exist in the database server.  It is debatable whether all types of tools could be implemented natively in a database server; in theory, they probably could.  This debate becomes a practical concern for modularly-designed database servers like Drizzle because in my humble opinion it is far easier to write plugins and thus tools-as-plugins for Drizzle than for MySQL.

This Taxonomy of Database Tools is still a work in progress.  A lot of the descriptions need to be expanded, traits refined, and more examples added.  If you do not agree with its organization, you can suggest a change, or develop your own taxonomy.  In any case, I will continue to refine this Taxonomy of Database Tools to see where it leads and what it reveals.


PlanetMySQL Voting: Vote UP / Vote DOWN

Designing a HTTP JSON database api

Апрель 24th, 2012

A few weeks ago I blogged about the HTTP JSON api in Drizzle. (See also a small demo app using it.) In this post I want to elaborate a little on the design decisions taken. (One reason to do this is to provide a foundation for future work, especially in the form of a GSoC project.)

Looking around: MongoDB, CouchDB, Metabase

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Notes from MySQL Conference 2012 — Part 2, the hard part

Апрель 23rd, 2012

This is the second and final part of my notes from the MySQL conference. In this part I'll focus on the technical substance of talks I saw, and didn't see.

More than ever before I was a contributor rather than attendee at this conference. Looking back, this resulted in seeing less talks than I would have wanted to, since I was speaking or preparing to speak myself. Sometimes it was worse than speaking, for instance I spent half a day picking up pewter goblets from an egnravings shop... (congratulations to all the winners again :-) Luckily, I can make up for some of that by going back and browse their slides. This is especially important whenever 2 good talks are scheduled in the same slot, or in the same slot when I was to speak. So I have categorized topics here along various axes, but also along the "things I did see" versus "things I missed" axis.

My own talks

Tutorial: Evaluating MySQL High Availability alternatives
Using and Benchmarking Galera in Different Architectures

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Notes from MySQL Conference 2012 — Part 1, the soft part

Апрель 22nd, 2012

I have finally recovered from my trip to Santa Clara enough that I can scribble down some notes from this year's MySQL Conference. Writing a travel report is part of the deal where my employer covers the travel expense, so even if many people have written about the conference, I need to do it too. And judging from the many posts for instance from Pythian's direction, Nokia is perhaps not the only company with such a policy?

Baron's keynote

There has usually always been something that can be called a "soft keynote". Pirate Party founder Rick Falckvinge speaking at a database conference is a memorable example (I still keep in touch with him, having met him at the Hyatt Santa Clara). This year there was one less day, and therefore less keynotes. The soft keynote was therefore taken care of by Baron using some time out of Peter's opening keynote. Baron's talk was an ode to the conference itself, underscoring the meaning of the conference beyond just learning about technology. Sharing his own journey from a numb ASP.NET coder ("a good day at the office was when I changed a table based layout to pure CSS ...but nobody else seemed to care.") to his role today, he challenged people to network, make new friends and new revolutionary ideas. To me, it was a great opening keynote (and quite obviously would have made less sense on the last day of the conference). The talk, including Peter's part, is available on Percona.TV.

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Disproving the CAP Theorem

Апрель 1st, 2012
Since the famous conjecture by Eric Brewer and proof by Nancy Lynch et al., CAP has given the world countless learned discussions about distributed systems and many a well-funded start-up.  Yet who truly understands what CAP means?  Even a cursory survey of the blogosphere shows profound disagreement about the meaning of terms like CP, AP, and CA in real systems.  Those who disagree on CAP include some of the most illustrious personages of the database community.

We can therefore state with some confidence that CAP is confusing. Yet this observation itself raises deeper questions.  Is CAP merely confusing?  Or is it the case that as with other initially accepted but now doubtful ideas like the Copernican model, evolution, and continental drift, that CAP is actually not correct?  Thoughtful readers will agree this question has not received anywhere near the level of scientific scrutiny it deserves.

Fortunately for science private citizens like me have been forging ahead without regard to the opinions of so-called experts or even common sense.  My work on CAP relies on two trusted analytic tools of database engineers over the legal drinking age: formal logic and beer.  Given the nature of the problem we should obviously use a minimum of the former and a maximum of the latter.  We have established that CAP is confusing.  To understand why we must now deepen our confusion and study its habits carefully.  Other investigators have used this approach with great success.

Let us begin by translating the terms of CAP into the propositional calculus.  The terms C (consistency), A (availability) and P (partition tolerance) can be used to state the famous "two out of three" of CAP using logical implication as shown below.

(1) A and P => not C
(2) P and C => not A
(3) C and A => not P

So far so good.  We can now dispense briefly with logic and turn to confusion.  It seems there is difficulty distinguishing the difference between CA and CP systems, i.e., that they are therefore equivalent. This is a key insight, which we can express formally as follows:

(4) C and A <=> C and P

which further reduces to

(5) A <=> P

In short our confusion has led us directly to the invaluable result that A and P, hence availability and partition tolerance, are exactly equivalent!  I am sure you share my excitement at the direction this work is taking.  We can now through a trivial substitution of A for P in equation 2 above reveal the following:

(6) A and C => not A
(7) C => (A => not A)

We have just shown that consistency implies that any system that is available is also unavailable simultaneously.  This is an obvious contradiction, which means the vast logical edifice on which CAP relies crumbles like a soggy nacho.  Considering the amount of beer consumed at the average database conference it is surprising nobody thought of this before.

At this point we can now raise the conversation up a level from looking for spare change under the table and comment on the greater meaning of our results in the real world.  Which is the following: Given the way most of us programmers write software it's a wonder CAP is an issue at all. Honestly, I can't even get calendar programs to send invitations to each other across time zones.  I plan to bring the combustible analytic capabilities of logic and beer to bear on the mystery of time at a later date.  For now we can just speculate it is due to a mistaken design based on CAP.

PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzle Differences from MySQL

Март 30th, 2012

I decided to take a look at Drizzle today and was encouraged by what I saw. Here’s my favorite part:

There is no UNSIGNED (as per the standard). * There are no spatial data types GEOMETRY, POINT, LINESTRING & POLYGON (go use Postgres). * No YEAR field type. * There are no FULLTEXT indexes for the MyISAM storage engine (the only engine FULLTEXT was supported in). Look at either Lucene, Sphinx, or Solr. * No “dual” table. * The “LOCAL” keyword in “LOAD DATA LOCAL INFILE” is not supported

GO USE POSTGRES. Awesome.

List of differences from MySQL.


PlanetMySQL Voting: Vote UP / Vote DOWN

Simple GUI to edit JSON records in Drizzle

Март 29th, 2012

So yesterday I introduced the newly committed HTTP JSON key-value interface in Drizzle. The next step of course is to create some simple application that would use this to store data, this serves both as an example use case as well as for myself to get the feeling for whether this makes sense as a programming paradigm.

Personally, I have been a fan of the schemaless key-value approach ever since I graduated university and started doing projects with dozens of tables and hundreds of columns in total. Especially in small projects I always found the array structures in languages like PHP and Perl and Python to be very flexible to develop with. As I was developing and realized I need a new variable or new data field somewhere, it was straightforward to just toss a new key-value into the array and continue with writing code. No need to go back and edit some class definition. If I ever needed to find out what is available in some struct, I could always do dump_var($obj) to find out. Even large projects like Drupal get along with this model very well.

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

Sessions I want to see at the MySQL User Conference

Март 20th, 2012

Oh boy, I'm starting to feel the stress of having to prepare a little bit of this and a little bit of that for the upcoming MySQL User Conference (Santa Clara, April 10 to 13). But I wanted to also jump on this meme and list a few sessions I definitively want to attend:

I'm speaking, so I suppose I need to attend:

read more


PlanetMySQL Voting: Vote UP / Vote DOWN

SkySQL is Coming to a City Near You!

Март 19th, 2012

Now that the snow is melting and spring is in the air, the SkySQL Team is hitting the road and making the rounds of key industry events, trade shows, and meetups around the globe.  Come meet the team, pick-up a few tips and tricks for using the MySQL database, network with your peers, and learn more about SkySQL’s products and services.  Here are some the events we’ll be at this spring:

BIG Data, A New Horizon for Data Analysis
March 20 - 21, 2012
Cité Internationale Univeritaire de Paris, Paris, France

POSSCON 2012
March 28-29, 2012
Columbia Metropolitan Convention Center, Columbia, South Carolina

Houston PHP Users Group Meeting
April 5, 2012
BrandExtract, Houston TX

Percona Live:  MySQL Conference & Expo 2012
April 10-April, 2012
Hyatt Regency Santa Clara, Santa Clara, CA

SkySQL & MariaDB:  Solutions Day for the MySQL Database
April 13, 2012
Hyatt Regency Santa Clara, Santa Clara, CA

Drizzle Day 2012
April 13, 2012
Hyatt Regency Santa Clara, Santa Clara, CA

Sphinx Search Day 2012
April 13, 2012
Hyatt Regency Santa Clara, Santa Clara, CA

For more detailed information on these events, visit http://www.skysql.com/news-and-events/events.

We hope to see you when we’re in a city near you!


PlanetMySQL Voting: Vote UP / Vote DOWN

IPv6 in Drizzle, soon in MySQL?

Март 19th, 2012

Earlier today I posted a Drizzle white paper we've been working on: Drizzle and IPv6.

read more


PlanetMySQL Voting: Vote UP / Vote DOWN