Archive for the ‘code’ Category

Storage Engine API state graph

Октябрь 25th, 2010

Drizzle still has a number of quirks inherited from the MySQL Storage Engine API (e.g. BLOBs, row buffer, CREATE SELECT and lack of DDL transaction boundaries, key tuple format). One of the things we fixed a long time ago was to have proper methods for StorageEngines to be called for: startTransaction, startStatement, endStatement, commit and rollback.

If you’ve had to implement a transactional storage engine in MySQL you will be well aware of the pattern of “in every Storage Engine/handler call: if transaction doesn’t exist, begin.” We’ve tried to fix this in the Drizzle API for a number of reasons. I think having this obvious set of calls will make the API a lot easier to understand. I am also very interested in making things much easier to prove correct.

A while ago I spotted Bug 587772, which was the READ COMMITTED isolation level not working correctly with InnoDB. It turns out that the most basic example for READ COMMITTED failed. Hrrm… this is no good. It worked on MySQL, so this was certainly something that we broke. What was more worrying is that there wasn’t a test for this in the test suite (and at the time I couldn’t find one in the MySQL test suite either, so I think we inherited the missing test).

I recently started delving in, actually going to solve this. I noticed something worrying, endStatement wasn’t being called, which is where the innobase plugin would release the read view that it used for the statement. You’d think that it would grab a new one on startStatement, but because of the previous design of the API (remember “if txn isn’t started, start it!”) this also happened for getting the read view for the statement… so we instead got a REPEATABLE READ isolation level.

I wanted a test.

Previously, I’ve created a dummy storage engine (tableprototester) and used it to test the server code for reading the table protobuf message. I thought about doing a Storage Engine for this problem too, basically looking at the calls to the Storage Engine as transitions between states in a state machine.

A basic view of a transaction could be:

State transitions for a transaction. Transaction can be empty OR have one or more statementsThat is, a transaction starts and has zero or more statements before it commits or gets rolled back.

By coding up a data structure of allowable state transitions, a small function to assert() on invalid transitions and enough of the boilerplate to make the engine “work”, I was able to hit an assert() exactly where I’d expected it: at an invalid transition from START STATEMENT to COMMIT.

To fix the initial bug (READ COMMITTED not working), I filled in a few state transitions for the system as a whole that aren’t quite correct. From the diagram below, you can quite obviously see where the obvious bugs are (it helps that I’ve coloured them red):

There is absolutely no sense in going BEGIN -> END STATEMENT or immediately to COMMIT. These should be relatively easy to solve too, but are separate bugs.

I wish to expand this in the future to cover Cursor as well. It will also be useful to ensure that DDL can be wrapped in transactions. Not to mention the last few HTON flags that exist (and should likely go away).

To generate the diagrams, I just wrote a little utility to dump out the state transitions in dot, using it to generate the diagrams.


PlanetMySQL Voting: Vote UP / Vote DOWN

HailDB being built by default in Drizzle

Октябрь 21st, 2010

It just it trunk – if you have HailDB installed when you build Drizzle, you will now get the HailDB plugin built. You can even run Drizzle with it (remove innobase plugin, load HailDB plugin). Previously, we had problems building both due to symbol conflicts between innobase and HailDB. We’ve fixed this thanks to the linker.

So, enjoy HailDB… welll, test it and report bugs that I can fix :)


PlanetMySQL Voting: Vote UP / Vote DOWN

Second Drizzle Beta (and InnoDB update)

Октябрь 14th, 2010

We just released the latest Drizzle tarball (2010-10-11 milestone). There are a whole bunch of bug fixes, but there are two things that are interesting from a storage engine point of view:

  • The Innobase plugin is now based on innodb_plugin 1.0.6
  • The embedded_innodb engine is now named HailDB and requires HailDB, it can no longer be built with embedded_innodb.

Those of you following Drizzle fairly closely have probably noticed that we’ve lagged behind in InnoDB versions. I’m actively working on fixing that – both for the innobase plugin and for the HailDB library.

If building the HailDB plugin (which is planned to replace the innobase plugin), you’ll need the latest HailDB release (which as of writing is 2.3.1). We’re making good additions to the HailDB API to enable the storage engine to have the same features as the Innobase plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Database speed tests (mysql and postgresql) — part 3 — code

Октябрь 1st, 2010
Here is the code structure dbfuncs.php : is the file which contains classes and functions for firing queries on mysql and pgsql mysqlinsert.php : creates and fires inserts on mysql mysqlselect.php : creates and fires selects on mysql pgsqlinsert.php : creates and fires inserts on pgsql pgsqlselect.php : creates and fires selects on pgsql benchmark.php : script used to control concurrency and
PlanetMySQL Voting: Vote UP / Vote DOWN

Drizzle7 Beta!

Сентябрь 30th, 2010

Just in case you missed it, I’m rather thrilled that our latest tarball of Drizzle is named Beta. Specifically, we’re calling it Drizzle7. Seven is a very nice number, and it seems rather appropriate.

This release is for a stand alone database server. A lot of the infrastructure for replication is there (with testing), but the big thing we want to hammer on and get perfect here is Drizzle7 as a stand alone database server.

Can I trust it? If you trust InnoDB to store your data, then yes, you can trust Drizzle (it uses InnoDB too)


PlanetMySQL Voting: Vote UP / Vote DOWN

Warnings are now actual problems

Сентябрь 23rd, 2010

Yesterday, I reached a happy milestone in HailDB development. All compiler warnings left in the api/ directory (the public interface to the database engine) are now either probable/possible bugs (that we need to look at closely) or are warnings due to unfinished code (that we should finish).

There’s still a bunch of compiler warnings that we’ve inherited (HailDB compiles with lots of warnings enabled) that we have to get through, but a lot will wait until after we update the core to be based on InnoDB 1.1.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Function to Convert Date To Words

Сентябрь 16th, 2010
Recently I saw a MySQL Stored Function requirement on Experts-Exchange for converting date into some specific words format. You may find MySQL function for date to words conversion online; even udfs might be ready, but I decided to write my own. I wrote this simple function mainly based on SELECT CASE to convert dates in to words [...] Related posts:
  1. bat – batch file to create formatted date time (ddmmyyyy) directory
  2. Stored procedure to Find database objects
  3. MySQL Stored procedure to Generate-Extract Insert Statement

PlanetMySQL Voting: Vote UP / Vote DOWN

Yet another DDL dump script

Август 24th, 2010

Inspired by some recent posts about scripts for parsing mysqldump files, I thought I’d put my own little dump program out there for those looking for alternatives to mysqldump. It’s written in perl and only actually uses the mysqldump utility to dump the data portion of its file dumps. The rest is done via mysql’s internal show commands. Each database is dumped to it’s own directory, where each component in that database is dumped into five subdirs: schemas, routines, views, triggers & data. Inside these subdirs is a file per table and file per proc, etc. The data section is slightly different: as I mentioned it uses mysqldump to dump bulk insert statements into a sql file, one per table. So at any time, any or all components (including data) can be reloaded with a simple redirect or pipe back to mysql client:

[user@svr123]$ mysql -D dbname < /path/to/files/dbname/schemas/*.sql
[user@svr123]$ mysql -D dbname < /path/to/files/dbname/routines/*.sql
[user@svr123]$ mysql -D dbname < /path/to/files/dbname/views/*.sql
[user@svr123]$ mysql -D dbname < /path/to/files/dbname/triggers/*.sql
[user@svr123]$ gunzip -c /path/to/files/dbname/data/*.sql.gz|mysql -D dbname
(data files are zipped)

The script is self-explanatory and since I’m pretty much a newb at perl code, I’d love to know if there’s a better way to do any of this stuff. You can find the code here.


Filed under: code, MySQL
PlanetMySQL Voting: Vote UP / Vote DOWN

HOWTO screw up launching a free software project

Июль 27th, 2010

Josh Berkus gave a great talk at linux.conf.au 2010 (the CFP for linux.conf.au 2011 is open until August 7th) entitled “How to destroy your community” (lwn coverage). It was a simple, patented, 10 step program, finely homed over time to have maximum effect. Each step is simple and we can all name a dozen companies that have done at least three of them.

Simon Phipps this past week at OSCON talked about Open Source Continuity in practice – specifically mentioning some open source software projects that were at Sun but have since been abandoned by Oracle and different strategies you can put in place to ensure your software survives, and check lists for software you use to see if it will survive.

So what can you do to not destroy your community, but ensure you never get one to begin with?

Similar to destroying your community, you can just make it hard: “#1 is to make the project depend as much as possible on difficult tools.

#1 A Contributor License Agreement and Copyright Assignment.

If you happen to be in the unfortunate situation of being employed, this means you get to talk to lawyers. While your employer may well have an excellent Open Source Contribution Policy that lets you hack on GPL software on nights and weekends without a problem – if you’re handing over all the rights to another company – there gets to be lawyer time.

Your 1hr of contribution has now just ballooned. You’re going to use up resources of your employer (hey, lawyers are not cheap), it’s going to suck up your work time talking to them, and if you can get away from this in under several hours over a few weeks, you’re doing amazingly well – especially if you work for a large company.

If you are the kind of person with strong moral convictions, this is a non-starter. It is completely valid to not want to waste your employers’ time and money for a weekend project.

People scratching their own itch, however small is how free software gets to be so awesome.

I think we got this almost right with OpenStack. If you compare the agreement to the Apache License, there’s so much common wording it ends up pretty much saying that you agree you are able to submit things to the project under the Apache license.  This (of course) makes the entire thing pretty redundant as if people are going to be dishonest about submitting things under the Apache licnese there’s no reason they’re not going to be dishonest and sign this too.

You could also never make it about people – just make it about your company.

#2 Make it all about the company, and never about the project

People are not going to show up, do free work for you to make your company big, huge and yourself rich.

People are self serving. They see software they want only a few patches away, they see software that serves their company only a few patches away. They see software that is an excellent starting point for something totally different.

I’m not sure why this is down at number three… it’s possibly the biggest one for danger signs that you’re going to destroy something that doesn’t even yet exist…

#3 Open Core

This pretty much automatically means that you’re not going to accept certain patches for reasons of increasing your own company’s short term profit. i.e. software is no longer judged on technical merits, but rather political ones.

There is enough politics in free software as it is, creating more is not a feature.

So when people ask me about how I think the OpenStack launch went, I really want people to know how amazing it can be to just not fuck it up to begin with. Initial damage is very, very hard to ever undo. The number of Open Source software projects originally coming out of a company that are long running, have a wide variety of contributors and survive the original company are much smaller than you think.

PostgreSQL has survived many companies coming and going around it, and is stronger than ever. MySQL only has a developer community around it almost in spite of the companies that have shepherded the project. With Drizzle I think we’ve been doing okay – I think we need to work on some things, but they’re more generic to teams of people working on software in general rather than anything to do with a company.


PlanetMySQL Voting: Vote UP / Vote DOWN

A tale of a bug…

Июль 22nd, 2010

So I sometimes get asked if we funnel back bug reports or patches back to MySQL from Drizzle. Also, MariaDB adds some interest here as they are a lot closer (and indeed compatible with) to MySQL. With Drizzle, we have deviated really quite heavily from the MySQL codebase. There are still some common areas, but they’re getting rarer (especially to just directly apply a patch).

Back in June 2009, while working on Drizzle at Sun, I found a bug that I knew would affect both. The patch would even directly apply (well… close, but I made one anyway).

So the typical process of me filing a MySQL bug these days is:

  • Stewart files bug
  • In the next window of Sveta being awake, it’s verified.

This happened within a really short time.

Unfortunately, what happens next isn’t nearly as awesome.

Namely, nothing. For a year.

So a year later, I filed it in launchpad for MariaDB.

So, MariaDB is gearing up for a release, it’s a relatively low priority bug (but it does have a working, correct and obvious patch), within 2 months, Monty applied it and improved the error checking around it.

So MariaDB bug 588599 is Fix Committed (June 2nd 2010 – July 20th 2010), MySQL Bug 45377 is still Verified (July 20th 2009 – ….).

(and yes, this tends to be a general pattern I find)

But Mark says he gets things through… so yay for him.2


PlanetMySQL Voting: Vote UP / Vote DOWN