Archive for the ‘archive’ Category

Setting up slave, stripping indexes and changing engines, on the fly

Октябрь 7th, 2010

Warning, the following is quite ugly, but does the job :)

A while back I needed to create an archive slave database from a half a terabyte myisam master and had space restrictions. I could not dump the db, load it, then drop keys (archive doesn’t support keys apart from a primary key on one column as of 5.1), alter engine etc (would take even longer than it took either way). So an ugly single liner came to mind and worked nicely too.

mysqldump -uuser -ppassword -h127.0.0.1 -P3306 dbname --master-data=1 | sed 's/ENGINE=MyISAM/ENGINE=archive/g' | grep -v '^ UNIQUE KEY' | grep -v '^ KEY' | perl -p0777i -e 's/,\n^\)/\n\)/mg' | mysql -uuser -ppassword -h127.0.0.1 -P3307 dbname

So what is it doing?
Broken down:
mysqldump -uuser -ppassword -h127.0.0.1 -P3306 dbname --master-data=1 –> extract the database with the master position
sed 's/ENGINE=MyISAM/ENGINE=archive/g' –> change myisam with archive so as to spare an alter table later
grep -v '^ UNIQUE KEY' –> remove unique keys
grep -v '^ KEY' –> remove other indexes
perl -p0777i -e 's/,\n^\)/\n\)/mg' –> after removing the indexes, any commas before the closing bracket need to be removed
mysql -uuser -ppassword -h127.0.0.1 -P3307 dbname –> inserting it directly into the slave

Let me know if you have any other ideas, again, the above is definitely not the neatest way, but it does work.

Because Sharing Is Caring


PlanetMySQL Voting: Vote UP / Vote DOWN

A tale of a bug…

Июль 22nd, 2010

So I sometimes get asked if we funnel back bug reports or patches back to MySQL from Drizzle. Also, MariaDB adds some interest here as they are a lot closer (and indeed compatible with) to MySQL. With Drizzle, we have deviated really quite heavily from the MySQL codebase. There are still some common areas, but they’re getting rarer (especially to just directly apply a patch).

Back in June 2009, while working on Drizzle at Sun, I found a bug that I knew would affect both. The patch would even directly apply (well… close, but I made one anyway).

So the typical process of me filing a MySQL bug these days is:

  • Stewart files bug
  • In the next window of Sveta being awake, it’s verified.

This happened within a really short time.

Unfortunately, what happens next isn’t nearly as awesome.

Namely, nothing. For a year.

So a year later, I filed it in launchpad for MariaDB.

So, MariaDB is gearing up for a release, it’s a relatively low priority bug (but it does have a working, correct and obvious patch), within 2 months, Monty applied it and improved the error checking around it.

So MariaDB bug 588599 is Fix Committed (June 2nd 2010 – July 20th 2010), MySQL Bug 45377 is still Verified (July 20th 2009 – ….).

(and yes, this tends to be a general pattern I find)

But Mark says he gets things through… so yay for him.2


PlanetMySQL Voting: Vote UP / Vote DOWN

There is more than one way to do it….

Сентябрь 29th, 2009

I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.

The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down INSERT, UPDATE and DELETE commands a great deal) — varchar(32) and utf8.

That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per character), and as an InnoDB table, the primary key is clustered with every other index. One of the more frequently used tables had 8 indexes other than the PRIMARY KEY — 7 were indexes on varchar(32) fields, the other was on an int(11) field (signed integer).

The query load on the server was 99% writes, 1% reads. The log tables only had rows inserted, not deleted or updated. Thus, we offered a few different solutions (in no particular order):

1) Use a BLACKHOLE table on the master and replicate to a slave server. This would make the INSERTs to the master fast, though the slave would lag behind. The slave could be changed to MyISAM or ARCHIVE or some other storage engine.

2) Use MyISAM on the master and slave. This was the solution we went to, because the log tables could be re-created with no data and moved into place, and the loss of foreign keys was not a big deal, since the application checked for them anyway. ARCHIVE could also be used, it is faster than MyISAM on inserts because it only allows one index, the PRIMARY KEY. concurrent_insert was set at the default of 1, but we changed it to 2 just in case there was an errant UPDATE or DELETE causing a gap in the MyISAM table.

3) Add a PRIMARY KEY that is AUTO_INCREMENT, keeping the GUID a UNIQUE KEY. This would speed up the writes without sacrificing any data. This can be combined with dropping unnecessary foreign keys to make the inserts faster.

4) Move the log tables to a different machine entirely. This would require the application to have two different database handlers.

5) Use INSERT DELAYED into the logging tables.

We discussed these scenarios and the pros and cons, and eventually we chose to do #2, which we could do right away, and indeed, we had an immediate positive impact. Almost all the unnacceptable performance went away.

We were also able to audit the system and provide several good recommendations for ways they can improve their system. It is exciting for me to be able to help people out in a very tangible way — often we do not see some of the really crazy problems that affect few people (such as slave lag getting further and further behind) because our remote DBA clients have the benefit of ongoing help from us, and we generally find the smaller degradations in performance before they become a large issue.

Being able to spend a few hours with a new environment and really help them out was extremely fulfilling for me. I love that MySQL has many different options (and there are likely more that I did not think of) that all would work to achieve better performance. I think people like to have a choice, too, so they do not feel they were forced to do something they did not want to do. Especially when there are solutions on both the DBA and developer side.


PlanetMySQL Voting: Vote UP / Vote DOWN