Archive for the ‘mmm’ Category

MySQL University session Oct 22: Dual Master Setups With MMM

Октябрь 22nd, 2009

This Thursday (October 22nd, 13:00 UTC), Walter Heck (of Open Query) will present Dual Master Setups With MMM. MMM (Multi-Master Replication Manager for MySQL) is a set of flexible scripts to perform monitoring/failover and management of MySQL master-master replication configurations (with only one node writable at any time). Session slides (PDF).

The toolset also has the ability to read balance standard master/slave configurations with any number of slaves, so you can use it to move virtual IP addresses around a group of servers depending on whether they are behind in replication. For more
information, see mysql-mmm.org.

For MySQL University sessions you point your browser here. You need a browser with a working Flash plugin. You may register for a Dimdim account, but you don’t have to.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL University: Dual Master Setups With MMM

Октябрь 21st, 2009

This Thursday (October 22nd, 13:00 UTC), Walter Heck will present Dual Master Setups With MMM. MMM (Multi-Master Replication Manager for MySQL) is a set of flexible scripts to perform monitoring/failover and management of MySQL master-master replication configurations (with only one node writable at any time). The toolset also has the ability to read balance standard master/slave configurations with any number of slaves, so you can use it to move virtual IP addresses around a group of servers depending on whether they are behind in replication. For more information, see http://mysql-mmm.org/.

For MySQL University sessions, point your browser to this page. You need a browser with a working Flash plugin. You may register for a Dimdim account, but you don't have to. (Dimdim is the conferencing system we're using for MySQL University sessions. It provides integrated voice streaming, chat, whiteboard, session recording, and more.) All MySQL University sessions are recorded, that is, slides and voice can be viewed as a Flash movie (.flv). You can find those recordings on the respective MySQL University session pages which are listed on the MySQL University home page.

MySQL University is a free educational online program for engineers/developers. MySQL University sessions are open to anyone, not just Sun employees. Sessions are recorded (slides and audio), so if you can't attend the live session you can look at the recording anytime after the session.

Here's the schedule for the rest of this year:

  • October 29: MySQL scalability on SPARC & INTEL X5500 (Nehalem) (Benoit Chaffanjon)
  • November 12: Gearman for MySQL (Giuseppe Maxia)
  • November 19: memcached Functions for MySQL (UDFs) (Patrick Galbraith)
  • December 3: Practical Full-Text Search in MySQL (Bill Karwin)

The schedule is not engraved in stone at this point. Please visit http://forge.mysql.com/wiki/MySQL_University#Upcoming_Sessions for the up-to-date list. On that page, you can also find the starting time for many time zones.


PlanetMySQL Voting: Vote UP / Vote DOWN

Dogfood: making our systems more resilient

Сентябрь 18th, 2009

This is a “dogfood” type story (see below for explanation of the term)… Open Query has ideas on resilient architecture which it teaches (training) and recommends (consulting, support) to clients and the general public (blog, conferences, user group talks). Like many other businesses, when we first started we set up our infrastructure quickly and on the cheap, and it’s grown since. That’s how things grow naturally, and is as always a trade-off between keeping your business running and developing while also improving infrastructure (business processes and technical).

Quite a few months ago we also started investing (mostly time) in the technical infrastructure, and slowly moving the various systems across to new servers and splitting things up along the way. Around the same time, the main webserver frequently became unresponsive. I’ll spare you the details, we know what the problem was and it was predictable, but since it wasn’t our system there was only so much we could do. However, systems get dependencies over time and thus it was actually quite complicated to move. In fact, apart from our mail, the public website was the last thing we moved, and that was through necessity not desire.

Of course it’s best for a company when their public website works, it’s quite likely you have noticed some glitches in ours over time. Now running on the new infra, I happened to take a quick peek at our Google Analytics data, and noticed an increase in average traffic numbers of about 40%. Great big auch.

And I’m telling this, because I think it’s educational and the world is generally not served by companies keeping problems and mishaps secret. Nasties grow organically and without malicious intent, improvements are a step-wise process, all that… but in the end, the net results of improvements can be more amazing than just general peace of mind! And of course it’s very important to not just see things happen, but to actively work on those incremental improvements, ongoing.

Our new infra has dual master MySQL servers (no surprise there ;-) but based in separate data centres so that makes the setup a bit more complicated (MMM doesn’t deal with that setup). Other “new” components we use are lighttpd, haproxy, and Zimbra (new in the sense that our old external infra used different tech). Most systems (not all, yet) are redundant/expendable and run on a mix of Linode instances and our own machines. Doing these things for your own infra is particularly educational, it provides extra perspective. The result is, I believe, pretty decent. Failures generally won’t cause major disruption any more, if at all. Of course, it’s still work in progress.

Running costs of this “farm”? I’ll tell later, as I think it’s a good topic for a poll and I’m curious: how much do you spend on server infrastructure per month?

Background for non-Anglophones: “eating your own dogfood” refers to a company doing themselves what they’re recommending to their clients and in general. Also known as “leading by example”, but I think it’s also about trust and credibility. On the other hand, there’s the “dentist’s tooth-ache” which refers to the fact that doctors are their own worst patients ;-)


PlanetMySQL Voting: Vote UP / Vote DOWN

Verify master-master[||-slave] data consistency without locking or downtime

Август 31st, 2009

We all knew that we are risking with MMM. Risking, and placing availability as a more important like consistency.  But non of us can risk loosing data forever but we show using it, regarding to our conversations think:  "I can fix my data later on, but I can’t turn back time and prevent the downtime. (Pascal Hoffman@xaprb.com)".

As I wrote before about staying online, now let me write about how to stay consistent.

We all know, mmm is not like a key of salvation, but its getting close to it :) . While MySQL doesn't support multi-master-slave environments from it's source code, we will sleep badly wondering on the safety of our precious databases.

But its not just about MMM, a few days ago we ran in to a well known InnoDB "feature". Its about the auto increment counter determination on restart. InnoDB try to count the next auto increment value on MySQL restart what can screw up things in the replication as in your data integrity too (http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html) what could be risky when you are about to use mmm, because you can restart your masters whenever you just want to.

In this post I'm using maatkit to verify and restore the rows without locking or downtime.

Maatkit:

This is not a success of MMM, but success of maatkit.
I don't want to write about maatkit. Everybody who is able to use MMM were met with maatkit. Great stuff and if you didn't met with it yet, this is the right time. http://www.maatkit.org

Verification between your masters:

This is the hardest part, you CAN'T and/or you SHOULD NOT to avoid manual overview.

I'm using mk-table-sync for this scenerio. As the about say: mk-table-sync finds and fixes data differences between MySQL tables.
This is what we are going to do, but lets consider the possibilities. If we goes trough the help menu, we can see there is a lots of options. Worth going trough of them.

Here is the command I used to execute.

mk-table-sync --chunksize 100 --databases mytestdb --lock 1 --transaction --skipbinlog --print  db02 db01

And here is the explain:

--chunksize 100 : Chunksize means the size of the chunks maatkit compare to each other. Mean rows. In this command maatkit compare the tables by chunks which are contain 100 rows. Its an affordable size, Its wont be faster and as I benchmarked neider become slower, but as you will see later, its start a transaction and lock the table. More about locking and transactions later but to understand why I recommend a small chunksize, you have to know that this means that the locking of the table is it in will be a short lock.

--database mytestdb: Allows you to give a comma-separated list of databases should be taken care of. In this example I use only one database but as I mentioned you can pass a list of databases you want to scan or what you don't want to scan.

--lock 1: As I mentioned before, maatkit provide you two kind of "locking". The first one is a real table lock and the second one is the way of using transactions. With innodb tables using transaction is a good choice, more about later, but with MyIsam tables, I think the best way is lock 1 regarding to what I want to do here, syncing online masters to each other. mk-table-sync provide 3 different ways of locking here: "0=none, 1=per sync cycle, 2=per table, or 3=globally" as far as I know, 0 means no locks, thats not a good way if you are about to sync online masters under traffic. Locking each table could be a wrong decision too, in case of a larger table, and globally is the same under usage. So this is why I just picked lock 1 what will lock while it's take care of my chunk (this is why I lowered the chunk size to 100).

--transaction: This is what i was talking about above. Instead of locking, mk-table-sync able to use transactions . Starting a transaction with a larger chunk size could cause lock-wait-timeout messages as locking can. This is why there is a small chunk size again. You have to listen to your isolation level too. As you know, its could be unsafe under read-committed.

--skipbinlog: This is the most important part of it. In a multi-master environment syncing without this option is like a suicide. With this, mk-table-sync will execute the fixing queries without logging in to binlog. I should not to mention, that logging to binlog on the slave could really mess up your databases as the other master will replicate and execute it.

--print: mk-table-sync provide three different (and some of them able to combine) ways to run. First I would recommend is --print. Print will just print to the stdout (what you can forward to file with 1>/your/file/path) but when you redirect the stdout, don't forget when using with --ask-pass that this message won't show up for your :) . You can use --execute instead. That will execute the queries immediately. I'm always scared by that a little bit so I rather get the output, go over it and execute what I think I should.

db02/db01: In my example these two hosts are the writable master (at first) and the passive master as a second one.You can use whatever hosts you want. Commands will be executed on the second host.

What to do with your output: When its done and its find differences, you can find where ever your stdout goes. In thet file or on your screen you can see the queries what mk-table-sync would execute to make your passive master consistent. Now its up to you what to do with this information. You can execute each of the or re-run the sync with --execute instead of --print.

Syncing slaves

If you done with your masters, you can sync the slaves on the same way, but you can use the already consistent passive master of your to sync from. Both maatkit and mmm provide some very useful scripts to rebuild slaves and you always got a passive master to do from. With this you are free to rebuild your cluster whenever you want.


PlanetMySQL Voting: Vote UP / Vote DOWN

Failure scenarios and solutions in master-master replication

Август 31st, 2009

I’ve been thinking recently about the failure scenarios of MySQL replication clusters, such as master-master pairs or master-master-with-slaves. There are a few tools that are designed to help manage failover and load balancing in such clusters, by moving virtual IP addresses around. The ones I’m familiar with don’t always do the right thing when an irregularity is detected. I’ve been debating what the best way to do replication clustering with automatic failover really is.

I’d like to hear your thoughts on the following question: what types of scenarios require what kind of response from such a tool?

I can think of a number of failures. Let me give just a few simple examples in a master-master pair:

Problem: Query overload on the writable master makes mysqld unresponsive
Do nothing. Moving the queries to another server will cause cascading failures.
Problem: The writable master is completely unreachable
Fence the writable master and promote the standby master.
Problem: The writable master is reachable but unresponsive due to overload-induced swapping
Do nothing. Moving the load to another server will cause cascading failures.

I don’t want to bias the jury, so I’ll stop there and ask you to contribute your failure scenarios and what you think the correct action should be.

Related posts:

  1. MySQL replication breaks single-threaded limitation? It’s
  2. How to check MySQL replication integrity continually I have rec
  3. Pop quiz: how can one slave break another slave Suppose yo

Related posts brought to you by Yet Another Related Posts Plugin.


PlanetMySQL Voting: Vote UP / Vote DOWN

Some use-cases for MMM for MySQL

Август 28th, 2009

In this blog, I would like to mark myself as a power-user of MMM.  For more then 3 years now I'm engineering high-traffic websites and none of them was a small project. In the beginning I worked behind the livejasmin.com project (warning: NSFW!) which is an Alexa top100 site and now I'm working at ustream.tv which is in the top500 (so far). In this area, your greatest enemy is downtime. Thousands of users want to see your dynamic content by executing thousand of queries against your MySQL servers and what they don't have is the patience and they don't even tolerate critical security-upgrades. In both sites I mentioned above there is no point in time during the day at which less then 10k visitors are hitting the sites.  Now one question comes up: How can you do an upgrade on your MySQL servers or how can you alter a larger table?
So you have a new feature and that means another index on a table or you just want to upgrade your application, there's no chance that your downtime will be hidden. We all know, even if you run your site with no downtime for months, 1 minute can make your bosses forget about that and ask you THE question: "What has happened?". If you would say: "There was no problem, I just took down the site to upgrade MySQL", you won't be the employee of the month.

So let's see  how can MMM help:

Modifications in your database, backup or maintenance:

For a lot of people these words could mean different things, so what I would like to mention is how MMM can help with that.

Just imagine a typical issue: a new column should be added to a large table.

If you have DRBD, NDB or actually nothing at all, this will make you cry if that table is large enough. This could mean different size for everyone, but lets say 25GB with 1,5 million rows. You can see each row size is not small at all. Adding a column even with a pretty fast, but standalone (not clustered like ndb or different sharding techniques) this could take up 4-5 minutes. Taking your site down for minutes is just not a possibility.

What would you or your developers do to avoid this scenario? A few times, I have seen them create a new table and later execute joins or multiple queries on that. That's really not good for us. (here you can find more answers on performance impacts). So lets say you have a chance to alter the table without downtime. MMM can do it for you.

All you have to do is:

  • Define the new column (at least in the beginning) with a possible NULL (but that isn't really good for us too) or with a default value to make it possible that your table will work with the current application and replication together without failed replication or failed queries. What I mean is: let's say you have a table with the following columns: id int(8) auto_increment, sometext varchar(32) NOT NULL and you want to add a column to the end say insertdate datetime default NOW() at the end.

So lets see our possibilities: At first, if the developers would handle it, you would have another table, lets say date_of_join which would contain the same id that you can find in the table above and every time when they want to insert in to the first table, they would write the first and our new table too. And when they want to select something that would mean at least two queries which is really not so good for us. So the first thing you have to take care is: Always made changes what would work perfectly with your current architecture/application.

  • Start executing alter command on each server of yours one-by-one (except your current active master) and on those servers which serve as a master (too), use sql_log_bin = 0 for this query! And this is important.  With this option, you can alter each server and finally move the writer role with MMM to another server and alter the last one too.

I already went trough this for a few times. Pretty smooth.

Upgrade:

Another problem with MySQL that is its a single-point of failure. In the most cases (except ndb) you cant  afford to loose your master. So every single modification will cost a lot. You can't upgrade. Now this problem is solved with MMM. All you have to take care is the correct, planned and exact ORDER of upgrading.

What I mean is you have to start upgrading from the bottom. First upgrade your slaves and when they done, upgrade the passive master and finally, move the writer role with MMM by the command set_offline. And set_offline is (could be) very important. Because if you move the writer rule to your passive, but upgraded master, maybe its execute queries in a different way what your lower version MySQL don't understand. Moving roles with set_offline will stop replication and you have time to upgrade safely. But I have to mention that you are risking downtime again but on a different way. Moving to a newer version (5.0>5.1) could be VERY risky without proper testing. Take care about that.

I hope I gave you some good advices.


PlanetMySQL Voting: Vote UP / Vote DOWN

Welcome to the MMM Community Blog

Август 26th, 2009

This is the first post on the brand spanking new MMM for MySQL Community Blog. Expect posts on new features for this great project. Expect user experiences.

So, your question is of course: what is MMM for MySQL and why haven't I heard of this before?

Well, MMM for MySQL is a project that strives to provide a HA Solution for MySQL, with automatic failover between multiple masters and slaves. If set up properly, you don't have to worry anymore over downtime for upgrading to a new version of MySQL, or a new version of your application (=> large alter tables, addition of indexes, etc.)

The project has recently reached it's version 2, which is a complete rewrite from scratch. We are now slowly seeing this being put into production, with some success stories making us happy already.

I won't make it too long here, but look forward to hear more from us!


PlanetMySQL Voting: Vote UP / Vote DOWN

Getting ready for FrOScon 2009

Август 20th, 2009

I arrived yesterday in St. Augustin, near Bonn in Germany. After a good day of hitchhiking (weather is beautiful here) I stayed with my Pakistani Couchsurfing host and we had an extremely interesting evening talking about the gigantic cultural differences between western civilization and Pakistani civilization. It beats staying in a hotel by about a million points :)

This morning I headed to the FrOScon HQ at the fachhochschule to help out with whatever was needed. Turns out that was a bit premature (misunderstanding on my part), so I have had some time to catch up on mail and give some more attention to my talk on Saturday. I’ll be helping out throughout the and the whole day tomorrow with things though.

I’ll be talking about MySQL MMM, a project that I have invested quite a bit of time in getting to know. My talk will outline what MMM is, what it’s not and an example of our setup at Open Query. It’s a full hour long, so it should be very interesting to be able to go into that much detail.

If you are near St. Augustin, make sure to come by for Froscon, as it’s schedule has some very interesting talks and you’ll also have a good chance to meet fellow MySQL-geeks in the OpenSQLCamp dev-room.


PlanetMySQL Voting: Vote UP / Vote DOWN