Archive for the ‘HA’ Category

MySQL HA with DRDB and Heartbeat on CentOS 5.5

Июль 21st, 2010

This is one of a few MySQL High Availability strategies.  I have used this for years and found it work great.  If you don’t know about DRBD and MySQL you should read Peter’s comments.

These are step by step instructions for Redhat 5 or CentOS.

If you need more details please refer to:
http://www.drbd.org/users-guide/

Configuring MySQL for DRBD
http://dev.mysql.com/doc/refman/5.1/en/ha-drbd-install-mysql.html

Getting started:

The OS in this example is CentOS 5.5.  I added a new disk (/dev/sde) to the four disk RAID-5 and RAID-1 I was already using.   I’m only creating an 8 gig disk (vmware). You should start with a partition (LVM and or RAID) partition big enough for your data.

# uname -a
Linux db1.grennan.com 2.6.18-194.8.1.el5 #1 SMP Thu Jul 1 19:04:48 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md1              24065660   2826564  19996896  13% /
/dev/md0                101018     20988     74814  22% /boot
tmpfs                   513476         0    513476   0% /dev/shm

# fdisk -l /dev/sde

Disk /dev/sde: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot      Start         End      Blocks   Id  System
/dev/sde1               1        1044     8385898+  83  Linux

DRBD:

Installation:
On machine1 and machine2 install DRBD and its kernel module.  You may need to review the packages you have available using ‘yum list | grep drbd’.  These are for CentOS 5.5.  You may also need to reboot after this step.

 # yum -y install drbd
 # yum –y install kmod-drbd82.x86_64
 # modprobe drbd  

Configuration:
On both machines edit this configuration file.  I have highlighted parts you will need to edit in red.

# vi /etc/drbd.conf
#
# please have a a look at the example configuration file in
# /usr/share/doc/drbd82/drbd.conf
#
# Our MySQL share
resource db
{
 protocol C;

 startup { wfc-timeout 0; degr-wfc-timeout 120; }
 disk { on-io-error detach; } # or panic, ...
 syncer {
 rate 6M;
 }

 on db1.grennan.com {
 device /dev/drbd1;}
 disk /dev/sde1;
 address 192.168.2.13:7789;
 meta-disk internal;
 }

 on db2.grennan.com {
 device /dev/drbd1;
 disk /dev/sde1;
 address 192.168.2.14:7789;
 meta-disk internal;
 }
}

Manage DRDB processes:

On both machines run

 # drbdadm adjust db

On machine1

 # drbdsetup /dev/drbd1 primary –o
 # service drbd start 

On machine2

 # service drbd start

On both machines(see status):

 # service drbd status

On machine1

# mkfs -j /dev/drbd1
# tune2fs -c -1 -i 0 /dev/drbd1
# mkdir /data
# mount -o rw /dev/drbd1 /data

On machine2

# mkdir /data

Test failover:
This is how you perform a manual fail over. You will use HA to do this for you in the next sections.

On primary (server1)

# umount /data
# drbdadm secondary db

On secondary (server2)

# drbdadm primary db
# service drbd status
# mount -o rw /dev/drbd1 /data
# df

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/md1              24065660   1898696  20924764   9% /
/dev/md0                101018     14886     80916  16% /boot
tmpfs                   513472         0    513472   0% /dev/shm
/dev/drbd1             8253948    149628   7685040   2% /data


Note we never formatted (mkfs) the disk on machine2! Here it is, ready to go, DRDB has copied all the data.

MySQL:

Here are a few notes for you to think about.

  • The default location for MySQL data is /var/lib/mysql.  You will be moving this to /data/mysql.
  • MySQL configuration is in /etc/my.cnf.  So that changes to the configuration move with failover, you should put my.cnf in /data/mysql and create a sym-link of /etc/my.cnf to this file.

Now comes the hurdle.

  • Install MySQL as you wish.
  • Move your data directory to a /data/mysql

On machine1

# mkdir /data/mysql
# chown  mysql.mysql /data/mysql
# cp –prv /var/lib/mysql/* /data/mysql

Start MySQL on machine1.
Create some sample database and table. Stop MySQL. Do a manual switchover of DRBD. Start MySQL on machine2 and query for that table. It should work. But, this is of no use if you have to switchover manually every time. When you have this working you are ready to move to Heartbeat.

Here are a couple of scripts to make this easy.

drdb-secondary

# service mysql stop
# umount /data
# drbdadm secondary db
# drdb-primary:
# drbdadm primary db
# mount -o rw /dev/drbd1 /data
# service mysql start


HA:

  • IMPORTANT: Heartbeat uses either Linux Services (LSB) Resource Agents or Heartbeat Resource Agents (HRA) to start and stop heartbeat resources. You will be adding MySQL (LSB), drbddisk (HRA) and IPaddr2 (HRA) are our heartbeat resources.
  • Refer this page on Resource Agent
  • As you are aware of it many *nix services are started using LSB Resource Agents. They are found in /etc/init.d

Installation:

On machine1 and machine2 install Heartbeat and needed utilities.  You may need to review the packages you have available using ‘yum list | grep drbd’.  These are for CentOS 5.5.  You may also need to reboot after this step.

# yum -y install gnutls*
# yum -y install ipvsadm*
# yum -y install heartbeat*
# yum -y install heartbeat.x86_64

Configuration:

Edit /etc/sysctl.conf and set net.ipv4.ip_forward = 1

# vi /etc/sysctl.conf

Controls IP packet forwarding net.ipv4.ip_forward = 1

# /sbin/chkconfig –level 2345 heartbeat on

# /sbin/chkconfig –del ldirectord


Configure HA:

You need to setup the following configuration files on both machines:

# vi /etc/ha.d/ha.cf

#/etc/ha.d/ha.cf content
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694 # If you have multiple HA setup in same network.. use different ports
bcast eth0 # Linux
auto_failback on # This will failback to machine1 after it comes back
ping 192.168.2.1 # The gateway
apiauth ipfail gid=haclient uid=hacluster
node db1.grennan.com
node db2.grennan.com  

On both machines

NOTE: Assuming 192.168.2.15 is virtual IP for your MySQL resource and mysqld is the LSB resource agent. The host name (db2) should be the secondary server’s name.
# vi /etc/ha.d haresources

# /etc/ha.d/haresources content
db2.grennan.com LVSSyncDaemonSwap::master Paddr2::192.168.2.15/24/eth0  rbddisk::db Filesystem::/dev/drbd1::/data::ext3 mysqld

# vi /etc/ha.d/authkeys

#/etc/ha.d/authkeys content
auth 2
2 sha1 BigSecretKeyks9wjwlf9gskg905snvl

Now, make your authkeys secure:

# chmod 600 /etc/ha.d/authkeys


Check your work:

On both machines, one at a time, stop MySQL and make sure MySQL does not start when the system reboots (init 6).

If it does, you may need to remove it from the init process with:

# /sbin/chkconfig –level 2345 MySQL off

Start Heartbeat.

# service heartbeat start

These commands will give you status about this LVS setup:
# /etc/ha.d/resource.d/LVSSyncDaemonSwap master status
# ip addr sh
# service heartbeat status
# df
# service mysqld status

Access your HA-MySQL server like:
# mysql –h 192.168.2.15

Shutdown machine1 to see MySQL up on machine2. ‘shutdown now’

Start machine1 to see MySQL back on machine1.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL HA , an alternative approach

Апрель 29th, 2010

For those who've seen my presentation on MySQL HA, you already know that I often use a multimaster setup with a meta OCF resource that groups my favoured MySQL instance with the service ip , using a meta resource means that pacemaker monitors mysql, but it doesn't actually manage it. It's an approach that works for us.

One of the other approaches I will be looking at soon is the freshly released OCF resource that Florian announced last week.

Back in the days our approach meant we didn't have to use clone resources, which you might remember being pretty buggy in the v2 era, not wanting to use clons resources isn't really a valid reason anymore these days . I've also frequently mentioned the combination of using DRBD and MultiMaster replication, using this set of OCF resource makes that a lot more easy ..

Now all I need to do is find me some time to validate this setup.

Technorati Tags:Technorati Tags:

Trackback URL for this post:

http://www.krisbuytaert.be/blog/trackback/1001

PlanetMySQL Voting: Vote UP / Vote DOWN

Linux Open Administration Days 2010

Апрель 20th, 2010

So about 4 monts ago there was the crazy idea to start a new FOSS event in Belgium targeted at sysadmins.

What started out as an event for local people to meet local people with some local speakers actually ended up being a small local event with some top international speakers on onfiguration mananagement and system administration mixed with a bunch of good local ones !

I had the honour to open the conference with an extremely short version of the Devops talk I gave earlier last year.. extremely short as I knew that over the course of the weekend the topic would reoccur a lot.

We had the first european talk on Chef, by Joshua Timberman, and we had Puppet talks amongst by Dan Bode from Puppetlabs and CFengine talks , devops was a frequently dropped word,

We had a book raffle where we handed out O'Reilly's .. we had a great free pizza party (got the idea from the saturday pizza event at LCA 2005) , and we had some free beer. Sounds like a good combination for a geeky weekend.

Apart from the regular talks there were plenty of Open Spaces where interesting topics were discussed ... we had spaces on Open Source vs Open Core , strong voices were heard when we discussed what we should do with the Open Core companies that claim to value Open Source , some people think we should actually list the fauxpensource ones somewhere and make sure the world knows about them

We had an awesome configuration management discussion session discussing Chef vs Puppet vs CFengine . And much much more ...

Some people owe me plenty of Sushi as I had to do my MySQL HA talk before their Managing MySQL talk , but other than that .. things just went fine..

Trackback URL for this post:

http://www.krisbuytaert.be/blog/trackback/998

PlanetMySQL Voting: Vote UP / Vote DOWN

UKUUG Spring Conference 2010

Апрель 7th, 2010

Last week I was in Manchester for the 2010 UKUUG Spring Conference, right .. make that 2 weeks ago , :)

The UKUUG usually hosts the more interesting conferences around ... , it's not just the schedule that attrackts me , yes there's the strong focus towards Larger Scale Unix (and mostly Linux) deployments and how to manage them, but there's also the opportunity to chat in real life with the Devops from across the chunnel.

Spending time with R.I.Pienaar, Julian Simpson, Simon Wilkinson , Alex Davies , Simon Riggs , Josette, and many others is always fun .

As I was in town early I went to the preconference beer meetup and met with a lot of people and chatted about config management, virtualization and lots of other stuff ... after the pub the plan was to go for curries nearby .. and while walking to the , ahem Bus stop, I managed to recognise Ben Martin from meeting him back ages ago in Hamburg for LinuxKongress , always fun ..

Apart from having to jump on a bus and our group being split at the curry place , rather than being able to tell the latecomers where to walk to and being seeted upstairs with the whole group , the curries were interesting and fun.

As I had been pushing Simon Wardley on Twitter to submit a talk for the conference it was really great to finally see him present .. His talk was the perfect soft introduction to the conference ...

Simon's talk was followed by a talk on Security for the virtual datacenters, after I questionned the speaker if anyone actualy uses TPM outside an academic lab the talk suddenly changed into a commercial presentation for a Quack, nuff said.

The Ever energetic Matt S Trout talked about 21st century perl before Simon "Life is to short for SELinux" Wilkinson talked about his experiences in getting the openAFS crowd on Git.

Bummer Thierry Carrez didn't show us the real juice of UEC and just the installations of a Cloud Controller and a Node Controller , but he managed to do so in approx 30 minutes as promised .

A talk titled Coherent and Integrated Configuration of Virtual Infrastructures always cathces my eye.. however when that talk turns out to be a Coherent and Integrated configuration only within the Univerity of Edinborough (aka lcfg2) talk I`m dissapointed, specially since it pretty much didn't introduce any new concepts from the ones I introduced back in my Durham UKUUG presentation

Luckily Andrew Stribblehill gave a very interesting talk on MySQL scalability, in which I promised him some answers to his questions for the next day :)

The Conference dinner was without a doubt the best UKUUG dinner so far , no typical english "food", no weird location (Old Trafford, an abandoned warship) , but just a big chinese place and plenty of food !

I started thurday morning in the wrong track, I assumed to be in the Virtualization track, but I ended up in the Sun thinclient and Abusing Linux to serve weird desktops under the Green computing umbrella track, not my favourites ..

When Patrick and Julian started their Hudson hit my Puppet with a Cucumber talk (which featured some aweseom #devops content) I was a afraid that we'd had to look for a replacment PostgreSQL talk as Simon hadn't arrived yet .. Luckily he arrived in time for his presentation and he explained us about the new replication features that are slowly making it into PostgreSQL, one way ... log shipping ... not really up to par with other alternatives yet :(

So with no further ado .. here's the presentation I gave

PS. If at a Ukuug event and not sure about a person's name ... try Simon.. pretty good chance you're correct :)

Trackback URL for this post:

http://www.krisbuytaert.be/blog/trackback/997

PlanetMySQL Voting: Vote UP / Vote DOWN

Solutions Linux / Open Source 2010

Март 23rd, 2010
Last week Solutions Linux / Open Source event was held in Paris.

Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.

Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.

...


PlanetMySQL Voting: Vote UP / Vote DOWN

Solutions Linux / Open Source 2010

Март 23rd, 2010
Last week Solutions Linux / Open Source event was held in Paris.

Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.

Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.

...


PlanetMySQL Voting: Vote UP / Vote DOWN

Solutions Linux / Open Source 2010

Март 23rd, 2010
Last week Solutions Linux / Open Source event was held in Paris.

Kuassi MENSAH (Head of Product Management Database Technologies, Oracle Corporation) presented the open source Oracle strategy. Linux, MySQL, virtualization, GlassFish, Eclipse, dynamic scripting languages ,... etc . It was well received by the audience. Knowing that MySQL organization will be kept safe in Oracle is perceived as a nice move.

Florian Haas(LINBIT) gave a tutorial on DRBD and did some demos with NFS and video streaming. And of course he reminded people that now since Linux 2.6.33, DRBD is officially integrated into the Linux kernel source. DRBD making the push for mainline Linux kernel is going to make HA easier.

...


PlanetMySQL Voting: Vote UP / Vote DOWN

PBXT Engine Level replication, works!

Март 17th, 2010
I have been talking about this for a while, now at last I have found the time to get started! Below is a picture from my 2008 MySQL User Conference presentation. It illustrates how engine level replication works, and also shows how this can be ramped up to provide a multi-master HA setup.


What I now have running is the first phase: asynchronous replication, in a master/slave configuration. The way it works is simple. For every slave in the configuration the master PBXT engine starts a thread which reads the transaction log, and transfers modifications to a thread which applies the changes to PBXT tables on the slave.

Where to get it

I have pushed the changes that do this trick to PBXT 2.0 on Launchpad. The branch to try out is lp:pbxt/2.0.

Getting started

Setup of the replication is dead easy. Assuming you already have a PBXT database, what you need to do is the following:

1. Copy the Master data: Shutdown the MySQL server and make a complete copy of the data directory.

2. Setup a Slave server: Setup a second MySQL server using the copy of the data directory.

3. Declare the Slave: Create a text file called slaves, in the data/pbxt directory of the master server, with the following entry:
[slave]
name=slave-thread-name
host=host-name-of-slave
port=37656
slave-process-name is any name you like, and is used to identify the replication thread running on the master. host-name-of-slave is the host name or IP address of the slave MySQL server. 37656 is the default port used by the PBXT slave engine to receive replication changes.

4. Enable replication: On the master server set pbxt_enable_replication=1, and on the slave server set pbxt_enable_replication=2. Also make sure that both servers have different server IDs (system parameter: server_id).

5. Start both servers: Replication will begin immediately if the slave server is started before master server, otherwise replication will begin after a minute (see below).

How it works


PBXT engine level replication, unlike MySQL replication, pushes changes to the slave. For every entry in the data/pbxt/slaves file, PBXT starts a thread (the supplier thread). The thread connects to the slave on the given address, and pushes the changes to an applier thread run by the PBXT engine on the slave side. If any error occurs, the supplier thread on the master will pause, and then try again in a minute.

On connect the supplier thread requests the global transaction ID (GID) of the last transaction committed on the slave. The applier determines the GID of the last transaction by searching backwards through its own transaction logs.

Replication is row-based, and fairly low level. Changes refer to the PBXT internal row and table IDs. The row data is transferred in the same format used to store the information on disk. This makes the replication extremely efficient. The supplier thread does not even have to read the log from disk if it is fairly up-to-date, because PBXT already caches the last changes to the transaction log for use by the writer and the sweeper threads.

Probably the most important thing about this type of replication is that it (theoretically) has almost no affect on the "foreground" activity on the master machine. I am interested to find out if this really is the case.

What's next?

Replication of DDL changes are not implemented yet. So if you do ALTER TABLE or any other such operation, replication will stop, and have to be restarted by copying over the data directory to the slave again.

After DDL changes the next step is to add synchronous replication, as illustrated above. This requires waiting for a commit from the slave before continuing. Latency in this case can be kept to a minimum by sending transactions to the slave before they have been committed on the master.

I believe this would then provide the basis for an extremely simple (and efficient) HA solution based on MySQL.

PlanetMySQL Voting: Vote UP / Vote DOWN

Nines , Damn Nines and More Nines

Октябрь 20th, 2009

Funny how different experiences lead to different evaluations of tools. The MySQL HA solutions the MySQL Performanceblog list, are almost listed in the complete opposited order of what my impressions are.

Ok agreed, I should probably not put my MySQL NDB experiences from 2-3 years ago with multiple Query of deaths and more problems than you into account anymore , but back then went in the list Less stable than a single node. I've had NDB POC setups going down for much more than 05:16 minutes
Ndb comes with a lot of restrictions, there are

As for MySQL on DRBD, I've said this before , I love DRBD, but having to wait for a long InnoDB recovery after a failover just kills your uptime ,
I remember being called by a customer during Fred last holiday who was waiting over 20 minutes for recovery , twice, so putting the DRBD/San setup second would not be my preference. But agreed .. it's only listed at 99.9% meaning almost 9 hours of downtime per year are allowed.

On the other hand we've seen database uptime of MySQL MultiMaster setups with Heartbeat reaching better figures than 99.99% Heck I've seen single nodes achieve better than 99.99% :)

So what does this teach us ... there is no golden rule for HA, lots of situations are different, it's the preferences of the customer, the size of the database, the kind of application , and much
more .. you always need to think and evaluate the environment ...

Technorati Tags:Technorati Tags:

Trackback URL for this post:

http://www.krisbuytaert.be/blog/trackback/951

PlanetMySQL Voting: Vote UP / Vote DOWN

Spider and vertical partition engines with new goodies

Октябрь 15th, 2009

sharding for the masses

The Spider storage engine should be already known to the community. Its version 2.5 has recently been released, with new features, the most important of which is that you can execute remote SQL statements in the backend servers. The method is quite simple. Together with Spider, you also get an UDF that executes SQL code in a remote server. You send a query with parameters saying how to connect to the server, and check the result (1 for success, 0 for failure). If the SQL involves a SELECT, the result can be sent to a temporary table. Simple and effective.

In addition to the Spider engine, Kentoku SHIBA has also created the vertical partitioning engine. Instead of splitting tables by record, you split them by columns. You can define a table with column A and column B, with primary key K, and another table with column C and column D, with primary key K. The vertical partition engine allows you to define a table with columns K, A, B, C, D, which looks to the user like a regular column. The backend tables can be of any engine.
There is a MySQL University session about the Spider and VP engines today at 15:00 CEST. Free attendance!

PlanetMySQL Voting: Vote UP / Vote DOWN