Archive for the ‘HA’ Category

MySQL Cluster training Jan 24 — 26 in DC

Январь 13th, 2012

We still have a few seats left for our MySQL Cluster training in Washington DC January 24 - 26. If you're interested in learning about MySQL Cluster, the architecture, how to install it, administer it and troubleshoot it this is the course for you. The course will also cover replication and optimization and we will also discuss the exciting new features coming in version 7.2 of MySQL Cluster. For more information about the course contents visit http://www.skysql.com/services/training/courses/administering-mysql-cluster and to sign up to the course go to http://www.skysql.com/services/training/schedule/administering-mysql-cluster-1.

See you there!

 


PlanetMySQL Voting: Vote UP / Vote DOWN

What’s new in MySQL 5.6 Replication – free webinar

Декабрь 12th, 2011

Global Transaction IDs - simplifying replication management

There will be a webinar this Wednesday where you can get the latest information on all of the great new content that has been included in the MySQL 5.6 Development Releases as well as some features that are still being developed. As always, the webinar is free but you need to register here in advance. Even if you can’t attend the live event it’s worth registering so that you get sent the replay.

Some of the topics we’ll be discussing are:

  • Enhanced data integrity: Global Transactions Identifiers, Crash-Safe Slaves and Replication Event Checksums;
  • High performance: Multi-Threaded Slaves, Binlog Group Commit and Optimized Row-Based Replication;
  • Improved flexibility: Time Delayed Replication, Multi-Source Replication, Binlog API and Informational Log Events
The event starts on Wednesday, December 14, 2011: 09:00 Pacific time; 17:00 UK; 18:00 CET.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL HA Solutions – webinar replay

Декабрь 7th, 2011

If you were unable to attend the live webinar (or you want to go back and listen to it again) then it’s now available to view on-line here.

Databases are the center of today’s web and enterprise applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization. Tune into this webcast to learn more.

The session discusses:

  1. Causes, effect and impact of downtime
  2. Methodologies to map applications to HA solution
  3. Overview of MySQL HA solutions
  4. Operational best practices to ensure business continuity

PlanetMySQL Voting: Vote UP / Vote DOWN

Replication stars

Ноябрь 4th, 2011
Working with replication, you come across many topologies, some of them sound and established, some of them less so, and some of them still in the realm of the hopeless wishes. I have been working with replication for almost 10 years now, and my wish list grew quite big during this time. In the last 12 months, though, while working at Continuent, some of the topologies that I wanted to work with have moved from the cloud of wishful thinking to the firm land of things that happen. My quest for star replication starts with the most common topology. One master, many slaves.
Replication 1 master slave

Fig 1. Master/Slave topology

Replication legend

Legend

It looks like a star, with the rays extending from the master to the slaves. This is the basis of most of the replication going on mostly everywhere nowadays, and it has few surprises. Setting aside the problems related to failing over and switching between nodes, which I will examine in another post, let's move to another star.
Replication 2 fan in slave

Fig 2. Fan-in slave, or multiple sources

The multiple source replication, also known as fan-in topology, has several masters that replicate to the same slave. For years, this has been forbidden territory for me. But Tungsten Replicator allows you to create multiple source topologies easily. This is kind of uni-directional, though. I am also interested in topologies where I have more than one master, and I can retrieve data from multiple points.
Replication 3 all to all three nodes

Fig 3. all-to-all three nodes

Replication 4 all to all four nodes

Fig 4. All-to-all four nodes

Tungsten Multi-Master Installation solves this problem. It allows me to create topologies where every node replicates to every other node. Looking at the three-node scheme, it appears a straightforward solution. When we add one node, though, we see that the amount of network traffic grows quite a lot. The double sided arrows mean that there is a replication service at each end of the line, and two open data channels. When we move from three nodes to four, we double the replication services and the channels needed to sustain the scheme. For several months, I was content with this. I thought: it is heavy, but it works, and it's way more than what you can do with native replication, especially if you consider that you can have a practical way of preventing conflicts using Shard Filters. But that was not enough. Something kept gnawing at me, and from time to time I experimented with Tungsten Replicator huge flexibility to create new topologies. But the star kept eluding me. Until … Until, guess what? a customer asked for it. The problem suddenly ceased to be a personal whim, and it became a business opportunity. Instead of looking at the issue in the idle way I often think about technology, I went at it with practical determination. What failed when I was experimenting in my free time was that either the pieces did not glue together the way I wanted, or I got an endless loop. Tungsten Replicator has a set of components that are conceptually simple. You deploy a pipeline between two points, open the tap, and data starts flowing in one direction. Even with multiple masters replication, the principle is the same. You deploy many pipes, and each one has one purpose only.
Replication 5 star topology 3 rays

Fig 5. All-masters star topology

In the star topology, however, you need to open more taps, but not too many, as you need to avoid the data looping around. The recipe, as it turned out, is to create a set of bi-directional replication systems, where you enable the central node slave services to get changes only from a specific master, and the slave services on the peripheral nodes to accept changes from any master. It was as simple as that. There are, of course, benefits and drawbacks with a star topology, compared to a all-replicate-to-all design. In the star topology, we create a single point of failure. If the central node fails, replication stops, and the central node needs to be replaced. Instead, the all-to-all design has no weaknesses. Its abundance of connections makes sure that, if a node fails, the system continues working without any intervention. There is no need for fail-over.
Replication 6 all to all extending png

Fig 6. extending an all-to-all topology

Replication 7 star extending

Fig 7. Extending a star topology

However, there is a huge benefit in the node management. If you need to add a new node, it costs two services and two connections, while the same operation in the all-to-all replication costs 8 services and 8 connections. With the implementation of this topology, a new challenge has arisen. While conflict prevention by sharding is still possible, this is not the kind of scenario where you want to apply it. We have another conflict prevention mechanism in mind, and this new topology is a good occasion make it happen. YMMV. I like the additional choice. There are cases where a all-replicate-to-all topology is still the best option, and there are cases where a star topology is more advisable.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster 7.2 (DMR2): NoSQL, Key/Value, Memcached

Октябрь 7th, 2011

70x Higher Performance, Cross Data Center Scalability and New NoSQL Interface

Its been an exciting week for all involved with MySQL Cluster, with the announcement of the second Development Milestone Release (7.2.1) at Oracle Open World. Highlights include:

- Enabling next generation web services: 70x higher complex query performance, native memcached API and integration with the latest MySQL 5.5 server

- Enhancing cross data scalability: new multi-site clustering and enhanced active/active replication

- Simplified provisioning: consolidated user privileges.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can also read up on the detail of each of these features in the new article posted at the MySQL Developer Zone. In this blog, I’ll summarize the main parts of the announcement.

70x Higher Performance with Adaptive Query Localization (AQL)

Previewed as part of the first MySQL Cluster DMR, AQL is enabled by a new Index Statistics function that allows the SQL optimizer to build a better execution plan for each query.

As a result, JOIN operations are pushed down to the data nodes where the query executes in parallel on local copies of the data. A merged result set is then sent back to the MySQL Server, significantly enhancing performance by reducing network trips.

Take a look at how this is used by a web-based content management to increase performance by 70x

Adaptive Query Localization enables MySQL Cluster to better serve those use-cases that have the need to run real-time analytics across live data sets, along with high throughput OLTP operations. Examples include recommendations engines and clickstream analysis in web applications, pre-pay billing promotions in mobile telecoms networks or fraud detection in payment systems.

New NoSQL Interface and Schema-less Storage with the memcached API

The memcached interface released as an Early Access project with the first MySQL Cluster DMR is now integrated directly into the MySQL Cluster 7.2.1 trunk, enabling simpler evaluation.

The popularity of Key/Value stores has increased dramatically. With MySQL Cluster and the new memcached API, you have all the benefits of an ACID RDBMS, combined with the performance capabilities of Key/Value store.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

You can read more about the design goals and implementation of the memcached API for MySQL Cluster here.

Integration with MySQL 5.5

MySQL Cluster 7.2.1 is integrated with MySQL Server 5.5, providing binary compatibility to existing MySQL Server deployments. Users can now fully exploit the latest capabilities of both the InnoDB and MySQL Cluster storage engines within a single application.

Users simply install the new MySQL Cluster binary including the MySQL 5.5 release, restart the server and immediate have access to both InnoDB and MySQL Cluster!

Enhancing Cross Data Center Scalability: Simplified Active / Active Replication

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Geographic replication has always been designed around an Active / Active technology, so if applications are attempting to update the same row on different clusters at the same time, the conflict can be detected and resolved. With the release of MySQL Cluster 7.2.1, implementing Active / Active replication has become a whole lot simpler. Developers no longer need to implement and manage timestamp columns within their applications. Also rollbacks can be made to whole transactions rather than just individual operations.

You can learn more here.

Enhancing Cross Data Center Scalability: Multi-Site Clustering

MySQL Cluster 7.2.1 DMR provides a new option for cross data center scalability – multi-site clustering. For the first time splitting data nodes across data centers is a supported deployment option.

Improvements to MySQL Cluster’s heartbeating mechanism with a new “ConnectivityCheckPeriod” parameter enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster.

With this deployment model, users can synchronously replicate updates between data centers without needing conflict detection and resolution, and automatically failover between those sites in the event of a node failure.

Users need to characterize their network bandwidth and latencies, and observe best practices in configuring both their network environment and Cluster. More guidance is available here.

User Privilege Consolidation

User privilege tables are now consolidated into the data nodes and centrally accessible by all MySQL servers accessing the cluster.

Previously the privilege tables were local to each MySQL server, meaning users and their associated privileges had to be managed separately on each server. By consolidating privilege data, users need only be defined once and managed centrally, saving Systems Administrators significant effort and reducing cost of operations.

Summary

The MySQL Cluster 7.2.1 DMR enables new classes of use-cases to benefit from web-scale performance with carrier-grade availability.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can learn more about the MySQL Cluster architecture from our Guide to scaling web databases

Let us know what you think of these enhancements directly in comments of this or the associated blogs. We look forward to working with the community to perfect these new features.


PlanetMySQL Voting: Vote UP / Vote DOWN

Synchronously Replicating Databases Across Data Centers – Are you Insane?

Октябрь 4th, 2011
 

Well actually….no. The second Development Milestone Release of MySQL Cluster 7.2 introduces support for what we call “Multi-Site Clustering”. In this post, I’ll provide an overview of this new capability, and considerations you need to make when considering it as a deployment option to scale geographically dispersed database services.

You can read more about MySQL Cluster 7.2.1 in the article posted on the MySQL Developer Zone

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Multi-Site Clustering provides a new option for cross data center scalability. For the first time splitting data nodes across data centers is a supported deployment option. With this deployment model, users can synchronously replicate updates between data centers without needing to modify their application or schema for conflict handling, and automatically failover between those sites in the event of a node failure.

MySQL Cluster offers high availability by maintaining a configurable number of data replicas.  All replicas are synchronously maintained by a built-in 2 phase commit protocol.  Data node and communication failures are detected and handled automatically.  On recovery, data nodes automatically rejoin the cluster, synchronize with running nodes, and resume service.

All replicas of a given row are stored in a set of data nodes known as a nodegroup.  To provide service, a cluster must have at least one data node from each nodegroup available at all times.  When the cluster detects that the last node in a nodegroup has failed, the remaining cluster nodes will be gracefully shutdown, to ensure the consistency of the stored databases on recovery.

Improvements to the heartbeating mechanism used by MySQL Cluster enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster. A new “ConnectivityCheck” mechanism is introduced, which must be explicitly configured. This extra mechanism adds messaging overheads and failure handling latency, and so is not switched on by default.

When configuring Multi-Site clustering, the following factors must be considered:

Bandwidth
Low bandwidth between data nodes can slow data node recovery.  In normal operation, the available bandwidth can limit the maximum system throughput.  If link saturation causes latency on individual links to increase, then node failures, and potentially cluster failure could occur.

Latency and performance
Synchronously committing transactions over a wide area increases the latency of operation execution and commit, therefore individual operations are slowed. To maintain the same overall throughput, higher client concurrency is required.  With the same client concurrency level, throughput will decrease relative to a lower latency configuration.

Latency and stability
Synchronous operation implies that clients wait to hear of the success or failure of each operation before continuing. Loss of communication to a node, and high latency communication to a node are indistinguishable in some cases.  To ensure availability, the Cluster monitors inter-node communication.  If a node experiences high communication latency, then it may be killed by another node, to prevent its high latency causing service loss.

Where inter-node latencies fluctuate, and are in the same range as the node-latency-monitoring trigger levels, node failures can result.  Node failures are expensive to recover from, and endanger Cluster availability. 

To avoid node failures, either the latency should be reduced, or the trigger levels should be raised.  Raising trigger levels can result in a longer time-to-detection of communication problems.

WAN latencies
Latency on an IP WAN may be a function of physical distance, routing hops, protocol layering, link failover times and rerouting times. The maximum expected latency on a link should be characterized as input to the cluster configuration.

Survivability of node failures
MySQL Cluster uses a fail fast mechanism to minimize time-to-recovery. Nodes that are suspected of being unreachable or dead are quickly excluded from the Cluster.  This mechanism is simple and fast, but sometimes takes steps that result in unnecessary cluster failure.  For this reason, latency trigger levels should be configured a safe margin
above the maximum latency variation on inter-data node links.

Users can configure various MySQL Cluster parameters including heartbeats, Connectivity_Check, GCP timeouts and transaction deadlock timeouts. You can read more about these parameters in the documentation

Recommendations for Multi-Site Clustering
- Ensure minimal, stable latency;
- Provision the network with sufficient bandwidth for the expected peak load - test with node recovery and system recovery;
- Configure the heartbeat period to ensure a safe margin above latency fluctuations;

- Configure the ConnectivtyCheckPeriod to avoid unnecessary node failures;

- Configure other timeouts accordingly including the GCP timeout, transaction deadlock timeout, and transaction inactivity timeout.

Example
The following is a recommendation of latency and bandwidth requirements for applications with high throughput and fast failure detection requirements:
- latency between remote data nodes must not exceed 20 milliseconds;
- bandwidth of the network link must be more than 1 Gigabit per Second.

For applications that do not require this type of stringent operating environment, latency and bandwidth can be relaxed, subject to the testing recommended above.

As the recommendations demonstrate, there are a number of factors that need to be considered before deploying multi-site clustering. For geo-redundancy, Oracle recommends Geographic Replication, but multi-site clustering does present an alternative deployment, subject to the considerations and constraints discussed above.

You can learn more about scaling web databases with MySQL Cluster from our new Guide.  We look forward to hearing your experiences with the new MySQL Cluster 7.2.1 DMR!


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL HA Solutions: New Guide Available

Сентябрь 29th, 2011

Databases are the center of today’s web, enterprise and embedded applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization.

The new MySQL Guide to High Availability solutions is designed to navigate users through the HA maze, discussing:

- The causes, effects and impacts of downtime;

- Methodologies to select the right HA solution;

- Different approaches to delivering highly available MySQL services;

- Operational best practices to meet Service Level Agreements (SLAs).

As discussed in the new Guide, selecting the high availability solution that is appropriate for your application depends upon 3 core principles:

- The level of availability required to meet business objectives, within budgetary constraints;

- The profile of application being deployed (i.e. concurrent users, requests per second, etc.);

- Operational standards within each data center.

Recognizing that each application or service has different operational and availability requirements, the guide discusses the range of certified and supported High Availability (HA) solutions – from internal departmental applications all the way through to geographically redundant, multi-data center systems delivering 99.999% availability (i.e. less than 5 ½ minutes of downtime per year) supporting transactional web services, communications networks, cloud and hosting environments, etc.

By combining the right technology with the right skills and processes, users can achieve business continuity, while developers and DBAs can sleep tight at night! Download the guide to learn more.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL HA Solutions: New Guide Available

Сентябрь 29th, 2011

Databases are the center of today’s web, enterprise and embedded applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization.

The new MySQL Guide to High Availability solutions is designed to navigate users through the HA maze, discussing:

- The causes, effects and impacts of downtime;

- Methodologies to select the right HA solution;

- Different approaches to delivering highly available MySQL services;

- Operational best practices to meet Service Level Agreements (SLAs).

As discussed in the new Guide, selecting the high availability solution that is appropriate for your application depends upon 3 core principles:

- The level of availability required to meet business objectives, within budgetary constraints;

- The profile of application being deployed (i.e. concurrent users, requests per second, etc.);

- Operational standards within each data center.

Recognizing that each application or service has different operational and availability requirements, the guide discusses the range of certified and supported High Availability (HA) solutions – from internal departmental applications all the way through to geographically redundant, multi-data center systems delivering 99.999% availability (i.e. less than 5 ½ minutes of downtime per year) supporting transactional web services, communications networks, cloud and hosting environments, etc.

By combining the right technology with the right skills and processes, users can achieve business continuity, while developers and DBAs can sleep tight at night! Download the guide to learn more.


PlanetMySQL Voting: Vote UP / Vote DOWN

SkySQL adds MySQL Cluster training to the offering

Сентябрь 28th, 2011
Very soon after SkySQL Ab was created, we added an Advanced MySQL Cluster course to our training offering. This addition was made as a customer who was already quite experienced with MySQL cluster wanted training. This course was, however, deemed to be too complex and with too much pre-requisites to add to our public offering so it was left as a special treat for customers who had deep cluster knowledge. We decided that we would instead create a more general purpose MySQL Cluster course with less prerequisite knowledge required as soon as possible.

Many months have passed and a lot of other courses have been created since, but we now finally release our version of a general purpose MySQL Cluster training course, called Administering MySQL Cluster. As the name suggests this 3-day course is for administering MySQL Cluster and it does require some a priori operating system and MySQL knowledge. The course teaches the basic concepts behind MySQL cluster; how high availability is achieved in a shared -nothing architecture etc, as well as more complex concepts. Practical details are also included, such as installation, configuration, backups, restoration etc. The course will also go through how to make upgrades and schema changes without downtime as well as how crash recovery and fail-over is handled. Geographical replication and query optimization for MySQL Cluster follows as cherries on the cake. If you want more information about the course and dates, check out our schedule.

PlanetMySQL Voting: Vote UP / Vote DOWN

Oracle Enhances MySQL Manageability on Windows

Сентябрь 6th, 2011

Windows is a major development and deployment platform for MySQL. A few months ago, we held our first article presenting the then beta installer last April, and you can now download it here.

2. Certification for Windows Failover Clustering

This certification expands the range of High-Availability solutions available for MySQL on Windows, which previously included replication and MySQL Cluster. Users can now power business critical applications with MySQL Enterprise Edition using native Windows clustering services.

Andrew Morgan provided more detailed information in his blog earlier today about the solution and how to get going.

We will be holding a webinar on September 15th at 9.00 am PT to present the two new Windows solutions in more details, Register now!


PlanetMySQL Voting: Vote UP / Vote DOWN