Archive for the ‘scalability’ Category

Surge 2011 slides, recap

Октябрь 7th, 2011

This year’s Surge conference was a great sophomore event to follow up last year’s inaugural conference. A lot of very smart people were there, and the hallway track was great.

I presented on three things: a lightning talk about causes of MySQL downtime; I chaired a panel on Big Data and the Cloud; and I showed how to derive scalability and performance metrics from TCP traffic. I’ve sent my slides to the Surge organizers, and I understand that they will be posting them as well as integrating them into the video of my session. In the meanwhile you can download my slides from Percona’s presentations page.

Further Reading:


PlanetMySQL Voting: Vote UP / Vote DOWN

When Clever Goes Wrong & How Etsy Overcame – Arstechnica

Октябрь 5th, 2011

In 2007, Etsy made a big bet on homegrown middleware to help with the site’s scalability. A half-year after it was taken live, the company decided to abandon it. As a senior software engineer at Etsy put it, “if you’re doing something ‘clever,” you’re probably doing it wrong.”

Read the full article at Arstechnica.com

I want to focus on the important lessons from this article, about middleware and using stored procedures in this fashion for a public web application, creating unscalable design complexity (smart and “proper” according to the old enterprise design teachings…) – causing infrastructure, development and maintenance hassles.

In the process they did replace PostgreSQL with MySQL but that’s not the critical change that made all the difference. PostgreSQL is a fine database system also.


PlanetMySQL Voting: Vote UP / Vote DOWN

New Commercial Extensions for MySQL Enterprise Edition

Сентябрь 15th, 2011

MySQL 5.5 GA and MySQL 5.6 Development Milestone Releases have delivered many new compelling features to the MySQL users and community for testing, feedback and use.

In addition, commercial customers have access to a number of commercial extensions already included in MySQL Enterprise Edition:

•    MySQL Enterprise Monitor
•    MySQL Enterprise Backup

Continuing the business model of MySQL, we are adding three new commercial extensions to MySQL Enterprise Edition:

  • MySQL Enterprise Scalability
    • Thread Pool
  • MySQL Enterprise High Availability
    • Oracle VM Template for MySQL Enterprise Edition
    • Windows Clustering for MySQL Enterprise Edition
  • MySQL Enterprise Security
    • External Authentication for PAM
    • External Authentication for Windows


MySQL Enterprise Scalability: Thread Pool

To meet the sustained performance and scalability of ever increasing user, query and data loads MySQL Enterprise Edition provides thread pooling.  Thread Pool provides a highly scalable thread-handling model designed to reduce overhead in managing client connections and statement execution threads. The result is improved scalability and sustained performance for high-traffic online applications that service ever-growing numbers of client connections.  Thread Pool is a user configurable option that provides an efficient, alternate thread-handling model designed to fully exploit the processing power of today’s multi-core systems.  MySQL internal SysBench OLTP benchmarks show that the Thread Pool provides a significant improvement in sustained performance and scalability for applications that service a high number of concurrent connections, specifically on 16-core and higher systems.

Learn more and review the MySQL internal SysBench OLTP benchmarks.

MySQL Enterprise High Availability

MySQL Enterprise Edition offers a range of solutions for database high availability, to automatically detect and recover from failures, as well as minimize downtime resulting from scheduled maintenance activities.

Oracle VM Template for MySQL Enterprise Edition - Ensures rapid deployment and helps eliminate configuration efforts and risks by providing a pre-installed and pre-configured virtualized software image, taking advantage of Oracle VM’s features to deliver high availability.

Windows Server Failover Clustering for MySQL Enterprise Edition - With the certification and support of MySQL with Windows Server Failover Clustering (WSFC), organizations can safely deploy business-critical applications demanding high levels of availability, powered by MySQL Enterprise Edition and using native Windows clustering services.

Learn more about these new MySQL Enterprise Edition features.

MySQL Enterprise Security

MySQL 5.5 introduced a pluggable authentication API that allows users to be authenticated using external libraries, directories, etc.  Developers can use this API to build their own custom modules that integrate MySQL into an existing security infrastructure.  MySQL Enterprise Edition leverages this same API to provide ready to use external authentication modules that authenticate users via Pluggable Authentication Modules (“PAM”) or native Windows OS services.  Each is described below:

External Authentication for PAM - Enables you to configure MySQL to use PAM to authenticate users on LDAP, Unix/Linux, Kerberos, and other systems.

External Authentication for Windows – Enables you to configure MySQL to use native Windows services to authenticate client connections.  Users who have logged in to Windows can connect from MySQL client programs to the server based on the token information in their environment without specifying an additional password.

Learn more about these new MySQL Enterprise Edition features and the technical details of the MySQL 5.5 pluggable authentication API.


Download

Existing commercial customers who are entitled to a MySQL Enterprise Edition subscription can log into MyOracleSupport: and download these immediately.
For others who want to try these new capabilities, we will make them available shortly, via the 30 day
free trial of MySQL Enterprise Edition
.

As always, thanks for your continued support of MySQL!


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster is a brilliant NoSQL database

Сентябрь 1st, 2011

MySQL Cluster* is one of the most advanced and scalable databases available today and, despite what its name might suggest, it is also a brilliant NoSQL database**.

Let me discuss this statement!

First, let’s discuss the high level issues that NoSQL databases try to address:
-     Scalability. Traditional RDBMS technology was designed four decades ago, and is not appropriate for today’s Big Data requirements. Database systems today need to be able to scale horizontally over multiple machines to handle millions of users. As the CAP theorem states, it is not possible to achieve availability, scalability and consistency in one system. Several NoSQL databases sacrifice consistency for availability and scalability.
-     RDBMS has a rigid data model. Once a normalized data model has been defined in an RDBMS, it is difficult to change. Schema operations on a database are often not online, and what’s worse, they can be slow. This makes them hard or impossible to implement on production systems. NoSQL databases do not require fixed table schemas, they are ’schema-free’ and can more easily store unstructured data.
-     Simple APIs to access simple data structures, rather than SQL. Joins are not used, they introduce complexity and performance bottlenecks, especially in distributed scale-out systems.

Now, let us have a look at how MySQL Cluster stacks up against these requirements.
-       In terms of scalability, MySQL Cluster scales in a horizontal fashion across a number of nodes. Currently, up to 255 nodes are supported in one cluster. Data is automatically partitioned across a number of Data Nodes, and it is possible to configure a number of replicas for each data partition. Synchronous replication ensures that replicas are consistent and an XA transaction model also makes transactions consistent across partitions..
-       MySQL Cluster has a relational data model, and programmers do have to normalize their data into tables. But the great thing is that schema updates are online, such as adding/removing tables or indexes, or changing existing tables (e.g. adding a column to a table).
-       MySQL Cluster has an Access Layer that can consist of MySQL Servers (SQL), LDAP (NoSQL), native C++ API and Java API (both NoSQL). In the upcoming 7.2 release, a Memcached API (NoSQL) is now also available. So, there are plenty of NoSQL APIs to choose from in order to access NDB data.

Looking at the above, it seems that MySQL Cluster has the problem of having a relational data model where programmers still have the problem of normalizing their data.  A schema-free database sounds great from a developer’s point of view, however, I wonder how it would be to operate a schema-free database. How would the users do ad-hoc queries and analysis?



                                            “Just throw it in the database!”



If a relational schema is something that a user needs or is at least willing to accept, then MySQL Cluster is a brilliant NoSQL database. It solves the same problems as the new NoSQL databases do, yet you also get the traditional benefits from the RDBMS world: atomic transactions even for complex transactions, consistency, integrity and durability, possibility to do complex analytics using multiple columns and indexes. See also this excellent comment by Henrik Ingo about how some of the world's largest telco companies are using MySQL Cluster.

For the majority of NoSQL database products out there, these fundamental data management concerns of an application are being moved out of the database layer to deliver very high performance and scalability.  This means that we are moving these fundamental concerns up the stack in the application layer. I wonder how many out there would feel comfortable with that. What do you think?

If you have any questions or comments, feel free to reply to this blog below or reach out to us on Facebook, LinkedIn, Twitter or directly via these contact details.


* Maybe NDB Cluster is a more appropriate name? MySQL server is an SQL connector.


** Although with a relational model at the back.

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL Cluster is a brilliant NoSQL database

Сентябрь 1st, 2011

MySQL Cluster* is one of the most advanced and scalable databases available today and, despite what its name might suggest, it is also a brilliant NoSQL database**.

Let me discuss this statement!

First, let’s discuss the high level issues that NoSQL databases try to address:
-     Scalability. Traditional RDBMS technology was designed four decades ago, and is not appropriate for today’s Big Data requirements. Database systems today need to be able to scale horizontally over multiple machines to handle millions of users. As the CAP theorem states, it is not possible to achieve availability, scalability and consistency in one system. Several NoSQL databases sacrifice consistency for availability and scalability.
-     RDBMS has a rigid data model. Once a normalized data model has been defined in an RDBMS, it is difficult to change. Schema operations on a database are often not online, and what’s worse, they can be slow. This makes them hard or impossible to implement on production systems. NoSQL databases do not require fixed table schemas, they are ’schema-free’ and can more easily store unstructured data.
-     Simple APIs to access simple data structures, rather than SQL. Joins are not used, they introduce complexity and performance bottlenecks, especially in distributed scale-out systems.

Now, let us have a look at how MySQL Cluster stacks up against these requirements.
-       In terms of scalability, MySQL Cluster scales in a horizontal fashion across a number of nodes. Currently, up to 255 nodes are supported in one cluster. Data is automatically partitioned across a number of Data Nodes, and it is possible to configure a number of replicas for each data partition. Synchronous replication ensures that replicas are consistent and an XA transaction model also makes transactions consistent across partitions..
-       MySQL Cluster has a relational data model, and programmers do have to normalize their data into tables. But the great thing is that schema updates are online, such as adding/removing tables or indexes, or changing existing tables (e.g. adding a column to a table).
-       MySQL Cluster has an Access Layer that can consist of MySQL Servers (SQL), LDAP (NoSQL), native C++ API and Java API (both NoSQL). In the upcoming 7.2 release, a Memcached API (NoSQL) is now also available. So, there are plenty of NoSQL APIs to choose from in order to access NDB data.

Looking at the above, it seems that MySQL Cluster has the problem of having a relational data model where programmers still have the problem of normalizing their data.  A schema-free database sounds great from a developer’s point of view, however, I wonder how it would be to operate a schema-free database. How would the users do ad-hoc queries and analysis?



                                            “Just throw it in the database!”



If a relational schema is something that a user needs or is at least willing to accept, then MySQL Cluster is a brilliant NoSQL database. It solves the same problems as the new NoSQL databases do, yet you also get the traditional benefits from the RDBMS world: atomic transactions even for complex transactions, consistency, integrity and durability, possibility to do complex analytics using multiple columns and indexes. See also this excellent comment by Henrik Ingo about how some of the world's largest telco companies are using MySQL Cluster.

For the majority of NoSQL database products out there, these fundamental data management concerns of an application are being moved out of the database layer to deliver very high performance and scalability.  This means that we are moving these fundamental concerns up the stack in the application layer. I wonder how many out there would feel comfortable with that. What do you think?

If you have any questions or comments, feel free to reply to this blog below or reach out to us on Facebook, LinkedIn, Twitter or directly via these contact details.


* Maybe NDB Cluster is a more appropriate name? MySQL server is an SQL connector.


** Although with a relational model at the back.

PlanetMySQL Voting: Vote UP / Vote DOWN

I’ll be presenting at Postgres Open 2011

Июль 20th, 2011

I’ve been accepted to present at the brand-new and very exciting Postgres Open 2011 about system scaling, TCP traffic, and mathematical modeling. I’m really looking forward to it — it will be my first PostgreSQL conference in a couple of years! See you there.

Related posts:

  1. Postgres folks, consider the 2011 MySQL conference
  2. O’Reilly MySQL 2011 conference CfP is open
  3. My sessions at the O’Reilly MySQL Conference 2011
  4. I’m a Postgres user, as it turns out
  5. Awesome Postgres/MySQL cross-pollination


PlanetMySQL Voting: Vote UP / Vote DOWN

I’m speaking at Surge 2011

Июль 7th, 2011

I’ll be speaking at Surge again this year. This time, unlike last year’s talk, I’m tackling a very concrete topic: extracting scalability and performance metrics from TCP network traffic. It turns out that most things that communicate over TCP can be analyzed very elegantly just by capturing arrival and departure timestamps of packets, nothing more. I’ll show examples where different views on the same data pull out completely different insights about the application, even though we have no information about the application itself (okay, I actually know that it’s a MySQL database, and a lot about the actual database and workload, but I don’t need that in order to do what I’ll show you). It’s an amazingly powerful technique that I continue to find new ways to apply to real systems.

Take a look at the other speakers too — it is an impressive lineup. I hope you can attend. Last year’s show was a great event.

Related posts:

  1. Speaking at Surge 2010
  2. Postgres folks, consider the 2011 MySQL conference
  3. Speaking at EdUI Conference 2009
  4. Speaking at MySQL Meetup in Northern Virginia
  5. I’ll be speaking at the O’Reilly MySQL Conference 2010


PlanetMySQL Voting: Vote UP / Vote DOWN

How VoltDB does transaction ordering and replication

Февраль 16th, 2011

I’m not sure how many of this blog’s readers are likely to be aware of VoltDB. It is one of the systems that I think could be poised to dispel the notion that SQL (or the relational model) is somehow inherently unscalable. Here’s a blog post explaining how VoltDB does transaction ordering and replication.

Related posts:

  1. Why MySQL replication is better than mysqlbinlog for recovery
  2. What are your favorite MySQL replication filtering rules?
  3. How to scale writes with master-master replication in MySQL
  4. How to make MySQL replication reliable
  5. How fast is MySQL replication?


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL At Scale – Zynga Games

Декабрь 3rd, 2010
Recently am part of Zynga’s database team as I was pretty much impressed with company’s database usage. As everyone knows how popular Zynga games like Farmville, Cafe World, Mafia Wars, Poker, FrontierVille, FishVille, PetVille and Treasure Island etc are. Zynga launched yet another new game today called CityVille along with series of acquisitions (latest today [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

Big Data: Freedom or Something Else?

Декабрь 1st, 2010
Googling around, I came across Bradford Cross' article, Big Data Is Less About Size, And More About Freedom. Bradford writes, " The scale of data and computations is an important issue, but the data age is less about the raw size of your data, and more about the cool stuff you can do with it."

Even though the article makes some good points, I'm not sure I can agree with Bradford's point of view here. As an architect, when I think in terms of Big Data, the ability to do "cool stuff" is probably the last thing that crosses my mind. Big Data, to me, is about ensuring constant response time as the data grows in size without sacrificing functionality.

What do you think Big Data is about? Is it merely about being able to do 'cool stuff' with your data? Is it about ensuring constant access/response times? Or is it about something else? I'm eager to hear your thoughts.

PlanetMySQL Voting: Vote UP / Vote DOWN