Archive for the ‘schema’ Category

Can MySQL use primary key values from a secondary index?

Май 17th, 2012

In the article about the role of a primary key, I mentioned that a secondary index in an InnoDB table consists not only of the values of its member columns, but also values of the table’s primary key are concatenated to the index. I.e. the primary key contents is part of every other index.

Assuming the following table structure:

CREATE TABLE `bets` (
  `id` int(10) unsigned NOT NULL,
  `user_id` int(10) unsigned NOT NULL,
  `game_id` int(10) unsigned NOT NULL,
...
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`)
) ENGINE=InnoDB

Here is the visualization:

If MySQL could use in queries these implicitly added values, it would maybe allow to save some space on listing the primary key columns at the end of an index explicitly. Let’s check various cases.

Row filtering
mysql> EXPLAIN
    -> SELECT *
    -> FROM   bets
    ->        JOIN games
    ->        ON     games.id = bets.id
    -> WHERE  bets.user_id = 111
    ->        AND bets.id > 3476\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bets
         type: ref
possible_keys: user_id
          key: user_id
      key_len: 4
          ref: const
         rows: 22
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: games
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: game.bets.id
         rows: 1
        Extra:

Both key_len and ref fields indicate that only one four bytes long column is used from the user_id index. MySQL cannot use the primary key values in a secondary index for filtering in WHERE clause.

Sorting with ORDER BY
mysql> EXPLAIN
    -> SELECT   *
    -> FROM     bets
    ->          JOIN games
    ->          ON       games.id = bets.game_id
    -> WHERE    bets.user_id = 111
    -> ORDER BY bets.id DESC\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bets
         type: ref
possible_keys: user_id
          key: user_id
      key_len: 4
          ref: const
         rows: 22
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: games
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: game.bets.game_id
         rows: 1
        Extra:

Extra only returns Using where, but there is no Using filesort. It means ORDER BY will be optimized using the hidden primary key data from the secondary index.

Aggregating with GROUP BY
mysql> EXPLAIN
    -> SELECT   *
    -> FROM     bets
    ->          JOIN games
    ->          ON       games.id = bets.game_id
    -> WHERE    bets.user_id = 111
    -> GROUP BY bets.id\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bets
         type: ref
possible_keys: user_id
          key: user_id
      key_len: 4
          ref: const
         rows: 22
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: games
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: game.bets.game_id
         rows: 1
        Extra:

Also in this case Extra neither shows Using filesort nor Using temporary, which would indicate no index is used for grouping. Therefore MySQL can optimize GROUP BY on the concatenated primary key values.

Covering index
mysql> EXPLAIN
    -> SELECT bets.id
    -> FROM   bets
    -> WHERE  bets.user_id = 111\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: bets
         type: ref
possible_keys: user_id
          key: user_id
      key_len: 4
          ref: const
         rows: 22
        Extra: Using index

The query execution plan confirms through Using index that it will only need index contents to return result. MySQL can read and return the hidden primary key values to avoid the additional data lookup.

Summary

In InnoDB tables each entry of a secondary index always contains the copy of the corresponding primary key row. These values may in some cases be used to the benefit of query execution plan:

  • for ORDER BY on the primary key column(s)
  • for GROUP BY on the primary key column(s)
  • when returning the primary key column(s) values in the SELECT list

MySQL cannot use them, however, to optimize filtering in WHERE.


PlanetMySQL Voting: Vote UP / Vote DOWN

Optimizing MySQL performance with accurate keys

Апрель 6th, 2012

MySQL performance is largely defined by keys and how efficiently queries can use them. As you scale, at certain point it is not enough anymore to just have any indexes and still get a good performance in return. You have to really figure them out and allow your queries to do less work, as little work as possible.

The approach presented in this article can sometimes help designing such good, efficient indexes. As a consultant, I have to rely on it myself from time to time, having to optimize a query that works in a database I know nothing about.

Let’s assume there is an application, which collects user activity in various places. The application uses a poorly indexed database, so there are plenty of examples to choose from. Our example query performs a full table scan, which means it reads all rows from the table it uses. It is also among the most popular statements executed by application.

mysql> EXPLAIN SELECT * FROM `checkins` WHERE user_id = 1410 AND checkin_source = 3 AND checkin_type IN (3,5)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: checkins
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1039425
        Extra: Using where

A full table scan is indicated by type field, which simply says ALL. Another relevant information is the estimation of how many rows the query will have to read. MySQL thinks it may be 1039425, but in reality, with InnoDB tables, this number is usually be a bit skewed. It should still give the idea of the magnitude of the effort even without the fine accuracy. It is clear that the query needs an index to become more efficient, but what index exactly?

It is typically expected from a key to offer high cardinality, or in other words, to contain many distinct values. It allows an index lookup to hit fewer, more relevant rows. A new key could be created on only one of the columns referenced in the WHERE clause. Or it could be created on a combination of these columns – it would be so called composite index.

Often even only by looking at a query, it may be easy to conclude, which columns have good cardinality and which don’t. The table from our example query holds information on users activity, so there will likely be a good mix of many different values in user_id column. But can there be equally many sources where user check-ins originate from? A reasonably good guess could be that the options are limited to a web page, a few named social networks, a mobile application and perhaps some more. In other words, not a lot.

This can always be verified:

mysql> SELECT COUNT(1) FROM `checkins` WHERE user_id = 1410;
+----------+
| COUNT(1) |
+----------+
|     6360 |
+----------+

mysql> SELECT COUNT(1) FROM `checkins` WHERE checkin_source = 3;
+----------+
| COUNT(1) |
+----------+
|   108623 |
+----------+

It becomes evident that user_id is a better candidate for indexing than checkin_source as it finds less rows. A query executing against a key on the former column would match only 6360 at first, while one executing against a key on the latter would have to start with 108623 rows. Why start? Because filtering happens in two stages. If MySQL has an index that can be applied, it uses it to find matches, but only using the columns that exist in both WHERE clause and in the index (with some limitations too). Every matching row is then read in full length and any remaining conditions are applied. Starting with one hundred thousand rather than six thousand something would push nearly twenty times more rows for post-filtering in the second step. Looking at a bigger picture, it is not just twenty times more work for CPU, but also possibility many more I/O requests.

A cross-check on cardinality could also be done:

mysql> SELECT COUNT(DISTINCT user_id) AS cardinality FROM `checkins`;
+-------------+
| cardinality |
+-------------+
|       26360 |
+-------------+

mysql> SELECT COUNT(DISTINCT checkin_source) AS cardinality FROM `checkins` WHERE checkin_source = 3;
+-------------+
| cardinality |
+-------------+
|          18 |
+-------------+

But is adding an index on user_id really the best of what can be done? Why not try examining the data even further? How many rows does the original query actually match?

mysql> SELECT COUNT(1) FROM `checkins` WHERE user_id = 1410 AND checkin_source = 3 AND checkin_type IN (3,5);
+----------+
| COUNT(1) |
+----------+
|      832 |
+----------+

That is even better than 6360 matched by the user alone. It means that using all three columns together can more significantly improve selectivity, i.e. reduce number of rows matched by a set of conditions. If this is somewhat consistent behavior, what can be verified through query statistics from MySQL slow log, the information can be used to improve the new index design. A composite key on (user_id, checkin_source, checkin_type) seems good candidate now, better than user_id alone, but now another question emerges… could it be shorter?

Indexes shouldn’t be needlessly long, because they use disk space and memory. They also require a lot of maintenance work from database to keep them up-to-date, so if something does not really, to a measurable extent, benefit MySQL performance elsewhere, should not be part of any index.

What if only two out of the three columns where used:

mysql> SELECT COUNT(1) FROM `checkins` WHERE user_id = 1410 AND checkin_source = 3;
+----------+
| COUNT(1) |
+----------+
|      869 |
+----------+

Thirty rows more is not going to turn into a measurable difference in most cases, so it should be okay to use a shorter index covering just (user_id, checking_source).

Conclusion

We considered four different keys for optimizing performance of the example query:

  1. (user_id)
  2. (checkin_status)
  3. (user_id, checkin_status, checkin_type)
  4. (user_id, checkin_status)

The last one offered the best factor of efficiency to cost being nine times more efficient than index number one and slightly less expensive than comparable index number three. Therefore we found the accurate index, which should help MySQL performance the most considering this particular query.


PlanetMySQL Voting: Vote UP / Vote DOWN

Scaling Web Databases, Part 2: Adding Nodes, Evolving Schema with Zero Downtime

Июль 21st, 2011

In my previous post, I discussed scaling web database performance in MySQL Cluster using auto-sharding and active/active geographic replication - enabling users to scale both within and across data centers.  

I also mentioned that while scaling write-performance of any web service is critical, it is only 1 of multiple dimensions to scalability, which include:

- The need to scale operational agility to keep pace with demand. This means being able to add capacity and performance to the database, and to evolve the schema – all without downtime;

- The need to scale queries by having flexibility in the APIs used to access the database – including SQL and NoSQL interfaces;

- The need to scale the database while maintaining continuous availability.

All of these subjects are discussed in more detail in our new Scaling Web Databases guide.

In this posting, we look at scaling operational agility. 

As a web service gains in popularity it is important to be able to evolve the underlying infrastructure seamlessly, without incurring downtime and without having to add lots of additional DBA or developer resource.

Users may need to increase the capacity and performance of the database; enhance their application (and therefore their database schema) to deliver new capabilities and upgrade their underlying platforms.

MySQL Cluster can perform all of these operations and more on-line – without interrupting service to the application or clients.  

On-Line, On-Demand Scaling

MySQL Cluster allows users to scale both database performance and capacity by adding Application and Data Nodes on-line, enabling users to start with small clusters and then scale them on-demand, without downtime, as a service grows. Scaling could be the result of more users, new application functionality or more applications needing to share the database.

In the following example, the cluster on the left is configured with two application and data nodes and a single management server.  As the service grows, the users are able to scale the database and add management redundancy – all of which can be performed as an online operation.  An added advantage of scaling the Application Nodes is that they provide elasticity in scaling, so can be scaled back down if demand to the database decreases.

When new data nodes and node groups are added, the existing nodes in the cluster initiate a rolling restart to reconfigure for the new resource.  This rolling restart ensures that the cluster remains operational during the addition of new nodes.  Tables are then repartitioned and redundant rows are deleted with the OPTIMIZE TABLE command.  All of these operations are transactional, ensuring that a node failure during the add-node process will not corrupt the database.

The operations can be performed manually from the command line or automated with MySQL Cluster Manager , part of the commercial MySQL Cluster Carrier Grade Edition.

On-Line Cluster Maintenance

With its shared-nothing architecture, it is possible to avoid database outages by using rolling restarts to not only add but also upgrade nodes within the cluster.  Using this approach, users can:

- Upgrade or patch the underlying hardware and operating system;

- Upgrade or patch MySQL Cluster, with full online upgrades between releases.

MySQL Cluster supports on-line, non-blocking backups, ensuring service interruptions are again avoided during this critical database maintenance task.  Users are able to exercise fine-grained control when restoring a MySQL Cluster from backup using ndb_restore. Users can restore only specified tables or databases, or exclude specific tables or databases from being restored, using ndb_restore options --include-tables, --include-databases, --exclude-tables, and --exclude-databases.

On-Line Schema Evolution

As services evolve, developers often want to add new functionality, which in many instances may demand updating the database schema.  

This operation can be very disruptive for many databases, with ALTER TABLE commands taking the database offline for the duration of the operation.  When users have large tables with many millions of rows, downtime can stretch into hours or even days.

MySQL Cluster supports on-line schema changes, enabling users to add new columns and tables and add and remove indexes – all while continuing to serve read and write requests, and without affecting response times.

Unlike other on-line schema update solutions, MySQL Cluster does not need to create temporary tables, therefore avoiding the user having to provision double the usual memory or disk space in order to complete the operation.

Summary

So in addition to scaling write performance, MySQL Cluster can also scale operational agility.  I'll post more on scaling of data access methods and availability levels over the next few weeks.

You can read more about all of these capabilities in the new Scaling Web Databases guide.  

And of course, you can try MySQL Cluster out for yourself - its available under the GPL:

The GA release is 7.1 which can be downloaded here, but I'd recommend taking a look at the latest Development Milestone Release for MySQL Cluster 7.2 which has some great new capabilities (localized JOIN operations, simpler provisioning, etc) which can be downloaded from here (select the Development Releases tab).

As ever, let me know if there are other dimensions of scalability that I should be discussing 


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing common_schema: common views & routines for MySQL

Июль 13th, 2011

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!


PlanetMySQL Voting: Vote UP / Vote DOWN

Announcing common_schema: common views & routines for MySQL

Июль 13th, 2011

Today I have released common_schema, a utility schema for MySQL which includes many views and functions, and is aimed to be installed on any MySQL server.

What does it do?

There are views answering for all sorts of useful information: stuff related to schema analysis, data dimensions, monitoring, processes & transactions, security, internals… There are basic functions answering for common needs.

Some of the views/routines simply formalize those queries we tend to write over and over again. Others take the place of external tools, answering complex questions via SQL and metadata. Still others help out with SQL generation.

Here are a few highlights:

There’s more. Take a look at the common_schema documentation for full listing. And it’s evolving: I’ve got quite a few ideas already for future components.

Some of these views rely on heavyweight INFORMATION_SCHEMA tables. You should be aware of the impact and risks.

What do I need to install?

There’s no script or executable file. It’s just a schema. The distribution in an SQL file which generates common_schema. Much like a dump file.

What are the system requirements?

It’s just between you and your MySQL. There are currently three distribution files, dedicated for different versions of MySQL (and allowing for increased functionality):

  • common_schema_mysql_51: fits all MySQL >= 5.1 distributions
  • common_schema_innodb_plugin: fits MySQL >= 5.1, with InnoDB plugin + INFORMATION_SCHEMA tables enabled
  • common_schema_percona_server: fits Percona Server >= 5.1

Refer to the documentation for more details.

What are the terms of use?

common_schema is released under the BSD license.

Where can I download it?

On the common_schema project page. Enjoy it!


PlanetMySQL Voting: Vote UP / Vote DOWN

Thoughts and ideas for Online Schema Change

Октябрь 7th, 2010

Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than ever.

The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.

“The original table must have PK. Otherwise an error is returned.”

This restriction could be lifted: it’s enough that the table has a UNIQUE KEY. My original oak-online-alter-table handled that particular case. As far as I see from their code, the Facebook code would work just as well with any unique key.

However, this restriction is of no real interest. As we’re mostly interested in InnoDB tables, and since any InnoDB table should have a PRIMARY KEY, we shouldn’t care too much.

“No foreign keys should exist. Otherwise an error is returned.”

Tricky stuff. With oak-online-alter-table, changes to the original table were immediately reflected in the ghost table. With InnoDB tables, that meant same transaction. And although I never got to update the text and code, there shouldn’t be a reason for not using child-side foreign keys (the child-side is the table on which the FK constraint is defined).

The Facebook patch works differently: it captures changes and writes them to a delta table,  to be later (asynchronously) analyzed and make for a replay of actions on the ghost table.

So in the Facebook code, some cases will lead to undesired behavior. Consider two tables, country and city, with city holding a RESTRICT/NO ACTION foreign key on country‘s id. Now consider the scenario:

  1. Rows from city are DELETEd, where the country Id is Spain’s.
    • city‘s ghost table is still unaffected, Spain’s cities are still there.
    • A change is written to the delta table to mark these rows for deletion.
  2. A DELETE is issued on country‘s Spain record.
    • The DELETE should work, from the user’s perspective
    • But it will fail: city’s ghost table has not received the changes yet. There’s still matching rows. The NO ACTION constraint will fail the DELETE statement.

Now, this does not lead to corruption, just to seemingly unreasonable behavior on the database part. This behavior is probably undesired. NO ACTION constraint won’t do.

However, with CASCADE or SET NULL options, there is less of an issue: operations on the parent table (e.g. country) cannot fail. We must make sure operations on the ghost table make it consistent with the original table (e.g. city).

Consider the following scenario:

  1. A new country is created, called “Sleepyland”. An INSERT is made to country.
    • Both city and city‘s ghost are immediately aware of it.
  2. A new town is created and INSERTed to city. The town is called “Naphaven”.
    • The change takes time to propagate to city‘s ghost table.
  3. Meanwhile, we realized we made a mistake. We’ve been had. There’s no such city nor country.
    1. We DELETE “Naphaven” from city.
    2. We DELETE “Sleepyland” from country.
    • Note that city‘s ghost table still hasn’t caught up with the changes.
  4. Eventually, the INSERT statement for “Naphaven” reaches city‘s ghost table.
    • What should happen now? The INSERT cannot succeed.
    • Will this fail the entire process?

Looking at the PHP code, I see that changes written on the delta table are blindly replayed on the ghost table.

Since the process is asynchronous, this should not be the case. We can solve the above if we use INSERT IGNORE instead of INSERT. The statement will fail without failing anything else. The row cannot exist, and that’s because the original row does not exist anymore.

Unlike a replication corruption, this does not lead to accumulation mistakes. The replay is static, somewhat like in binary log format. Changes are just written, regardless of existing data.

I have given this considerable thought, and I can’t say I’ve covered all the possible scenario. However I believe that with proper use of INSERT IGNORE and REPLACE INTO (two statements I heavily relied on with oak-online-alter-table), correctness can be achieved.

There’s the small pain of re-generating the foreign key definition on the “ghost” table (CREATE TABLE LIKE … does not copy FK definitions). And since foreign key names are unique, a new name must be picked up. Not pretty, but perfectly doable.

“No AFTER_{INSERT/UPDATE/DELETE} triggers must exist.”

It would be nicer if MySQL had an ALTER TRIGGER statement. There isn’t such statement. If there were such an atomic statement, then we would be able to rewrite the trigger, so as to add our own code to the end of the trigger’s code. Yuck. Would be even nicer if we were allowed to have multiple triggers of same event.

So, we are left with DROP and CREATE triggers. Alas, this makes for a short period where the trigger does not exist. Bad. The easy solution would be to LOCK WRITE the table, but apparently you can’t DROP the trigger (*) when the table is locked. Sigh.

(*) Happened to me, apparently to Facebook too; With latest 5.1 (5.1.51) version this actually works. With 5.0 it didn’t use to; this needs more checking.

Use of INFORMATION_SCHEMA

As with oak-online-alter-table, the OSC checks for triggers, indexes, column by searching on the INFORMATION_SCHEMA tables. This makes for nice SQL for getting the exact listing and types of PRIMARY KEY columns, whether or not AFTER triggers exist, and so on.

I’ve always considered this to be the weak part of openark-kit, that it relies on INFORMATION_SCHEMA so much. It easier, it’s cleaner, it’s even more correct to work that way — but it just puts too much locks. I think Baron Schwartz (and now Daniel Nichter) did amazing work on analyzing table schemata by parsing the SHOW CREATE TABLE and other SHOW commands regex-wise with Maatkit. It’s a crazy work! Had I written openark-kit in Perl, I would have just import their code. But I’m too lazy busy to do the conversion from Perl to Python, and rewrite that code, what with all the debugging.

OSC is written in PHP. Again, much conversion work. I think performance-wise this is an important step to make.

A word for the critics

Finally, a word for the critics. I’ve read some Facebook/MySQL bashing comments and wish to relate.

In his interview to The Register, Mark Callaghan gave the example that “Open Schema Change lets the company update indexes without user downtime, according to Callaghan”.

PostgeSQL was mentioned for being able to add index with only read locks taken, or being able to do the work with no locks using CREATE INDEX CONCURRENTLY. I wish MySQL had that feature! Yes, MySQL has a lot to improve upon, and the latest PostgreSQL 9.0 brings valuable new features. (Did I make it clear I have no intention of bashing PostgreSQL? If not, please re-read this paragraph until convinced).

Bashing related to the notion of MySQL being so poor that Facebook used an even poorer mechanism to work out the ALTER TABLE.

Well, allow me to add a few words: the CREATE INDEX is by far not the only thing you can achieve with OSC (although it may be Facebook’s major concern). You should be able to:

  • Add columns
  • Drop columns
  • Convert character sets
  • Modify column types
  • Add partitioning
  • Reorganize partitioning
  • Compress the table
  • Otherwise changing table format
  • Heck, you could even modify the storage engine! (To other transactional engine)

These are giant steps. How easy would it be to write these down into the database? It only takes a few weeks time to work out a working solution with reasonable limitations, just using the resources the MySQL server provides you with. The MySQL@Facebook team should be given credit for that.


PlanetMySQL Voting: Vote UP / Vote DOWN

Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power)

Июль 30th, 2010

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice. In some places, I still have questions, and I am happy to have folks comment and contribute what they know.

One interesting point brought up:
Maria quoted someone (she said the name but I did not grab it) from Forrester saying, “3NF is typically a selfless model used by Enterprise data warehouse, which is used by the whole company. A star schema is a selfish model, used by a department, because it’s already got aggregation in it.”

I thought that was an interesting way of pointing that out — most people do not understand why 3NF is not good enough for data warehousing, and I have had a hard time explaining why a star or snowflake schema should be used. Another schema-related topic I had a hard time putting into words before this workshop was the difference between a star and a snowflake schema: compared to a star schema, in a snowflake schema, you have more than one fact table and maybe some dimensions that are not used often.

From Maria and the slides:
“Oracle says model what will suit your business best. Don’t get lost in academia. Most schemas are not 100% according to the theoretical
models. Some examples: 3NF schema with denormalized attributes to avoid costly joins, Star schema with multiple hierarchies in same fact table.”

Data warehousing has a 3-step approach —

1) data sources -> staging layer (temp loading layer)
2) staging layer (temp loading layer)-> foundation (logical, data store) layer
3) foundation (logical, data store) layer -> access and performance layer

The foundation layer is usually 3NF the access layer is usually a star or snowflake schema. As for the data sources, they can be varied, you would hope that they are in 3NF (and if they are you can skip the first 2 steps) but they are not always that way.

The 3 P’s of best practice for data warehousing (on Oracle) are power, partitioning, parallelism. The goal of the data warehousing environment is to minimize the amount of data accessed and use the most efficient joins – so it is not so index focused. This may be based on Oracle’s way of doing joins, I am not so sure if it applies to MySQL as well.

Power The weakest link in the chain (the 3 steps above) will define the throughput, so make sure your hardware configuration is balanced. Maria mentioned that as DBAs, “most of the time we don’t have control over this, but we’re still bound to the SLAs.”

This includes hardware that immediately comes to mind such as # of CPUs/cores, speed of CPU, amount of RAM, speed of disk as well as what we may not think of immediately: speed of network switches, speed of disk controllers, number and speed of host BUS adapters. Notes on host BUS adapters (HBAs): Know the # of HBA ports you have. 4 Gb HBA does 400 Mb/sec. 2 Gb HBA does 200 Mb/sec. Make sure there’s enough HBA capacity to sustain the CPU throughput (ie, make sure HBA isn’t the bottleneck). Also the speed at which it all talks. If you have a 4 Gb machine but a 2 Gb switch, you end up having 2 Gb throughput. Upgrade the network at the same time you upgrade machines.

Because we are talking about data warehousing, it is often not possible to eliminate disk I/O, so the goal is to have the fastest I/O throughput possible. Data warehouses need to be sized on I/O throughput not number of I/O’s.

I made a post earlier about how to determine I/O throughput for a system, which used information from this session. Justin Swanhart already pointed out that this is based on the fact that Oracle can do hash joins and MySQL can only do nested loop joins. I wonder, though, if there is indeed no case when using MySQL for which I/O throughput is a more useful metric than iops.

Disk arrays that are expensive are usually sized for iops, not throughput, and because they’re expensive the disk array is shared throughout the company. A DBA needs to ask ‘how many connections into the storage array do I have? How many disk controllers do I have? Where are my physical disks, and which controllers are they hanging off of?’

Typical 15k rpm disk can do about 25-35 Mb/sec (per disk) random i/o’s. Disk manufacturers will throw out numbers like 200-300 Mb/sec but that’s sequential I/O and leading edge of the drive. Make sure all your LUNs are not coming off the same set of disks, so that you’re not conflicting on disk seeks.

Continue to part 2, partitioning.


PlanetMySQL Voting: Vote UP / Vote DOWN

mk-schema-change? Check out ideas from oak-online-alter-table

Март 10th, 2010

In response to Mark Callaghan’s post mk-schema-change.

I apologize for not commenting on the post itself, I do not hold a Facebook account. Anyway this is a long write, so it may as well deserve a post of its own.

Some of the work Mark is describing already exists under openark kit’s oak-online-alter-table. Allow me to explain what I have gained there, and how the issue can be further pursued. There is relevance to Mark’s suggestion.

oak-online-alter-table uses a combination of locks, chunks and triggers to achieve an almost non-blocking ALTER TABLE effect. I had a very short opportunity to speak with Mark on last year’s conference, in between bites. Mark stated that anything involving triggers was irrelevant in his case.

The triggers are a pain, but I believe a few other insights from oak-online-alter-table can be of interest.

The first attempt

My first attempt with the script assumed:

  • Table has an AUTO_INCREMENT PRIMARY KEY column
  • New rows always gain ascending PRIMARY KEY values
  • PRIMARY KEY never changes for an existing row
  • PRIMARY KEY values are never reused
  • Rows may be deleted at will
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

So the idea was: when one wants to do an ALTER TABLE:

  1. Create a ghost table with the new structure.
  2. Read the minimum and maximum PK values.
  3. Create AFTER INSERT, AFTER UPDATE, AFTER DELETE triggers on the original table. These triggers will propagate the changes onto the ghost table.
  4. Working out slowly, and in small chunks, copy rows within recorded min-max values range into the ghost table. The interesting part is where the script makes sure there’s no contradiction between these actions and those of the triggers, (whichever came first!). This is largely solved using INSERT IGNORE and REPLACE INTO in the proper context.
  5. Working out slowly and in chunks again, we remove rows from the ghost table, which are no longer existent in the original table.
  6. Once all chunking is complete, RENAME original table to *_old, and ghost table in place of the original table.

Steps 4 & 5 are similar in concept to transactional recovery through redo logs and undo logs.

The next attempt

Next phase removed the AUTO_INCREMENT requirement, as well as the “no reuse of PK”. In fact, the only remaining constraints were:

  • There is some UNIQUE KEY on the table which is unaffected by the ALTER operation
  • No triggers exist on the table
  • No FOREIGN KEYs exist on the table.

The steps are in general very similar to those listed previously, only now a more elaborate chunking method is used with possible non-integer, possible multi-column chunking algorithm. Also, the triggers take care of changes in UNIQUE KEY values themselves.

mk-schema-change?

Have a look at the wiki pages for OnlineAlterTable*. There is some discussion on concurrency issues; on transactional behavior, which explains why oak-online-alter-table performs correctly. Some of these are very relvant, I believe, to Mark’s suggestion. In particular, making the chunks copy; retaining transactional integrity, etc.

To remove any doubt, oak-online-alter-table is not production ready or anywhere near. Use at your own risk. I’ve seen it work, and I’ve seen it crash. I got little feedback and thus little chance to fix things. I also didn’t touch the code for quite a few months now, so I’m a little rusty myself.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL University: Performance Schema: Instrumenting Code

Февраль 15th, 2010

This Thursday (February 18th, 17:00 UTC - note the different starting time!), Marc Alff will present the brand new Performance Schema: Instrumenting Code. Marc is the implementor of the Performance Schema feature. His presentation will show how developers maintaining

  • the MySQL server code,
  • a storage engine,
  • or any type of plugin

can instrument the code to collect data for the new performance schema instrumentation.

For MySQL University sessions, point your browser to this page. You need a browser with a working Flash plugin. You may register for a Dimdim account, but you don't have to. (Dimdim is the conferencing system we're using for MySQL University sessions. It provides integrated voice streaming, chat, whiteboard, session recording, and more.

MySQL University is a free educational online program for engineers/developers. MySQL University sessions are open to anyone. All sessions (slides & audio) are recorded; the links will be on the respective MySQL University session pages which are listed on the MySQL University home page.

Here's the tentative list of upcoming sessions:

  • February 25: Securich - Security Plugin for MySQL (Darren Cassar)
  • March 4: MySQL Column Databases (Robin Schumacher)
  • March 11: Improving MySQL Full-Text Search (Kristofer Pettersson)

By the way, did I mention that we need more speakers to fill up the 2010 schedule? If you'd like to be a speaker, have a look at this blog article!


PlanetMySQL Voting: Vote UP / Vote DOWN