Archive for the ‘partition’ Category

On LVM: How to setup Volume Groups and Logical Volumes.

Май 28th, 2011
LVM (Logical Volume Management) is a very important tool to have in the toolkit of a MySQL DBA. It allows you to create and extend logical volumes on the fly. This allows me to, say, add another disk and extend a partition effortlessly. The other very important feature is the ability to take snapshots, that you can then use for backups. All in all its a must have tool. Hence, this guide will allow you to understand various terminologies associated with LVM, together with setting up LVM volumes and in a later part will also show you how to extend...
PlanetMySQL Voting: Vote UP / Vote DOWN

Data Warehousing Best Practices: Comparing Oracle to MySQL, part 2 (partitioning)

Июль 30th, 2010

At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.

See part 1 for the introduction and talking about power and hardware. This part will go over the 2nd “P”, partitioning. Learning about Oracle’s partitioning has gotten me more interested in how MySQL’s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle’s partitioning looks very nice (then again, that’s why it costs so much I guess).

Partition – Larger tables or fact tables can benefit from partitioning because it makes data load easier and can increase join performance and use data elimination. Parallel execution can be done with partitioning due to partition pruning. The degree of parallelism should be a power of 2, because of hash-based algorithm in hash partitioning. To translate this to the MySQL world, if you are using LINEAR HASH partitioning, then you should use a degree of parallelism that is a power of 2 (I checked, and indeed. Otherwise, use a degree of parallelism that makes sense given the number of partitions you have.

One important note that during Pythian’s testing of MySQL partitioning, we found that all partitions were locked when an INSERT occurs, for the duration of the INSERT. Bulk-loading with MySQL partitioning is not as fast as it would be if MySQL allowed partition pruning for INSERTs.

So, what should be partitioned? For the first level of partitioning, the goal is to enable partitioning pruning and simplify data management. The most typical partitioning is range or interval partitioning on a date column. Interval partitioning is you say what the partition is (date, month) and partition is automatically created. MySQL does not have interval partitioning, and I have seen typical first-level partitioning be range or list based on a date or timestamp column. Note that if you use a timestamp field, the partitioning expression is optimized if you use TO_DAYS(timestamp_field) or YEAR(timestamp_field). In my experience, using anything else (such as DATE(timestamp_field)) actually makes partitioning slower than not using partitioning at all. Note that this is based on tests I did a few months ago, and your mileage may vary.

So — how do you decide partitioning strategy? Ask yourself:

  • What range of data do the queries touch – a quarter, a year?
  • What is the data loading frequency?
  • Is an incremental load required?
  • How much data is involved, a day, a week, a month?

The answers to the above questions will tell you about how big your interval needs to be. The best scenario is that all answers are the same, “we load every day, and people query by day.” If the answers are different weight access a higher priority than loading, because most people care more about query performance than performance of ETL.

This is true even if your intervals have different sizes — ie sales per day are much bigger in Dec but that’s OK. However, Maria recommends that the subpartition be as evenly divided as possible.

Easier to look at more partitions than to look at a partition that’s too big. But you don’t want too many partitions, max Oracle allows partitions is 1 million partitions, prior to 11g it was 64,000. “Stick closer to 64,000 than 1 million”. MySQL’s limitation is 1024 per table.

For the second level of partitioning, also called subpartitioning, the goal is to allow for multi-level pruning and improve join performance. In Oracle, the most typical subpartition is hash or list – in MySQL, you can only subpartition by hash or key.

How do you decide subpartitioning strategy?

  • Select the dimension queried most frequently on the fact table OR
  • Pick the common join column

For example, if you want to look at sales per day, per store, you would choose “per day” as the partition and “per store” as the subpartition.

If you do not have a good partition on logical elements (like grouping), then you can subpartition using hash partitioning on common joins — perhaps surrogate keys, or using join key of the largest table involved in the join.

For example, if the sales table is partitioned and another big table is product, you can hash subpartition product_id.

Because there’s overhead in partitions (loading metadata, reading metadata), make sure size of partitions and subpartitions is >20 Mb. So better to have a 30 Mb subpartition than a 15 Mb subpartition. [I have no idea if this is true in MySQL or not -- I think the general concept is true, because there is some overhead, but I have no idea about the 20 Mb figure and why that's true for Oracle, nor do I know what is true in MySQL.]

One easy calculation is double the # of CPUs, round up to nearest power of 2. If you’re executing in parallel, Oracle will use 2x CPUs. (all this advice, by the way, follows 80/20 rule, this is probably good for about 80% of the environments out there). Of course, MySQL does not do parallel execution very well, so this probably does not apply.

Oracle knows it can get partition elimination while it does a join.

If 2 tables have the same degree of parallelism (same # of buckets) and are partitioned in the same way on the join column (say, customer_id in a subpartition of sales and a partition of customer), Oracle will match the partitions when joining:

sales table joined with customer table can change into 4 small joins:
sales sub part 1 joins with customer part 1
sales sub part 2 joins with customer part 2
sales sub part 3 joins with customer part 3
sales sub part 4 joins with customer part 4

And with parallelism, the total time is now reduced to the time it takes to do one of those smaller joins.

This is also why you want to have a power of 2 for buckets – because cores/processors come in powers of 2. Partition-wise joins like this can also be done with range or list, assuming both tables in the join have the same buckets.

I have no idea if MySQL partitioning works this way, but it’s certainly a functionality that makes sense to me.


PlanetMySQL Voting: Vote UP / Vote DOWN

Using MySQL Partitioning Instead of MERGE Tables

Июль 24th, 2010

One common question I get is how to use partitioning instead of MERGE tables. The process I use involves using stored procedures to create and drop partitions. This article will go over the stored procedures I use; special thanks to Roland Bouman for taking a look and giving great feedback to optimize this process.

First, a simple table, not partitioned (yet):

use test;
DROP TABLE IF EXISTS my_part;
CREATE TABLE IF NOT EXISTS my_part (
  id int NOT NULL,
  creationDate datetime NOT NULL,
  PRIMARY KEY (id,creationDate)
) ENGINE=InnoDB;

In real, life there is more to the table than just id and creationDate. The most important part is that the partitioned field(s) need to be part of the primary key.

Now, add the partition definition. This can be done in the CREATE statement, but I have found that it is easier for me to think about what fields and indexes I want first, and then worry about partitioning, so I naturally gravitate towards defining the fields and indexes first and then altering the table to add partitioning. This also helps me think about how to modify an existing table to add partitioning.

ALTER TABLE my_part PARTITION BY RANGE (TO_DAYS(creationDate)) (
partition 2010_07_01 values less than (to_days('2010-07-02')),
partition 2010_07_02 values less than (to_days('2010-07-03')),
partition 2010_07_03 values less than (to_days('2010-07-04')),
partition 2010_07_04 values less than (to_days('2010-07-05'))
);

This makes it pretty clear what is happening — the idea is to give the partition names actual dates that they hold, so that it is easy to see what partitions need to be added and deleted.

Deleting partitions

I find that making stored procedures makes things easy….so I will define a procedure called partition_drop to drop partitions. The partition_drop stored procedure takes in a table name and schema name to drop partitions from, and a date to delete up through and including that date. Here’s the procedure:

DELIMITER ||
DROP PROCEDURE IF EXISTS partition_drop ||

CREATE PROCEDURE partition_drop (IN through_date date, IN tbl varchar(64), IN db varchar(64))
BEGIN
DECLARE delete_me varchar(64);
DECLARE notfound BOOL DEFAULT FALSE;
DECLARE pname CURSOR FOR SELECT PARTITION_NAME
FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME=tbl AND TABLE_SCHEMA=db
AND DATE(PARTITION_NAME)!= 0
AND DATE(PARTITION_NAME) IS NOT NULL
AND DATE(PARTITION_NAME)<=through_date;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET notfound:=TRUE;
OPEN pname;

cursor_loop: LOOP
FETCH pname INTO delete_me;
IF notfound THEN LEAVE cursor_loop; END IF;
SET @alter_stmt:=CONCAT("ALTER TABLE ",db,".",tbl," DROP PARTITION ",delete_me);

# sanity check commented out for production use
# SELECT @alter_stmt;
PREPARE stmt_alter FROM @alter_stmt; EXECUTE stmt_alter; DEALLOCATE PREPARE stmt_alter;

END LOOP;
CLOSE pname;
END ||
DELIMITER ;

Go ahead and run CALL partition_drop('2010-07-02','my_part','test'); to verify that SHOW CREATE TABLE my_part; shows that the desired partitions have been dropped.

Adding partitions

Adding partitions is similar to deleting partitions — using a stored procedure that takes in the date to add partitions up to that date. It will not try to add so many partitions that the table will have more than 1024, and it won’t add any partitions that already exist.

DELIMITER ||
DROP PROCEDURE IF EXISTS partition_add ||

CREATE PROCEDURE partition_add (IN through_date date, IN tbl varchar(64), IN db varchar(64))
BEGIN
DECLARE add_me char(10);
DECLARE max_new_parts,add_cnt smallint unsigned default 0;
SELECT 1024-COUNT(*) AS max_new_parts,
SUM(CASE WHEN
 DATE(PARTITION_NAME)>=through_date then 1 else 0
 END)
INTO max_new_parts, add_cnt
FROM INFORMATION_SCHEMA.PARTITIONS
WHERE  TABLE_SCHEMA = db
 AND TABLE_NAME = tbl;

IF add_cnt=0 THEN
BEGIN
SELECT MAX(DATE(PARTITION_NAME)) INTO add_me
FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME=tbl and TABLE_SCHEMA=db
AND DATE(PARTITION_NAME)<through_date;

# to do: declare handler for exceptions here
IF DATEDIFF(through_date,add_me)+1 < max_new_parts THEN
BEGIN
WHILE add_me<through_date do BEGIN
SET add_me:=DATE_FORMAT(add_me + INTERVAL 1 DAY,"%Y_%m_%d");
SET @alter_stmt:=CONCAT("ALTER TABLE ",db,".",tbl," ADD PARTITION (PARTITION ",add_me," VALUES LESS THAN (TO_DAYS('",add_me+INTERVAL 1 DAY, "')))" );

PREPARE stmt_alter FROM @alter_stmt; EXECUTE stmt_alter; DEALLOCATE PREPARE stmt_alter;

END;
END WHILE;
END;
END IF;
END;
END IF;
END ||
DELIMITER ;

Here’s how to call that stored procedure:
CALL partition_add ('2010_07_10', 'my_part','test');

Caveat: This stored procedure will only add in partitions from the maximum partition name/date until the date you add it in, it will not fill in any gaps. However, it is possible to add in another input parameter to be the “start date” and change the stored procedure to add the partition if it does not exist, from the start date through the end date.

Note: It was pointed out by Roland Bouman that it would be better to change both the add and drop stored procedures to do one ALTER TABLE instead of sequential ones. He wrote:

It’s probably better to generate a single statement to drop / add all partitions.

So the general pattern would be:

- generate and concatenate lines for each partition
- one single sequence of prepare, execute, deallocate to execute one DDL statement.

For the drop partition procedure, this approach would allow you to do away with the cursor. You can simply use GROUP_CONCAT in a single SELECT…INTO statement to generate the entire DDL statement.

Note: Roland also mentioned that these stored procedures could be generalized to use any interval, for example if each table partition held 7 days. The only limitation is that to get the optimal partition performance with dates, use either TO_DAYS(date_field) or YEAR(date_field) as the partitionining function.

Putting it all together

The procedure: daily I would run the following in MySQL, to keep only the previous “x” days:

CALL partition_drop('CURRENT_DATE()-INTERVAL x DAY', 'tbl_name','schema_name');

To add new partitions, I would do:
CALL partition_add('CURRENT_DATE()-INTERVAL 32 DAY', 'tbl_name','schema_name');

It is run daily just in case; the side benefit is that it will catch any gaps if you do not modify the stored procedure to include a start date for when to add. Adding a partition does not take a long time, because there’s no data to reorganize.

I would make a daily MySQL event, and then everything is stored nicely in the database, and backed up with the database. Others may choose to run the stored procedures in a cron script or scheduled task (Windows), but that requires a password to be stored somewhere — either in the script, somewhere the script can read, or in a config file such as .my.cnf.


PlanetMySQL Voting: Vote UP / Vote DOWN

Kontrollkit – new backup script is partition space aware

Март 31st, 2010
I’ve been wanting to write a backup script for a while now that does the following: reads the partition information for the directory that you are backing up into and computes the used/available/percentage-available space. Then it reads the total data size from the MySQL tables and ensures that you have enough space on disk (for [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

InnoDB’s tablespace ids and Partitions

Март 22nd, 2010
There are times when what you have is a partially running database and a bunch of backup innodb tablespace files (the .ibd files). If you're using innodb_file_per_table, then you have a separate .ibd file for each InnoDB table.

Now, you have your running database with a bunch of tables, and you want to replace some of them with the backup .ibd files. According to the MySQL docs, you'd do this:
  1. ALTER TABLE foo DISCARD TABLESPACE; (this deletes the current .ibd file)
  2. copy the old .ibd file into your database directory
  3. ALTER TABLE foo IMPORT TABLESPACE;
Assuming your .ibd file was from the same database and you did not drop the table and recreate it sometime between when you made the backup .ibd and now, this should work.Except... if you use partitions. If your table foo uses partitions, ie, its create statement was something like this:
CREATE TABLE foo (
...
) PARTITION BY ... (
PARTITION p0 ...,
);
In this case, you cannot discard the tablespace, and the first alter command throws an error:
mysql> ALTER TABLE foo DISCARD TABLESPACE;

ERROR 1031 (HY000): Table storage engine for 'foo' doesn't have this option
I have not investigated if there are workarounds for this, but I do have a little more information on what's happening.Remember that each .ibd file is a tablespace. For a partitioned table, there are multiple .ibd files, one for each partition. The table's files look like this:
foo.frm
foo.par
foo#P#p0.ibd
foo#P#p1.ibd
...
Where p0, p1, etc. are the partition names that you specified in the create statement. Each partition is a different tablespace and has its own tablespace id. When you create an InnoDB table without partitioning, the internal tablespace id counter is incremented by 1. When you create an InnoDB table with paritions, the internal tablespace id counter is incremented by the number of partitions. The actual tablespace id is stored in each partition's .ibd file somewhere within the first 100 bytes. I have not attempted to find out where exactly though.

PlanetMySQL Voting: Vote UP / Vote DOWN