Archive for the ‘BLOB’ Category

Information on Bug#12704861 (which doesn’t exist in any public bug tracker)

Ноябрь 21st, 2011

Some of you may be aware that MySQL is increasingly using an Oracle-internal bug tracker. You can see these large bug numbers mentioned alongside smaller public bug numbers in recent MySQL release notes. If you’re particularly unlucky, you  just get a big Oracle-internal bug number. For a recently fixed bug, I dug further, posted up on the Percona blog: http://www.mysqlperformanceblog.com/2011/11/20/bug12704861/

Possibly interesting reading for those of you who interested in InnoDB, MySQL, BLOBs and crash recovery.


PlanetMySQL Voting: Vote UP / Vote DOWN

Why a new memory engine may change everything ?

Сентябрь 26th, 2011

I’m sure you are aware that the last Percona server release includes a new improved MEMORY storage engine for MySQL.
This new engine is based on Dynamic Row Format and offers some of great features, specialy for VARCHAR, VARBINARY, TEXT and BLOB fields in MEMORY tables.

But because this new MEMORY engine by Percona has some limitations and because Percona server hasn’t used it for its internal temporary tables yet, I would like to talk about what can be the real benefits to have a brand new MEMORY engine based on Dynamic row format specialy for internal memory tables.

Just remember or discover how MySQL uses internal memory tables

And the MEMORY storage engine characteristics and its limitations

So, the memory storage engine transforms all varchar fields in char fields for internal temporary tables or for user created memory tables. 

1. Let me explain what is the problem with a simple exemple :

I’ve created an InnoDB table (without index) with two varchar fields (50 & 100) :

mysql> show table status like 'test_memory5'\G
*************************** 1. row ***************************
           Name: test_memory5
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 621089
 Avg_row_length: 66
    Data_length: 41484288
Max_data_length: 0
   Index_length: 0
      Data_free: 5242880
 Auto_increment: NULL
    Create_time: 2011-09-21 13:10:17
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:

The size of this table is about 48Mb (and more than 600.000 rows) :

-rw-rw---- 1 mysql mysql  48M 2011-09-21 13:11 test_memory5.ibd

Now, I’m creating a new memory table with exactly the same structure and I’m setting paramters for memory tables like this :

  • set tmp_table_size=50*1024*1024;
  • set max_heap_table_size=50*1024*1024;

That’s mean I can create a 50Mb max memory table.
Thus, let me insert my 600.000 rows in this table :

mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

My 50Mb memory table can’t  contain the 48Mb of the InnoDB table !
Let’s try with 80Mb :

mysql> set tmp_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> set max_heap_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

And this error occurs until the memory tables can have a 110Mb maximum size !

Why ? Because the two varchar fields of the InnoDB table are converted in char fields with the memory storage engine.
Let’s see this example from MySQL documentation :

Value CHAR(4) Storage Required VARCHAR(4) Storage Required
'' '    ' 4 bytes '' 1 byte
'ab' 'ab  ' 4 bytes 'ab' 3 bytes
'abcd' 'abcd' 4 bytes 'abcd' 5 bytes
'abcdefgh' 'abcd' 4 bytes 'abcd' 5 bytes

By the way, this is a very good reason to take care of your varchar fields.

Conclusion : A memory table can be really bigger than an InnoDB table

2. Let  me explain why a new memory engine may change everything :

Changing the rules for memory tables may change everything for, at least, two reasons :

  • VARCHARVARBINARYTEXT and BLOB fields will be supported by this new engine for user created memory tables (Percona server can do it with restrictions)
  • Internal memory tables could be more efficient with Dynamic Row Format
The real benefits will come with internal memory tables, how often do you see that when you “explain” your queries :
Extra: Using where; Using temporary; Using filesort

Thus, for each query using a temporary table, MySQL could use less of memory (RAM) !
I don’t know what would be the real benefit but I am convinced that it can be really significant.

I look forward to see more benchmark about that with last percona server release and I hope that Percona server or MariaDB will support dynamic row format for internal memory tables soon.

Please, let us know if you have already tested this new Percona Memory engine.


PlanetMySQL Voting: Vote UP / Vote DOWN

PBMS in Drizzle

Июль 8th, 2010

Some of you may have noticed that blob streaming has been merged into the main Drizzle tree recently. There are a few hooks inside the Drizzle kernel that PBMS uses, and everything else is just in the plug in.

For those not familiar with PBMS it does two things: provide a place (not in the table) for BLOBs to be stored (locally on disk or even out to S3) and provide a HTTP interface to get and store BLOBs.

This means you can do really neat things such as have your BLOBs replicated, consistent and all those nice databasey things as well as easily access them in a scalable way (everybody knows how to cache HTTP).

This is a great addition to the AlsoSQL arsenal of Drizzle. I’m looking forward to it advancing and being adopted (now much easier that it’s in the main repository)


PlanetMySQL Voting: Vote UP / Vote DOWN

BLOBS in the Drizzle/MySQL Storage Engine API

Май 26th, 2010

Another (AFAIK) undocumented part of the Storage Engine API:

We all know what a normal row looks like in Drizzle/MySQL row format (a NULL bitmap and then column data):

Nothing that special. It’s a fixed sized buffer, Field objects reference into it, you read out of it and write the values into your engine. However, when you get to BLOBs, we can’t use a fixed sized buffer as BLOBs may be quite large. So, the format with BLOBS is the bit in the row is a length of the blob (1, 2, 3 or 4 bytes – in Drizzle it’s only 3 or 4 bytes now and soon only 4 bytes once we fix a bug that isn’t interesting to discuss here). The Second part of the in-row part is a pointer to a location in memory where the BLOB is stored. So a row that has a BLOB in it looks something like this:

The size of the pointer is (of course) platform dependent. On 32bit machines it’s 4 bytes and on 64bit machines it’s 8 bytes.

Now, if I were any other source of documentation, I’d stop right here.

But I’m not. I’m a programmer writing a Storage Engine who now has the crucial question of memory management.

When your engine is given the row from the upper layer (such as doInsertRecord()/write_row()) you don’t have to worry, for the duration of the call, the memory will be there (don’t count on it being there after though, so if you’re not going to immediately splat it somewhere, make your own copy).

For reading, you are expected to provide a pointer to a location in memory that is valid until the next call to your Cursor. For example, rnd_next() call reads a BLOB field and your engine provides a pointer. At the subsequent rnd_next() call, it can free that pointer (or at doStopTableScan()/rnd_end()).

HOWEVER, this is true except for index_read_idx_map(), which in the default implementation in the Cursor (handler) base class ends up doing a doStartIndexScan(), index_read(), doEndIndexScan(). This means that if a BLOB was read, the engine could have (quite rightly) freed that memory already. In this case, you must keep the memory around until either a reset() or extra(HA_EXTRA_FLUSH) call.

This exception is tested (by accident) by a whole single query in type_blob.test – a monster of a query that’s about a seven way join with a group by and an order by. It would be quite possible to write a fairly functional engine and completely miss this.

Good luck.

This blog post (but not the whole blog) is published under the Creative Commons Attribution-Share Alike License. Attribution is by linking back to this post and mentioning my name (Stewart Smith).


PlanetMySQL Voting: Vote UP / Vote DOWN