Archive for the ‘ENGINE’ Category

Why a new memory engine may change everything ?

Сентябрь 26th, 2011

I’m sure you are aware that the last Percona server release includes a new improved MEMORY storage engine for MySQL.
This new engine is based on Dynamic Row Format and offers some of great features, specialy for VARCHAR, VARBINARY, TEXT and BLOB fields in MEMORY tables.

But because this new MEMORY engine by Percona has some limitations and because Percona server hasn’t used it for its internal temporary tables yet, I would like to talk about what can be the real benefits to have a brand new MEMORY engine based on Dynamic row format specialy for internal memory tables.

Just remember or discover how MySQL uses internal memory tables

And the MEMORY storage engine characteristics and its limitations

So, the memory storage engine transforms all varchar fields in char fields for internal temporary tables or for user created memory tables. 

1. Let me explain what is the problem with a simple exemple :

I’ve created an InnoDB table (without index) with two varchar fields (50 & 100) :

mysql> show table status like 'test_memory5'\G
*************************** 1. row ***************************
           Name: test_memory5
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 621089
 Avg_row_length: 66
    Data_length: 41484288
Max_data_length: 0
   Index_length: 0
      Data_free: 5242880
 Auto_increment: NULL
    Create_time: 2011-09-21 13:10:17
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:

The size of this table is about 48Mb (and more than 600.000 rows) :

-rw-rw---- 1 mysql mysql  48M 2011-09-21 13:11 test_memory5.ibd

Now, I’m creating a new memory table with exactly the same structure and I’m setting paramters for memory tables like this :

  • set tmp_table_size=50*1024*1024;
  • set max_heap_table_size=50*1024*1024;

That’s mean I can create a 50Mb max memory table.
Thus, let me insert my 600.000 rows in this table :

mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

My 50Mb memory table can’t  contain the 48Mb of the InnoDB table !
Let’s try with 80Mb :

mysql> set tmp_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> set max_heap_table_size=80*1024*1024;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into test_memory6 select * from test_memory5;
ERROR 1114 (HY000): The table 'test_memory6' is full

And this error occurs until the memory tables can have a 110Mb maximum size !

Why ? Because the two varchar fields of the InnoDB table are converted in char fields with the memory storage engine.
Let’s see this example from MySQL documentation :

Value CHAR(4) Storage Required VARCHAR(4) Storage Required
'' '    ' 4 bytes '' 1 byte
'ab' 'ab  ' 4 bytes 'ab' 3 bytes
'abcd' 'abcd' 4 bytes 'abcd' 5 bytes
'abcdefgh' 'abcd' 4 bytes 'abcd' 5 bytes

By the way, this is a very good reason to take care of your varchar fields.

Conclusion : A memory table can be really bigger than an InnoDB table

2. Let  me explain why a new memory engine may change everything :

Changing the rules for memory tables may change everything for, at least, two reasons :

  • VARCHARVARBINARYTEXT and BLOB fields will be supported by this new engine for user created memory tables (Percona server can do it with restrictions)
  • Internal memory tables could be more efficient with Dynamic Row Format
The real benefits will come with internal memory tables, how often do you see that when you “explain” your queries :
Extra: Using where; Using temporary; Using filesort

Thus, for each query using a temporary table, MySQL could use less of memory (RAM) !
I don’t know what would be the real benefit but I am convinced that it can be really significant.

I look forward to see more benchmark about that with last percona server release and I hope that Percona server or MariaDB will support dynamic row format for internal memory tables soon.

Please, let us know if you have already tested this new Percona Memory engine.


PlanetMySQL Voting: Vote UP / Vote DOWN

Federated Tables

Июль 7th, 2010

Your searching for how to create a join across two databases on two different servers and it can’t be done directly.   select  d1.a, d2.b from db1@server1 join db2@server2 where db1.c = db2.c; does not work.

You learn about federated databases.  The federated storage engine allows accesses data in tables of remote databases.  Now how do you make it work?

1) Check if the federated storage engine is supported.  Federation is OFF by default!

mysql> show engines;
+------------+---------+----------------------------------------------------------------+
| Engine     | Support | Comment                                                        |
+------------+---------+----------------------------------------------------------------+
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys     |
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance         |
| BLACKHOLE  | YES     | /dev/null storage engine (anything you write to it disappears) |
| CSV        | YES     | CSV storage engine                                             |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables      |
| FEDERATED  | YES     | Federated MySQL storage engine                                 |
| ARCHIVE    | YES     | Archive storage engine                                         |
| MRG_MYISAM | YES     | Collection of identical MyISAM tables                          |
+------------+---------+----------------------------------------------------------------+

If it is not “Support”ed (on) you need to add ‘federated=ON‘ to the [mysqld] section of your /etc/my.cnf file.  I found this section to be a bit troublesome.  It must be ‘=ON’ not ‘=YES” or even ‘=on’.   Most options allow these but the federated options is picky.  I’m running MySQL Enterprise 5.1.37.sp1.

2) If you don’t already have the database created, create the database on the storage server.  By ‘storage server’ I mean the one where the data will be written to disk.

I like to create a user just for the purpose of connection the federated copy of the database to the true database.  This way, if the password gets changed or the user deleted, the federated system can continue to connect.

mysql> CREATE DATABASE xfiles;
mysql> USE xfiles;
mysql> CREATE TABLE cases(
 Name VARCHAR(20),
 case TINYINT(3),
) ENGINE = INNODB;

3) Now you can create the federated version of your data on the remote system.

mysql> CREATE DATABASE xfiles;
mysql> USE xfiles;
mysql> CREATE TABLE cases(
 Name VARCHAR(20),
 case TINYINT(3),
) ENGINE = FEDERATED
CONNECTION = 'mysql://skiner:c0nsper@fbi/xfiles/cases';

4) Check your work. The table status should show Engine: FEDERATED.

mysql> use xfiles;
mysql> show table status\G

Now you can add records to the table and the data should show up in select on either server.

Enjoy.


PlanetMySQL Voting: Vote UP / Vote DOWN

GRAPH engine – Mk.II

Октябрь 16th, 2009

The GRAPH engine allows you to deal with hierarchies and graphs in a purely relational way. So, we can find all children of an item, path from an item to a root node, shortest path between two items, and so on, each with a simple basic query structure using standard SQL grammar.

The engine is implemented as a MySQL/MariaDB 5.1 plugin (we’re working on a 5.0 backport for some clients) and thus runs with an unmodified server.

Demo time! I’ll simplify/strip a little bit here for space reasons, but what’s here is plain cut/paste from a running server, no edits

-- insert a few entries with connections (and multiple paths)
insert into foo (origid, destid) values (1,2), (2,3), (2,4), (4,5), (3,6), (5,6);
-- a regular table to join on to
insert into people values (1,"pearce"),(2,"hunnicut"),(3,"potter"),
                          (4,"hoolihan"),(5,"winchester"),(6,"mulcahy");
-- find us the shortest path from pearce (1) to mulcahy (6) please
select group_concat(people.name order by seq) as path
  from foo join people on (foo.linkid=people.id)
  where latch=1 and origid=1 and destid=6;
+--------+--------+--------------------------------+
| origid | destid | path                           |
+--------+--------+--------------------------------+
|      1 |      6 | pearce,hunnicut,potter,mulcahy |
+--------+--------+--------------------------------+
-- find us all people we can get to from potter (3)
select origid,group_concat(people.name order by seq) as destinations
  from foo join people on (foo.linkid=people.id)
  where latch=1 and origid=3;
+--------+----------------+
| origid | destinations   |
+--------+----------------+
|      3 | mulcahy,potter |
+--------+----------------+

-- find us all people from where we can get to hoolihan (4)
select origid,group_concat(people.name order by seq) as origins
  from foo join people on (foo.linkid=people.id)
  where latch=1 and destid=4;
+--------+--------------------------+
| origid | origins                  |
+--------+--------------------------+
|      4 | hoolihan,hunnicut,pearce |
+--------+--------------------------+

So, there you have it. A graph (in this case a simple unidirectional tree, aka hierarchy) that looks like a table to us, as do the resultsets that have been computed.

This is still a early implementation, we’re still enhancing the storage efficiency (in memory) and speed, and adding persistence. We’re also looking for a suitable large dataset that would allow us to seriously test the system, find bugs and assess speed. If you happen to have a large hierarchical structure, but especially a social graph you could obfuscate and give to us, that would be great!

Also, if you’re interested in deploying the GRAPH engine or have questions or additional needs, we’d be happy to talk with you.

select origid,group_concat(people.name order by seq) as destinations from foo join people on (foo.linkid=people.id) where latch=1 and origid=4;
+——–+—————————–+
| origid | destinations                |
+——–+—————————–+
|      4 | mulcahy,winchester,hoolihan |
+——–+—————————–+

PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL University: The Spider Storage Engine

Октябрь 12th, 2009

This Thursday (October 15th, 13:00 UTC), Giuseppe Maxia will present the Spider Storage Engine. Here's from the abstract: Everybody needs sharding. Which is not easy to maintain. Being tied to the application layer, sharding is hard to export and to interact with. The Spider storage engine, a plugin for MySQL 5.1 and later, solves the problem in a transparent way. It is an extension of partitioning. Using this engine, the user can deal transparently with multiple backends in the server layer. This means that the data is accessible from any application without code changes. This lecture will briefly introduce MySQL partitioning, and then show how to create and use the Spider engine, with some practical examples. The talk covers the latest version of the Spider engine, which includes a condition pushdown feature that increases performance significantly.

For MySQL University sessions, point your browser to this page. You need a browser with a working Flash plugin. You may register for a Dimdim account, but you don't have to. (Dimdim is the conferencing system we're using for MySQL University sessions. It provides integrated voice streaming, chat, whiteboard, session recording, and more.) All MySQL University sessions are recorded, that is, slides and voice can be viewed as a Flash movie (.flv). You can find those recordings on the respective MySQL University session pages which are listed on the MySQL University home page.

MySQL University is a free educational online program for engineers/developers. MySQL University sessions are open to anyone, not just Sun employees. Sessions are recorded (slides and audio), so if you can't attend the live session you can look at the recording anytime after the session.

Here's the schedule for the upcoming weeks:

  • October 22: Dual Master Setups With MMM (Arjen Lentz)
  • October 29: MySQL scalability on SPARC & INTEL X5500 (Nehalem) (Benoit Chaffanjon)
  • November 12: Gearman for MySQL (Giuseppe Maxia)
  • November 19: memcached Functions for MySQL (UDFs) (Patrick Galbraith)
  • December 3: Practical Full-Text Search in MySQL (Bill Karwin)

The schedule is not engraved in stone at this point. Please visit http://forge.mysql.com/wiki/MySQL_University#Upcoming_Sessions for the up-to-date list. On that page, you can also find the starting time for many time zones.


PlanetMySQL Voting: Vote UP / Vote DOWN

Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx – Fight!)

Сентябрь 3rd, 2009

In the past few weeks I've been implementing advanced search at Plaxo, working quite closely with Solr enterprise search server. Today, I saw this relatively detailed comparison between Solr and its main competitor Sphinx (full credit goes to StackOverflow user mausch who had been using Solr for the past 2 years). For those still confused, Solr and Sphinx are similar to MySQL FULLTEXT search, or for those even more confused, think Google (yeah, this is a bit of a stretch, I know).

Similarities

  • Both Solr and Sphinx satisfy all of your requirements. They're fast and designed to index and search large bodies of data efficiently.
  • Both have a long list of high-traffic sites using them (Solr, Sphinx)
  • Both offer commercial support. (Solr, Sphinx)
  • Both offer client API bindings for several platforms/languages (Sphinx, Solr)
  • Both can be distributed to increase speed and capacity (Sphinx, Solr)

Here are some differences

Related questions

Conclusion

In my experience, Solr is very-very fast on the query side. It is also very powerful. The indexing side is very CPU and memory intensive and is an unfortunate side effect of having such a feature-rich, fast application. Nevertheless, I highly recommend Solr.

For disclaimer purposes, I have not had much experience with Sphinx and, again, all credit for this comparison goes to mausch.

 
Similar Posts:Share/Bookmark
PlanetMySQL Voting: Vote UP / Vote DOWN