<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PlanetMysql.ru - информация о СУБД MySQL &#187; hacking</title>
	<atom:link href="http://planetmysql.ru/category/hacking/feed/" rel="self" type="application/rss+xml" />
	<link>http://planetmysql.ru</link>
	<description>Блог о самой популярной СУБД MySQL</description>
	<lastBuildDate>Thu, 24 May 2012 17:22:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>BlitzDB Crash Safety and Auto Recovery</title>
		<link>http://torum.net/2010/07/blitzdb-crash-safety-and-auto-recovery/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=blitzdb-crash-safety-and-auto-recovery</link>
		<comments>http://torum.net/2010/07/blitzdb-crash-safety-and-auto-recovery/#comments</comments>
		<pubDate>Thu, 22 Jul 2010 09:43:14 +0000</pubDate>
		<dc:creator>Toru Maesaka</dc:creator>
				<category><![CDATA[blitzdb]]></category>
		<category><![CDATA[Drizzle]]></category>
		<category><![CDATA[hacking]]></category>
		<category><![CDATA[OSS]]></category>
		<category><![CDATA[recovery]]></category>
		<category><![CDATA[storage]]></category>

		<guid isPermaLink="false">http://torum.net/?p=2369</guid>
		<description><![CDATA[Crash Safety is a big deal in the database league. Lack of durability can lead to all sorts of terrible things upon a catastrophic event. Many projects, especially in the so called NoSQL world compromises crash safety in return for higher QPS. The argument there is that the availability of the overall system should be accomplished by replication since a database server can&#8217;t be rescued if the physical disk breaks. I happen to agree with this philosophy but I am also aware that this isn&#8217;t a correct answer for everyone. So, what will I do with BlitzDB?
Several relational database hackers have pointed out that BlitzDB isn&#8217;t any safer than MyISAM since it doesn&#8217;t guarantee crash safety. This is currently true but I plan on making BlitzDB much safer than MyISAM by providing following features.

Auto Recovery Routine (startup option)
Tokyo Cabinet&#8217;s Transaction API (table-specific option)

The second feature above would actually guarantee BlitzDB to be crash safe (especially combined with auto recovery) but I won&#8217;t get into depth in this post since this topic deserves a blog post of it&#8217;s own. Let me just state that this feature will be provided in a form like this:

CREATE TABLE t1 &#40;
  a int PRIMARY KEY,
  b varchar&#40;256&#41;
&#41; ENGINE = BLITZDB, CRASH_SAFE;

From here on, I&#8217;ll cover how I plan on hacking auto recovery in BlitzDB.
Auto Recovery Challenges
As I blogged a while back, recovering Tokyo Cabinet is relatively simple. However, this is not a sufficient solution in BlitzDB since the data file (hash database that actually holds the rows) and the index file(s) are independent from each other. That is, the likelihood of the data file and the index file(s) to be inconsistent is very high after a crash. So, how can we hack on this? Pretty simple.
Indexes aren&#8217;t Important at Recovery Phase
Because BlitzDB logically separates the data file and it&#8217;s indexes, index files aren&#8217;t that important. If a server crash had occurred, BlitzDB could delete the index file(s) and recompute them from the data file. Needless to say, this process would involve a lot of random access and computation but it would not dominate the time space of the system since it&#8217;s a one-time cost. This approach however has one flaw in it such that the index files can&#8217;t be recomputed if the data file is broken or is unrecoverable.
Therefore to guarantee crash safety, BlitzDB must ensure that the data file is unbreakable. This is precisely where Tokyo Cabinet&#8217;s Transaction API comes in. I&#8217;m planning on using it to protect the data file from breaking. If the data file is protected, the table can be rescued. Simple!
So, that&#8217;s what I have in mind for making BlitzDB a safer engine. Unfortunately I can&#8217;t start hacking on it immediately since I have several bugs to fix first. Nevertheless I&#8217;m looking forward to start hacking on it. This challenge should be quite fun to tackle.]]></description>
			<content:encoded><![CDATA[<p>Crash Safety is a big deal in the database league. Lack of durability can lead to all sorts of terrible things upon a catastrophic event. Many projects, especially in the so called NoSQL world compromises crash safety in return for higher QPS. The argument there is that the availability of the overall system should be accomplished by replication since a database server can&#8217;t be rescued if the physical disk breaks. I happen to agree with this philosophy but I am also aware that this isn&#8217;t a correct answer for everyone. So, what will I do with BlitzDB?</p>
<p>Several relational database hackers have pointed out that BlitzDB isn&#8217;t any safer than MyISAM since it doesn&#8217;t guarantee crash safety. This is currently true but I plan on making BlitzDB much safer than MyISAM by providing following features.</p>
<ol>
<li>Auto Recovery Routine (startup option)</li>
<li>Tokyo Cabinet&#8217;s Transaction API (table-specific option)</li>
</ol>
<p>The second feature above would actually guarantee BlitzDB to be crash safe (especially combined with auto recovery) but I won&#8217;t get into depth in this post since this topic deserves a blog post of it&#8217;s own. Let me just state that this feature will be provided in a form like this:</p>

<div><div><pre><span>CREATE</span> <span>TABLE</span> t1 <span>&#40;</span>
  a int <span>PRIMARY</span> <span>KEY</span><span>,</span>
  b varchar<span>&#40;</span><span>256</span><span>&#41;</span>
<span>&#41;</span> ENGINE <span>=</span> BLITZDB<span>,</span> CRASH_SAFE;</pre></div></div>

<p>From here on, I&#8217;ll cover how I plan on hacking auto recovery in BlitzDB.</p>
<h3>Auto Recovery Challenges</h3>
<p>As I blogged a while back, <a href="http://torum.net/2010/01/how-to-recover-a-tokyo-cabinet-database-file/">recovering Tokyo Cabinet</a> is relatively simple. However, this is not a sufficient solution in BlitzDB since the data file (hash database that actually holds the rows) and the index file(s) are independent from each other. That is, the likelihood of the data file and the index file(s) to be inconsistent is very high after a crash. So, how can we hack on this? Pretty simple.</p>
<h3>Indexes aren&#8217;t Important at Recovery Phase</h3>
<p>Because BlitzDB logically separates the data file and it&#8217;s indexes, index files aren&#8217;t that important. If a server crash had occurred, BlitzDB could delete the index file(s) and recompute them from the data file. Needless to say, this process would involve a lot of random access and computation but it would not dominate the time space of the system since it&#8217;s a one-time cost. This approach however has one flaw in it such that the index files can&#8217;t be recomputed if the data file is broken or is unrecoverable.</p>
<p>Therefore to guarantee crash safety, BlitzDB must ensure that the data file is unbreakable. This is precisely where Tokyo Cabinet&#8217;s Transaction API comes in. I&#8217;m planning on using it to protect the data file from breaking. If the data file is protected, the table can be rescued. Simple!</p>
<p>So, that&#8217;s what I have in mind for making BlitzDB a safer engine. Unfortunately I can&#8217;t start hacking on it immediately since I have several bugs to fix first. Nevertheless I&#8217;m looking forward to start hacking on it. This challenge should be quite fun to tackle.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25367&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25367&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/22/blitzdb-crash-safety-and-auto-recovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySQL client tool and how to output newlines</title>
		<link>http://geert.vanderkelen.org/2010/05/mysql-client-tool-and-how-to-output.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=mysql-client-tool-and-how-to-output-newlines</link>
		<comments>http://geert.vanderkelen.org/2010/05/mysql-client-tool-and-how-to-output.html#comments</comments>
		<pubDate>Thu, 13 May 2010 11:38:00 +0000</pubDate>
		<dc:creator>Geert Vanderkelen</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[mysql]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[This blog posts explains how to add a new line in strings in MySQL Stored Procedures and how to output the result using the MySQL client tool.Today I was fooling around with some stored procedure making it more fancy and stuff. What I wanted was the OUT variable to contain a newline. Easy of course, using CONCAT:mysql&#62; SELECT CONCAT('foo','\n','bar');
+--------------------------+
&#124; CONCAT('foo','\n','bar') &#124;
+--------------------------+
&#124; foo
bar                  &#124;
+--------------------------+Now, if youconcat strings in a stored procedure, it doesn't work as expected when you run it through the MySQL client tool mysql:DELIMITER //
CREATE PROCEDURE sp1(OUT pres VARCHAR(6000))
BEGIN
  SET pres = CONCAT('foo','\n','bar');
END;
//
DELIMITER ;

SET @res = 'foo ';
CALL sp1(@res);
SELECT @res;When we execute it, we get this:shell&#62; mysql -N test  mysql -Nr test &#60; foo.sql
foo
bar
But that&#039;s not all! Use \G when selecting the OUT-variable and it is also working. The output is not so useful though.Ah.. The things you find out while having the day off..]]></description>
			<content:encoded><![CDATA[<p><i>This blog posts explains how to add a new line in strings in <a href="http://dev.mysql.com">MySQL</a> <a href="http://dev.mysql.com/doc/refman/5.1/en/stored-programs-views.html">Stored Procedures</a> and how to output the result using the MySQL client tool.</i></p><p>Today I was fooling around with <a href="http://geert.vanderkelen.org/2010/02/stuffing-gaps-in-collations-table-using.html">some stored procedure</a> making it more fancy and stuff. What I wanted was the <tt>OUT</tt> variable to contain a newline. Easy of course, using <a href="http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_concat"><tt>CONCAT</tt></a>:</p><pre>mysql> SELECT CONCAT('foo','\n','bar');
+--------------------------+
| CONCAT('foo','\n','bar') |
+--------------------------+
| foo
bar                  |
+--------------------------+</pre><p>Now, if you<b>concat strings in a stored procedure</b>, it doesn't work as expected when you run it through the MySQL client tool <tt>mysql</tt>:</p><pre>DELIMITER //
CREATE PROCEDURE sp1(OUT pres VARCHAR(6000))
BEGIN
  SET pres = CONCAT('foo','\n','bar');
END;
//
DELIMITER ;

SET @res = 'foo ';
CALL sp1(@res);
SELECT @res;</pre><p>When we execute it, we get this:</p><pre>shell> <b>mysql -N test < foo.sql</b>
foo\nbar</pre><p><b>What on earth is wrong?</b> After some looking, we found a not so often used option called <tt>--raw</tt>. This produces the the desired effect:</p><pre>shell> <b>mysql -Nr test < foo.sql</b>
foo
bar
</pre><p>But that's not all! Use <tt>\G</tt> when selecting the <tt>OUT</tt>-variable and it is also working. The output is not so useful though.</p><p>Ah.. The things you find out while having the day off..</p><div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/7603704315097619422-7329275971334451949?l=geert.vanderkelen.org" alt="" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24739&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24739&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/05/13/mysql-client-tool-and-how-to-output-newlines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Simulating server-side cursors with MySQL Connector/Python</title>
		<link>http://geert.vanderkelen.org/2010/04/simulating-server-side-cursors-with.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=simulating-server-side-cursors-with-mysql-connectorpython</link>
		<comments>http://geert.vanderkelen.org/2010/04/simulating-server-side-cursors-with.html#comments</comments>
		<pubDate>Mon, 26 Apr 2010 13:09:24 +0000</pubDate>
		<dc:creator>Geert Vanderkelen</dc:creator>
				<category><![CDATA[hacking]]></category>
		<category><![CDATA[myconnpy]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[pyhon]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Last week, my colleague Massimo and I discussed how to handle big result sets coming from MySQL in Python. The problem is that MySQL doesn't support server-side cursors, so you need to select everything and then read it. You can do it either buffered or not. MySQL Connector/Python defaults to non-buffered, meaning that you need to fetch all rows after issuing a SELECT statement. You can also turn on the buffering, mimicking what MySQL for Python (MySQLdb) does.For big result sets, it's better to limit your search. You can do this using an integer primary key or some temporal field for example. Or you can use the LIMIT keyword. The latter solution is what is used in the MySQLCursorServerSide cursor-class. Using the SELECT it creates a temporary table from which the fetch-methods will get the information. It is something people have probably implemented in their applications, but I hope this new class will make it easier since it's done transparently.The code is not pushed yet, but expect it to be available in next release. Here is an example how you could use it. This code selects cities staring with Z, loops over the result getting the country (yes, this is a simple join made difficult):cnx = db.connect(user='root',db='world')
    cur = cnx.cursor()
    curCity = cnx.cursor(db.cursor.MySQLCursorServerSide)
    
    curCity.execute("SELECT ID,Name,CountryCode FROM City "\
        "WHERE NAME LIKE 'Z%' ORDER BY ID")
    
    for city in curCity:
        cur.execute("SELECT Code,Name FROM Country WHERE CODE = %s",
            (city[2],))
        country = cur.fetchone()
        print "%s (%s)" % (city[1], country[1])
    
    cur.close()
    cnx.close()
I guess the main advantage is that you can use two or more cursor objects with the same connection without the need of buffering everything in Python. On the MySQL side, the temporary table could go to disk when to big. It's maybe slower, but keeping big result sets in memory ain't good either.Comments are welcome!]]></description>
			<content:encoded><![CDATA[<p>Last week, my colleague Massimo and I discussed <strong>how to handle big result sets</strong> coming from <a href="http://dev.mysql.com">MySQL</a> in <a href="http://www.python.org">Python</a>. The problem is that MySQL doesn't support server-side cursors, so you need to select everything and then read it. You can do it either buffered or not. <a href="http://launchpad.net/myconnpy">MySQL Connector/Python</a> defaults to non-buffered, meaning that you need to fetch all rows after issuing a <tt>SELECT</tt> statement. You can also turn on the buffering, mimicking what <a href="http://sourceforge.net/projects/mysql-python/files/">MySQL for Python</a> (MySQLdb) does.</p><p>For big result sets, <strong>it's better to limit your search</strong>. You can do this using an integer primary key or some temporal field for example. Or you can use the <tt>LIMIT</tt> keyword. The latter solution is what is used in the <tt>MySQLCursorServerSide</tt> cursor-class. Using the <tt>SELECT</tt> it creates a temporary table from which the fetch-methods will get the information. It is something people have probably implemented in their applications, but I hope this new class will make it easier since it's done transparently.</p><p><strong>The code is not pushed yet</strong>, but expect it to be available in next release. Here is an example how you could use it. This code selects cities staring with <tt>Z</tt>, loops over the result getting the country (yes, this is a simple join made difficult):</p><pre>cnx = db.connect(user='root',db='world')
    cur = cnx.cursor()
    curCity = cnx.cursor(db.cursor.MySQLCursorServerSide)
    
    curCity.execute("SELECT ID,Name,CountryCode FROM City "\
        "WHERE NAME LIKE 'Z%' ORDER BY ID")
    
    for city in curCity:
        cur.execute("SELECT Code,Name FROM Country WHERE CODE = %s",
            (city[2],))
        country = cur.fetchone()
        print "%s (%s)" % (city[1], country[1])
    
    cur.close()
    cnx.close()
</pre><p>I guess <strong>the main advantage</strong> is that you can use two or more cursor objects with the same connection without the need of buffering everything in Python. On the MySQL side, the temporary table could go to disk when to big. It's maybe slower, but keeping big result sets in memory ain't good either.</p><p>Comments are welcome!</p><div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/7603704315097619422-87433650665512154?l=geert.vanderkelen.org" alt="" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24510&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=24510&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/04/26/simulating-server-side-cursors-with-mysql-connectorpython/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

