Archive for the ‘Opinions’ Category

MySQL command line vs. visual editors — reflections

Февраль 1st, 2012

My previous post drew some attention, and in particular two comments I wish to relate to. I also wish to make a more fine-grained observation on visual editors.

One comment is by Peter Laursen, who rejected the generalization in my post, and another by wlad, who was harsher (but to the point), suggesting my post translated to "I don't know it, therefore it sucks".

I must have delivered the wrong message, since apparently people read my post as "don't use visual editors, they are bad for you", which is not what I intended to say, nor is it what I actually think. I took a very specific aspect of visual editors and commented on that. My comment should not extrapolate to "anything about visual editors is bad".

I don't think people should only be using the command line, and am not in the least interested in vi/emacs/eclipse/netbeans kind of wars. Of course everyone should be using whatever works best for them in terms of productivity and ease of use. And of course visual editors have great advantages over command line.

I'm sorry, it would be childish to assume otherwise, and extra-extra-extrapolation to suggest that was my hidden meaning.

Reflections on Peter's comments

I can see that my post was harsh on visual editors, and agree one cannot just generalize on the dozens of editors available. I have used a few tools, and only to some extent, since I personally feel more comfortable with command line, and so I cannot possibly generalize on all tools. Nor are they all equal.

My point is, there is one de-facto tool, which is the command-line, for which there are no hidden assumptions. For which I don't need to investigate whether a new connection is being opened, or if my query is somehow modified, or... I guess one can say "hey, my tool does nothing of the sort and acts just like the command line". I'm not arguing. The visual editor is an additional layer between you and the server (the mysql command line client is in itself a layer, but a very transparent one, a very low level one), one which needs to take care of visual issues, such as display space, GUI components, widgets not running out of memory, responsiveness, good looks. These, one way or the other, add complexity to the discussion between yourself and your database.

One can say "it's all configurable, there's good documentation, you should know the tool you are using". To some extent the same can be said of the command line client. Here is an observation of mine, please consider it as user input: many (I don't know if most) users will not bother modifying the settings. Many will feel uncomfortable with turning checkboxes on and off, or modify thresholds. It is similar in essence to users not modifying the default InnoDB settings for the MySQL server, working with 8MB of buffer pool.

Anyway I agree my post was over-generalizing on the criticism of visual editors. Sorry for that. It should have been more specific on the advantages of command line.

Reflections on wlad's comments

Since I wanted to concentrate on the command line issue, I didn't elaborate on other stuff. For example, I didn't specify on what type of trainings I ask students to close down their visual editors and open up a command line. Perhaps one could read as "from this moment on, and for the entire duration of the training, no one should be using visual editors, and anyone who does will find himself/herself out of the class". I really didn't think to elaborate on that, and even now - wow, it will take me ages to tell the full story. So I will be extra-brief:

I do make that suggestion. I do that when I start discussion of keys & execution plans. I don't enforce that. I don't mind which tool a developer uses. I actually tell them about various GUI tools they can use (naming the ones I see as most popular among other students).

But I also do not turn my training to "I'm going to do this session using the same tools you are using just because you are using them and you are in this class".

There is a multi-layered reasoning for that:

  • Many times there is a mixture of tools in the classroom. I can't just pick up on one. The command line is the most common tool everyone can agree upon. I sometimes demonstrate on MySQL Workbench since many are familiar with it being an "official" GUI tool by the same company which makes the database.
  • If I dwell on the particulars of a GUI tool, this consumes precious time that must be used to explain the EXPLAIN and talk about INDEX in depth. I just don't have that time on training. The command line consumes very little time for me.
  • I also don't see that I should demonstrate using commercial products. I have nothing against commercial, and on occasion I buy commercial, and I certainly do not slight a commercial product just because it's commercial, but I prefer to sponsor open source. If students have a commercial product, I don't see this is my duty to walk them through the product.
  • There are actually a few more reasons; operating systems, installations, assumptions on installed software, strict system administrators who would not installed anything I would tell them, projectors, font sizes, contrast of colors on projected image, ... Believe me, I've been doing this for years, there are many factors to any small decision in class.

Last, I wish to relate to wlad's "I don't know it. It sucks" attribution to my post.

It is true there is not one single SQL GUI tool I know bottoms up. And the status is deteriorating, since I've long since decided my personal preference is command line. This makes me a non-expert, no-authority on any GUI tool. I actually present myself as such in class. This does not lower the quality of the training; it's an aspect outside the training (for the reasons I mentioned above). I therefore agree to "I don't know it".

Now, I strongly believe I do not make the connection to "therefore it sucks". I think that's an exaggerated extrapolation of what I said. But it is such an overwhelming statement about myself for me to read, that I will keep it in mind. It's a good thing to keep in mind, not only on GUI tools, but also in real life.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL command line vs. visual editors

Январь 30th, 2012

Students in my training classes usually prefer to use some kind of visual editor for MySQL. Typically this would be the software they're using at work. Sometimes they just bring over their laptops with the software installed. Or they would use MySQL Workbench, which is what I usually have pre-installed on their desktops.

I see MySQL Workbench, SQLyog, Toad for MySQL, or several more.

I always humbly suggest they close down their software and open up a command line.

It isn't fancy. It may not even be convenient (especially on Windows, in my opinion). And repeating your last command with a minor modification requires a lot of key stroking. Or you would copy+paste from some text editor. Most students will give it a shot, then go back to their favorite editor.

Well, again and again I reach the same conclusion:

Visual editors are not as trustworthy as the command line.

Time and again students show me something on their editor. Behavior seems strange to me. Opening up a console shows a completely different picture.

Things like:

  • The visual editor would open a new connection for every new query (oh, so the @user_defined_variable I've just assigned turns NULL, or the TEMPORARY TABLE disappears).
  • The visual editor will only show 1,000 results, via LIMIT 0,1000. "But the same query runs so much faster on my machine!". Well, sure, a filesort of 1,000,000 rows that can satisfy the first 1,000 will quit early!
  • The visual editor shows table definition graphically. "I didn't realize the index did(n't) cover this and that columns. I didn't realize it only covered first n characters of my VARCHAR.". That's because you can't beat SHOW CREATE TABLE, the definite table structure description.
  • The visual editor allows for export/import/copy/transfer of tables and rows with just one click! "Why is it so complicated in the command line to purge 1,000,000 rows from a table?". Ummm, did you realize the visual editor would typically use a naive approach of doing everything in one huge transaction?
  • The visual editor is smart. But sometimes you don't want smart. You just assume simple. I personally take great precaution with smart solutions. Luckily, with scripts you have so much greater control (i.e. command line options, "dry-run" mode, etc.) that I have greater confidence in them.

I do like it when a visual editor plays it both smart and safe, in such way that before doing its smart work it actually presents you with the query it's going to issue. Which is why I always considered MySQL Query Browser (now replaced by Workbench) to be the visual editor of choice in my classes.

But, at the end of the day, I strongly believe: if you don't know how to do it with command line, you can't really know how it's done.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL command line vs. visual editors

Январь 30th, 2012

Students in my training classes usually prefer to use some kind of visual editor for MySQL. Typically this would be the software they're using at work. Sometimes they just bring over their laptops with the software installed. Or they would use MySQL Workbench, which is what I usually have pre-installed on their desktops.

I see MySQL Workbench, SQLyog, Toad for MySQL, or several more.

I always humbly suggest they close down their software and open up a command line.

It isn't fancy. It may not even be convenient (especially on Windows, in my opinion). And repeating your last command with a minor modification requires a lot of key stroking. Or you would copy+paste from some text editor. Most students will give it a shot, then go back to their favorite editor.

Well, again and again I reach the same conclusion:

Visual editors are not as trustworthy as the command line.

Time and again students show me something on their editor. Behavior seems strange to me. Opening up a console shows a completely different picture.

Things like:

  • The visual editor would open a new connection for every new query (oh, so the @user_defined_variable I've just assigned turns NULL, or the TEMPORARY TABLE disappears).
  • The visual editor will only show 1,000 results, via LIMIT 0,1000. "But the same query runs so much faster on my machine!". Well, sure, a filesort of 1,000,000 rows that can satisfy the first 1,000 will quit early!
  • The visual editor shows table definition graphically. "I didn't realize the index did(n't) cover this and that columns. I didn't realize it only covered first n characters of my VARCHAR.". That's because you can't beat SHOW CREATE TABLE, the definite table structure description.
  • The visual editor allows for export/import/copy/transfer of tables and rows with just one click! "Why is it so complicated in the command line to purge 1,000,000 rows from a table?". Ummm, did you realize the visual editor would typically use a naive approach of doing everything in one huge transaction?
  • The visual editor is smart. But sometimes you don't want smart. You just assume simple. I personally take great precaution with smart solutions. Luckily, with scripts you have so much greater control (i.e. command line options, "dry-run" mode, etc.) that I have greater confidence in them.

I do like it when a visual editor plays it both smart and safe, in such way that before doing its smart work it actually presents you with the query it's going to issue. Which is why I always considered MySQL Query Browser (now replaced by Workbench) to be the visual editor of choice in my classes.

But, at the end of the day, I strongly believe: if you don't know how to do it with command line, you can't really know how it's done.


PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL command line vs. visual editors

Январь 30th, 2012

Students in my training classes usually prefer to use some kind of visual editor for MySQL. Typically this would be the software they're using at work. Sometimes they just bring over their laptops with the software installed. Or they would use MySQL Workbench, which is what I usually have pre-installed on their desktops.

I see MySQL Workbench, SQLyog, Toad for MySQL, or several more.

I always humbly suggest they close down their software and open up a command line.

It isn't fancy. It may not even be convenient (especially on Windows, in my opinion). And repeating your last command with a minor modification requires a lot of key stroking. Or you would copy+paste from some text editor. Most students will give it a shot, then go back to their favorite editor.

Well, again and again I reach the same conclusion:

Visual editors are not as trustworthy as the command line.

Time and again students show me something on their editor. Behavior seems strange to me. Opening up a console shows a completely different picture.

Things like:

  • The visual editor would open a new connection for every new query (oh, so the @user_defined_variable I've just assigned turns NULL, or the TEMPORARY TABLE disappears).
  • The visual editor will only show 1,000 results, via LIMIT 0,1000. "But the same query runs so much faster on my machine!". Well, sure, a filesort of 1,000,000 rows that can satisfy the first 1,000 will quit early!
  • The visual editor shows table definition graphically. "I didn't realize the index did(n't) cover this and that columns. I didn't realize it only covered first n characters of my VARCHAR.". That's because you can't beat SHOW CREATE TABLE, the definite table structure description.
  • The visual editor allows for export/import/copy/transfer of tables and rows with just one click! "Why is it so complicated in the command line to purge 1,000,000 rows from a table?". Ummm, did you realize the visual editor would typically use a naive approach of doing everything in one huge transaction?
  • The visual editor is smart. But sometimes you don't want smart. You just assume simple. I personally take great precaution with smart solutions. Luckily, with scripts you have so much greater control (i.e. command line options, "dry-run" mode, etc.) that I have greater confidence in them.

I do like it when a visual editor plays it both smart and safe, in such way that before doing its smart work it actually presents you with the query it's going to issue. Which is why I always considered MySQL Query Browser (now replaced by Workbench) to be the visual editor of choice in my classes.

But, at the end of the day, I strongly believe: if you don't know how to do it with command line, you can't really know how it's done.


PlanetMySQL Voting: Vote UP / Vote DOWN

Speaking at “August Penguin 2011″

Август 3rd, 2011

I will be speaking at August Penguin 2011 (אוגוסט פינגווין), on August 12th in Ramat-Gan, Israel.

August Penguin is the annual meeting of Hamakor society: an Israeli society for Free Software and Open-Source Code (read more here).

I’ll be holding a non-technical talk about MySQL, titled “MySQL and the Open Source Sphere”. In this talk I will be presenting my impressions of the nature of open source development of MySQL and surroundings: the core server, the various forks, patches, 3rd party tools, companies involved, etc. So this is a general “get to know who’s who & what’s what in the MySQL world”.

This is a 30 minutes talk. I will surely not cover every open source development in this field, so my apologies in advance to those I leave out. The truth is, there’s so much going on lately I can hardly keep up with reading the announcements. Truly, this is a wonderful time for open source development with MySQL.

Talk will be held in Hebrew.

See you there!


PlanetMySQL Voting: Vote UP / Vote DOWN

Thoughts and ideas for Online Schema Change

Октябрь 7th, 2010

Here’s a few thoughts on current status and further possibilities for Facebook’s Online Schema Change (OSC) tool. I’ve had these thoughts for months now, pondering over improving oak-online-alter-table but haven’t got around to implement them nor even write them down. Better late than ever.

The tool has some limitations. Some cannot be lifted, some could. Quoting from the announcement and looking at the code, I add a few comments. I conclude with a general opinion on the tool’s abilities.

“The original table must have PK. Otherwise an error is returned.”

This restriction could be lifted: it’s enough that the table has a UNIQUE KEY. My original oak-online-alter-table handled that particular case. As far as I see from their code, the Facebook code would work just as well with any unique key.

However, this restriction is of no real interest. As we’re mostly interested in InnoDB tables, and since any InnoDB table should have a PRIMARY KEY, we shouldn’t care too much.

“No foreign keys should exist. Otherwise an error is returned.”

Tricky stuff. With oak-online-alter-table, changes to the original table were immediately reflected in the ghost table. With InnoDB tables, that meant same transaction. And although I never got to update the text and code, there shouldn’t be a reason for not using child-side foreign keys (the child-side is the table on which the FK constraint is defined).

The Facebook patch works differently: it captures changes and writes them to a delta table,  to be later (asynchronously) analyzed and make for a replay of actions on the ghost table.

So in the Facebook code, some cases will lead to undesired behavior. Consider two tables, country and city, with city holding a RESTRICT/NO ACTION foreign key on country‘s id. Now consider the scenario:

  1. Rows from city are DELETEd, where the country Id is Spain’s.
    • city‘s ghost table is still unaffected, Spain’s cities are still there.
    • A change is written to the delta table to mark these rows for deletion.
  2. A DELETE is issued on country‘s Spain record.
    • The DELETE should work, from the user’s perspective
    • But it will fail: city’s ghost table has not received the changes yet. There’s still matching rows. The NO ACTION constraint will fail the DELETE statement.

Now, this does not lead to corruption, just to seemingly unreasonable behavior on the database part. This behavior is probably undesired. NO ACTION constraint won’t do.

However, with CASCADE or SET NULL options, there is less of an issue: operations on the parent table (e.g. country) cannot fail. We must make sure operations on the ghost table make it consistent with the original table (e.g. city).

Consider the following scenario:

  1. A new country is created, called “Sleepyland”. An INSERT is made to country.
    • Both city and city‘s ghost are immediately aware of it.
  2. A new town is created and INSERTed to city. The town is called “Naphaven”.
    • The change takes time to propagate to city‘s ghost table.
  3. Meanwhile, we realized we made a mistake. We’ve been had. There’s no such city nor country.
    1. We DELETE “Naphaven” from city.
    2. We DELETE “Sleepyland” from country.
    • Note that city‘s ghost table still hasn’t caught up with the changes.
  4. Eventually, the INSERT statement for “Naphaven” reaches city‘s ghost table.
    • What should happen now? The INSERT cannot succeed.
    • Will this fail the entire process?

Looking at the PHP code, I see that changes written on the delta table are blindly replayed on the ghost table.

Since the process is asynchronous, this should not be the case. We can solve the above if we use INSERT IGNORE instead of INSERT. The statement will fail without failing anything else. The row cannot exist, and that’s because the original row does not exist anymore.

Unlike a replication corruption, this does not lead to accumulation mistakes. The replay is static, somewhat like in binary log format. Changes are just written, regardless of existing data.

I have given this considerable thought, and I can’t say I’ve covered all the possible scenario. However I believe that with proper use of INSERT IGNORE and REPLACE INTO (two statements I heavily relied on with oak-online-alter-table), correctness can be achieved.

There’s the small pain of re-generating the foreign key definition on the “ghost” table (CREATE TABLE LIKE … does not copy FK definitions). And since foreign key names are unique, a new name must be picked up. Not pretty, but perfectly doable.

“No AFTER_{INSERT/UPDATE/DELETE} triggers must exist.”

It would be nicer if MySQL had an ALTER TRIGGER statement. There isn’t such statement. If there were such an atomic statement, then we would be able to rewrite the trigger, so as to add our own code to the end of the trigger’s code. Yuck. Would be even nicer if we were allowed to have multiple triggers of same event.

So, we are left with DROP and CREATE triggers. Alas, this makes for a short period where the trigger does not exist. Bad. The easy solution would be to LOCK WRITE the table, but apparently you can’t DROP the trigger (*) when the table is locked. Sigh.

(*) Happened to me, apparently to Facebook too; With latest 5.1 (5.1.51) version this actually works. With 5.0 it didn’t use to; this needs more checking.

Use of INFORMATION_SCHEMA

As with oak-online-alter-table, the OSC checks for triggers, indexes, column by searching on the INFORMATION_SCHEMA tables. This makes for nice SQL for getting the exact listing and types of PRIMARY KEY columns, whether or not AFTER triggers exist, and so on.

I’ve always considered this to be the weak part of openark-kit, that it relies on INFORMATION_SCHEMA so much. It easier, it’s cleaner, it’s even more correct to work that way — but it just puts too much locks. I think Baron Schwartz (and now Daniel Nichter) did amazing work on analyzing table schemata by parsing the SHOW CREATE TABLE and other SHOW commands regex-wise with Maatkit. It’s a crazy work! Had I written openark-kit in Perl, I would have just import their code. But I’m too lazy busy to do the conversion from Perl to Python, and rewrite that code, what with all the debugging.

OSC is written in PHP. Again, much conversion work. I think performance-wise this is an important step to make.

A word for the critics

Finally, a word for the critics. I’ve read some Facebook/MySQL bashing comments and wish to relate.

In his interview to The Register, Mark Callaghan gave the example that “Open Schema Change lets the company update indexes without user downtime, according to Callaghan”.

PostgeSQL was mentioned for being able to add index with only read locks taken, or being able to do the work with no locks using CREATE INDEX CONCURRENTLY. I wish MySQL had that feature! Yes, MySQL has a lot to improve upon, and the latest PostgreSQL 9.0 brings valuable new features. (Did I make it clear I have no intention of bashing PostgreSQL? If not, please re-read this paragraph until convinced).

Bashing related to the notion of MySQL being so poor that Facebook used an even poorer mechanism to work out the ALTER TABLE.

Well, allow me to add a few words: the CREATE INDEX is by far not the only thing you can achieve with OSC (although it may be Facebook’s major concern). You should be able to:

  • Add columns
  • Drop columns
  • Convert character sets
  • Modify column types
  • Add partitioning
  • Reorganize partitioning
  • Compress the table
  • Otherwise changing table format
  • Heck, you could even modify the storage engine! (To other transactional engine)

These are giant steps. How easy would it be to write these down into the database? It only takes a few weeks time to work out a working solution with reasonable limitations, just using the resources the MySQL server provides you with. The MySQL@Facebook team should be given credit for that.


PlanetMySQL Voting: Vote UP / Vote DOWN

Oracle week 2010 in Israel: not a single MySQL session

Сентябрь 19th, 2010

Development, Middleware, [Oracle] Database, BI, ERP, CRM, SCM, EPM, SOA & BPM, Java, Security — all these and more are on the schedule. No MySQL track, not a single MySQL session.

Lack of speakers? Hardly. Lack of public interest? I can’t imagine.

Than what is it? I have no information, and don’t want to throw around suggestions.

Reading the convention’s objectives, I see nothing to suggest why MySQL would not be in. Heck, there’s even two new special seminars: “Family Economy” and “Career planning“.

Sorry, not providing a link this time (besides, it’s all Hebrew). If you’re eager to look it up, google for “Oracle week Israel”.

It’s a shame, and makes it harder to answer the “so what do you think Oracle will do to MySQL?” question I get to be asked ever so often.


PlanetMySQL Voting: Vote UP / Vote DOWN

openark-kit, Facebook Online Schema Change, and thoughts on open source licenses

Сентябрь 16th, 2010

MySQL@Facebook team have recently published an Online Schema Change code for non blocking ALTER TABLE operations. Thumbs Up!

The code is derived from oak-online-alter-table, part of openark-kit, a toolkit I’m authoring. Looking at the documentation I can see many ideas were incorporated as well. And of course many things are different, a lot of work has been put to it by MySQL@Facebook.

openark-kit is currently released under the new BSD license, and, as far as I can tell (I’m not a lawyer), Facebook’s work has followed the license to the letter. It is a strange thing to see your code incorporated into another project. While I knew work has begun on the tool by Facebook, I wasn’t in on it except for a few preliminary email exchanges.

And this is the beauty

You release code under open source license, and anyone can pick it up and continue working on it. One doesn’t have to ask or even let you know. Eventually one may release back to the community improved code, more tested (not many comments on oak-online-alter-table in the past 18 months).

It is a beauty, that you can freely use one’s patches, and he can then use yours.

And here is my concern

When I created both openark-kit and mycheckpoint, I licensed them under the BSD license. A very permissive license. Let anyone do what they want with it, I thought. However Facebook’s announcement suddenly hit me: what license would other people use for their derived work?

The OSC has been release under permissive license back to the community (again, I am not a lawyer). But, someone else could have made it less friendly. Perhaps not release the code at all: just sell it, closed-source, embedded in their product. And I found out that I do not want anyone to do whatever they want with my code.

I want all derived work to remain open!

Which is why in next releases of code I’m authoring the license will change to less permissive and more open license, such as GPL or LGPL. (Of course, all code released so far remains under the BSD license).


PlanetMySQL Voting: Vote UP / Vote DOWN

Personal observation: more migrations from MyISAM to InnoDB

Июнь 16th, 2010

I’m evidencing an increase in the planning, confidence & execution for MyISAM to InnoDB migration.

How much can a single consultant observe? I agree Oracle should not go to PR based on my experience. But I find that:

  • More companies are now familiar with InnoDB than there used to.
  • More companies are interested in migration to InnoDB than there used to.
  • More companies feel such migration to be safe.
  • More companies start up with an InnoDB based solution than with a MyISAM based solution.

This is the way I see it. No doubt, the Oracle/Sun deal made its impact. The fact that InnoDB is no longer a 3rd party; the fact Oracle invests in InnoDB and no other engine (Falcon is down, no real development on MyISAM); the fact InnoDB is to be the default engine: all these put companies at ease with migration.

I am happy with this change. I believe for most installations InnoDB provides with a clear advantage over MyISAM (though MyISAM has its uses), and this makes for more robust, correct and manageable MySQL instances; the kind that make a DBA’s life easier and quieter. And it is easier to make customers see the advantages.

I am not inclined to say “You should migrate your entire database to InnoDB”. I don’t do that a lot. But recently, more customers approach and say “We were thinking about migrating our entire database to InnoDB, what do you think?”. What a change of approach.

And, yes: there are still a lot of companies using MyISAM based databases, who still live happily.


PlanetMySQL Voting: Vote UP / Vote DOWN

Poll: what (minor) versions of MySQL should be supported by an open source MySQL related project?

Апрель 2nd, 2010

I would like to get the community’s opinion about supporting older (minor) versions of MySQL in open source projects (other than MySQL itself).

If I were to develop some open source project, and a bug report came which only applied to MySQL 5.0.51 (but more recent versions worked fine), would I need to fix the code so as to support this older version?

How about supporting 5.0.22 (released almost 4 years ago, with almost 70 revisions since)? Would you expect an open source project to support this MySQL version because, say, this is the default version in your yum repository?

I would like to concentrate on the currently stable MySQL versions: 5.0 and 5.1. Versions 4.x are out of the question for me, and 5.5 is not yet GA.

Sure, it would be great to support everything. But also time and effort consuming. So, I would greatly appreciate your feedback!

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
PlanetMySQL Voting: Vote UP / Vote DOWN