Archive for the ‘Google’ Category

Four short links: 21 June 2010

Июнь 21st, 2010

  1. Law of Success 2.0 -- a blog of interviews with famous and/or interesting people, from Brad Feld to Uri Geller.
  2. Pioneer One -- crowdsourced funding for TV show, perhaps a hint of the future. Pilot shot for $6,000 which was raised through KickStarter. Distributed via BitTorrent.
  3. DrasticTools -- PHP/MySQL visualisation tools, including TreeMap, tag cloud, hierarchical bar chart, and animated list. (via TomC on Delicious)
  4. GoogleCL -- command-line interface to Google services. At the moment the services are Picasa, Blogger, YouTube, Contacts, Docs, and Calendar.


PlanetMySQL Voting: Vote UP / Vote DOWN

Welcome googleCL

Июнь 19th, 2010
I am writing this blog post with Vim, my favorite editor, instead of using the online editor offered by blogger. And I am uploading this post to my Blogger account using Google CL a tool that lets you use Google services from the command line.
I am a command line geek, and as soon as I saw the announcement, I installed it in my laptop. The mere fact that you are reading this blog post shows that it works.

GoogleCL is an apparently simple application. If you install it on Mac using macports you realize how many dependencies it has and how much complexity it gives under the hood.
Using an easy to understand syntax, it allows you to access your blog, pictures, calendar, contacts, videos, and online documents at your fingertips.
For example, let's query my blog for partitioning:

$ google blogger --blog="The Data Charmer" --title=partitioning list "title,url"

Hmm. No results. The manual doesn't help much, but something happened during this query. The first thing ist that I was asked to authorize the script to access my blog, and that was done by activating a key that I got in the command line. So far, so good. The second thing was a message informing me that a default configuration file was created in my home directory. Looking at that file, I saw an option saying "regex = True". Aha! So the title supports regular expressions. Let's try:

$ google blogger --blog="The Data Charmer" --title=".*partitioning" list "title"
Holiday gift - A deep look at MySQL 5.5 partitioning enhancements
The partition helper - Improving usability with MySQL 5.1 partitioning
A quick usability hack with partitioning
MySQL 5.1 Improving ARCHIVE performance with partitioning

OK. This gives me everything with the word "partitioning" in the title. But I know that some titles are missing. Comparing with the results that I get online, I see that the titles where "partitioning" is capitalized are not reported. So the search is case sensitive. What I need to do is to tell the regular expression that I want a case insensitive search. Fortunately, I know how to speak regular expressions. Let's try again.

$ google blogger --blog="The Data Charmer" --title="(?i).*partitioning.*" list "title"
Holiday gift - A deep look at MySQL 5.5 partitioning enhancements
Partitioning with non integer values using triggers
Tutorial on Partitioning at the MySQL Users Conference 2009
The partition helper - Improving usability with MySQL 5.1 partitioning
A quick usability hack with partitioning
MySQL 5.1 Improving ARCHIVE performance with partitioning

Now I feel confident enough to do some changes to my online contents.
To create this blog post, I used some of googlecl capabilities. After I created an image, I uploaded it to my Picasa album using this command:

$google picasa post -n "Blogger Pictures" -t googlecl ~/Desktop/google_cl.png

Then I asked Picasa to give me the URL of the image:

$ google picasa list -n "Blogger Pictures" --query googlecl title,url_direct
google_cl.png,http://lh6.ggpht.com/_gVfZHGgf5LA/TBzjaKiJJvI/AAAAAAAAA74/dthDDhybsmc/google_cl.jpg

And then I inserted that URL in this blog post. Finally, I uploaded the blog post with this command:

google blogger --blog="The Data Charmer" --draft --title "Welcome googleCL" --tags="google,mysql,partitioning,command line,blogging" post ~/blog/welcome_googlecl.html


(Now writing online) And after I checked that the post was looking as I wanted it, I hit the "PUBLICH POST" button.
Welcome, GoogleCL!

PlanetMySQL Voting: Vote UP / Vote DOWN

My MySQL keynote slides and video

Апрель 16th, 2010

Been asked a few times in the last few days about where my slides are from my MySQL keynote from *last* year.

Ooops.

Um, yeah.  Sorry about that.  Here’s a link to ‘The SmugMug Tale’ slides, and you can watch the video below:

Sorry for the extreme lag.  I suck.

The important highlights go something like this:

  • Use transactional replication.  Without it, you’re dead in the water. You have no idea where a crashed slave was.
  • Use a filesystem that lets you do snapshots.  Easily the best way to do backups, spin up new slaves, etc. I love ZFS.  You’ll need transactional replication to really make this painless.
  • Use SSDs if you can. We can’t afford to be fully deployed on SSDs (terabytes are expensive), but putting them in the write path to lower latency is awesome.  The read path might help, too, depending on how much caching you’re already doing.  Love hybrid storage pools.
  • Use Fishworks (aka Open Storage) if you can.  The analytics are unbeatable, plus you get SSDs, snapshots, ZFS, and tons of other goodies.
  • Use transactional replication. This is so important I’m repeating it.  Patch it into MySQL (Google, Facebook, and Percona have patches) or use XtraDB if you use replication.  We use the Percona patch.

Holler in the comments if something in the presentation isn’t clear, I’ll answer.  Apologies again.

Shameless plug - we’re hiring. And it’s a blast.



PlanetMySQL Voting: Vote UP / Vote DOWN

CAOS Theory Podcast 2010.02.19

Февраль 20th, 2010

Topics for this podcast:

*Jacobsen v. Katzer and open source impact
*Intel, Nokia team up for MeeGo open source OS
*Open source continues in embedded space
*MongoDB and the advent of the NoSQL databases
*Copyrights, complexities, control and conflict

iTunes or direct download (21:48, 6.07 MB)


PlanetMySQL Voting: Vote UP / Vote DOWN

CAOS Theory Podcast 2010.02.05

Февраль 5th, 2010

Topics for this podcast:

*Matt Asay moves from Alfresco to Canonical
*GPL fade fuels heated discussion
*Apple’s iPad and its enterprise and open source impact
*Open source in data warehousing and storage
*Our perspective on Oracle’s plans for Sun open source

iTunes or direct download (32:50, 9.2 MB)


PlanetMySQL Voting: Vote UP / Vote DOWN

A guide to The 451 Group’s open source software coverage

Январь 13th, 2010

Regular visitors to the 451 CAOS Theory blog will be well aware of The 451 Group’s CAOS (Commercial Adoption of Open Source) research service and our CAOS long-form reports.

They are probably less aware of the open source coverage that The 451 Group provides on a day-to-day and week-to-week basis, however, and I thought it would be worthwhile to provide some examples of The 451 Group’s ongoing open source coverage by highlighting a few recent reports.

The company’s core services are 451 Market Insight Service, which delivers daily insight into emerging enterprise IT markets, and 451 TechDealmaker, a forward-looking weekly analysis service focused on M&A activity within the enterprise IT business.

Here’s some examples of how our coverage fits in to those two services. Needless to say, these reports are only available to clients, although you can apply for trial access. Vendors - open source or otherwise - do not have to be clients in order to be covered by our analysts.

451 Market Insight Service
The 451’s CAOS analysts - Jay and I - are responsible for much of the coverage of open source specialist vendors. Recent examples include:

Meanwhile The 451 Group’s team of analysts also cover open source related vendors in their respective coverage areas, often in conjunction with CAOS analysts. For example:

Additionally, we also provide reports assessing the strategies of proprietary/mixed source vendors towards open source. Examples include:

In addition to our vendor-centric MIS output, open source also regularly makes an appearance in our reports assessing wider industry trends. For example:

451 TechDealmaker
451 Group analysts follow open source-related M&A in their coverage areas, again often working with the CAOS analsyst. Examples include:

While we also provide reports assessing the prospects of potential acquirers and targets alike. For example:

And again, open source makes an appearance in our reports assessing wider industry trends. For example:

For those with an interest in M&A it is also worth mentioning is 451 M&A KnowledgeBase – the company’s merger and acquisition database, which contains details of all M&A deals tracked by The 451 Group, and offers the ability to filter search results to contain deals that are themed “open source”.


PlanetMySQL Voting: Vote UP / Vote DOWN

A year in review; new direction.

Октябрь 3rd, 2009
It has been more than a year since my self-imposed hiatus from serious MySQL development started and I think it is about time that I get back into the saddle. I have a handful of working prototypes but I should get the code out there, back into the community.I learned a bunch of stuff during the past year at Google but in the end, working on JavaScript, HTML/CSS and Google proprietary languages
PlanetMySQL Voting: Vote UP / Vote DOWN

MySQL2GoogleSpreadsheets

Сентябрь 22nd, 2009
I've managed to find a way to connect MySQL directly to Google Spreadsheets and although that its not yet perfect, it does show a lot of potential.

You will have a MySQL table directly fed into Google Spreadsheets. From there, you could do some charts, highlights some trends or simply share the data as is with our people in our out of your organization is a secure way.

The end result should look something like this:


What you will need:
A Linux Server with Apache and MySQL
A Google Apps Premium account
Google Secure Data Connector installed on your Linux server


SDC
Installing the SDC is a bit tricky but not too difficult and there is a lot of documentation as to how to do it. A lot of it is giving things the right permissions and configuring 2-3 XML files.

You can read more about how SDC works, but from my impressions, it runs on your Linux server and gets data from apache. It then sends that data in a secure way into Google Spreadsheets. So when you query your data, the URL you use is "localhost" which is as if you are on the server.

Apache
Again, its important to note that everything that SDC gets has to go through Apache, which gives you the option of static data (in the form of CSV or XML files) or dynamic data (from PHP for example). It also means that you should secure Apache to not give the whole world this data. You can do this with OAuth (and there are a lot of instructions as to how to secure Apache with SDC in the SDC documentation) or you could try restricting Apache to know that it will only server 127.0.0.1 since SDC runs on your server. Either way, you need to factor that in.

MySQL
The question you must be asking now is, since SDC connects to Apache, how to I get data out of MySQL directly? (hopefully in the future SDC might be able to query MySQL directly)

In order to achieve that, I have used MySQL CSV Storage Engine, which basically means that MySQL will store all the data from the table into a CSV file on the hard disk (I hope that you are already imagining the possibilities here).
All you would need to do is update that table with an INSERT or UPDATE statement and the CSV file gets updated.

Symlink
Now we need to connect that CSV file to Apache. For that, we can use a Symbolic Link in Linux and connect the CSV into Apache by placing the link into the directory Apache reads files from (where ever your index.html file is, usually /var/www/html).
Now another thing to remember is that you need to give Apache rights to see the MySQL directory and be careful about doing that. An option could be to create a new database which would create a new directory on the hard disk and to let Apache see that directory. Do try not to let someone that might possible hack Apache to also hack MySQL, but I will leave that kind of thinking to the security analysts.

SDC XML configuration file
You need to add this symlink as a new rule in the SDC configuration file. You can see some examples there inside the file. When adding a new rule, you need to remember that SDC accesses Apache from the same machine, meaning your URLs should start with http://localhost/

Google Spreadsheets
Go into your Google Spreadsheets and go the to first cell (A1) and type:
=importData('http://localhost/[Your File Here]')
Hopefully, you should see the data from the MySQL CSV table.
If not, you might need to debug it some more.


Problems with SDC
The installation process wasn't as smooth as we would have liked but we did try something new at the time with MySQL. But apart from that, there are 2 problems we have noticed with SDC:
  1. Refresh rates
  2. High number of rows
SDC caches extremely aggressively and when we wanted to make changes to our CSV file, we still saw the old data which confused us through out the process.
A trick to help solve this is, if you goto Google Spreadsheets and in the importData function you change "http://localhost/sample.csv" to "http://localhost/sample.csv?a"
This will force Google to get the new data and you can change the "a" to whatever you like.

A high number of rows (I tried 31k) will get Google Spreadsheets stuck. And when I mean stuck, I don't mean that their server crashes or anything, just that you would wait a long long time to get any results. So you should just show the summarized data you wanted to show.


Conclusion

I am under the impression that this is a relatively new technology for Google and its not 100% tweaked and finalized (it was launched around the end of April 2009). I would assume that it would be good to invest time in structuring it now (meaning, you would be an early adopter), while Google catches up.


PlanetMySQL Voting: Vote UP / Vote DOWN

The Open Source Events Calendar

Сентябрь 10th, 2009

Open Source Events Calendar

Kudos to Lenz, who has put together a comprehensive calendar of open source events. Most of them are somehow related to the MySQL ecosystem, but there is no limitation to what the calendar contains.
Here is the announcement, with the instructions to use and contribute to the calendar.
In addition to informing you about the events, this calendar does also tell you when a deadline is approaching. Using this tool, you won't miss a call for participation anymore.
You can simply subscribe to the iCal feed (it's a Google calendar) or see it online.
And of course, we want to improve the calendar. Feel free to submit new events using the event submission form.

We're looking into ways of improving the service. It would be nice to have a widget to show on your blog. Using Google APIs, it's easy to create such a widget, but the events are shown in insertion order, rather than chronological order. If anyone knows how to fix this issue, please contact me or Lenz.

PlanetMySQL Voting: Vote UP / Vote DOWN

A Remote-Attendee’s Look at OSCON

Июль 29th, 2009

Another year and another successful OSCON has been concluded. While I didn’t attend this year’s conference, let me hereby offer some reflections — basing it on reading blogs and talking to attendees both in person and over Twitter (I’m glad to see both the @MySQL and @MySQL_Community Twitter accounts have a large and quickly growing list of followers).

Let me start by highlighting the 2009 Google O’Reilly Open Source Awards. First on the list (albeit probably for alphabetical reasons) is Brian Aker, who is recognised as the Best Open Source Database Hacker. He joined MySQL many years ago having not just worked on Apache but also a major developer behind Slashdot. His award he gets for his contributions to MySQL in the past and Drizzle currently. Congratulations to Brian, and I’m sorry I won’t be attending Burning Man with you this year!

I also want to highlight some of the other winners. Evan Prodromou won the award for Best Social Networking Hacker and Clay Johnson who won the Best Community Builder award. Evan Prodromou wrote and runs the open-source microblogging tool Laconica which powers Identi.ca. The Laconica platform runs on MySQL as the database. The same can be said for Sunlight Labs of which Clay Johnson is the Director. Sunlight Labs produces technology to make government in the United States more transparent. Their platform also uses MySQL as a database.

Let me also grab the opportunity to congratulate Bruce Momjian, who was named Database Jedi Master for his work on PostgreSQL!

From what I sensed, highlighted topics of this years OSCON were web applications, cloud computing in addition to what could be labeled “regular applications“. In all of them, data and the web as a data driven operating system (to use Tim O’Reilly’s words from the keynote) is a self evident component, a fact of life. And MySQL continues to be one of the prime movers in this space.