Archive for the ‘mongodb’ Category

Tungsten on the Beach—LA MySQL Meetup on Jan 11, 2012

Январь 7th, 2012
It is my pleasure to announce that I will be presenting on Tungsten Replicator next Wednesday, January 11th at the Los Angeles MySQL Meetup. The presentation title is Fast, Flexible, and Fun--The Tungsten Replicator Magical Mystery Tour. This talk is going to be fun for two reasons.

First, it's a great opportunity to meet people in the LA MySQL community and talk about my favorite replication software. Tungsten is like a Swiss Army Knife for data replication.  It solves a wide range of problems involving HA, scaling, and data movement.   The presentation gives a quick intro to the replicator, then surveys how to use the most interesting features, including parallel slave apply, multi-master replication, transaction filtering, and replicating to MongoDB, Oracle, or data warehouses.  I'll even show you how to grab the GPL V2 sources from code.google.com and code up your own replicator extensions using Java or Javascript.

Second, the talk venue is in Santa Monica about 10 blocks up from the ocean.  Who doesn't like beaches?    I certainly do.  See you next week!

p.s.,  Thanks to Joe Devon and the other LA MySQL Meetup folks for the kind invitation.

PlanetMySQL Voting: Vote UP / Vote DOWN

[RELOADED] Vote for MySQL+ community awards 2011 !

Январь 5th, 2012

[UPDATE 2011/01/11] : New poll added, vote for the best GUI client tool ! (And continue to vote for other polls)
And thanks again for your involvement. It’s time to vote again… 

First of all, I wish you a happy new year.
Many things happened last year, it was really exciting to be involved in the MySQL ecosystem.
I hope this enthusiasm will be increased this year, up to you !

To start the year, I propose the MySQL+ Community Awards 2011
It will only take 5 minutes to fill out these polls.
Answer with your heart first and then with your experience with some of these tools or services.

Polls will be closed January 31, so, vote now !
For “other” answers, please,  let me a comment with details.

Don’t hesitate to submit proposal for tools or services in the comments.
And, please, share these polls !

 

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

Happy 2012 !
Cédric

This article is obviously not sponsored !
(MySQL is a trademark of Oracle Corporation and/or its affiliates)

Sources :


PlanetMySQL Voting: Vote UP / Vote DOWN

Fake O’Reilly Covers

Декабрь 19th, 2011

Fake mongodb

Here are some of the fake O'Reilly book covers I mentioned in a prior post.  These have been optimized for use as black & white Kindle screensaver wallpaper images.  If you haven't done so already, you can install a Kindle screensaver hack with a couple of downloads. 

Update: I've embedded a slideshow from PicasaWeb, but it requires Flash.  If you don't see it you can click on the links below to go directly to PicasaWeb.


PlanetMySQL Voting: Vote UP / Vote DOWN

451 CAOS Links 2011.12.14

Декабрь 14th, 2011

Jive goes public. webOS goes open source. Cloud Foundry goes .NET. And more.

# Jive Software started IPO at $12 a share, closing the day up nearly 30%.

# HP announced that it plans to release webOS under an open source license. Details are thin on the ground, although Fedora is reportedly an inspiration. Joel West’s post pretty much summed up my thoughts.

# Tier 3 announced that it has created Iron Foundry, and open source .NET Framework implementation of Cloud Foundry.

# Xeround raised $9m funding for its MySQL-as-a-service cloud database.

# Microsoft released the Windows Azure SDK for Node.js as open source and made available a preview of the Apache Hadoop on Windows Azure, amongst a slew of other open source-related announcements.

# Red Hat, Canonical, Cisco, IBM, Intel, NetApp, and SUSE created the oVirt project, based around Red Hat’s Enterprise Virtualization technology for managing KVM environments.

# Nuxeo announced the availability of Nuxeo Platform 5.5.

# Joyent launched its SmartMachine Appliance for MongoDB.

Red Hat announced JBoss Enterprise Portal Platform 5.2 and JBoss Operations Network 3.0.

# Novell announced the availability of Novell Open Enterprise Server 11.

# Couchbase claimed thousands of open source deployments and 150 commercial deployments, but has rethought its product line-up for 2012, having “confused the heck” out of potential users in 2011.

# Univention released Univention Corporate Server 3.0.

# SuccessBricks announced that its ClearDB distributed MySQL-based database service is now available through Heroku.

# Ember.js is the new name for the SproutCore 2.0 JavaScript framework.

# HEnrik Ingo examined the recent spate of MySQL authentication plug-ins.


PlanetMySQL Voting: Vote UP / Vote DOWN

451 CAOS Links 2011.12.09

Декабрь 9th, 2011

Funding for BlazeMeter and Digital Reasoning. Red Hat goes unstructured. And more.

# BlazeMeter announced $1.2m in Series A funding and launched the a cloud service for load and performance testing.

# Digital Reasoning announced a second round of funding to help develop its Hadoop-based analytics offering.

# Red Hat announced the availability of Red Hat Storage Software Appliance, based on its recent acquisition of Gluster.

# Red Hat also announced the general availability of Red Hat Enterprise Linux 6.2.

# Jaspersoft released Jaspersoft 4.5, delivering drag-and-drop analytics and reporting on Apache Hadoop, NoSQL and analytic databases.

# Jaspersoft also delivered a second-generation native connector to MongoDB.

# CloudBees announced the availability of Jenkins Enterprise by CloudBees providing support and enhanced capabilities for the Jenkins Continuous Integration platform.

# Diaspora* is back in action, and outlined its plans.

# Talend announced that Bi3 Solutions has embedded Talend Integration Suite inside its Software-as-a-Service platform.

# DataStax announced new versions of Apache Cassandra, DataStax Community, and DataStax Enterprise.

# The H reported that Microsoft’s Windows Store agreement has open source exception.

# Black Duck Software announced the release of Export 6.0.

# Antelink launched SourceSquare, a free open source scanning engine.


PlanetMySQL Voting: Vote UP / Vote DOWN

451 CAOS Links 2011.11.18

Ноябрь 18th, 2011

Rapid7 secures new funding. Microsoft drops Dryad. And more.

# Rapid7 secured $50m in series C funding.

# Microsoft confirmed that it is ditching its Dryad project in favour of Apache Hadoop.

# Arun Murthy provided more details of Apache Hadop 0.23.

# The Google Plugin for Eclipse and GWT Designer projects are now fully open source.

# openSUSE released version 12.1.

# Amazon released the source code of the Kindle Fire.

# Black Duck Software joined the GENIVI Alliance.

# dotCloud announced the availability of the top three databases MySQL, MongoDB and Redis on its PaaS.


PlanetMySQL Voting: Vote UP / Vote DOWN

VC funding for Hadoop and NoSQL tops $350m

Ноябрь 15th, 2011

451 Research has today published a report looking at the funding being invested in Apache Hadoop- and NoSQL database-related vendors. The full report is available to clients, but non-clients can find a snapshot of the report, along with a graphic representation of the recent up-tick in funding, over at our Too Much Information blog.


PlanetMySQL Voting: Vote UP / Vote DOWN

New England Database Summit

Ноябрь 15th, 2011

The New England Database Summit is an all day conference-style event where participants from the research community and industry in the New England area can come together to present ideas and discuss their research and experiences working with on data-related problems.  It is an academic conference with applications to real life, and includes any type of database.

The 5th annual NEDB will be held in Cambridge, MA MIT (in 32-123) on Friday, February 3, 2012.  Anyone who would like is welcome to present a poster (registration required), or submit a short paper for review.  We plan to accept 8--10 papers for presentation (15 minutes) at the meeting.   All posters will be accepted.

For more details, and to register and / or upload a paper, see:

http://db.csail.mit.edu/nedbday12/


PlanetMySQL Voting: Vote UP / Vote DOWN

My take on the "warning" against using MongoDB…

Ноябрь 14th, 2011
We have seen the "warning" against using MongoDB a few times now, and I have to say that this reminds me of other such warnings:
In a sense, most of them were right. If you had, in the 1920's, asked the movie going public if they wanted "talkies", chances are most of them would have said no. If you had told my mom and dad in the late 1970's that within 20 - 30 years, everyone would have a computer at home, with some resemblence to what their nerdy near-20 year old boy was tinkering with in the basement of their house, they probably would have laughed, at best.

But that's not the thing here. True innovation moves things forward. It introduces new things and new ways of doing things in a way that we have not heard of before, and the rest of the world has not a got a good view on it. Look at Virtual computing. This was considered so slow that it was close to useless some 10 years ago or so, but today it just cannot be ignored and is put to good use all over the place (I am now disregarding the fact that this technology is way older than this, I am talking Virtual computing in the field where I spend most of my time).

If something that provides new and unique features, and new ways of doing things, are still slow, when compred to traditional means of acheving similar results, isn't strange:
  • New means not fully developed. What you want to demonstrate with something completely new isn't that it performs as well as existing technologies, then why would anyone change? No, you want to show the new features and demonstrate hwo unique this new thing is.
  • The way we measure performance or whatever we use to measure existing technologies, is usually tied to measuring just that: the performance or whatever of existing technologies, not that of a new technology and a new way of doing things.
In the early 20th centrury, steam cars were mach faster and more reliable than internal combustion powered cars, the Stanley Steamer was more performant and reliable than most competitors.

Getting back to MongoDB then: My main gripe with it is not that's it's not in all aspect mature (it's not. face it, if you use MongoDB you used leading egde stuff. It will break, live with it!). Neither do I have any issues with many of the other attributes of MongoDB and neither that it really isn't even innovative (it's not, live with it). No, my main gripe is this: MongoDB and NoSQL isn't really that new, and this means it is probably a stop-gap solution. In the 1980's running a SQL database on a PC was possible, but slow (I was working for Oracle at the time, so I know), DBase was in the case of a single PC easier to use, faster and more developer friendly. And the way you used Ashton-Tate DBase, by the way, wasn't that much different from how MongoDB is used today.

But the SQL Based relational databases, like Oracle, Informix etc. had more features and was more flexible and standardized, and once the PCs got more powerful, DBase was history.

What really wins then, in my mind, is features, flexibility, scalability and broad spectrum of usacases. SQL Based relational databases, has this, to an extent, but what most of them lacks is scalability across servers in a cloud. In this aspect, they have some of this scalability, but they don't scale nearly as much or as easily as, say, MongoDB or the other NoSQL databases (yes, I hate that term. Find a better one fpor me that is broadly accepted and I start using it).

So what am I saying here? Let me summarize it:
  • No. MongoDB doesn't suck, no way, but in terms of maturity it has some way to go.
  • Yes, a relational RDBMS usualy has more flexibility and broader set of usecases than a NoSQL solution.
  • And Yes: There are places where the RDBMS software industry got caught with their pants down: Cloud environment scalability for example. Also, licensing, if Oracle or MySQL or whoever could figure out a proper means of pricing cloud services, I'd be happy (insteaad of using the pricing models for software that was introduced with Auto-Flow in the early 1960. This is insane. The world has changed since then, guy!)
  • Would I rather use MySQL than MongoDB here at Recorded Future? Yes, probably, but I don't insist, and it just wouldn't work, as MySQL will not scale in the way you can scale a MongoDB solution, far from it.
  • Will MongoDB or the other NoSQL solutions mean that the era of SQL based databases is reaching an end? Nope, no way, José. Eventually some RDBMS vendor will get it and understand the issues and build a viable solution (like ScaleDB or NuoDB or something like that).
  • What is my main issue with MongoDB? That it sacrifices features and functionality for performance. And it cannot add the features and flexibility of an RDBMS without sacrificing performance (look at this: access method: B+-Tree. Come on, how innovative is THAT?) . To be honest, running MongoDB without sharding seems like a useless excercise to me. If you don't need the scalability that this setup can provide you with, you have better options. (but this is just me talking here.)
  • With MongoDB hang around for long? Yes, probably, but it will not be that hot for as long at SQL based databases. The reason is that compared to a traditional RDBMS it provides just one big advantage (a big advantage, yes, but just one): Performance. Which is why we use it. Remember the Stanley Steamer? Old hat today, but it was the hottest thing you could drive some 100 years ago or so. And this is what cloud computing is all about, a constant change of technology to get the best value for money right now, and to be on constant lookout for new technologies that drives features and performance.
/Karlsson

PlanetMySQL Voting: Vote UP / Vote DOWN

MongoDB for MySQL folks part 3 — More on queries and indexes

Ноябрь 2nd, 2011
Last time I wrote about MongoDB for MySQL DBAs I described some of the basics of MongoDB querying, and this time I'll follow that up with some more on querying.

As we saw last time, the basic format of a MongoDB query is:
db.find(<query>,<attributes>)
Note that you do NOT replace db with the name of the database you want to query here, you just make the database you want to use the current one and issue the query, such as:
> use test
> db.mycoll.find()
The example above will find all objects in the mycoll collection, and will include all the object attributes and also the key (_id), like this:
{ "_id" : ObjectId("4eb0634807b16556bf46b214"), "c1" : 1 }
{ "_id" : ObjectId("4eb0634a07b16556bf46b215"), "c2" : 1 }
{ "_id" : ObjectId("4eb0635607b16556bf46b216"), "c1" : 2, "c2" : 2 }
{ "_id" : ObjectId("4eb0635e07b16556bf46b217"), "c3" : 3 }
The Object id is generated by MongoDB itself here, although you can set it yorself if you want to, as long as it's unique. The insert method is used to insert data:
> db.mycoll.insert({c3: 4, c4: 'some string'})
> db.mycoll.find()
results in;
{ "_id" : ObjectId("4eb0634807b16556bf46b214"), "c1" : 1 }
{ "_id" : ObjectId("4eb0634a07b16556bf46b215"), "c2" : 1 }
{ "_id" : ObjectId("4eb0635607b16556bf46b216"), "c1" : 2, "c2" : 2 }
{ "_id" : ObjectId("4eb0635e07b16556bf46b217"), "c3" : 3 }
{ "_id" : ObjectId("4eb063d307b16556bf46b218"), "c3" : 4, "c4" : "some string" }
And as you can see, typing is automatic, or you can look at it as being type agnostic. Now, this wasn't much more than we saw last time, what we want is to select some specific objects and possibly get some specific columns from it, this is done by specifying one or two arguments to the find() method. For example, if I only want to get the object back that I inserted last above, I'd do this:
> db.mycoll.find({c3: 4})
{ "_id" : ObjectId("4eb063d307b16556bf46b218"), "c3" : 4, "c4" : "some string" }
And this wasn't really complicated, right? The condition is passed as a Java Script object notation, and that is fairly uncomplicated. But what happens for something slightly more than this really simple example, like a rangesearch? To get all objects where the c3 member is 4 or higher (which results in the same object as above by the way), you would write something like this, and :
> db.mycoll.find({c3: {$gt: 3}})
{ "_id" : ObjectId("4eb063d307b16556bf46b218"), "c3" : 4, "c4" : "some string" }
I will show some more $-operations beyond $gt in a later post, for now just accept that they exist and are documented here: Advanced Queries

The _id column is just annoying here, right now, but it is always shown by default, as are all the other object. To get rid of it for now, this will do the trick:
> db.mycoll.find({c3: {$gt: 3}}, {_id: 0})
{ "c3" : 4, "c4" : "some string" }
Not too bad, right, and kinda easy to understand. The falgs you pass for each field in the second argument may have one of three values:
  • 1 - Include this field. This is the default.
  • 0 - Do not include this field.
  • -1 - Include no fields except this one and the ObjectId. You may have more of these, in which case all the -1 flagged fields will be included.
Let's try a more advanced version. I want to the the c1 and c2 attributes, and nothing else, then I do this:
> db.mycoll.find({},{c1: -1, c2: -1, _id: 0})
{ "c1" : 1 }
{ "c2" : 1 }
{ "c1" : 2, "c2" : 2 }
{ }
{ }
As you can see, I have to explicitly exclude the _id field.

Online help
The mongo commandline tool for once has decent online help. Typing just help will show the options. For help on database specific operations, type db.help() and for collection specific operations, type db..help(), such as db.mycoll.help(). In JavaScript, a function is just another script, and adding arguments to the function will execute the function, but maybe you want to see how the function is implemented? The just type the name of the function, like this:
> db.mycoll.find
function (query, fields, limit, skip) {
return new DBQuery(this._mongo, this._db, this, this._fullName, this._massageObject(query), fields, limit, skip);
}

DBA Work - Indexing data and explain
What would a mongo DBA want to do? Let's try creating an index. Let's say we want an index on the c1 attribute in the mycoll collection as above, then we must use the ensureIndex() method on the collection in question, telling what columns I want to index, like this:
> db.mycoll.ensureIndex({c1: 1})
And that's it. Let's try to query that collection again, this time using the c1 column as an argument, and hopefully the index will be used:
{ "_id" : ObjectId("4eb0634807b16556bf46b214"), "c1" : 1 }
Right. But is the index used? I want to know that it is for a fact, or if it isn't, so I have something to complain to my developers about. In MySQL, you want use the EXPLAIN command and figure out what index are being used, but with mongo? Easy. Use the explain method, like this:
> db.mycoll.find({c1: 1}).explain()
{
"cursor" : "BtreeCursor c1_1",
"nscanned" : 1,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 0,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"c1" : [
[
1,
1
]
]
}
}
Hey, that's prett cool, right! The index is a standard B-tree index (the only index type available in MongoDB). An index can also be unique, like this:
> db.mycoll.ensureIndex({c2: 1}, {unique: true})
Which will create a unique index on the c2 attribute, but in our case it will not work:
E11000 duplicate key error index: test.mycoll.$c2_1 dup key: { : null }
What's going on here? Well, the c2 attribute isn't included in all objects, and but the index will include all objects, and MongoDB considers NULL a duplicate here (unlike an SQL NULL in which case this is not the case). So the real question here is, what do you want? As MongoDB is schema-free, and you can have any kind of attributes, and also looking at the data above, what I would probably want is an index on the c2 attrbute that makes sure that c2 is unique WHEN INCLUDED, if the c2 attribute isn't part of the object, then please mr. Indexer, ignore it. This is called a sparse index in MongoDB, and what it means is an index that just indexes the objects where the attribute is included.

Note that this may not always be what you want with non-unique indexes, but it often it is, and it makes seaching and inserting faster (as the index is smaller). In the case you have an attribute that is only rarely part of the object, and you want to find the objects where it IS included, this is just what you want.

In our case, the index is created like this:
> db.mycoll.ensureIndex({c2: 1}, {unique: true, sparse: true})
And this time we had no errors. Let's see how it works, first get some data:
> db.mycoll.find({}, {c2: 1, _id:0})
{ }
{ "c2" : 1 }
{ "c2" : 2 }
{ }
{ }
Now, let's see if the unique index on c2 will guarantee uniqueness by inserting a new row with an existing value for c2:
> db.mycoll.insert({c2: 1})
E11000 duplicate key error index: test.mycoll.$c2_1 dup key: { : 1.0 }
Yo! That worked as expected! As does this (which gives no errors):
> db.mycoll.insert({c2: 3})

That's it for now, I'll be back soon with some more MongoDB DBA stuff: Sharding!
/Karlsson

PlanetMySQL Voting: Vote UP / Vote DOWN