Archive for the ‘Key/Value Stores’ Category

The NoSQL community needs to engage the DBA’s

Июль 30th, 2009

Baloney

The NoSQL movement has been gaining some steam lately, with discussion forums and mailing lists popping up all around the web.  Despite having a career that has been centered on the RDBMS, I have made no secret that I think we have gone too far down with our RDBMS for everything mindset.  I think we need to add a few more tools back into our data toolbox. 

Today, 99.5% of new data centric developments started will use a RDBMS by default.  Maybe .5 of a % will consider using something as obtuse as a NoSQL platform.  By experience I know the majority of people discussing NoSQL platforms today are web developers.  In fact there is almost a sense of trying to trying to keep this under the radar of DBAs.  If we don’t talk to the DBAs about this stuff then they won’t bother us with all that jabber about consistency, data integrity, robustness and recovery. 

Actually, many of the NoSQL projects are touting one of the key benefits of a NoSQL platform is you can do big data without the need of a costly DBA.

Baloney.

This shows me that the people making those comments have no idea what DBAs do and what happens with critical data applications post deployment.

A NoSQL data platform may have a different approach to operational management than a RDBMS, but a large part of the requirement will be the same.  It doesn’t matter if you have 10, 100 or 1000GB of data deployed on a NoSQL platform or an RDBMS.  Someone still needs to be thinking about backups & recovery, availability, capacity planning, performance monitoring, import/export, data integration, tuning & optimization, replication latency and so on.  Also, I have never come across any technology that works perfectly 100% of the time, so when things don’t work as expected and nodes are out of sync or partial data corruption occurs at 2am, someone will still need to fix it.  Guess who that is going to be.

DBAs are critical to any wide scale success with NoSQL platforms.  They need to be engaged and educated.  Sure they are going to be really annoying for quite a while, ripping into common NoSQL limitations such as lack of transaction support, eventual consistency, data duplication & application controlled data integrity.  However over time they will start to see the positive aspects as well and learn sometimes a mallet isn’t the only tool required.


Reblog this post [with Zemanta]

HamsterDB

Июль 29th, 2009

Hamster

This post was a bit of a test to see if I could write a serious post about a database platform called Hamster.  I think I just made it :)

With all the noise over key/value stores recently, we should keep in mind that this technology isn’t exactly new.  It is being applied to new problems, but many of the foundations have been around for decades.  Probably the oldest of them all, Berkley DB came into existence during the mid ‘80’s and now has over 200 million deployments (according to the Oracle web site).

HamsterDB, while not having the same pedigree of Berkley, has been steadily worked on by Christoph Rupp for the last 5 years.  I spoke to Christoph yesterday about his release of a new edition of Hamster.

Hamster primarily has been a single threaded data store for embedded use, however Christoph has expanded Hamster into two editions, the second being a multi-user transactional platform.  This month sees the release of the BETA build of this edition.  Key Features of the NoSQL HamsterDB Transactional Key/Value Store:

  • ACID Compliance
  • Lock Free Architecture (transactions fail on conflict rather than block)
  • Transaction logging & fail recovery (redo logs)
  • In Memory support – can be used as a non-persisted cache
  • B+ Trees – supported but additional indexes are user maintained (see below)


Performance details are still sparse, but the embedded edition has done very well when compared with BerkleyDB.  HamsterDB is also licensed using a mixture of GPL and commercial licenses.  It provides native C++ support but wrappers exist for .NET, Java & Python.

Hamster is a pure key/value store so doesn’t have the feature set of the hybrids, such as MongoDB or Wistla, i.e. lookups are by key only, B-Tree indexes are supported but must be manually maintained by the application etc.  HamsterDB also doesn’t have a distribution layer so is for single node use.

But, from the perspective of a simple, lightweight, high performance key/value alternative HamsterDB looks very interesting.


Reblog this post [with Zemanta]