Archive for the ‘lucene’ Category

Disrupting IT with Open Source & Cloud

Июнь 2nd, 2010

A couple of weeks ago I gave a presentation at the Apache Lucene Eurocon in Prague. It was a good conference focused on Lucene/Solr open source search technology and sponsored by Lucid Imagination.  

I've posted the bulk of the presentation below.  (I omitted a couple of slides that were MySQL specific.) Even though it was a technical conference, I got positive feedback from the attendees and organizers that the information was useful in helping folks think about where to focus their efforts.  

The slides have been posted to Box.net and are shown using their new "embedded preview" feature which is pretty cool. You can also use the short URL www.tinyurl.com/box-disr

Thanks to the folks at Lucid Imagination as well as those who gave input and feedback on the presentation.


PlanetMySQL Voting: Vote UP / Vote DOWN

NoSQL options

Октябрь 6th, 2009

The NoSQL event in New York had a number of presentations on non relational technologies including of Hadoop, MongoDB and CouchDB.

Coming historically from a relational background of 20 years with Ingres, Oracle and MySQL I have been moving my focus towards non relational data store. The most obvious and well used today is memcached, a non persistent distributed key/value pair store. There are a number of persistent key/value stores in the marketplace, Tokyo Cabinet, Project Voldemort and Redis to name a few.

My list of data store products helps to identify the complex name space of varying products that now exist. A trend is towards schema less solutions, the ability to better manage dynamically typed/formatted information and the Agile Methodology release approach is simply non achievable in a statically type relational database table/column structure. The impact of constant ALTER TABLE commands in a MySQL database makes your production system unusable.

In a highly distribute online and increasing offline operation, fault tolerance and data synchronization and eventual consistency are required features in complex topologies such as multi-master.

I advise and promote a technology agnostic solution when possible. With the use of an API this is actually achievable, however in order to use a variety of backend data store products, one must consider the design patterns for optimal management. Two factors to support a highly distributed data set are no joins and minimal transactional semantics. The Facebook API is a great example, where there are no joins for their MySQL Relational backend. The movement back to a logical and non-normalized schema, or move towards a totally schemaless solution do require great though in the architectural concepts of your application.

Ultimately feature requirements will dictate the relative strengths and weaknesses of products. Full text search is a good example. CouchDB provides native support via Lucene. Another feature I like of couchDB is its append only data mode. This makes durability easy, and auto-recovery after crash a non issue, another feature a transactional relational database can not achieve.

With a 2 day no:sql(east) conference this month, there is definitely greater interest in this space.


PlanetMySQL Voting: Vote UP / Vote DOWN