Archive for the ‘hbase’ Category

MariaDB: the new MySQL? Interview with Michael Monty Widenius.

Сентябрь 29th, 2011
“I want to ensure that the MySQL code base (under the name of MariaDB) will survive as open source, in spite of what Oracle may do.” -- Michael “Monty” Widenius. Michael “Monty” Widenius is the main author of the original version of the open-source MySQL database and a founding member of the MySQL AB company. [...]
PlanetMySQL Voting: Vote UP / Vote DOWN

451 CAOS Links 2010.10.15

Октябрь 15th, 2010

The future of the JCP. A new Mozilla CEO. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# Mike Milinkovich explained why the Eclipse Foundation will support Oracle’s plans for Java 7, and outlined its concerns about the Java 8 JSR.

# Stephen Colebourne outlined the choices facing Java Community Process executive committee voters: pragmatism or bust, before later proposing a third option: a split in the Java Community Process between core and ecosystem projects.

# Gary Kovacs was named the new CEO of the Mozilla Corporation.

# New Relic raised $10m in series C funding.

# Oracle maintained its commitment to OpenOffice.org and released OpenOffice.org 3.2.1 and OpenOffice.org 3.3 Beta.

# SkySQL formally launched its services and support for the MySQL database with the release of SkySQL Enterprise.

# Android drove $1bn ad revenue for Google.

# Ross Gardler described the Apache Software Foundation’s open development methodology.

# Red Hat updated its messaging, realtime and grid technologies with the release of Red Hat Enterprise MRG 1.3.

# Actuate’s Nobby Akiha offered some advice for closed source companies transitioning to open source.

# OSSCube released OSSCube Voice – an open source integration of Asterisk and SugarCRM.

# StumbleUpon confirmed plans to open source OpenTSDB: a scalable time series database built on top of HBase.

# SugarCRM claimed 60% revenue growth in Q3.

# Civic Commons asked What’s the return on investment for open?

# The Free Software Foundation announced the criteria for its hardware endorsement program.

# Adobe’s Dave McAllister discussed why it and other software vendors, release open source code.

# Engine Yard formalized its support for fog, the cloud computing library for Ruby applications.

# The Linux Foundation’s survey suggested Linux adoption over next five years will outpace Windows.

# Datameer announced the general availability of its Datameer Analytics Solution for Hadoop.

# SGI announced support and benchmarks for VoltDB’s Database.

# Ingres announced the availability of Ingres Database 10.

# Vyatta integrated Sourcefire Intrusion Prevention System rules.


PlanetMySQL Voting: Vote UP / Vote DOWN

451 CAOS Links 2010.10.15

Октябрь 15th, 2010

The future of the JCP. A new Mozilla CEO. And more.

Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca, and daily at Paper.li/caostheory
“Tracking the open source news wires, so you don’t have to.”

# Mike Milinkovich explained why the Eclipse Foundation will support Oracle’s plans for Java 7, and outlined its concerns about the Java 8 JSR.

# Stephen Colebourne outlined the choices facing Java Community Process executive committee voters: pragmatism or bust, before later proposing a third option: a split in the Java Community Process between core and ecosystem projects.

# Gary Kovacs was named the new CEO of the Mozilla Corporation.

# New Relic raised $10m in series C funding.

# Oracle maintained its commitment to OpenOffice.org and released OpenOffice.org 3.2.1 and OpenOffice.org 3.3 Beta.

# SkySQL formally launched its services and support for the MySQL database with the release of SkySQL Enterprise.

# Android drove $1bn ad revenue for Google.

# Ross Gardler described the Apache Software Foundation’s open development methodology.

# Red Hat updated its messaging, realtime and grid technologies with the release of Red Hat Enterprise MRG 1.3.

# Actuate’s Nobby Akiha offered some advice for closed source companies transitioning to open source.

# OSSCube released OSSCube Voice – an open source integration of Asterisk and SugarCRM.

# StumbleUpon confirmed plans to open source OpenTSDB: a scalable time series database built on top of HBase.

# SugarCRM claimed 60% revenue growth in Q3.

# Civic Commons asked What’s the return on investment for open?

# The Free Software Foundation announced the criteria for its hardware endorsement program.

# Adobe’s Dave McAllister discussed why it and other software vendors, release open source code.

# Engine Yard formalized its support for fog, the cloud computing library for Ruby applications.

# The Linux Foundation’s survey suggested Linux adoption over next five years will outpace Windows.

# Datameer announced the general availability of its Datameer Analytics Solution for Hadoop.

# SGI announced support and benchmarks for VoltDB’s Database.

# Ingres announced the availability of Ingres Database 10.

# Vyatta integrated Sourcefire Intrusion Prevention System rules.


PlanetMySQL Voting: Vote UP / Vote DOWN

LCA Miniconf Call for Papers: Data Storage: Databases, Filesystems, Cloud Storage, SQL and NoSQL

Сентябрь 29th, 2010

This miniconf aims to cover many of the current methods of data storage and retrieval and attempt to bring order to the universe. We’re aiming to cover what various systems do, what the latest developments are and what you should use for various applications.

We aim for talks from developers of and developers using the software in question.

Aiming for some combination of: PostgreSQL, Drizzle, MySQL, XFS, ext[34], Swift (open source cloud storage, part of OpenStack), memcached, TokyoCabinet, TDB/CTDB, CouchDB, MongoDB, Cassandra, HBase….. and more!

Call for Papers open NOW (Until 22nd October).


PlanetMySQL Voting: Vote UP / Vote DOWN

Liveblogging at Confoo: Blending NoSQL and SQL

Март 11th, 2010

Persistence Smoothie: Blending NoSQL and SQL – see user feedback and comments at http://joind.in/talk/view/1332.

Michael Bleigh from Intridea, high-end Ruby and Ruby on Rails consultants, build apps from start to finish, making it scalable. He’s written a lot of stuff, available at http://github.com/intridea. @mbleigh on twitter

NoSQL is a new way to think about persistence. Most NoSQL systems are not ACID compliant (Atomicity, Consistency, Isolation, Durability).

Generally, most NoSQL systems have:

  • Denormalization
  • Eventual Consistency
  • Schema-Free
  • Horizontal Scale

NoSQL tries to scale (more) simply, it is starting to go mainstream – NY Times, BBC, SourceForge, Digg, Sony, ShopWiki, Meebo, and more. But it’s not *entirely* mainstream, it’s still hard to sell due to compliance and other reasons.

NoSQL has gotten very popular, lots of blog posts about them, but they reach this hype peak and obviously it can’t do everything.

“NoSQL is a (growing) collection of tools, not a new way of life.”

What is NoSQL? Can be several things:

  • Key-Value Stores
  • Document Databases
  • Column-oriented data stores
  • Graph Databases

Key-Value Stores


memcached is a “big hash in the sky” – it is a key value store. Similarly, NoSQL key-value stores “add to that big hash in the sky” and store to disk.

Speaker’s favorite is Redis because it’s similar to memcached.

  • key-value store + datatypes (list, sets, scored sets, soon hashes will be there)
  • cache-like functions (like expiration)
  • (Mostly) in-memory

Another interesting key-value store is Riak

  • Combination of key-value store and document database
  • heavy into HTTP REST
  • You can create links between documents, and do “link walking” that you don’t normally get out of a key-value store
  • built-in Map Reduce

Map Reduce:


  • Massively parallel way to process large datasets
  • First you scour data and “map” a new set of dataM
  • Then you “reduce” the data down to a salient result — for example, map reduce function to make a tag cloud: map function makes an array with a tag name and a count of 1 for each instance of that tag, and the reduce tag goes through that array and counts them…
  • http://en.wikipedia.org/wiki/MapReduce

Other key-value stores:

Document Databases


Some say that it’s the “closest” thing to real SQL.
  • MongoDB – Document store that speaks BSON (Binary JSON, which is compact). This is the speaker’s favorite because it has a rich query syntax that makes it close to SQL. Can’t do joins, but can embed objects in other objects, so it’s a tradeoff

    • Also has GridFS that can store large files efficiently, can scale to petabytes of data
    • does have MapReduce but it’s deliberate and you run it every so often.

  • CouchDB
    • Pure JSON Document Store – can query directly with nearly pure javascript (there are auth issues) but it’s an interesting paradigm to be able to run your app almost entirely through javascript.
    • HTTP REST interface
    • MapReduce only to see items in CouchDB. Incremental MapReduce, every time you add or modify a document, it dynamically changes the functions you’ve written. You can do really powerful queries as easy as you can do simple queries. However, some things are really complex, ie, pagination is almost impossible to do.
    • Intelligent Replication – CouchDB is designed to work with offline integration. Could be used instead of SQLite as the HTML5 data store, but you need CouchDB running locally to be doing offline stuff w/CouchDB

Column-oriented store


Columns are stored together (ie, names) instead of rows. Lets you be schema-less because you don’t care about a row’s consistency, you can just add a column to a table very easily.

Graph Databases


speaker’s opinion – there aren’t enough of these.
Neo4J – can handle modeling complex relationships – “friends of friends of cousins” but it requires a license.

When should I use this stuff?





If you have:Use
Complex, slow joins for an “activity stream”Denormalize, use a key-value store.
Variable schema, vertical interactionDocument database or column store
Modeling multi-step relationships (linkedin, friends of friends, etc)Graph

Don’t look for a single tool that does every job. Use more than one if it’s appropriate, weigh the tradeoffs (ie, don’t have 7 different data stores either!)

NoSQL solves real scalability and data design issues. But financial transactions HAVE to be atomic, so don’t use NoSQL for those.

A good presentation is http://www.slideshare.net/bscofield/the-state-of-nosql.

Using SQL and NoSQL together


Why? Well, your data is already in an SQL database (most likely).

You can blend by hand, but the easy way is DataMapper:
Generic, relational ORM (adapters for many SQL dbs and many NoSQL stores)
Implements Identity Map
Module-based inclusion (instead of extending from a class, you just include into a class).

You can set up multiple data targets (default is MySQL, example sets up MongoDB too).
DataMapper is:

  • Ultimate Polyglot ORM
  • simple r’ships btween persistence engines are easy
  • jack of all, master none
  • Sometimes perpetuates false assumptions –
  • If you’re in Ruby, your legacy stuff is in ActiveRecord, so you’re going to have to rewrite your code anyway.

Speaker’s idea to be less generic and better use of features of each data store – Gloo – “Gloo glues together different ORMs by providing relationship proxies.” this software is ALPHA ALPHA ALPHA.

The goal is to be able to define relationships on the terms of any ORM from any class, ORM or not
Right now – partially working activeRecord relationships
Is he doing it wrong? Is it a crazy/stupid idea? Maybe.

Example:





NeedUse
Assume you already have an auth systemit’s already in SQL, so leave it there.
Need users to be able to purchase items from the storefront – Can’t lose transactions, need full ACID complianceuse MySQL.
Social Graph – want to have activity streams and 1-way and 2-way relationships. Need speed, but not consistencyuse Redis
Product Listings — selling moves and books, both have different properties, products are pretty much non-relationaluse MongoDB

He wrote the example in about 3 hours, so integration of multiple data stores can be done quickly and work.


PlanetMySQL Voting: Vote UP / Vote DOWN