Archive for the ‘Startups’ Category

Who/What to acquire next

Март 18th, 2011

Well as predicted, with Aster Data recently being picked up by Teradata most of the key new generation MPP distributed analytics vendors have been acquired (Aster Data, Vertica, Netezza & Greenplum).  This had to happen and was expected to happen.  The MPP Analytics startup “revolution” is over and these technologies will now be integrated into the mainstream.

So what’s next?  As we now, if you are a massive multi-national software company it is a lot less risky to incrementally innovate and leave the development of “game changing” technologies to startups that can be acquired after they prove both the tech and the market.  So what follows MPP?

NoSQL technologies seem the only likely candidate at the moment, although I think it is a few years too early for any major acquisitions to occur.  A key issue that would need to be worked through is what exactly is being acquired as most NoSQL platforms are open source / free (most MPP platforms were proprietary).  But nonetheless, as the market grows and starts to eat away at some noticeable level from the existing RDBMS market the major vendors will want a piece of that action and the frenzy will start again.  But this is still quite a while away yet.

 


PlanetMySQL Voting: Vote UP / Vote DOWN

The problem with a full box of big data tools

Октябрь 7th, 2010

NoSQL”, for lack of better name, is a generic term that describes any data management system that does not use SQL as a query interface.  Generally this means any data management system that is non-relational, but the term also has also been stretched as far to include the boundaries of what constitutes a data management system at all (such as Hadoop).

Early on (a couple of years back in NoSQL time) when the term was coined I think the positioning was much more aggressive, but more recently this has been softened so now NoSQL is commonly quoted as meaning of “Not only SQL” or “next generation databases” (whatever that means).  The common message you get now is something along the lines of NoSQL systems are more “specialized”, each being designed to solve a smaller number of problems than the generic RDBMS sets out to.  NoSQL is another tool in your toolbox.  A better option in certain cases where the RDBMS doesn’t fit well.  A different hammer for a different type of nail.  All makes sense in theory, but in reality this brings its own set of troubles.

There are now dozens of NoSQL systems available for a developer to choose.  From MongoDB, Cassandra, Voldemort, Hbase, CouchDB, Riak, Neo4J, HamsterDB and so on.  And there are several different orientations of NoSQL system including document, key/value and graph.  It seems the same energy we saw open-source hackers 10 years ago putting into MySQL has now been transferred into a myriad of NoSQL systems.  Again the argument, more choice, better for everyone.

The problem, and I am putting it out there as a problem so we can think of ways to fix it, is that while that is fine in reality, in practice many choices also creates difficulties.  Real world development projects have certain skills bases they draw on, with experience and ability to “make things work” based on years of hard slog cobbling things together.  And there are very few surprises left when deploying an application on a mainstream RDBMS (of course they will, like any software, will still have issue from time to time).

One of the key reasons the RDBMS has been so dominate is the fact that you could use it pretty much for any requirement.  And using it for any requirement meant that your developers had lots of experience building applications and your DBAs had lots of experience running it.  But also you knew that you could almost always make any requirement work “good enough” by buying extra hardware and/or indexing the heck out of it etc.  Regardless of whether it was technically the best fit or not, when all things were considered the RDBMS was a stable constant given short project timeframes and limited development budgets.  It was exactly its generic nature, its ability to do most things good enough, that has led to the RDBMS to become the default option for any new development project (with the various flavors of MySQL, Oracle, DB2 ,SQL Server being less relevant).

As humans, we all have limited brain capacities and most of us can only be experts in a small number of things.  And our expertise typically come from our history, making mistakes learning what works and what doesn’t through the hard yards of experience.   So given a buffet choice of specialized NoSQL systems how on earth do we choose the most appropriate tool for the job, while at the same time dealing with the lack of expertise we will invariably have?  Also what will be the impact to development projects in choosing the wrong tool for the job?  The RDBMS is very very forgiving to poor design, poor implementation and the subsequent addition of unforeseen application requirements (you want to run OLAP now we have built you a busy OLTP database – sure but do it overnight).  Will a specialist NoSQL system have the same tolerance for our incompetence?

So now I return back to the point that is really the keystone of the NoSQL motivation, “there are requirements which a RDBMS doesn’t work at all well for”.  I agree with this, but I have yet to see any quantification of what this actually means.  Is it 5% or 10% of current development projects?  And should the question really be “what percentage of development projects is the RDBMS unusable for”?  Technical purity, and even reducing license costs, needs to be balanced against one of the largest costs, re-skilling development and production teams to understand this new data platform. 

There are some clear cases, the Googles, Twitters, Facebooks etc where scale alone is clearly outside the boundaries of what is possible on today’s RDBMS platforms.  But in terms of today’s development projects, what percentage would these scalability requirements quantify?  1%?  Less?  Sure, we are going through somewhat of a data explosion and by all counts the volume of data we collect and manage in our databases is growing at an alarming rate.  So the demand for scale will continue, but let’s also not forget that the big RDBMS vendors are very market driven, and as the market changes their products will also continue to change with it.  It is very unlikely they will be asleep at the wheel and lose their dominate share of the ~$30b market without a fight.

Contrary to how it may appear, I am actually supportive of a number of NoSQL initiatives and I am even hands on with a few.  But I do have concerns about how we quantify the market, how we ensure that people are making the right decisions in choosing a NoSQL platform.  And also how do bridge the gap with skill sets and experience for developers who will have years upon years of RDBMS experience but, by nature, only have exposure to NoSQL systems periodically based on certain application requirements. 

 


PlanetMySQL Voting: Vote UP / Vote DOWN

Big Data innovation marches on

Сентябрь 21st, 2010

Netezza

With IBM intending to acquire Netezza the predicted consolidation in the distributed analytics market is well underway.  Recent deals include EMC/Greenplum Teradata/Kickfire and now IBM/Netezza.  A good breakdown of this deal is on Curt’s blog.  There is still more to go of course with one of the crown jewels, Vertica, still ripe for the picking. 

What this indicates is that MPP analytics has moved from the innovative edge into the mainstream market and now the more risk adverse large caps and now willing to invest substantially in growing this market.  Interestingly Microsoft made this move early with the acquisition of Datallegro in 2008, I doubt this has paid dividends yet but 5 years out this might be a different story as the explosive growth of machine generated data continues. 

While it is probably a bad time to start building another MPP query processor of course innovation in big data core technology continues to be strong.  Key areas of innovation relate to Flash/SSD optimization & caching, Graph databases, stream processing & CEP, Hadoop optimization, massive shared nothing (cloud) scalability & SQL/NOSQL convergence.  These technologies will come to market in a variety of different product forms some of which will later be picked up by the large caps.  Rinse, repeat.


PlanetMySQL Voting: Vote UP / Vote DOWN

VLDB 2010

Сентябрь 6th, 2010

VLDB 2010

I will be at VLDB 2010 next week.  If anyone on this blog is attending and wants to catch up to discuss start ups and innovation in DB, NoSQL, Big Data etc drop me a line and I will try to meet up.


PlanetMySQL Voting: Vote UP / Vote DOWN

Why software startups decide to patent … or not

Июль 21st, 2010

Guest blogger Pamela Samuelson is the Richard M. Sherman Distinguished Professor of Law and Information at the University of California, Berkeley. She teaches courses on intellectual property, cyberlaw, and information privacy, and she has written and spoken extensively about the challenges that new information technologies pose for traditional legal regimes. The following column will also appear in the November 2010 issue of Communications of the ACM.

Two-thirds of the approximately 700 software entrepreneurs who participated in the 2008 Berkeley Patent Survey report that they neither have nor are seeking patents for innovations embodied in their products and services. These entrepreneurs rate patents as the least important mechanism among seven options for attaining competitive advantage in the marketplace. Even software startups that hold patents regard them as providing only a slight incentive to invest in innovation.

These are three of the most striking findings from our recently published article, "High Technology Entrepreneurs and the Patent System: Results of the 2008 Berkeley Patent Survey."

After providing some background about the survey, this column will discuss some key findings about how software startup firms perceive, use and are affected by the patent system.

While the three findings highlighted above might seem to support a software patent abolitionist position, it is significant that a third of the software entrepreneurs reported having or seeking patents, and that they perceive patents to be important to persons or firms from whom they hope to obtain financing.

Survey background

More than 1,300 high technology entrepreneurs in the software, biotechnology, medical devices, and computer hardware fields filled out the Berkeley Patent Survey. All of these firms had been started no more than ten years before the survey was conducted. We drew our sample from a general population of software firms registered with Dun & Bradstreet (D&B) and from the VentureXpert (VX) database that has a rich data set on venture-backed startups. (Just over 500 of the survey respondents were D&B firms; just under 200 were VX firms.)

Eighty percent of the software respondents were either the CEOs or CTOs of their firms, and most had experience in previous startups. The average software firm had 58 employees, half of whom were engineers. Between 10 and 15 percent of the software startup respondents among the D&B respondents were venture-backed firms. Among the software respondents, only 2 percent had experienced an initial public offering (IPO), while 9 percent had been acquired by another firm.

Our interest in conducting this survey arose because high technology entrepreneurs have contributed significantly to economic growth in recent decades. They build firms that create new products, services, organizations, and opportunities for complementary economic activities. We were curious to know the extent to which high tech startups were utilizing the patent system, as well as to learn their reasons for choosing to avail themselves of the patent system -- or not.

The basic economic principle underlying the patent system is that technology innovations are often expensive, time-consuming, and risky to develop, although once developed, these innovations are often cheap and easy to copy. In the absence of intellectual property rights (IPRs), innovative high tech firms may have insufficient incentives to invest in innovation insofar as they cannot recoup their research and development (R&D) expenses and justify further investments in innovation because of cheap copies that undermine the firms' recoupment strategy.

Although this economic principle applies to all companies, early-stage technology firms might, we conjectured, be more sensitive to IPRs than more mature firms. The former often lack various kinds of complementary assets (such as well-defined marketing channels and access to cheap credit) that the latter are more likely to enjoy. We decided it would be worthwhile to test this conjecture empirically. With generous funding from the Ewing Marion Kauffman Foundation, we and two other colleagues designed and carried out the survey and analyzed the results.

Why startups decide to patent -- or not to

The most important reasons for seeking patents, as reported by the software executives who responded to the Berkeley Patent Survey, were these:

  1. to prevent competitors from copying the innovation (2.3 on a 4 point scale, where 2 was moderately important)
  2. to enhance the firms’ reputation (2.2)
  3. and to secure investment and improve the likelihood of an IPO (1.96 and 1.97 respectively)

The importance of patents to investors was also evident from survey data showing striking differences in the rate of patenting among the VX and the D&B software companies.

Three-quarters of the D&B firms had no patents and were not seeking them. Because the D&B firms are, we believe, typical of the population of software startup firms in the U.S., their responses may be representative of patenting rates among software startups generally. It is, in fact, possible that the overall percentage of software startup patenting is lower than this, insofar as patent holders may have been more likely than other software entrepreneurs to take time to fill out a Berkeley Patent Survey.

In striking contrast to the D&B respondents, over two-thirds of the VX software startup respondents in the sample, all venture-backed, had or were seeking patents. We cannot say why these VC-backed firms were more likely to seek patents than other firms. Perhaps VCs are urging the firms they fund to seek patents; or VCs may be choosing to fund the development of software technologies that VCs think are more amenable to patenting.

Interestingly, the rate of patenting did not vary by the age of the firm (that is, older firms did not patent at rates statistically significant from younger firms).

Why forgo patenting?

The survey asked two sets of questions about decisions to forego patenting: For the last innovation for which the firm chose not to seek a patent, what factors influenced this decision, and then what was the most important factor in the decision?

The costs of obtaining and of enforcing patents emerged as the first and second most frequent explanation. Twenty-eight percent of the software startups reported that the costs of obtaining patents had been the most important factor in this decision, and 12 percent said that the costs of enforcing patents was the most important factor. (They reported that average cost of getting a software patent was just under $30,000.)

Ease of inventing around the innovation and satisfaction with trade secrecy also influenced software startup decisions not to seek patents, although only rarely were these factors considered the most important.

Intriguingly, more than 40 percent of the software executive respondents cited the unpatentability of the invention as a factor in decisions to forego patenting, and almost a quarter of them rated this as the most important factor. Indeed, unpatentability ranked just behind costs of obtaining patents as the most frequently cited "most important factor" for not seeking patents.

It is difficult to know what to make of the unpatentability finding. One explanation might be that the software entrepreneur respondents believed that patent standards of novelty, non-obviousness, and the like are so rigorous that their innovation might not have satisfied patent requirements. Yet, because the patentability of software innovations has been contentious for decades, it may also be that a significant number of these entrepreneurs have philosophical or practical objections to patents in their field.



How important are patents to competitive advantage?


One of the most striking findings of our study is that software firms ranked patents dead last among seven strategies for attaining competitive advantage identified by the survey, as Figure 1 below shows. (The relative unimportance of patents for competitive advantage in the software field contrasts sharply with the perceived importance of patents in the biotech industry, where patents are ranked the most important means of attaining such advantage.)



Figure 1: Measures of Capturing "Competitive Advantage" from Inventions

Measures of Capturing Competitive Advantage from Inventions



As Figure 1 shows, software startups regard first-mover advantage as the single most important strategy for attaining competitive advantage. Next most important was complementary assets (e.g., providing services for licensed software or offering a proprietary complement to an open source program).

Interestingly, these two strategies for getting ahead in the market outstrip the IPRs about which we inquired for software firms. Among IPRs, though, copyrights and trademarks, closely followed by secrecy and difficulties of reverse engineering, outranked patents as means of attaining competitive advantage among software respondents by a statistically significant margin.



What incentive effects do patents have?



The Berkeley Patent survey asked startup executives to rate the incentive effects of patents on a scale, where 0 = no incentive, 1 = weak incentive, 2 = moderate incentive, and 3 = strong incentive, for engaging in four types of innovation: (1) inventing new products, processes, or services, (2) conducting initial R&D, (3) creating internal tools or processes, and (4) undertaking the risks and costs of commercializing the innovation.

We were surprised to discover that the software respondents reported that patents provide only weak incentives for engaging in core activities, such as invention of new products (.96) and commercialization (.93). By contrast, biotech and medical device firms reported just above 2 (moderate incentives) for these same questions.

Interestingly, the results did not change significantly even when focusing only on responses from software entrepreneurs whose firms hold at least one patent or application. Even patent-holding software entrepreneurs reported that patents provide just above a weak incentive for engaging in these innovation-related activities.



Resolving a paradox


If patents provide only weak incentives for investing in innovation among software startups, why are two-thirds of the VX firms and at least one-quarter of the D&B firms seeking patents?

The answer may lie in the perception among software entrepreneurs that patents may be important to potential funders, such as venture capitalists (VCs), angel investors, other firms, commercial banks, and friends and family. Sixty percent of software startups that had negotiated with VCs reported that that they perceived patents to be an important factor in VC decisions about whether to make the investments. Between 40 and 50 percent of the software respondents reported that patents were important to other types of investors, such as angels, investment banks, and other companies.



How well is the patent system working?


While most of the Berkeley Patent Survey questions focused on what firms had actually been doing vis-à-vis patents, we decided to ask a few questions to gauge the perception of high tech entrepreneurs about the patent system. We asked, for example, how well the entrepreneurs perceive the patent system to be working for them and for their industry. The scale for responses ranged from 0 = very poorly to 4 = very well, and 2 = neither poorly or well.

The software entrepreneurs' for-my-industry rating was 1.6 and their for-my-firm rating was 1.7. Both results tend toward the poorly end of the scale (in contrast to the biotech and medical device firms that reported above 2 ratings on both questions).

It is interesting is that the VX firms were slightly less positive about the patent system than the D&B firms, although the difference was not statistically significant. We also tested to see if the responses were bipolar (that is, did some software firms rate the patent system very poorly and their ratings canceled out by some positive responses?), but discovered that the ratings fell into a normal distribution, suggesting that we had drawn a sample from a cross-section of the population.



Conclusion


Over the next several years, we expect to engage in further analysis of the results of the 2008 Berkeley Patent Survey and to report new findings about the roles that patents play in the software industry. The initial findings reported here and in the larger article suggest that software entrepreneurs do not find persuasive the canonical story that patents provide strong incentives to invest in technology innovation. These executives regard first-mover advantage and complementary assets as more important than IPRs in conferring competitive advantage upon their firms. Moreover, among IPRs, copyrights and trademarks are perceived to be more important than patents. Still, about one-third of our software entrepreneur respondents reported having or seeking patents, and their perception that their investors care about patents seems to be a key factor in decisions to obtain patents.


Related:




References:

Stuart J.H. Graham, Robert P. Merges, Pam Samuelson, & Ted Sichelman, High Technology Entrepreneurs and the Patent System: Results of the 2008 Berkeley Patent Survey, Berkeley Technology Law Journal, 25:4, pp. 1255-1327 (2010), available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1429049.


About the Authors:

Pamela Samuelson is the Richard M. Sherman Distinguished Professor of Law & Information, University of California, Berkeley.

Stuart J.H. Graham is on leave from his position as an Assistant Professor at the Georgia Institute of Technology, College of Management, to serve as the Chief Economist for the U.S. Patent & Trademark Office (USPTO). The views expressed in this article are his own, and are not the views of the USPTO.


PlanetMySQL Voting: Vote UP / Vote DOWN

Four short links: 25 June 2010

Июнь 25th, 2010

  1. Membase -- an open-source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data in real-time. Supporting these requirements, membase processes data operations with quasi-deterministic low latency and high sustained throughput. (via Hacker News)
  2. Sergey's Search (Wired) -- Sergey Brin, one of the Google founders, learned he had a gene allele that gave him much higher odds of getting Parkinson's. His response has been to help medical research, both with money and through 23andme. Langston decided to see whether the 23andMe Research Initiative might be able to shed some insight on the correlation, so he rang up 23andMe’s Eriksson, and asked him to run a search. In a few minutes, Eriksson was able to identify 350 people who had the mutation responsible for Gaucher’s. A few clicks more and he was able to calculate that they were five times more likely to have Parkinson’s disease, a result practically identical to the NEJM study. All told, it took about 20 minutes. “It would’ve taken years to learn that in traditional epidemiology,” Langston says. “Even though we’re in the Wright brothers early days with this stuff, to get a result so strongly and so quickly is remarkable.”
  3. Startup.gov (YouTube) -- Anil Dash talk at Personal Democracy Forum on applying insights from startups to government. I hope the more people say this, the greater the odds it'll be acted on.
  4. Open Core Software -- Marten Mickos (ex-MySQL) talks up "open core" (open source base, proprietary extensions) as a way to resolve the conflict of "change the world with open source" and "make money". Brian Aker disagrees: There has been no successful launch of an open core company that has reached any significant size, especially of the size that Marten hints at in the article. My take: there are three reasons for open source (freedoms, price, and development scale) and if you close the source to part of your product then the whole product loses those benefits. If you open source enough that the open source bit has massive momentum, then you probably don't have enough left proprietary to gain huge financial benefit.


PlanetMySQL Voting: Vote UP / Vote DOWN

Guy Kawasaki’s "Reality Check"

Июнь 17th, 2010

Realitycheck

In case you haven't figured it out, I'm a fan of Guy Kawasaki and his "How to Change the World " blog.  If you like his blog, you should check out his book "Reality Check."  Yes, you can read most of the content for free on the web, but sometimes a printed copy is more convenient.   Like if you're on an airplane.  Or on the toilet.  Or if you want to underline it.  Or if you want to underline it while you're on the toilet on an airplane.  Ok, you get the idea. 

The book covers some of the best items from his blog, categorized into themes like starting a company, raising money, business planning, innovation, marketing, schmoozing, management, hiring and firing and more. It's not a bunch of high-falutin' theories either. It's hard lessons learned by working with hundreds of entrepreneurs. Guy is also a player coach, having built four startup companies and served on the boards of ten companies. So it's practical advice rather than academic theory. And even though it's practical, it's still entertaining. You won't find the top 10 lies of VCs or the top 11 lies of entrepreneurs in any other book. And I doubt you'll read either of those pieces and not learn something either about yourself or about how you conduct your business. With more than 90 essays (including some great Q&A pieces) if you heed just a fraction of the advice in this book, it will pay for itself tens if not hundreds of times over. How's that for a compelling ROI?

While a lot of his advice is oriented towards startups, in my experience, it's equally applicable to large companies.  It's all about focus and execution.  Guy was one of the top rated speakers we ever had at the MySQL Users Conference, and he's well worth whatever he charges for these events. 


PlanetMySQL Voting: Vote UP / Vote DOWN

Riptano for Cassandra

Май 3rd, 2010

Riptano

Cassandra is one of the most interesting NoSQL platforms at the moment.  And by most interesting what I really mean is the most clearly justifiable.  Some NoSQL platforms offer new data models, improved query interfaces and/or good single node performance through relaxed consistency models.  As a database guy however, the justification for throwing out the RDBMS baby and bathwater is still difficult at this point as NoSQL platforms tend to be highly focused in one aspect of data management, and very immature in all other areas.  Cassandra is somewhat different as it is more mature in a number of key areas (albeit still immature in others).  Areas that can make Cassandra more justifiable for the right project, when compared with a more traditional RDBMS based solution.  This is because Cassandra’s primary capabilities can’t easily be replicated on those traditional mainstream platforms.

Cassandra’s primary focus is on scalability.  More specifically that is scalability combined with reasonable functionality and performance & availability when at scale.  While some other platforms are trying to bolt on scalability/availability to their functionality rich data engines, Cassandra already has proven real life examples running 150 node clusters.  Notable uses of Cassandra include Digg, Facebook, Twitter, Reddit & Rackspace.  And the feedback from these sites is very good; commonly Cassandra has been expressed as the hands down winner for transaction processing performance at scale.

One of the key contributors to Cassandra has been Jonathan Ellis and until recently he has been working on Cassandra while employed by RackSpace.  But, I was pleased to hear that Jonathan, and business partner Matt Pfeil, have taken the step of setting up their own Cassandra focused company, Riptano.

Riptano are providing the commercialized support services around the open source Cassandra that are necessary for the platform to survive and grow.  While such services may be less important for adoption from the techie rich Web 2.0 crowd, for any platform to become mainstream there needs to be an escalation path for companies uninterested or unable to tinker with the code themselves.  Riptano provides those services which can allow Cassandra use to start to grow further.

Just as importantly, this move gives representation to Cassandra and provides an entity whose best interests will be served through advocacy of the platform.  While Jonathan and others had been doing a fine job of this to date personally, another corporation investing commercial dollars into advocacy will be important to ensure Cassandra’s message isn’t drowned out by more highly funded alternatives.

Riptano has received some early funding from RackSpace and I believe already has a few customers signed for their support services.  Best luck Jonathan & Matt.

Related articles by Zemanta
Reblog this post [with Zemanta]

PlanetMySQL Voting: Vote UP / Vote DOWN

NoSQL Buzz

Апрель 14th, 2010

I have noticed a definite increase in NoSQL buzz over the last few months.  This is partly confirmed by Google Trends, this service shows data relating to how search topics rank:

Googletrends_nosql

The last couple of months has seen a dramatic rise in both the number of searches and also the number of news items relating to NoSQL. 

But the traditionalists need not yet fret, interest in NoSQL is yet but a blip on the data management radar, as demonstrated by this compairson between NoSQL and MySQL search rankings:

Googletrends_mysql

I will be interesting to see how the dynamics of this change throughout 2010 though.

Related articles by Zemanta
Reblog this post [with Zemanta]

PlanetMySQL Voting: Vote UP / Vote DOWN

What is Big Data?

Январь 31st, 2010

Exhibit: AggregationsImage by Aranda\Lasch via Flickr

One of my favorite terms at the moment is “Big Data”.  While all terms are by nature subjective, in this post I will try and explain what Big Data means to me.

So what is Big Data?

Big Data is the “modern scale” at which we are defining or data usage challenges.  Big Data begins at the point where need to seriously start thinking about the technologies used to drive our information needs.

While Big Data as a term seems to refer to volume this isn’t the case.  Many existing technologies have little problem physically handling large volumes (TB or PB) of data.  Instead the Big Data challenges result out of the combination of volume and our usage demands from that data.  And those usage demands are nearly always tied to timeliness.

Big Data is therefore the push to utilize “modern” volumes of data within “modern” timeframes.  The exact definitions are of course are relative & constantly changing, however right now this is somewhere along the path towards the end goal.  This is of course the ability to handle an unlimited volume of data, processing all requests in real time.

So what are Big Data technologies?

More than at any point in the past, data related technologies are the focus of research & innovation.  But Big Data challenges won’t be solved anytime soon by a single approach.  Keeping in mind all the different platforms that Big Data is having an impact on (web, cloud, enterprise, mobile) combined with all the Big Data domain challenges (transaction processing, analytics, data mining, visualization) as well as many of the Big Data characteristic requirements (volume, timeliness, availability, consistency), it is easy to see how no single technology will provide a cover-all solution for the eclectic mix of needs. Instead a broad set of technologies that are each focused on meeting specific set of needs are improving our ability to manage data at scale. 

A few common areas of innovation that I describe as Big Data technologies include: MPP Analytics, Cloud Data Services, Hadoop & Map/Reduce (and associate technologies such as HBase, Pig & Hive), In-Memory Databases and Distributed Transaction Processing.

So what is the point of Big Data?

Someone asked me if Big Data was just tools to “try and sell them more relevant crap they don’t want”.  While up-sell & targeted advertising are too major uses of Big Data technologies I hope that mine and others work in this field does result achievements more significant than just these.

When describing the point of Big Data I like to think about how the Internet has changed my life in general.  By having unlimited & timely access to information we are now better informed in all areas of our existence than ever before.  However, we are now facing the problem that there is fast becoming too much data for us to digest in its raw form.  To move forward in our understanding we will need to rely on technology to provide timely, summarized & relevant data across all aspects of our lives.  This is what those working in Big Data are setting out to achieve.


Reblog this post [with Zemanta]

PlanetMySQL Voting: Vote UP / Vote DOWN