<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PlanetMysql.ru - информация о СУБД MySQL &#187; data warehouse</title>
	<atom:link href="http://planetmysql.ru/category/data-warehouse/feed/" rel="self" type="application/rss+xml" />
	<link>http://planetmysql.ru</link>
	<description>Блог о самой популярной СУБД MySQL</description>
	<lastBuildDate>Thu, 24 May 2012 17:22:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Scale differences between OLTP and Analytics</title>
		<link>http://database-scalability.blogspot.com/2012/05/scale-differences-between-oltp-and.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=scale-differences-between-oltp-and-analytics</link>
		<comments>http://database-scalability.blogspot.com/2012/05/scale-differences-between-oltp-and.html#comments</comments>
		<pubDate>Tue, 15 May 2012 04:08:46 +0000</pubDate>
		<dc:creator>Doron Levari</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database scalability]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[scale-out]]></category>

		<guid isPermaLink="false">http://planetmysql.ru/?guid=333f90bde8698e50db16223084afd70c</guid>
		<description><![CDATA[In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html,&#160;I reviewed the differences between OLTP and Analytics databases.Scale challenges are different between those 2 worlds of databases.Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! Specialized Analytics databases (such as Oracle Exadata and Netezza) have TBs of RAM. Then, in order to bring results for queries, data needs to be scanned and filtered, a great fit for parallelism. A big data range is divided into many smaller ranges given to parallel worker threads that each performs his task in parallel, the entire scan will finish in a fraction of the time.In the OLTP, scale challenges are in the growing transaction concurrency throughput and… growing amounts of data. Again?&#160;Didn't&#160;we just say growing data is the problem of Analytics? Well, today’s OLTP apps are required to hold more data to provide a larger span online functionality. In the last couple of years OLTP data archiving was changed dramatically. OLTP data now covers years and not just days or weeks. Facebook recently launched its “time line” feature (http://www.facebook.com/about/timeline), can you imagine your timeline ends after 1 week? Facebook’s probably world’s largest OLTP database holds data of a billion users for years back. Today all data is required anywhere anytime, right here, right now, online. Many of today’s OLTP databases go well beyond the 1TB line. And what about transaction concurrency throughput? Applications today are bombarded by millions of users shooting transactions from browsers, smartphones, tablets… I personally checked my bank account 3 times today. Why? Because I can…What can be done to solve OLTP scale challenges?In my next post let's start answering this question with understanding why solutions proposed for the Analytics are limited in the OLTP, and start reviewing relevant approaches.Stay tuned, subscribe, get involved!]]></description>
			<content:encoded><![CDATA[<br />In my previous post,<a href="http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html">http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html</a>,&nbsp;I reviewed the differences between OLTP and Analytics databases.<br /><br />Scale challenges are different between those 2 worlds of databases.<br /><br /><div><a href="http://1.bp.blogspot.com/-g3T73XdTnZo/T7HM6HI3muI/AAAAAAAAFpY/lZJkxgMQsnM/s1600/OLTP-DW.png" imageanchor="1"><img border="0" height="315" src="http://1.bp.blogspot.com/-g3T73XdTnZo/T7HM6HI3muI/AAAAAAAAFpY/lZJkxgMQsnM/s320/OLTP-DW.png" width="320" /></a></div><br /><br />Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: <b><u>Columnar storage, RAM and parallelism</u></b>.<br />Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! Specialized Analytics databases (such as Oracle Exadata and Netezza) have TBs of RAM. Then, in order to bring results for queries, data needs to be scanned and filtered, a great fit for parallelism. A big data range is divided into many smaller ranges given to parallel worker threads that each performs his task in parallel, the entire scan will finish in a fraction of the time.<br /><br />In the OLTP, scale challenges are in the growing transaction concurrency throughput and… growing amounts of data. Again?&nbsp;Didn't&nbsp;we just say growing data is the problem of Analytics? Well, today’s OLTP apps are required to hold more data to provide a larger span online functionality. In the last couple of years OLTP data archiving was changed dramatically. OLTP data now covers years and not just days or weeks. Facebook recently launched its “time line” feature (<a href="http://www.facebook.com/about/timeline">http://www.facebook.com/about/timeline</a>), can you imagine your timeline ends after 1 week? Facebook’s probably world’s largest OLTP database holds data of a billion users for years back. Today all data is required anywhere anytime, right here, right now, online. Many of today’s OLTP databases go well beyond the 1TB line. And what about transaction concurrency throughput? Applications today are bombarded by millions of users shooting transactions from browsers, smartphones, tablets… I personally checked my bank account 3 times today. Why? Because I can…<br /><br />What can be done to solve OLTP scale challenges?<br /><br />In my next post let's start answering this question with understanding why solutions proposed for the Analytics are limited in the OLTP, and start reviewing relevant approaches.<br /><br />Stay tuned, subscribe, get involved!<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/6415786925319620734-101650794984121573?l=database-scalability.blogspot.com" alt="" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=33240&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=33240&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2012/05/15/scale-differences-between-oltp-and-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Scale differences between OLTP and Analytics</title>
		<link>http://database-scalability.blogspot.com/2012/05/scale-differences-between-oltp-and.html?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=scale-differences-between-oltp-and-analytics-2</link>
		<comments>http://database-scalability.blogspot.com/2012/05/scale-differences-between-oltp-and.html#comments</comments>
		<pubDate>Tue, 15 May 2012 04:08:46 +0000</pubDate>
		<dc:creator>Doron Levari</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[Columnar Storage]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database scalability]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[OLTP]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[scale-out]]></category>

		<guid isPermaLink="false">http://planetmysql.ru/?guid=333f90bde8698e50db16223084afd70c</guid>
		<description><![CDATA[In my previous post,http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html,&#160;I reviewed the differences between OLTP and Analytics databases.Scale challenges are different between those 2 worlds of databases.Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! Specialized Analytics databases (such as Oracle Exadata and Netezza) have TBs of RAM. Then, in order to bring results for queries, data needs to be scanned and filtered, a great fit for parallelism. A big data range is divided into many smaller ranges given to parallel worker threads that each performs his task in parallel, the entire scan will finish in a fraction of the time.In the OLTP, scale challenges are in the growing transaction concurrency throughput and… growing amounts of data. Again?&#160;Didn't&#160;we just say growing data is the problem of Analytics? Well, today’s OLTP apps are required to hold more data to provide a larger span online functionality. In the last couple of years OLTP data archiving was changed dramatically. OLTP data now covers years and not just days or weeks. Facebook recently launched its “time line” feature (http://www.facebook.com/about/timeline), can you imagine your timeline ends after 1 week? Facebook’s probably world’s largest OLTP database holds data of a billion users for years back. Today all data is required anywhere anytime, right here, right now, online. Many of today’s OLTP databases go well beyond the 1TB line. And what about transaction concurrency throughput? Applications today are bombarded by millions of users shooting transactions from browsers, smartphones, tablets… I personally checked my bank account 3 times today. Why? Because I can…What can be done to solve OLTP scale challenges?In my next post let's start answering this question with understanding why solutions proposed for the Analytics are limited in the OLTP, and start reviewing relevant approaches.Stay tuned, subscribe, get involved!]]></description>
			<content:encoded><![CDATA[<br />In my previous post,<a href="http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html">http://database-scalability.blogspot.com/2012/05/oltp-vs-analytics.html</a>,&nbsp;I reviewed the differences between OLTP and Analytics databases.<br /><br />Scale challenges are different between those 2 worlds of databases.<br /><br /><div><a href="http://1.bp.blogspot.com/-g3T73XdTnZo/T7HM6HI3muI/AAAAAAAAFpY/lZJkxgMQsnM/s1600/OLTP-DW.png" imageanchor="1"><img border="0" height="315" src="http://1.bp.blogspot.com/-g3T73XdTnZo/T7HM6HI3muI/AAAAAAAAFpY/lZJkxgMQsnM/s320/OLTP-DW.png" width="320" /></a></div><br /><br />Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: <b><u>Columnar storage, RAM and parallelism</u></b>.<br />Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! Specialized Analytics databases (such as Oracle Exadata and Netezza) have TBs of RAM. Then, in order to bring results for queries, data needs to be scanned and filtered, a great fit for parallelism. A big data range is divided into many smaller ranges given to parallel worker threads that each performs his task in parallel, the entire scan will finish in a fraction of the time.<br /><br />In the OLTP, scale challenges are in the growing transaction concurrency throughput and… growing amounts of data. Again?&nbsp;Didn't&nbsp;we just say growing data is the problem of Analytics? Well, today’s OLTP apps are required to hold more data to provide a larger span online functionality. In the last couple of years OLTP data archiving was changed dramatically. OLTP data now covers years and not just days or weeks. Facebook recently launched its “time line” feature (<a href="http://www.facebook.com/about/timeline">http://www.facebook.com/about/timeline</a>), can you imagine your timeline ends after 1 week? Facebook’s probably world’s largest OLTP database holds data of a billion users for years back. Today all data is required anywhere anytime, right here, right now, online. Many of today’s OLTP databases go well beyond the 1TB line. And what about transaction concurrency throughput? Applications today are bombarded by millions of users shooting transactions from browsers, smartphones, tablets… I personally checked my bank account 3 times today. Why? Because I can…<br /><br />What can be done to solve OLTP scale challenges?<br /><br />In my next post let's start answering this question with understanding why solutions proposed for the Analytics are limited in the OLTP, and start reviewing relevant approaches.<br /><br />Stay tuned, subscribe, get involved!<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/6415786925319620734-101650794984121573?l=database-scalability.blogspot.com" alt="" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=33240&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=33240&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2012/05/15/scale-differences-between-oltp-and-analytics-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Can’t Travel to Collaborate 12?  Plug-in Virtually Instead! (revised schedule)</title>
		<link>http://blogs.ioug.org/2012/04/16/cant-travel-to-collaborate-12-plug-in-virtually-instead-revised-schedule/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=cant-travel-to-collaborate-12-plug-in-virtually-instead-revised-schedule</link>
		<comments>http://blogs.ioug.org/2012/04/16/cant-travel-to-collaborate-12-plug-in-virtually-instead-revised-schedule/#comments</comments>
		<pubDate>Mon, 16 Apr 2012 20:00:30 +0000</pubDate>
		<dc:creator>IOUG Blogs</dc:creator>
				<category><![CDATA[cloud]]></category>
		<category><![CDATA[COLLABORATE]]></category>
		<category><![CDATA[COLLABORATE 12]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[developer]]></category>
		<category><![CDATA[discount]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[exadata]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[IOUG]]></category>
		<category><![CDATA[IOUG General]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[OEM]]></category>
		<category><![CDATA[online]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[training]]></category>
		<category><![CDATA[virtualization]]></category>

		<guid isPermaLink="false">http://blogs.ioug.org/?p=347</guid>
		<description><![CDATA[  Plug-in to Vegas The program focuses on key topics such as high availability, virtualization, security, business intelligence, Exadata, Cloud Computing and internals.  Recently added, we switched around the schedule to include the Thursday Deep Dive, Avoiding Downtime through the Maximum &#8230; Continue reading &#8594;]]></description>
			<content:encoded><![CDATA[  Plug-in to Vegas The program focuses on key topics such as high availability, virtualization, security, business intelligence, Exadata, Cloud Computing and internals.  Recently added, we switched around the schedule to include the Thursday Deep Dive, Avoiding Downtime through the Maximum &#8230; <a href="http://blogs.ioug.org/2012/04/16/cant-travel-to-collaborate-12-plug-in-virtually-instead-revised-schedule/">Continue reading <span>&#8594;</span></a><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=32911&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=32911&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2012/04/17/cant-travel-to-collaborate-12-plug-in-virtually-instead-revised-schedule/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How analysing your binlogs can be quite informative</title>
		<link>http://blog.wl0.org/2010/10/how-analysing-your-binlogs-can-be-quite-informative/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-analysing-your-binlogs-can-be-quite-informative</link>
		<comments>http://blog.wl0.org/2010/10/how-analysing-your-binlogs-can-be-quite-informative/#comments</comments>
		<pubDate>Sat, 23 Oct 2010 18:11:34 +0000</pubDate>
		<dc:creator>Simon Mudd</dc:creator>
				<category><![CDATA[binlog]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[Databases]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[mysqlbinlog]]></category>
		<category><![CDATA[Replication]]></category>

		<guid isPermaLink="false">http://blog.wl0.org/?p=357</guid>
		<description><![CDATA[If you have used MySQL for some time you know that mysqld can write binlogs. This is usually used for backup purposes and JITR or for replication purposes so a slave can collect the changes made on the master and apply them locally. Most of the time apart from configuring how long you keep these [...]]]></description>
			<content:encoded><![CDATA[If you have used MySQL for some time you know that mysqld can write binlogs. This is usually used for backup purposes and JITR or for replication purposes so a slave can collect the changes made on the master and apply them locally. Most of the time apart from configuring how long you keep these [...]<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26250&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=26250&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/10/23/how-analysing-your-binlogs-can-be-quite-informative/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why Kickfire is a fail in MySQL Data warehouse</title>
		<link>http://venublog.com/2010/08/04/why-kickfire-is-a-fail-in-mysql-data-warehouse/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=why-kickfire-is-a-fail-in-mysql-data-warehouse</link>
		<comments>http://venublog.com/2010/08/04/why-kickfire-is-a-fail-in-mysql-data-warehouse/#comments</comments>
		<pubDate>Wed, 04 Aug 2010 23:28:08 +0000</pubDate>
		<dc:creator>Venu Anuganti</dc:creator>
				<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[MySQL Data warehouse]]></category>
		<category><![CDATA[NoSQL]]></category>
		<category><![CDATA[OLAP appliance]]></category>
		<category><![CDATA[scalability]]></category>

		<guid isPermaLink="false">http://venublog.com/?p=684</guid>
		<description><![CDATA[Even though Data warehouse is picking very rapidly in the last year or so, but few companies who are already made a right mark in the right time could not take the market share that easily due to number of reasons. Even though am not a marketing guy to go over, but some of the [...]]]></description>
			<content:encoded><![CDATA[Even though Data warehouse is picking very rapidly in the last year or so, but few companies who are already made a right mark in the right time could not take the market share that easily due to number of reasons. Even though am not a marketing guy to go over, but some of the [...]<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25488&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25488&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/08/05/why-kickfire-is-a-fail-in-mysql-data-warehouse/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Warehousing Best Practices:  Comparing Oracle to MySQL, part 2 (partitioning)</title>
		<link>http://www.pythian.com/news/15167/data-warehousing-best-practices-comparing-oracle-to-mysql-part-2-partitioning/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-warehousing-best-practices-comparing-oracle-to-mysql-part-2-partitioning</link>
		<comments>http://www.pythian.com/news/15167/data-warehousing-best-practices-comparing-oracle-to-mysql-part-2-partitioning/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 21:00:54 +0000</pubDate>
		<dc:creator>Sheeri K. Cabral</dc:creator>
				<category><![CDATA[conferences]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[data warehousing]]></category>
		<category><![CDATA[dw]]></category>
		<category><![CDATA[hash]]></category>
		<category><![CDATA[Kaleidoscope]]></category>
		<category><![CDATA[kscope]]></category>
		<category><![CDATA[linear hash partitioning]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[odtug]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[partition]]></category>
		<category><![CDATA[partitioning]]></category>
		<category><![CDATA[Pythian]]></category>
		<category><![CDATA[range]]></category>
		<category><![CDATA[subpartition]]></category>
		<category><![CDATA[Technical Blog]]></category>

		<guid isPermaLink="false">http://www.pythian.com/news/?p=15167</guid>
		<description><![CDATA[At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle.  My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
See part 1 for the introduction and talking about power and hardware.  This part will go over the 2nd &#8220;P&#8221;, partitioning.  Learning about Oracle&#8217;s partitioning has gotten me more interested in how MySQL&#8217;s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle&#8217;s partitioning looks very nice (then again, that&#8217;s why it costs so much I guess).

Partition &#8211; Larger tables or fact tables can benefit from partitioning because it makes data load easier and can increase join performance and use data elimination.  Parallel execution can be done with partitioning due to partition pruning.  The degree of parallelism should be a power of 2, because of hash-based algorithm in hash partitioning.  To translate this to the MySQL world, if you are using LINEAR HASH partitioning, then you should use a degree of parallelism that is a power of 2 (I checked, and indeed.  Otherwise, use a degree of parallelism that makes sense given the number of partitions you have.
One important note that during Pythian&#8217;s testing of MySQL partitioning, we found that all partitions were locked when an INSERT occurs, for the duration of the INSERT.  Bulk-loading with MySQL partitioning is not as fast as it would be if MySQL allowed partition pruning for INSERTs.
So, what should be partitioned?  For the first level of partitioning, the goal is to enable partitioning pruning and simplify data management.  The most typical partitioning is range or interval partitioning on a date column.  Interval partitioning is you say what the partition is (date, month) and partition is automatically created.  MySQL does not have interval partitioning, and I have seen typical first-level partitioning be range or list based on a date or timestamp column.  Note that if you use a timestamp field, the partitioning expression is optimized if you use TO_DAYS(timestamp_field) or YEAR(timestamp_field).  In my experience, using anything else (such as DATE(timestamp_field)) actually makes partitioning slower than not using partitioning at all.  Note that this is based on tests I did a few months ago, and your mileage may vary.
So &#8212; how do you decide partitioning strategy?  Ask yourself:
What range of data do the queries touch &#8211; a quarter, a year?
What is the data loading frequency?
Is an incremental load required?
How much data is involved, a day, a week, a month?
The answers to the above questions will tell you about how big your interval needs to be.  The best scenario is that all answers are the same, &#8220;we load every day, and people query by day.&#8221;  If the answers are different weight access a higher priority than loading, because most people care more about query performance than performance of ETL.
This is true even if your intervals have different sizes &#8212; ie sales per day are much bigger in Dec but that&#8217;s OK.  However, Maria recommends that the subpartition be as evenly divided as possible.
Easier to look at more partitions than to look at a partition that&#8217;s too big.  But you don&#8217;t want too many partitions, max Oracle allows partitions is 1 million partitions, prior to 11g it was 64,000.  &#8220;Stick closer to 64,000 than 1 million&#8221;.  MySQL&#8217;s limitation is 1024 per table.
For the second level of partitioning, also called subpartitioning, the goal is to allow for multi-level pruning and improve join performance.  In Oracle, the most typical subpartition is hash or list &#8211; in MySQL, you can only subpartition by hash or key.
How do you decide subpartitioning strategy?
Select the dimension queried most frequently on the fact table OR
Pick the common join column
For example, if you want to look at sales per day, per store, you would choose &#8220;per day&#8221; as the partition and &#8220;per store&#8221; as the subpartition.
If you do not have a good partition on logical elements (like grouping), then you can subpartition using hash partitioning on common joins &#8212; perhaps surrogate keys, or using join key of the largest table involved in the join.
For example, if the sales table is partitioned and another big table is product, you can hash subpartition product_id.
Because there&#8217;s overhead in partitions (loading metadata, reading metadata), make sure size of partitions and subpartitions is &#62;20 Mb.  So better to have a 30 Mb subpartition than a 15 Mb subpartition.  [I have no idea if this is true in MySQL or not -- I think the general concept is true, because there is some overhead, but I have no idea about the 20 Mb figure and why that's true for Oracle, nor do I know what is true in MySQL.]
One easy calculation is double the # of CPUs, round up to nearest power of 2.  If you&#8217;re executing in parallel, Oracle will use 2x CPUs.  (all this advice, by the way, follows 80/20 rule, this is probably good for about 80% of the environments out there).  Of course, MySQL does not do parallel execution very well, so this probably does not apply.
Oracle knows it can get partition elimination while it does a join.
If 2 tables have the same degree of parallelism (same # of buckets) and are partitioned in the same way on the join column (say, customer_id in a subpartition of sales and a partition of customer), Oracle will match the partitions when joining:
sales table joined with customer table can change into 4 small joins:
sales sub part 1 joins with customer part 1
sales sub part 2 joins with customer part 2
sales sub part 3 joins with customer part 3
sales sub part 4 joins with customer part 4
And with parallelism, the total time is now reduced to the time it takes to do one of those smaller joins.
This is also why you want to have a power of 2 for buckets &#8211; because cores/processors come in powers of 2.  Partition-wise joins like this can also be done with range or list, assuming both tables in the join have the same buckets.
I have no idea if MySQL partitioning works this way, but it&#8217;s certainly a functionality that makes sense to me.]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://www.odtugkaleidoscope.com/agenda.html">Kscope</a> this year, I attended a half day in-depth session entitled <a href="http://www.odtugkaleidoscope.com/oraclebusinessintelligence.html#colgan">Data Warehousing Performance Best Practices</a>, given by <a href="http://blogs.oracle.com/optimizer/">Maria Colgan</a> of Oracle.  My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.</p>
<p>See <a href="http://www.pythian.com/news/15157/data-warehousing-best-practices-comparing-oracle-to-mysql-part-1-introduction-and-power/">part 1</a> for the introduction and talking about power and hardware.  This part will go over the 2nd &#8220;P&#8221;, partitioning.  Learning about Oracle&#8217;s partitioning has gotten me more interested in how MySQL&#8217;s partitioning works, and I do hope that MySQL partitioning will develop to the level that Oracle partitioning does, because Oracle&#8217;s partitioning looks very nice (then again, that&#8217;s why it costs so much I guess).<br />
<span></span></p>
<p><strong>Partition</strong> &#8211; Larger tables or fact tables can benefit from partitioning because it makes data load easier and can increase join performance and use data elimination.  Parallel execution can be done with partitioning due to partition pruning.  The degree of parallelism should be a power of 2, because of hash-based algorithm in hash partitioning.  To translate this to the MySQL world, if you are using LINEAR HASH partitioning, then you should use a degree of parallelism that is a power of 2 (I checked, and indeed.  Otherwise, use a degree of parallelism that makes sense given the number of partitions you have.</p>
<p>One important note that during Pythian&#8217;s testing of MySQL partitioning, we found that <strong>all</strong> partitions were locked when an INSERT occurs, for the duration of the INSERT.  Bulk-loading with MySQL partitioning is not as fast as it would be if MySQL allowed partition pruning for INSERTs.</p>
<p>So, what should be partitioned?  For the first level of partitioning, the goal is to enable partitioning pruning and simplify data management.  The most typical partitioning is range or interval partitioning on a date column.  Interval partitioning is you say what the partition is (date, month) and partition is automatically created.  MySQL does not have interval partitioning, and I have seen typical first-level partitioning be range or list based on a date or timestamp column.  Note that if you use a timestamp field, the partitioning expression is optimized if you use <code>TO_DAYS(timestamp_field)</code> or <code>YEAR(timestamp_field)</code>.  In my experience, using anything else (such as <code>DATE(timestamp_field)</code>) actually makes partitioning slower than not using partitioning at all.  Note that this is based on tests I did a few months ago, and your mileage may vary.</p>
<p>So &#8212; how do you decide partitioning strategy?  Ask yourself:<br />
<UL><LI>What range of data do the queries touch &#8211; a quarter, a year?<br />
</LI><LI>What is the data loading frequency?<br />
</LI><LI>Is an incremental load required?<br />
</LI><LI>How much data is involved, a day, a week, a month?</LI></UL></p>
<p>The answers to the above questions will tell you about how big your interval needs to be.  The best scenario is that all answers are the same, &#8220;we load every day, and people query by day.&#8221;  If the answers are different weight access a higher priority than loading, because most people care more about query performance than performance of ETL.</p>
<p>This is true even if your intervals have different sizes &#8212; ie sales per day are much bigger in Dec but that&#8217;s OK.  However, Maria recommends that the subpartition be as evenly divided as possible.</p>
<p>Easier to look at more partitions than to look at a partition that&#8217;s too big.  But you don&#8217;t want too many partitions, max Oracle allows partitions is 1 million partitions, prior to 11g it was 64,000.  &#8220;Stick closer to 64,000 than 1 million&#8221;.  MySQL&#8217;s limitation is 1024 per table.</p>
<p>For the second level of partitioning, also called subpartitioning, the goal is to allow for multi-level pruning and improve join performance.  In Oracle, the most typical subpartition is hash or list &#8211; in MySQL, you can only subpartition by hash or key.</p>
<p>How do you decide subpartitioning strategy?<br />
<UL><LI>Select the dimension queried most frequently on the fact table OR<br />
</LI><LI>Pick the common join column</LI></UL></p>
<p>For example, if you want to look at sales per day, per store, you would choose &#8220;per day&#8221; as the partition and &#8220;per store&#8221; as the subpartition.</p>
<p>If you do not have a good partition on logical elements (like grouping), then you can subpartition using hash partitioning on common joins &#8212; perhaps surrogate keys, or using join key of the largest table involved in the join.</p>
<p>For example, if the sales table is partitioned and another big table is product, you can hash subpartition product_id.</p>
<p>Because there&#8217;s overhead in partitions (loading metadata, reading metadata), make sure size of partitions and subpartitions is >20 Mb.  So better to have a 30 Mb subpartition than a 15 Mb subpartition.  [I have no idea if this is true in MySQL or not -- I think the general concept is true, because there is some overhead, but I have no idea about the 20 Mb figure and why that's true for Oracle, nor do I know what is true in MySQL.]</p>
<p>One easy calculation is double the # of CPUs, round up to nearest power of 2.  If you&#8217;re executing in parallel, Oracle will use 2x CPUs.  (all this advice, by the way, follows 80/20 rule, this is probably good for about 80% of the environments out there).  Of course, MySQL does not do parallel execution very well, so this probably does not apply.</p>
<p>Oracle knows it can get partition elimination while it does a join.</p>
<p>If 2 tables have the same degree of parallelism (same # of buckets) and are partitioned in the same way on the join column (say, customer_id in a subpartition of sales and a partition of customer), Oracle will match the partitions when joining:</p>
<p>sales table joined with customer table can change into 4 small joins:<br />
sales sub part 1 joins with customer part 1<br />
sales sub part 2 joins with customer part 2<br />
sales sub part 3 joins with customer part 3<br />
sales sub part 4 joins with customer part 4</p>
<p>And with parallelism, the total time is now reduced to the time it takes to do one of those smaller joins.</p>
<p>This is also why you want to have a power of 2 for buckets &#8211; because cores/processors come in powers of 2.  Partition-wise joins like this can also be done with range or list, assuming both tables in the join have the same buckets.</p>
<p>I have no idea if MySQL partitioning works this way, but it&#8217;s certainly a functionality that makes sense to me.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25428&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25428&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/30/data-warehousing-best-practices-comparing-oracle-to-mysql-part-2-partitioning/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Warehousing Best Practices:  Comparing Oracle to MySQL, part 1 (introduction and power)</title>
		<link>http://www.pythian.com/news/15157/data-warehousing-best-practices-comparing-oracle-to-mysql-part-1-introduction-and-power/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=data-warehousing-best-practices-comparing-oracle-to-mysql-part-1-introduction-and-power</link>
		<comments>http://www.pythian.com/news/15157/data-warehousing-best-practices-comparing-oracle-to-mysql-part-1-introduction-and-power/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 20:53:33 +0000</pubDate>
		<dc:creator>Sheeri K. Cabral</dc:creator>
				<category><![CDATA[3nf]]></category>
		<category><![CDATA[conferences]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[data warehousing]]></category>
		<category><![CDATA[disk array]]></category>
		<category><![CDATA[disk speed]]></category>
		<category><![CDATA[dw]]></category>
		<category><![CDATA[HBA]]></category>
		<category><![CDATA[Kaleidoscope]]></category>
		<category><![CDATA[kscope]]></category>
		<category><![CDATA[LUN]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[normalization]]></category>
		<category><![CDATA[normalize]]></category>
		<category><![CDATA[odtug]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[orion]]></category>
		<category><![CDATA[parallelism]]></category>
		<category><![CDATA[Pythian]]></category>
		<category><![CDATA[SAN]]></category>
		<category><![CDATA[schema]]></category>
		<category><![CDATA[snowflake schema]]></category>
		<category><![CDATA[star schema]]></category>
		<category><![CDATA[Technical Blog]]></category>
		<category><![CDATA[throughput]]></category>

		<guid isPermaLink="false">http://www.pythian.com/news/?p=15157</guid>
		<description><![CDATA[At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle.  My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write).  Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice.  In some places, I still have questions, and I am happy to have folks comment and contribute what they know.

One interesting point brought up:
Maria quoted someone (she said the name but I did not grab it) from Forrester saying, &#8220;3NF is typically a selfless model used by Enterprise data warehouse, which is used by the whole company.  A star schema is a selfish model, used by a department, because it&#8217;s already got aggregation in it.&#8221;
I thought that was an interesting way of pointing that out &#8212; most people do not understand why 3NF is not good enough for data warehousing, and I have had a hard time explaining why a star or snowflake schema should be used.  Another schema-related topic I had a hard time putting into words before this workshop was the difference between a star and a snowflake schema:  compared to a star schema, in a snowflake schema, you have more than one fact table and maybe some dimensions that are not used often.
From Maria and the slides:
&#8220;Oracle says model what will suit your business best.  Don&#8217;t get lost in academia.  Most schemas are not 100% according to the theoretical
models.  Some examples: 3NF schema with denormalized attributes to avoid costly joins, Star schema with multiple hierarchies in same fact table.&#8221;
Data warehousing has a 3-step approach &#8212; 
1) data sources -&#62; staging layer (temp loading layer)
2) staging layer (temp loading layer)-&#62; foundation (logical, data store) layer
3) foundation (logical, data store) layer -&#62; access and performance layer
The foundation layer is usually 3NF the access layer is usually a star or snowflake schema.  As for the data sources, they can be varied, you would hope that they are in 3NF (and if they are you can skip the first 2 steps) but they are not always that way.
The 3 P&#8217;s of best practice for data warehousing (on Oracle) are power, partitioning, parallelism.  The goal of the data warehousing environment is to minimize the amount of data accessed and use the most efficient joins &#8211; so it is not so index focused.  This may be based on Oracle&#8217;s way of doing joins, I am not so sure if it applies to MySQL as well.
Power The weakest link in the chain (the 3 steps above) will define the throughput, so make sure your hardware configuration is balanced.  Maria mentioned that as DBAs, &#8220;most of the time we don&#8217;t have control over this, but we&#8217;re still bound to the SLAs.&#8221;
This includes hardware that immediately comes to mind such as # of CPUs/cores, speed of CPU, amount of RAM, speed of disk as well as what we may not think of immediately:  speed of network switches, speed of disk controllers, number and speed of host BUS adapters.  Notes on host BUS adapters (HBAs):  Know the # of HBA ports you have.  4 Gb HBA does 400 Mb/sec.  2 Gb HBA does 200 Mb/sec.  Make sure there&#8217;s enough HBA capacity to sustain the CPU throughput (ie, make sure HBA isn&#8217;t the bottleneck).  Also the speed at which it all talks.  If you have a 4 Gb machine but a 2 Gb switch, you end up having 2 Gb throughput.  Upgrade the network at the same time you upgrade machines.
Because we are talking about data warehousing, it is often not possible to eliminate disk I/O, so the goal is to have the fastest I/O throughput possible.  Data warehouses need to be sized on I/O throughput not number of I/O&#8217;s.  
I made a post earlier about how to determine I/O throughput for a system, which used information from this session.  Justin Swanhart already pointed out that this is based on the fact that Oracle can do hash joins and MySQL can only do nested loop joins.  I wonder, though, if there is indeed no case when using MySQL for which I/O throughput is a more useful metric than iops.
Disk arrays that are expensive are usually sized for iops, not throughput, and because they&#8217;re expensive the disk array is shared throughout the company.  A DBA needs to ask &#8216;how many connections into the storage array do I have?  How many disk controllers do I have?  Where are my physical disks, and which controllers are they hanging off of?&#8217;
Typical 15k rpm disk can do about 25-35 Mb/sec (per disk) random i/o&#8217;s.  Disk manufacturers will throw out numbers like 200-300 Mb/sec but that&#8217;s sequential I/O and leading edge of the drive.  Make sure all your LUNs are not coming off the same set of disks, so that you&#8217;re not conflicting on disk seeks.
Continue to part 2, partitioning.]]></description>
			<content:encoded><![CDATA[<p>At <a href="http://www.odtugkaleidoscope.com/agenda.html">Kscope</a> this year, I attended a half day in-depth session entitled <a href="http://www.odtugkaleidoscope.com/oraclebusinessintelligence.html#colgan">Data Warehousing Performance Best Practices</a>, given by <a href="http://blogs.oracle.com/optimizer/">Maria Colgan</a> of Oracle.  My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.</p>
<p>These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write).  Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice.  In some places, I still have questions, and I am happy to have folks comment and contribute what they know.<br />
<span></span></p>
<p>One interesting point brought up:<br />
Maria quoted someone (she said the name but I did not grab it) from Forrester saying, &#8220;3NF is typically a selfless model used by Enterprise data warehouse, which is used by the whole company.  A star schema is a selfish model, used by a department, because it&#8217;s already got aggregation in it.&#8221;</p>
<p>I thought that was an interesting way of pointing that out &#8212; most people do not understand why 3NF is not good enough for data warehousing, and I have had a hard time explaining why a star or snowflake schema should be used.  Another schema-related topic I had a hard time putting into words before this workshop was the difference between a star and a snowflake schema:  compared to a star schema, in a snowflake schema, you have more than one fact table and maybe some dimensions that are not used often.</p>
<p>From Maria and the slides:<br />
&#8220;Oracle says model what will suit your business best.  Don&#8217;t get lost in academia.  Most schemas are not 100% according to the theoretical<br />
models.  Some examples: 3NF schema with denormalized attributes to avoid costly joins, Star schema with multiple hierarchies in same fact table.&#8221;</p>
<p>Data warehousing has a 3-step approach &#8212; </p>
<p>1) data sources -> staging layer (temp loading layer)<br />
2) staging layer (temp loading layer)-> foundation (logical, data store) layer<br />
3) foundation (logical, data store) layer -> access and performance layer</p>
<p>The foundation layer is usually 3NF the access layer is usually a star or snowflake schema.  As for the data sources, they can be varied, you would hope that they are in 3NF (and if they are you can skip the first 2 steps) but they are not always that way.</p>
<p>The 3 P&#8217;s of best practice for data warehousing (on Oracle) are power, partitioning, parallelism.  The goal of the data warehousing environment is to minimize the amount of data accessed and use the most efficient joins &#8211; so it is not so index focused.  This may be based on Oracle&#8217;s way of doing joins, I am not so sure if it applies to MySQL as well.</p>
<p><strong>Power</strong> The weakest link in the chain (the 3 steps above) will define the throughput, so make sure your hardware configuration is balanced.  Maria mentioned that as DBAs, &#8220;most of the time we don&#8217;t have control over this, but we&#8217;re still bound to the SLAs.&#8221;</p>
<p>This includes hardware that immediately comes to mind such as # of CPUs/cores, speed of CPU, amount of RAM, speed of disk as well as what we may not think of immediately:  speed of network switches, speed of disk controllers, number and speed of host BUS adapters.  Notes on host BUS adapters (HBAs):  Know the # of HBA ports you have.  4 Gb HBA does 400 Mb/sec.  2 Gb HBA does 200 Mb/sec.  Make sure there&#8217;s enough HBA capacity to sustain the CPU throughput (ie, make sure HBA isn&#8217;t the bottleneck).  Also the speed at which it all talks.  If you have a 4 Gb machine but a 2 Gb switch, you end up having 2 Gb throughput.  Upgrade the network at the same time you upgrade machines.</p>
<p>Because we are talking about data warehousing, it is often not possible to eliminate disk I/O, so the goal is to have the fastest I/O throughput possible.  Data warehouses need to be sized on <strong>I/O throughput</strong> not number of I/O&#8217;s.  </p>
<p>I made a post earlier about <a href="http://www.pythian.com/news/15161/determining-io-throughput-for-a-system/">how to determine I/O throughput for a system</a>, which used information from this session.  <a href="http://swanhart.livejournal.com/">Justin Swanhart</a> already pointed out that this is based on the fact that Oracle can do hash joins and MySQL can only do nested loop joins.  I wonder, though, if there is indeed no case when using MySQL for which I/O throughput is a more useful metric than iops.</p>
<p>Disk arrays that are expensive are usually sized for iops, not throughput, and because they&#8217;re expensive the disk array is shared throughout the company.  A DBA needs to ask &#8216;how many connections into the storage array do I have?  How many disk controllers do I have?  Where are my physical disks, and which controllers are they hanging off of?&#8217;</p>
<p>Typical 15k rpm disk can do about 25-35 Mb/sec (per disk) random i/o&#8217;s.  Disk manufacturers will throw out numbers like 200-300 Mb/sec but that&#8217;s sequential I/O and leading edge of the drive.  Make sure all your LUNs are not coming off the same set of disks, so that you&#8217;re not conflicting on disk seeks.</p>
<p>Continue to <a href="http://www.pythian.com/news/15167">part 2, partitioning</a>.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25429&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=25429&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2010/07/30/data-warehousing-best-practices-comparing-oracle-to-mysql-part-1-introduction-and-power/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>High-Performance, Affordable, Open Data Marts</title>
		<link>http://www.kickfire.com/blog/?p=417&#038;utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=high-performance-affordable-open-data-marts</link>
		<comments>http://www.kickfire.com/blog/?p=417#comments</comments>
		<pubDate>Mon, 03 Aug 2009 18:25:14 +0000</pubDate>
		<dc:creator>Kickfire Team Blog</dc:creator>
				<category><![CDATA[data mart]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[Kickfire]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[SQL chip]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.kickfire.com/blog/?p=417</guid>
		<description><![CDATA[Departmental or subject-specific data warehouses - known as &#8220;data marts&#8221; in the industry - seem to be gaining in popularity.  Fueled partly by companies wanting to start small with focused projects in today&#8217;s economy, and partly by advances in data warehousing technology improving affordability and deployability, data marts seem to be popping-up everywhere.
In most cases, data mart projects are driven by the head of a business unit or a functional group (like Sales) needing to analyze their own slice of data in order to run their department more efficiently and effectively.  The data may come from directly from an operational system or a combination of source systems resulting in what&#8217;s called an &#8220;independent data mart&#8221;, or it may come directly from a larger, enterprise data warehouse in a hub-and-spoke or &#8220;dependent data mart&#8221; configuration.
In either case today, according to industry analysts, companies are looking for data mart products that provide compelling price-performance and plug-and-play simplicity based on open architectures.
With our Kickfire Data Mart Appliance, we believe we have done just that.  By dramatically reducing the cost of high-performance data warehousing with our SQL Chip and ultra-modern column-store database, and by packing our technology in a true appliance, we have been able to achieve the industry&#8217;s leading price-performance and very compelling time-to-value. 
Furthermore, by leveraging the defacto standard open source database MySQL, our customers are able to design, develop, and deploy their data marts quickly and flexibly with the tools of their choice.  In this way, we&#8217;re able to provide high-performance, affordable, open data marts to allow businesses to respond to a market opportunity or competitive threat quickly and effectively.]]></description>
			<content:encoded><![CDATA[<p>Departmental or subject-specific data warehouses - known as &#8220;data marts&#8221; in the industry - seem to be gaining in popularity.  Fueled partly by companies wanting to start small with focused projects in today&#8217;s economy, and partly by advances in data warehousing technology improving affordability and deployability, data marts seem to be popping-up everywhere.</p>
<p>In most cases, data mart projects are driven by the head of a business unit or a functional group (like Sales) needing to analyze their own slice of data in order to run their department more efficiently and effectively.  The data may come from directly from an operational system or a combination of source systems resulting in what&#8217;s called an &#8220;independent data mart&#8221;, or it may come directly from a larger, enterprise data warehouse in a hub-and-spoke or &#8220;dependent data mart&#8221; configuration.</p>
<p>In either case today, according to industry analysts, companies are looking for data mart products that provide compelling price-performance and plug-and-play simplicity based on open architectures.</p>
<p>With our Kickfire Data Mart Appliance, we believe we have done just that.  By dramatically reducing the cost of high-performance data warehousing with our SQL Chip and ultra-modern column-store database, and by packing our technology in a true appliance, we have been able to achieve the industry&#8217;s leading price-performance and very compelling time-to-value. </p>
<p>Furthermore, by leveraging the defacto standard open source database MySQL, our customers are able to design, develop, and deploy their data marts quickly and flexibly with the tools of their choice.  In this way, we&#8217;re able to provide high-performance, affordable, open data marts to allow businesses to respond to a market opportunity or competitive threat quickly and effectively.</p>]]></content:encoded>
			<wfw:commentRss>http://planetmysql.ru/2009/08/03/high-performance-affordable-open-data-marts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

