Archive for the ‘KFDB’ Category

Kickfire Basics — The KFDB columnar storage engine

Июль 29th, 2009

This is the first post in a new series of “Kickfire Basics” blog posts by myself and others here at Kickfire.  This series will review the basics of the Kickfire appliance starting from this post describing how data is stored on disk, to future posts on topics such as loading data into the appliance and writing queries which best leverage the capabilities of the SQL chip.

The Kickfire Equation
Column store + Compression + SQL Chip = performance

The Kickfire Analytic Appliance features the new KFDB storage engine which was built from scratch to handle queries over vast amounts of data.  KFDB is a column store in contrast to most MySQL storage engines which are row stores.  What follows is a description of our column oriented storage engine and how it improves performance over typical row stores.

This post concerns itself with the first part of the equation, the KFDB column store.

Column stores provide significant IO benefits over row stores

In general row stores are optimized for the quick storage and retrieval of many columns from a table for a small number of rows. Performance may suffer when a large number of rows must be accessed, particularly if only a small subset of columns must be accessed by the query.

In contrast, column stores perform very well when querying over a large number of rows, particularly if a small number of columns must be accessed, but they may struggle if asked to return only a small number of rows.  This is because instead of having to access entire rows of data, the column store can quickly retrieve data for only the subset of columns included in the query.

The column store uses 1/20th the I/O of a row store in this example diagram.

The column store uses 1/20th the I/O of a row store in this example diagram.

As you can see from the diagram, much less I/O is necessary to count the number of rows which match the WHERE  ‘Age < 15′ filter condition.

  • In order to determine which column values match the WHERE clause, a row store much read 5 entire rows, which reads 100 bytes from the table, assuming with overhead that 20 bytes of storage are used per row.
  • The column store needs only access the ‘Age’ column in order to answer the query, which reduces the amount of I/O significantly.  Column stores such as Kickfire also support column compression, which could reduce I/O even further.

In summary, column stores perform extremely well when a subset of the available columns are selected from the table because this reduces the amount of IO necessary to retrieve the values.

Beyond this point is a more technical description of how the KFDB storage engine stores rows on disk.

Columnar storage in KFDB in depth

The KFDB storage engine stores the data for each column into a segment.  Each segment is stored as a fixed-width structure on disk.  An individual segment contains data from only one column and one segment may belong to only a single table.  More than one segment can be grouped together into a column-group so that often retrieved groups of columns can be retrieved with reduced I/O costs.

In the example diagram above, each column (Pet Name, Pet Type, Age) will be stored in a separate segment and the “row #” column represents the row id (see below) for each row.

Column values are stored as fixed width on disk

The on disk width of a column is usually referred to as the “significant width” of the column.  The significant width is determined based on the data type, compression attributes and the values stored in the column.  The Kickfire loader chooses the best compression and storage attributes automatically during the loading process, but these values can be provided manually as well.  Later, when additional data is loaded, data may be “re-organized” into a different optimum format automatically.  We call this called data restructuring a “reorg”.

Each column is an array of values indexed by ROW_ID.

Each segment may be conceptualized as an array of values.  Each row in the database is identified by a ROW_ID or RID which represents the row number in the table.  To find any one column value, Kickfire multiplies the ROW_ID* significant_width and use this as an offset into the column.  This allows Kickfire to address rows in hardware and software very quickly using virtual addresses or VAs.  Each column has a base VA which is mapped into SQL chip memory for fast access.  The appliance smartly fetches appropriate ranges of columns based on what are called VA ranges, or VARs.

Columnar storage lends itself to sequential IO

When reading in an entire column, or a VA range, the database uses sequential IO to read the values into memory.  Sequential IO is much faster than the random IO.  Once the data is in memory, it may be addressed via VA lookups at extreme speeds.