Archive for the ‘crosstab’ Category

Crosstabs cell-shifting

Май 6th, 2010

A crosstab query is a specific query used to create aggregate reports on two or more fields, it’s  a handy way to display summary information. At Open Query we have customers using that trick to display production schedules.

The summary table is generated from the database to extract the manufacturing date (mand), unit number (unitn), product area (pro_area), and ranking (rnk). Then we can start using that summary table to process our crosstab. The summary table looks like this:

mand unitn pro_area rnk
2009-12-15 587-MWI2-PP49 1 11
2009-12-15 587-MWI2-PP50 1 10
2009-12-16 670-W1101 1 11
2009-12-17 670-W1102 1 7
2009-12-16 670-W1103 1 10
2009-12-16 683-BR107-S2-01 12 9
2009-12-18 683-BR107-S4-05a 12 11
2009-12-18 683-BR107-S5-06 12 10
2009-12-15 691-02-105 1 9
2009-12-16 691-02-107 1 8
2009-12-17 691-03-101 1 6
2009-12-17 691-03-105 1 5

Starting from that table we are going to generate the final report, who will display the production schedule by factory and by production order (rank).  Our crosstab query will look like this:

select
pro_area as ‘factory’,
max( case when mand = ‘2009-12-15′ then unitn else ” end ) as ‘15Dec09′,
max( case when mand = ‘2009-12-16′ then unitn else ” end ) as ‘16Dec09′,
max( case when mand = ‘2009-12-17′ then unitn else ” end ) as ‘17Dec09′,
max( case when mand = ‘2009-12-18′ then unitn else ” end ) as ‘18Dec09′
from pro_sched_sorted
group by pro_area, rnk
order by pro_area, rnk desc

SELECT pro_area as ‘factory’,

max( case when mand = ‘2009-12-15′ then unitn else ” end ) as ‘15Dec09′,

max( case when mand = ‘2009-12-16′ then unitn else ” end ) as ‘16Dec09′,

max( case when mand = ‘2009-12-17′ then unitn else ” end ) as ‘17Dec09′,

max( case when mand = ‘2009-12-18′ then unitn else ” end ) as ‘18Dec09′

FROM pro_sched_sorted

GROUP BY pro_area, rnk

ORDER BY pro_area, rnk desc

And the result of our query will give the following report:

factory 15Dec09 16Dec09 17Dec09 18Dec09
1 587-MWI2-PP49 670-W1101
1 587-MWI2-PP50 670-W1103
1 691-02-105 691-03-106
1 691-02-107 691-04-103
1 670-W1102
1 691-03-101
1 691-03-105
5 698-078
5 698-079
5 698-080
5 698-081
5 698-082 698-087
5 698-083 698-088
5 698-084 698-089
5 698-085 698-090
5 698-086 698-091
9 692-1930-073 692-1930-081
9 692-1930-074 692-1930-082
9 692-1930-075 692-1930-083
9 692-1930-076 692-1930-084
10 692-1930-077 692-1930-085
10 692-1930-078 692-1930-086
10 692-1930-079 692-1930-087
10 692-1930-080 692-1930-088
12 683-BR107-S4-05a
12 683-BR107-S5-06
12 683-BR107-S2-01

Now the result of our crosstab query is statistically correct, but the result layout is not satisfying because of the empty cells that are generated in the case a given factory does not have a production order for a specific date.

I have been looking into a solution for removing those empty cells on the server side. That would involve either filtering, or doing cell-shifting in SQL.

The filtering solution is not really an option since the date columns are aggregated. You cannot just add a WHERE … != ”clause because of the GROUP BY process.

For the second solution, I have thought of writing a special stored procedure as a cursor on the result set that would reorder the report by replacing empty cells by the value in the next row. Now probably MySQL stored procedure language would be somewhat limited for doing previous row/next row operations.

Any thoughts on doing cell-shifting in SQL?


PlanetMySQL Voting: Vote UP / Vote DOWN

Multi dimensional cubes in MySQL through Gearman

Январь 20th, 2010

MySQL cubes with Gearman

I gave two presentations about Gearman at the Linux.conf.au. As part of the preparation for these talks, I created several sample applications. One of them, about remote replication administration, I will cover in a separate post. The most amazing one, which I cover here, is a quick and painless solution for multiple level crosstabs in MySQL.

Some background is needed. Crosstabs (also called data cubes or pivot tables, have been one of my favorite hacks for long time. In 2001 I wrote an article about a simple way of doing single level crosstabs. A few years later, I developed a Perl module that generates multiple levels of data cubes in most any database systems. Since then, I have received countless requests to convert this module to PHP, Python, Java, and I have always declined, for lack of time or abilities.
In the coming years, I tackled the same problem using MySQL Proxy and some SQL hacks. Both attempts were not completely satisfactory. The options offered by the Perl module are simply too hard to replicate to any other system.
When I started using Gearman, I realized that I could use the original Perl module through a Gearman worker, without converting to any other language. The idea is to write a simple worker that accepts some parameters and runs the Perl module to return a crosstab query to the client. The query being the most complicated thing to generate, the architecture could look like the image below.

To take the idea one step further, I used the Gearman UDF for MySQL, which makes the crosstab function available at the SQL level, thus being transparent no matter which programming language the client uses, and without need of using the Gearman API.

In this scenario, what you need to do is just querying the worker (through the UDF), with a simple string of parameters.

mysql> set @q = (select gman_do('crosstab',
'from=all_personnel;op=sum salary;rows=country;cols=gender'));

mysql> prepare q from @q; execute q;
+---------+-------+-------+-------+
| country | m | f | total |
+---------+-------+-------+-------+
| Germany | 16000 | 11000 | 27000 |
| Italy | 6000 | 6000 | 12000 |
| UK | 10500 | NULL | 10500 |
| zzzz | 32500 | 17000 | 49500 |
+---------+-------+-------+-------+

To make this work, what's missing is the worker. You can try the sample crosstab worker from MySQL Forge.

PlanetMySQL Voting: Vote UP / Vote DOWN

Multi dimensional cubes in MySQL through Gearman

Январь 20th, 2010

MySQL cubes with Gearman

I gave two presentations about Gearman at the Linux.conf.au. As part of the preparation for these talks, I created several sample applications. One of them, about remote replication administration, I will cover in a separate post. The most amazing one, which I cover here, is a quick and painless solution for multiple level crosstabs in MySQL.

Some background is needed. Crosstabs (also called data cubes or pivot tables, have been one of my favorite hacks for long time. In 2001 I wrote an article about a simple way of doing single level crosstabs. A few years later, I developed a Perl module that generates multiple levels of data cubes in most any database systems. Since then, I have received countless requests to convert this module to PHP, Python, Java, and I have always declined, for lack of time or abilities.
In the coming years, I tackled the same problem using MySQL Proxy and some SQL hacks. Both attempts were not completely satisfactory. The options offered by the Perl module are simply too hard to replicate to any other system.
When I started using Gearman, I realized that I could use the original Perl module through a Gearman worker, without converting to any other language. The idea is to write a simple worker that accepts some parameters and runs the Perl module to return a crosstab query to the client. The query being the most complicated thing to generate, the architecture could look like the image below.

To take the idea one step further, I used the Gearman UDF for MySQL, which makes the crosstab function available at the SQL level, thus being transparent no matter which programming language the client uses, and without need of using the Gearman API.

In this scenario, what you need to do is just querying the worker (through the UDF), with a simple string of parameters.

mysql> set @q = (select gman_do('crosstab',
'from=all_personnel;op=sum salary;rows=country;cols=gender'));

mysql> prepare q from @q; execute q;
+---------+-------+-------+-------+
| country | m | f | total |
+---------+-------+-------+-------+
| Germany | 16000 | 11000 | 27000 |
| Italy | 6000 | 6000 | 12000 |
| UK | 10500 | NULL | 10500 |
| zzzz | 32500 | 17000 | 49500 |
+---------+-------+-------+-------+

To make this work, what's missing is the worker. You can try the sample crosstab worker from MySQL Forge.

PlanetMySQL Voting: Vote UP / Vote DOWN