Archive for the ‘Tips & Tricks’ Category

MySQL GIS – Part 1

Август 24th, 2010

Tweet

In my business (weather) we use lots map based (Geo) information.  Almost every table has latitude and longitude. Working with this kind of data can be exciting and frustrating.  This should give you a quick start into GIS with MySQL.

“A geographic information system (GIS), or geographical information system, is any system that captures, stores, analyzes, manages, and presents data that are linked to location. In the simplest terms, GIS is the merging of cartography, statistical analysis, and database technology. GIS systems are used in cartography, remote sensing, land surveying, utility management, natural resource management, photogrammetry, geography, urban planning, emergency management, navigation, and localized search engines.”Wikipedia

GIS / Mapping Systems work with both text data and graphical data.  Applications and utilities often blur the lines between the two types and make understanding difficult.  Map servers blend raster images, with point or polygon data, and bitmap images to make complete images to display in the user’s client application.  For this post I will concentrate on the text type “data”.  The type we can index in a MySQL database.

THE SEARCH

After months of reading, [1]  I’m writing this post to describes what I have learned about how to get started using GEO coding data as quickly as possible.  I found very little piratical information on GIS and MySQL.  The MySQL manual covers the functions but doesn’t supply much practical information on GEO.  Anders Karlsson wrote a nice and short story about GIS that give me a good start.

The best information has be written by Florin Duroiu in his post titled “Political boundaries overlay in Google maps”. A good part of my post is based on his work.

STEP BY STEP

Below are the detailed needed to to produce a MySQL database with the Points of Interest (POI).  This is based on CentOS 5.5 with MySQL 5.1.

yum install gdal
mkdir geo
mkdir data
mkdir data/Oklahoma
cd geo/data/Oklahoma
wget http://downloads.cloudmade.com/north_america/united_states/oklahoma/oklahoma.shapefiles.zip
unzip oklahoma.shapefiles.zip
mysql -e 'create database geo'
ogr2ogr -f "MySQL" MySQL:"geo,user=root,host=localhost,password=" -nln oklahoma_poi -lco engine=MYISAM oklahoma_poi.shp
mysql geo -e 'desc oklahoma_poi'

+----------+--------------+------+-----+---------+----------------+
| Field    | Type         | Null | Key | Default | Extra          |
+----------+--------------+------+-----+---------+----------------+
| OGR_FID  | int(11)      | NO   | PRI | NULL    | auto_increment |
| SHAPE    | geometry     | NO   | MUL | NULL    |                |
| category | varchar(30)  | YES  |     | NULL    |                |
| name     | varchar(113) | YES  |     | NULL    |                |
+----------+--------------+------+-----+---------+----------------+

mysql geo -e "select name, category, Y(SHAPE) as lat, X(SHAPE) as lng  from oklahoma_poi where NAME like 'School:Putnam%'"

+-------------------------------------+--------------------------------+------------+-------------+
| name                                | category                       | lat        | lng         |
+-------------------------------------+--------------------------------+------------+-------------+
| School:Putnam City West High School | Government and Public Services |  35.492557 | -97.6605975 |
| School:Putnam City North School     | Government and Public Services | 35.5892209 | -97.6372648 |
| School:Putnam City School           | Government and Public Services | 35.5122794 | -97.6142079 |
| School:Putnam High School           | Government and Public Services | 35.5214459 | -97.6086523 |
| School:Putnam Heights Academy       | Government and Public Services | 35.5081143 | -97.5397619 |
+-------------------------------------+--------------------------------+------------+-------------+

In a coming set of post I’ll go over:

  • The “Data” types your will find and how to convert between them.
  • What data is available and where can you find it?
  • More examples on what you can do with GIS data.
  • Viewing our GIS data.
  • How to collect your own GIS data.
  • Good and bad examples of searching GIS data.
  • Optimizing MySQL GIS.  Is it really worth using?

[1] Books: GIS for DummiesAuthor: Michael N. DeMers – John Wiley & Sons (2009) – ISBN: 0470236825
Open Source GIS: A GRASS GIS Approach. Third Edition.
Author: Markus Neteler and Helena Mitasova – ISBN: 978-0-38735767-6
Web Mapping Illustrated: Using Open Source GIS Toolkits
Author: Tyler Mitchell – ISBN: 9780596008659




PlanetMySQL Voting: Vote UP / Vote DOWN

Does Size or Type Matter?

Июль 27th, 2010

MySQL seems to be happy to convert types for you. Developers are rushed to complete their project and if the function works they just move on. But what is the costs of mixing your types? Does it matter if your are running across a million rows or more? Lets find out.

Here is what the programmers see.

mysql> select 1+1;
+-----+
| 1+1 |
+-----+
|   2 |
+-----+
1 row in set (0.00 sec)

mysql> select "1"+"1";
+---------+
| "1"+"1" |
+---------+
|       2 |
+---------+
1 row in set (0.00 sec)

Benchmark

What if we do a thousand simple loops?  How long does the looping itself take?

The BENCHMARK() function executes the expression expr repeatedly count times. It may be used to time how quickly MySQL processes the expression. The result value is always 0.

mysql> select benchmark(1000000000, 1);
+--------------------------+
| benchmark(1000000000, 1) |
+--------------------------+
|                        0 |
+--------------------------+
1 row in set (5.42 sec)

mysql> select benchmark(1000000000, "1" );
+-----------------------------+
| benchmark(1000000000, "1" ) |
+-----------------------------+
|                           0 |
+-----------------------------+
1 row in set (5.40 sec)

So maybe type doesn’t matter? About five seconds just to loop but the type didn’t change it.   What if we add 1+”1″?

mysql> select benchmark(1000000000, 1+1);
+----------------------------+
| benchmark(1000000000, 1+1) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (12.65 sec)

mysql> select benchmark(1000000000, 1+"1");
+------------------------------+
| benchmark(1000000000, 1+"1") |
+------------------------------+
|                            0 |
+------------------------------+
1 row in set (35.58 sec)
mysql> select benchmark(1000000000, "1"+"1");
+--------------------------------+
| benchmark(1000000000, "1"+"1") |
+--------------------------------+
|                              0 |
+--------------------------------+
1 row in set (51.59 sec)

It looks like type does matter.  But does it always matter?

mysql> select benchmark(1000000000, sum(1+1));
+---------------------------------+
| benchmark(1000000000, sum(1+1)) |
+---------------------------------+
|                               0 |
+---------------------------------+
1 row in set (9.69 sec)

mysql> select benchmark(1000000000, sum("1"+"1"));
+-------------------------------------+
| benchmark(1000000000, sum("1"+"1")) |
+-------------------------------------+
|                                   0 |
+-------------------------------------+
1 row in set (9.94 sec)

mysql> select benchmark(1000000000, sum("1.23456789"+"1.23456789"));
+-------------------------------------------------------+
| benchmark(1000000000, sum("1.23456789"+"1.23456789")) |
+-------------------------------------------------------+
|                                                     0 |
+-------------------------------------------------------+
1 row in set (10.32 sec)

So, not all functions are the same.  But it looks like size might matter!

mysql> select benchmark(1000000000, 1.1+1.1);
+--------------------------------+
| benchmark(1000000000, 1.1+1.1) |
+--------------------------------+
|                              0 |
+--------------------------------+
1 row in set (34.90 sec)

mysql> select benchmark(1000000000, "1.1"+"1.1");
+------------------------------------+
| benchmark(1000000000, "1.1"+"1.1") |
+------------------------------------+
|                                  0 |
+------------------------------------+
1 row in set (1 min 15.32 sec)

mysql> select  benchmark(1000000000, "1.123456789"+"1.123456789");
+----------------------------------------------------+
| benchmark(1000000000, "1.123456789"+"1.123456789") |
+----------------------------------------------------+
|                                                  0 |
+----------------------------------------------------+
1 row in set (1 min 53.32 sec)

Sorry.  Looks like size does matter.
This doesn't seem logical.

mysql> select benchmark(1000000000, 1=1);
+----------------------------+
| benchmark(1000000000, 1=1) |
+----------------------------+
|                          0 |
+----------------------------+
1 row in set (12.75 sec)

mysql> select benchmark(1000000000, 1="1");
+------------------------------+
| benchmark(1000000000, 1="1") |
+------------------------------+
|                            0 |
+------------------------------+
1 row in set (40.78 sec)
mysql> select benchmark(1000000000, 1=true);
+-------------------------------+
| benchmark(1000000000, 1=true) |
+-------------------------------+
|                             0 |
+-------------------------------+
1 row in set (12.73 sec)

mysql> select benchmark(1000000000, 1="true");
+---------------------------------+
| benchmark(1000000000, 1="true") |
+---------------------------------+
|                               0 |
+---------------------------------+
1 row in set, 65535 warnings (3 min 5.72 sec)
mysql> select benchmark(1000000000, "true"="true");
+--------------------------------------+
| benchmark(1000000000, "true"="true") |
+--------------------------------------+
|                                    0 |
+--------------------------------------+
1 row in set (57.25 sec)

Maybe we should CAST our work?

mysql> select benchmark(1000000000, cast("1" as unsigned));
+----------------------------------------------+
| benchmark(1000000000, cast("1" as unsigned)) |
+----------------------------------------------+
|                                            0 |
+----------------------------------------------+
1 row in set (32.27 sec)

mysql> select benchmark(1000000000, cast("1" as unsigned) + cast("1" as unsigned));
+----------------------------------------------------------------------+
| benchmark(1000000000, cast("1" as unsigned) + cast("1" as unsigned)) |
+----------------------------------------------------------------------+
|                                                                    0 |
+----------------------------------------------------------------------+
1 row in set (1 min 7.24 sec)

Maybe not!
Conclusion:  Be careful with your data types.  If you are taking user input, do the type conversion ONCE in your program.  Don’t let MySQL do the type conversions for you.
query = “SELECT * FROM table where $INPUT = 1″;   could be doing your wrong.

References:

http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_benchmark

http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview.html

http://dev.mysql.com/doc/refman/5.0/en/cast-functions.html

PlanetMySQL Voting: Vote UP / Vote DOWN

Federated Tables

Июль 7th, 2010

Your searching for how to create a join across two databases on two different servers and it can’t be done directly.   select  d1.a, d2.b from db1@server1 join db2@server2 where db1.c = db2.c; does not work.

You learn about federated databases.  The federated storage engine allows accesses data in tables of remote databases.  Now how do you make it work?

1) Check if the federated storage engine is supported.  Federation is OFF by default!

mysql> show engines;
+------------+---------+----------------------------------------------------------------+
| Engine     | Support | Comment                                                        |
+------------+---------+----------------------------------------------------------------+
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys     |
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance         |
| BLACKHOLE  | YES     | /dev/null storage engine (anything you write to it disappears) |
| CSV        | YES     | CSV storage engine                                             |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables      |
| FEDERATED  | YES     | Federated MySQL storage engine                                 |
| ARCHIVE    | YES     | Archive storage engine                                         |
| MRG_MYISAM | YES     | Collection of identical MyISAM tables                          |
+------------+---------+----------------------------------------------------------------+

If it is not “Support”ed (on) you need to add ‘federated=ON‘ to the [mysqld] section of your /etc/my.cnf file.  I found this section to be a bit troublesome.  It must be ‘=ON’ not ‘=YES” or even ‘=on’.   Most options allow these but the federated options is picky.  I’m running MySQL Enterprise 5.1.37.sp1.

2) If you don’t already have the database created, create the database on the storage server.  By ‘storage server’ I mean the one where the data will be written to disk.

I like to create a user just for the purpose of connection the federated copy of the database to the true database.  This way, if the password gets changed or the user deleted, the federated system can continue to connect.

mysql> CREATE DATABASE xfiles;
mysql> USE xfiles;
mysql> CREATE TABLE cases(
 Name VARCHAR(20),
 case TINYINT(3),
) ENGINE = INNODB;

3) Now you can create the federated version of your data on the remote system.

mysql> CREATE DATABASE xfiles;
mysql> USE xfiles;
mysql> CREATE TABLE cases(
 Name VARCHAR(20),
 case TINYINT(3),
) ENGINE = FEDERATED
CONNECTION = 'mysql://skiner:c0nsper@fbi/xfiles/cases';

4) Check your work. The table status should show Engine: FEDERATED.

mysql> use xfiles;
mysql> show table status\G

Now you can add records to the table and the data should show up in select on either server.

Enjoy.


PlanetMySQL Voting: Vote UP / Vote DOWN