Archive for the ‘tech’ Category

SQL JOINing a Table to Itself

Март 23rd, 2012

Getting two sets of information from one table in a select statement often leads people to write subselects, but it really doesn't matter that this is the same table twice, we can just give it a new alias and treat it as if it were a different table. This is one of those techniques where, once you've seen it, it's really obvious, but until that point it can be very confusing. I explained this to someone else recently, so I thought I'd capture it here in case it's helpful to anyone else.

Consider that tried-and-tested example: employees and managers. Here's the staff table from the database (today's imaginary data isn't particularly imaginative, sorry):

mysql> select * from staff;
+----+------------+-----------+------------+
| id | first_name | last_name | manager_id |
+----+------------+-----------+------------+
|  1 | Hattie     | Hopkins   |          4 |
|  2 | Henry      | Hopkins   |          4 |
|  3 | Harry      | Hopkins   |          5 |
|  4 | Helen      | Hopkins   |       NULL |
|  5 | Heidi      | Hopkins   |          4 |
|  6 | Hazel      | Hopkins   |          1 |
+----+------------+-----------+------------+
6 rows in set (0.00 sec)


In order to see who reports to whom, we need to query the table to get the employees, and again to get the managers' names. MySQL will get horribly confused if we use the same table twice, so we alias it each time to indicate which one we mean. I do this by drawing the picture of which tables I need and labelling them - here I've used "e" for employees and "m" for managers:

mysql> select e.first_name as employee, m.first_name as reports_to
    -> from staff e
    -> inner join staff m on e.manager_id = m.id;
+----------+------------+
| employee | reports_to |
+----------+------------+
| Hattie   | Helen      |
| Henry    | Helen      |
| Harry    | Heidi      |
| Heidi    | Helen      |
| Hazel    | Hattie     |
+----------+------------+
5 rows in set (0.02 sec)

Did you spot that there's one row less in this output? That's because we did an inner join, but Helen has no manager since she owns the company, so she doesn't appear in the second set of results. We could easily have left joined instead to include her if we wanted to.

You can use this technique regardless of how many other tables are needed for the query and whether you need to join the table to itself or onto other places. You might have a table of users, and a table of user_friends, linking one user to another. You can start from the users table, join to user_friends, then join to user again with a different alias to pull back the information you need - the possiblities are almost endless.

Lorna is an independent web development consultant, writer and trainer, open source project lead and community evangelist. This post was originally published at LornaJane


PlanetMySQL Voting: Vote UP / Vote DOWN

Linux Documentation Writer Wanted!

Февраль 9th, 2012

The Oracle Linux and Virtualization Documentation Team is seeking an experienced Technical Writer
with a focus on writing documentation for the Oracle Linux product. (The MySQL Documentation Team is part of that group as well.)

Applicants should be located in either Ireland, the UK, Sweden, Norway, Denmark, or Finland (click on the links for a detailed job description).

We're a vastly distributed team, with writers in Australia, North America, and Europe. Our infrastructure is based on DocBook XML, and we're not just writing docs, but also maintain the whole processing and publication work chain.

Key competencies you should have include:

  • 3 or more years previous experience in writing software documentation (please provide URLs of your writings I can look at!)
  • Experience with writing documentation for system level software and operating systems
  • Strong knowledge of the Linux operating system
  • Strong knowledge of XML, DocBook XML, and XSL style sheets (and motivation to help maintain and expand our tools and infrastructure)
  • Ability to administer own workstation and test environment
  • Good experience with distributed working environments and versioning systems such as SVN

If this sounds like something for you, follow the links above and send in your application!


PlanetMySQL Voting: Vote UP / Vote DOWN

SQL Joins with On or Using

Январь 17th, 2012

I recently wrote a post about inner and outer joins, and a couple of people asked what the difference is between USING and ON.

In a nutshell, you use ON for most things, but USING is a handy shorthand for the situation where the column names are the same.

Consider this example dataset:

mysql> select * from pets;
+---------+---------+--------+-----------+
| pets_id | animal  | name   | owners_id |
+---------+---------+--------+-----------+
|       1 | fox     | Rusty  |         2 |
|       2 | cat     | Fluffy |         2 |
|       3 | cat     | Smudge |         3 |
|       4 | cat     | Toffee |         3 |
|       5 | dog     | Pig    |         3 |
|       6 | hamster | Henry  |         1 |
|       7 | dog     | Honey  |         1 |
+---------+---------+--------+-----------+
7 rows in set (0.00 sec)

mysql> select * from owners;
+-----------+-------+
| owners_id | name  |
+-----------+-------+
|         1 | Susie |
|         2 | Sally |
|         3 | Sarah |
+-----------+-------+
3 rows in set (0.00 sec)

To find out who has which pets, we would join the two tables together like this:

mysql> select owners.name as owner, pets.name as pet, pets.animal
    -> from owners join pets on (pets.owners_id = owners.owners_id);
+-------+--------+---------+
| owner | pet    | animal  |
+-------+--------+---------+
| Sally | Rusty  | fox     |
| Sally | Fluffy | cat     |
| Sarah | Smudge | cat     |
| Sarah | Toffee | cat     |
| Sarah | Pig    | dog     |
| Susie | Henry  | hamster |
| Susie | Honey  | dog     |
+-------+--------+---------+
7 rows in set (0.00 sec)

The example above uses the ON keyword, but since the columns we use to join are called owners_id in both tables, then we can instead put in USING as a shorthand.

mysql> select owners.name as owner, pets.name as pet, pets.animal
    -> from owners join pets using (owners_id);
+-------+--------+---------+
| owner | pet    | animal  |
+-------+--------+---------+
| Sally | Rusty  | fox     |
| Sally | Fluffy | cat     |
| Sarah | Smudge | cat     |
| Sarah | Toffee | cat     |
| Sarah | Pig    | dog     |
| Susie | Henry  | hamster |
| Susie | Honey  | dog     |
+-------+--------+---------+
7 rows in set (0.00 sec)

OK so it's a super-simple tip but until you see the different approaches laid out side-by-side, it can be confusing. This USING trick is why you will often see fields named, for example, "user_id" when they are in the "users" table - then the shorthand can be used any time you join this user_id to any other user_id column.

Lorna is an independent web development consultant, writer and trainer, open source project lead and community evangelist. This post was originally published at LornaJane


PlanetMySQL Voting: Vote UP / Vote DOWN

Inner vs Outer Joins on a Many-To-Many Relationship

Декабрь 20th, 2011

Someone will probably tell me that this is an elementary-level topic, but I got some good questions regarding joins from my most recent ZCE class students, so I thought I'd put down the examples that I used to explain this to them. Being able to join with confidence is a key skill, because it means that you can refactor and normalise your data, without worrying about how hard something will be to retrieve.

The database structure I'm using here looks like this (the same example that I used when I wrote the Databases chapter for PHP Master):

diagram showing table relationships

If you want to grab the actual data I'm using for my examples, then here's the zipped mysqldump file: recipes-db.sql.

The database includes a "many-to-many" relationship; each recipe is linked to the ingredients needed for this particular dish, while each ingredient can of course appear in many dishes. To represent this, there is a recipes table, an ingredients table, and a recipe_ingredients table to show the combinations which apply. To join twice and understand whether you're getting all the rows or just the ones with matches in all the tables can be confusing, so this post is all about showing you some examples.

Fetching Matching Records From All the Tables

Let's say we want to see which ingredients are in which recipe. We'll do this in a number of steps, which I will show below, along with the query and result that goes with that step.

Get all the recipes

mysql> SELECT id, name FROM recipes;
+----+-------------------+
| id | name              |
+----+-------------------+
|  1 | Apple Crumble     |
|  4 | Beans Chili       |
|  5 | Chicken Casserole |
|  2 | Fruit Salad       |
|  3 | Weekday Risotto   |
+----+-------------------+
5 ROWS IN SET (0.00 sec)

OK, so we have some recipes, looking at the relationship diagram, we can see we need to link across to recipe_ingredients using the ID of each recipe.

All recipes, linked with recipe_ingredients

mysql> SELECT r.id AS recipe_id, r.name, ri.ingredient_id 
FROM recipes r 
JOIN recipe_ingredients ri ON (r.id = ri.recipe_id);
+-----------+---------------+---------------+
| recipe_id | name          | ingredient_id |
+-----------+---------------+---------------+
|         1 | Apple Crumble |             1 |
|         1 | Apple Crumble |             5 |
|         1 | Apple Crumble |             7 |
|         1 | Apple Crumble |             8 |
|         2 | Fruit Salad   |             6 |
|         2 | Fruit Salad   |             2 |
|         2 | Fruit Salad   |             1 |
|         2 | Fruit Salad   |             3 |
|         2 | Fruit Salad   |             4 |
+-----------+---------------+---------------+
9 ROWS IN SET (0.00 sec)

There are a couple of different things going on here. Firstly, I have aliased the table names to r and ri respectively. This is so that they are less in the way of us reading the important bits! Since there are multiple tables in the query now, it is important to "qualify" the table names - to say which table this column belongs to so that if there is more than one id column, it's obvious which one you meant.

Secondly, there is now a join between two tables. The query doesn't specify what kind of join it is, but the default join type is INNER JOIN. You might notice that not all of the recipes appear in this result set - only the ones where there are matching records in the recipes_ingredients table will match this query. To get all of the records, we would use an outer join - more on that later.

All the recipes, their ingredients, and the actual display information for those ingredients

To get more than the ingredient_id, we need to join on the ingredients table to get the details. This is a very common type of join, we'll use it when we're linking to pretty much any kind of detail record by its ID. Once again, it's an inner join, and it looks like this:

mysql> SELECT r.id AS recipe_id, r.name, ri.ingredient_id, i.item
FROM recipes r
JOIN recipe_ingredients ri ON (r.id = ri.recipe_id)
JOIN ingredients i ON (ri.ingredient_id = i.id);
+-----------+---------------+---------------+--------------+
| recipe_id | name          | ingredient_id | item         |
+-----------+---------------+---------------+--------------+
|         1 | Apple Crumble |             1 | apple        |
|         1 | Apple Crumble |             5 | flour        |
|         1 | Apple Crumble |             7 | butter       |
|         1 | Apple Crumble |             8 | sugar        |
|         2 | Fruit Salad   |             6 | fruit juice  |
|         2 | Fruit Salad   |             2 | banana       |
|         2 | Fruit Salad   |             1 | apple        |
|         2 | Fruit Salad   |             3 | kiwi fruit   |
|         2 | Fruit Salad   |             4 | strawberries |
+-----------+---------------+---------------+--------------+
9 ROWS IN SET (0.00 sec)

And there you have it - a many-to-many relationship join across three tables. Now you have this result, you can start adding whatever "where" clauses you need to get exactly the information you're after. This dataset includes chef and category, so you could search by all pudding recipes for example (my favourite!)

All the Records, Even Those Without Matches (Outer Join)

How about those "lost" recipes, the ones without details of their ingredients? You might want to include them in your results even if the chef didn't include ingredient details yet. To do this, we'll use an outer join.

The outer join brings in all the records, plus any matching ones. So in our example, we'll see all recipes, plus any ingredients records that match. If there are no ingredients that match, then we'll still get those records, but with a NULL in the ingredients fields.

Outer joins come in two flavours: LEFT and RIGHT. The way that I think of this is that the first table is the one on the left; if you want all the records from that first table, plus any matches from the second, you'll choose LEFT. If you want all the records from the second table, plus information from the first one if there is any, you'll choose RIGHT. Take a look at the diagram at the top of this post, and you'll see that we have the recipes table on the left of the diagram, and first in our query, so we'll be using a LEFT join here. (as a complete aside, I almost always have left joins, because I build up my queries that way around: getting the records I want, then the additional information to go with them, and so on. Maybe that's just the way my brain works?). Here's our recipes and ingredients again, but including the recipes without ingredients:

mysql> SELECT r.id AS recipe_id, r.name, ri.ingredient_id, i.item 
FROM recipes r 
LEFT JOIN recipe_ingredients ri ON (r.id = ri.recipe_id)
LEFT JOIN ingredients i ON (ri.ingredient_id = i.id);
+-----------+-------------------+---------------+--------------+
| recipe_id | name              | ingredient_id | item         |
+-----------+-------------------+---------------+--------------+
|         1 | Apple Crumble     |             1 | apple        |
|         1 | Apple Crumble     |             5 | flour        |
|         1 | Apple Crumble     |             7 | butter       |
|         1 | Apple Crumble     |             8 | sugar        |
|         4 | Beans Chili       |          NULL | NULL         |
|         5 | Chicken Casserole |          NULL | NULL         |
|         2 | Fruit Salad       |             6 | fruit juice  |
|         2 | Fruit Salad       |             2 | banana       |
|         2 | Fruit Salad       |             1 | apple        |
|         2 | Fruit Salad       |             3 | kiwi fruit   |
|         2 | Fruit Salad       |             4 | strawberries |
|         3 | Weekday Risotto   |          NULL | NULL         |
+-----------+-------------------+---------------+--------------+
12 ROWS IN SET (0.00 sec)

Since we're working on a many-to-many relationship here, we're joining three tables in total. This means two joins: one between the first table and the linking table, and another from the result of that query to the other end of the relationship. In this example, this means we had two left joins. Each join operates between two data sets, which are the resultset you have so far, and the new table that is being joined. Since we still want to have records from the recipe/recipe_ingredients join even if there isn't a matching record in ingredients, then the second join here also needs to be an outer join.

Joining All Tables

My initial diagram showed five tables in all, so here's the query and result for pulling all that data together:

mysql> SELECT r.name AS recipe, c.name AS category, ch.name AS chef, i.item AS ingredient
FROM recipes r 
LEFT JOIN recipe_ingredients ri ON (r.id = ri.recipe_id)
LEFT JOIN ingredients i ON (ri.ingredient_id = i.id)
INNER JOIN chefs ch ON (r.chef_id = ch.id)
INNER JOIN categories c ON (r.category_id = c.id);
+-------------------+----------+-------+--------------+
| recipe            | category | chef  | ingredient   |
+-------------------+----------+-------+--------------+
| Weekday Risotto   | Main     | Lorna | NULL         |
| Beans Chili       | Main     | Lorna | NULL         |
| Chicken Casserole | Main     | Lorna | NULL         |
| Apple Crumble     | Pudding  | Lorna | apple        |
| Apple Crumble     | Pudding  | Lorna | flour        |
| Apple Crumble     | Pudding  | Lorna | butter       |
| Apple Crumble     | Pudding  | Lorna | sugar        |
| Fruit Salad       | Pudding  | Lorna | fruit juice  |
| Fruit Salad       | Pudding  | Lorna | banana       |
| Fruit Salad       | Pudding  | Lorna | apple        |
| Fruit Salad       | Pudding  | Lorna | kiwi fruit   |
| Fruit Salad       | Pudding  | Lorna | strawberries |
+-------------------+----------+-------+--------------+
12 ROWS IN SET (0.00 sec)

Once you have this dataset, you can filter it down any way you want to. Since the chef and category elements are required fields, I used inner joins for those; left joins would have given the same results though on this dataset.

If you were looking for examples of inner and outer joins across a real (if simple!) database with multiple joins, I hope this helped. All this talking about food is making me hungry, I'm off to the kitchen :)

Lorna is an independent web development consultant, writer and trainer, open source project lead and community evangelist. This post was originally published at LornaJane


PlanetMySQL Voting: Vote UP / Vote DOWN

Introduction to memcached

Май 28th, 2010

These are the slides to a talk I did earlier this week for students of the professional bachelor in ICT course at KaHo St. Lieven. I wanted to give a clear and simple introduction to the memcached service, as I think it’s an invaluable tool in today’s web development.


PlanetMySQL Voting: Vote UP / Vote DOWN