Archive for the ‘data integrity’ Category

Impact of foreign keys absence on replicating slaves

Май 14th, 2012

In this post I describe what happens when a slave's Foreign Key setup is different from that of the master. I'm in particular interested in a setup where the slave has a subset of the master's foreign keys, or no foreign keys at all. I wish to observe whether integrity holds.

Making the changes

Which foreign keys do we have and how do we drop them? If you want to do this by hand, well, good luck! Fortunately, common_schema provides with quite a few handy views and routines to assist us. Consider viewing the existing foreign keys on sakila:

master> SELECT create_statement FROM common_schema.sql_foreign_keys WHERE TABLE_SCHEMA='sakila';
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| create_statement                                                                                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ALTER TABLE `sakila`.`address` ADD CONSTRAINT `fk_address_city` FOREIGN KEY (`city_id`) REFERENCES `sakila`.`city` (`city_id`) ON DELETE RESTRICT ON UPDATE CASCADE                             |
| ALTER TABLE `sakila`.`city` ADD CONSTRAINT `fk_city_country` FOREIGN KEY (`country_id`) REFERENCES `sakila`.`country` (`country_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`customer` ADD CONSTRAINT `fk_customer_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE               |
| ALTER TABLE `sakila`.`customer` ADD CONSTRAINT `fk_customer_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`film` ADD CONSTRAINT `fk_film_language` FOREIGN KEY (`language_id`) REFERENCES `sakila`.`language` (`language_id`) ON DELETE RESTRICT ON UPDATE CASCADE                   |
| ALTER TABLE `sakila`.`film` ADD CONSTRAINT `fk_film_language_original` FOREIGN KEY (`original_language_id`) REFERENCES `sakila`.`language` (`language_id`) ON DELETE RESTRICT ON UPDATE CASCADE |
| ALTER TABLE `sakila`.`film_actor` ADD CONSTRAINT `fk_film_actor_actor` FOREIGN KEY (`actor_id`) REFERENCES `sakila`.`actor` (`actor_id`) ON DELETE RESTRICT ON UPDATE CASCADE                   |
| ALTER TABLE `sakila`.`film_actor` ADD CONSTRAINT `fk_film_actor_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                       |
| ALTER TABLE `sakila`.`film_category` ADD CONSTRAINT `fk_film_category_category` FOREIGN KEY (`category_id`) REFERENCES `sakila`.`category` (`category_id`) ON DELETE RESTRICT ON UPDATE CASCADE |
| ALTER TABLE `sakila`.`film_category` ADD CONSTRAINT `fk_film_category_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                 |
| ALTER TABLE `sakila`.`inventory` ADD CONSTRAINT `fk_inventory_film` FOREIGN KEY (`film_id`) REFERENCES `sakila`.`film` (`film_id`) ON DELETE RESTRICT ON UPDATE CASCADE                         |
| ALTER TABLE `sakila`.`inventory` ADD CONSTRAINT `fk_inventory_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_customer` FOREIGN KEY (`customer_id`) REFERENCES `sakila`.`customer` (`customer_id`) ON DELETE RESTRICT ON UPDATE CASCADE             |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_rental` FOREIGN KEY (`rental_id`) REFERENCES `sakila`.`rental` (`rental_id`) ON DELETE SET NULL ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`payment` ADD CONSTRAINT `fk_payment_staff` FOREIGN KEY (`staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                         |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_customer` FOREIGN KEY (`customer_id`) REFERENCES `sakila`.`customer` (`customer_id`) ON DELETE RESTRICT ON UPDATE CASCADE               |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_inventory` FOREIGN KEY (`inventory_id`) REFERENCES `sakila`.`inventory` (`inventory_id`) ON DELETE RESTRICT ON UPDATE CASCADE           |
| ALTER TABLE `sakila`.`rental` ADD CONSTRAINT `fk_rental_staff` FOREIGN KEY (`staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                           |
| ALTER TABLE `sakila`.`staff` ADD CONSTRAINT `fk_staff_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`staff` ADD CONSTRAINT `fk_staff_store` FOREIGN KEY (`store_id`) REFERENCES `sakila`.`store` (`store_id`) ON DELETE RESTRICT ON UPDATE CASCADE                             |
| ALTER TABLE `sakila`.`store` ADD CONSTRAINT `fk_store_address` FOREIGN KEY (`address_id`) REFERENCES `sakila`.`address` (`address_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
| ALTER TABLE `sakila`.`store` ADD CONSTRAINT `fk_store_staff` FOREIGN KEY (`manager_staff_id`) REFERENCES `sakila`.`staff` (`staff_id`) ON DELETE RESTRICT ON UPDATE CASCADE                     |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Most of the foreign key constraints use RESTRICT for DELETE (meaning you are not allowed to delete a parent row when children exist), and CASCADE for UPDATE (meaning changes to parent will propagate to children). This is good, since I want to test behavior of both RESTRICT and CASCADE.

OK, we wish to remove these constraints from the slave. To see what we are going to do, consider:

slave1> select drop_statement from common_schema.sql_foreign_keys where table_schema='sakila';
+-----------------------------------------------------------------------------------+
| drop_statement                                                                    |
+-----------------------------------------------------------------------------------+
| ALTER TABLE `sakila`.`address` DROP FOREIGN KEY `fk_address_city`                 |
| ALTER TABLE `sakila`.`city` DROP FOREIGN KEY `fk_city_country`                    |
| ALTER TABLE `sakila`.`customer` DROP FOREIGN KEY `fk_customer_address`            |
| ALTER TABLE `sakila`.`customer` DROP FOREIGN KEY `fk_customer_store`              |
| ALTER TABLE `sakila`.`film` DROP FOREIGN KEY `fk_film_language`                   |
| ALTER TABLE `sakila`.`film` DROP FOREIGN KEY `fk_film_language_original`          |
| ALTER TABLE `sakila`.`film_actor` DROP FOREIGN KEY `fk_film_actor_actor`          |
| ALTER TABLE `sakila`.`film_actor` DROP FOREIGN KEY `fk_film_actor_film`           |
| ALTER TABLE `sakila`.`film_category` DROP FOREIGN KEY `fk_film_category_category` |
| ALTER TABLE `sakila`.`film_category` DROP FOREIGN KEY `fk_film_category_film`     |
| ALTER TABLE `sakila`.`inventory` DROP FOREIGN KEY `fk_inventory_film`             |
| ALTER TABLE `sakila`.`inventory` DROP FOREIGN KEY `fk_inventory_store`            |
| ALTER TABLE `sakila`.`payment` DROP FOREIGN KEY `fk_payment_customer`             |
| ALTER TABLE `sakila`.`payment` DROP FOREIGN KEY `fk_payment_rental`               |
| ALTER TABLE `sakila`.`payment` DROP FOREIGN KEY `fk_payment_staff`                |
| ALTER TABLE `sakila`.`rental` DROP FOREIGN KEY `fk_rental_customer`               |
| ALTER TABLE `sakila`.`rental` DROP FOREIGN KEY `fk_rental_inventory`              |
| ALTER TABLE `sakila`.`rental` DROP FOREIGN KEY `fk_rental_staff`                  |
| ALTER TABLE `sakila`.`staff` DROP FOREIGN KEY `fk_staff_address`                  |
| ALTER TABLE `sakila`.`staff` DROP FOREIGN KEY `fk_staff_store`                    |
| ALTER TABLE `sakila`.`store` DROP FOREIGN KEY `fk_store_address`                  |
| ALTER TABLE `sakila`.`store` DROP FOREIGN KEY `fk_store_staff`                    |
+-----------------------------------------------------------------------------------+

To actually make the DROP, we use common_schema's eval():

slave1> call common_schema.eval("select drop_statement from common_schema.sql_foreign_keys where table_schema='sakila'");

eval() is a handy routine which invokes statements generated by the given query.

This concludes the setup part.

Tests will include:

  1. Attempting to delete a parent row
  2. Attempting to add an invalid child row
  3. Attempting to update parent row

I was thinking there would be a difference between the two binary log file formats: STATEMENT and ROW. But the tests I produced showed no difference.

Tests

Attempting to delete parent row:

master> delete from actor where actor_id=1;
ERROR 1451 (23000): Cannot delete or update a parent row: a foreign key constraint fails (`sakila`.`film_actor`, CONSTRAINT `fk_film_actor_actor` FOREIGN KEY (`actor_id`) REFERENCES `actor` (`actor_id`) ON UPDATE CASCADE)

slave1> select * from actor where actor_id=1;
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|        1 | PENELOPE   | GUINESS   | 2006-02-15 04:34:33 |
+----------+------------+-----------+---------------------+

Good: the master refused the DELETE, and no DELETE occurred on slave. Integrity is intact.

Attempting to add an invalid child row:

master> insert into film_actor (actor_id, film_id, last_update) values (9999, 1, NOW());
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`sakila`.`film_actor`, CONSTRAINT `fk_film_actor_actor` FOREIGN KEY (`actor_id`) REFERENCES `actor` (`actor_id`) ON UPDATE CASCADE)

slave> select * from film_actor where actor_id=9999;
Empty set (0.00 sec)

Integrity is still intact.

Attempting to update parent row: there is nothing invalid about this operation. I'm wondering whether changes are CASCADEd on slave as well as on master:

master> update actor set actor_id=999 where actor_id=199;

master> select count(*) from film_actor where actor_id=999;
+----------+
| count(*) |
+----------+
|       15 |
+----------+

The 999 value wasn't there before on the master, so this verifies the CASCADE works on master. As for slave:

slave> select count(*) from actor where actor_id=999;
+----------+
| count(*) |
+----------+
|        1 |
+----------+

slave> select count(*) from film_actor where actor_id=999;
+----------+
| count(*) |
+----------+
|        0 |
+----------+

Bummer! The actor's row was updated, but cascading did not work on slave.

This is actually documented. However, the documentation only relates to the issue of slave tables being MyISAM. The problem occurs even when the slave tables are InnoDB, and have no foreign key constraints.

Conclusion

My personal interest in the scenario is due to something I'm working on, I'll elaborate on a future post. People sometime hope to get rid of foreign keys, and might wonder whether replication performance would boost having constraints removed on slaves.

When slave does not enforce foreign keys, you cannot rely on integrity with cascading constraints. An ugly patch might be to use triggers so as to simulate their behavior. Performance wise this is very bad.


PlanetMySQL Voting: Vote UP / Vote DOWN

Why SQL_MODE is essential even when not perfect

Февраль 16th, 2012

In a recent rant on Why I think SQL_MODE is useless…, I wanted to counteract this statement with why we MUST all use SQL_MODE, even with the inherit flaws.

The fundamental principle of a database is to restore and retrieve data. When I can insert data into the database and then I select this data it is different, this is fundamentally wrong. This is a loss of essential data integrity, something a database should NEVER do.

SQL_MODE solves the problem of “silent truncation” in most instances, and produces an all important error. As pointed out, the SQL_MODE has several limitations, however the benefits do out way the risks. Quality control on source code can reduce the limitations, but no amount of coding can stop the CRUD that comes out of the database without some SQL_MODE settings.

I would ask two more important questions.

  1. How in the first place can such a critical feature of silent data truncation ever be permitted in MySQL? Who made that decision and why?
  2. When is the owner of MySQL codebase realize this is rather ridiculous and enforce essential minimual data integrity that can be obtain with options including STRICT_ALL_TABLES, NO_ZERO_DATE, NO_ZERO_IN_DATE and NO_ENGINE_SUBSTITUTION.

References

MySQL Idiosyncrasies That Bite

Июнь 28th, 2010

The following are my slides that I presented at ODTUG Kaleidoscope 2010. This presentation talks about the MySQL defaults including a non-transactional state, silent data truncations, date management, and transaction isolation options. These are all critical for data integrity and consistency. I also cover in-depth topics including SQL_MODE, character sets and collations.


PlanetMySQL Voting: Vote UP / Vote DOWN