Monday, September 14, 2009

Delete duplicates in a MYSQL database

* Create a new table identical in structure to the first, but with no data.
* Use the query: INSERT INTO `name_of_new_table` SELECT DISTINCT * FROM `name_of_old_table`
* Drop the old table
* Rename the new table to whatever the old table was called.

This has the advantage of being pretty fast, and only requiring 4 queries, no matter how many rows are duplicated. On the other hand though, if there are any database queries on the old table between the drop and the rename, they will fail (since the table won't exist during that period). That being said, the period for which this is a problem should be pretty short.