Deleting Duplicate Posts in WordPressMar 15th, 2011 | By admin | Category: Internet, Technology
I ran into a little problem today where one of the programs I wrote to create blog posts was creating duplicates during one step of the process. I haven’t quite figured out why it’s doing it yet however in the meantime I have used it to create and import about 25,000 blog posts and the rub is that about 30% of those are duplicates. So my dilemma now was whether I should delay the launch a network of sites by spending another few days recreating, re-importing and resetting up all these posts or should I leave the script as is, duplication bug and then just find some way to remove the dupes. Luckily enough I managed to figure out how to go with the latter option here and will debug this after I have a few dozen sites up and running.
I’ve tried both major duplicate post remover plugins and they are both crap. Neither works, and both are broken in different ways. Not sure if it’s because of the large volume of posts or if it’s just because they are bugged but needless to say they are useless to me.
So if you find yourself with a ton of duplicate posts in wordpress and want probably the most efficient, quickest way to go about getting rid of them just log onto your server, go into mysql, use the database in question and execute the following:
from wp_posts as bad_rows
inner join (
select post_title, MIN(id) as min_id
group by post_title
having count(*) > 1
) as good_rows on good_rows.post_title = bad_rows.post_title
and good_rows.min_id <> bad_rows.id;
Now give it a while, my quad-core xeon took more than a few minutes to go through 7500 posts and remove the dupes, so grab a cup of joe and come back to see all of your problems magically disappeared 🙂 I can confirm this works with WordPress 3.1
A big thanks to pf69.com over at the wordpress forums who originally posted this solution here: http://wordpress.org/support/topic/plugin-to-remove-duplicate-posts