Importing 1000’s of posts, publishing 1000’s of drafts and mysql funMar 10th, 2011 | By admin | Category: Internet, Technology
Alright so for one of my side businesses I am using wordpress as a platform to create large directories for businesses to use to find other businesses. It’s like a big ass yellow pages but for specific businesses. Anyways part of the fun has been figuring out how to get 5000-6000 listings per city into a wordpress blog as painlessly as possible and it has been anything but painless figuring it out. So through my blood sweat and tears I have learned a few things and figured I’d share 3 or 4 of them to make someone elses life a little less dreadful.
*PS: You need to be root on your server for this to work, if you aren’t don’t waste your time cause none of this will work*
*PPS: Read this article all the way through before doing anything or you’re going to spend time redoing shit*
1.) Importing a shit-ton of posts
So after you’ve put together the 1000’s of posts you need to get it into wordpress. A decent tool for the job is CVS-importer [Found HERE]. I have it importing upwards of 6000 posts in under 5 minutes, but only after I’ve tweaked mysql some. If you are running on a share server where you don’t have root access to mysql you’re going to have a hell of a time getting this to work with large numbers of posts. Under default mysql configurations I’ve gotten it to import 400 posts in a go but anything over that and it started to peter out. In section 3 I’ll go over some mysql tweaks.
Now the one key element to this working out is that you have to have CVS-Importer publish all of your posts as DRAFT. Make bloody well sure that is checked or else the thing will croak on you.. For whatever reason I can import 2000 posts in 150 seconds as drafts but doing 500 as published posts hangs the system. No clue why, but that’s the rub on that
2.) Publishing 1000’s of draft posts at once
So at first glance this looks to be a pretty f’ing daunting task. You’ve got 5000 posts sitting as drafts and publishing them 10 or 20 at a time is giong to take forever. Well it still will take a while but a hell of a lot less than you think.
Login to your box, go into mysql, use the wordpress database your are importing all these posts into and do the following
UPDATE wp_posts SET post_status = 'publish' WHERE post_status != 'static';
replace PUBLISH with with either draft, private, or publish and viola.
Alternatively if you want to just do this on a per author basis do the following from inside mysql:
UPDATE wp_posts SET post_status='publish' WHERE post_author='NUMBER';
Again replacing PUBLISH and also replacing NUMBER with the ID # of whatever user you are importing these posts under.
Viola, 10000 posts published in less than 20 seconds 🙂
*Alternatively this sort of thing can be done from phpMyadmin if you dont’ want to go through shell, in which case I’d also recommend picking up some pampers at Wal-Mart right now as they are on sale :p *
A huge thanks to WPMU.org for the above tip, man that girl just saved me hours of work: http://wpmu.org/daily-tip-bulk-edit-of-the-status-field-for-all-posts-in-wordpress-database/
Something I figured out when first trying to import large numbers was that the default mysql install was not setup with this kind of thing in mind, so there’s a bit of tweaking you’re going to have to do to help it handle the amount of data coming in and I’m guessing also the way that this plugin imports things.
Mainly you need to increase key_buffer_size from 128 up to somewhere around 384M. I didn’t test it with a ton of posts at 256M but it did work for a couple thousand. The key here is to tweak it so it is using just enough memory to do what you want but not a drop more. For me with big imports of 6000-7000 posts 384M did the trick, but there are a couple of new blogs coming up with 20,000 + posts in them so chances are I might have to temporarily increase it for those and then probably scale it back down once that is done.
I am running FreeBSD and for me I have to edit my /etc/my.cnf and add the following line:
That should just about do it, was yet another big day of learning today so I thought I’d share a little bit as there isn’t a lot of resources out there for this sort of thing.. Also I’ll freely admit that wordpress probably isn’t the best platform to be doing this sort of thing on in general, if you can get away with a specific piece of software for directories you’d be better off. Hell I imagine joomla or drupal would probably be more efficient but for how I am structuring everything wordpress was the name of the game.