21 May, 2010
The Blogger to WordPress move from hell!
Posted by: Jennifer In: WordPress|WordPress: Lessons Learned
One of the projects I've been working on recently is to move a client's Blogger blog into WordPress. The blog is a few years old, but it has an ENORMOUS amount of comments. Doing the standard import from within WordPress wasn't bringing over all the posts and it barely scratched the surface of bringing over the comments.
I found this post on ArtLung: Migrating an old Blogger Blog To WordPress – which was a HUGE help. The gist of that was you need to download an export file from Blogger, then get this script you run from the terminal (on Mac) which will convert the Blogger xml into a xml that WordPress will understand. (Just to give you an idea what we're talking about – the Blogger XML file I downloaded was over 40MB.)
THEN the fun began. After the conversion to WordPress XML – the file was still about 30MB. This meant that I had to run the import over a dozen times before it could chew through the whole file. When the import times out and you run it again, it recognizes that some posts are already there, so it doesn't import them a second time – but for some reason, it seemed to have trouble with a bunch of comments. Some of the comments DID get imported more than once. As well – the comment counts for each post weren't getting updated. SO! Here's the fix I pieced together for those issues.
First, to remove the duplicate comments – in PHPMyAdmin – I ran the following SQL command: (*DISCLAIMER: make sure you backup your database before attempting any of this!!)
CREATE TABLE temp_table AS SELECT * FROM wp_comments WHERE 1 GROUP BY `comment_post_ID`,`comment_content`,`comment_date`;
What that does is create a duplicate of the wp_comments table – but it's only grabbing the unique entries. If a comment is posted to the same post, has the same content and the same date – we'll just be grabbing one of those and ignoring the rest.
UPDATED TO ADD: One important thing I need to add in here is that you need to set the primary keys and a few indexes for this new table – This is the structure of a normal wp_comments table:
CREATE TABLE `wp_comments` (
`comment_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`comment_post_ID` bigint(20) unsigned NOT NULL DEFAULT '0',
`comment_author` tinytext NOT NULL,
`comment_author_email` varchar(100) NOT NULL DEFAULT '',
`comment_author_url` varchar(200) NOT NULL DEFAULT '',
`comment_author_IP` varchar(100) NOT NULL DEFAULT '',
`comment_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`comment_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`comment_content` text NOT NULL,
`comment_karma` int(11) NOT NULL DEFAULT '0',
`comment_approved` varchar(20) NOT NULL DEFAULT '1',
`comment_agent` varchar(255) NOT NULL DEFAULT '',
`comment_type` varchar(20) NOT NULL DEFAULT '',
`comment_parent` bigint(20) unsigned NOT NULL DEFAULT '0',
`user_id` bigint(20) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`comment_ID`),
KEY `comment_approved` (`comment_approved`),
KEY `comment_post_ID` (`comment_post_ID`),
KEY `comment_approved_date_gmt` (`comment_approved`,`comment_date_gmt`),
KEY `comment_date_gmt` (`comment_date_gmt`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=2 ;
THEN, I ran this SQL command;
DROP TABLE wp_comments;
RENAME TABLE temp_table TO wp_comments;
What that does is delete the wp_comments table, then we rename our new and improved "temp_table" so that it is now the official wp_comments table.
NEXT – we want to update the comment counts. To save having to recount the comments each time – there is a field in the wp_posts table that stores the total number of comments. However, with the way this import worked, those numbers did not get updated so most were saying there were 0 comments on posts that did have comments. To fix that issue I ran the following SQL command: (which I found here on Planet MySQL)
UPDATE wp_posts wpp
LEFT JOIN
(SELECT comment_post_id AS c_post_id, count(*) AS cnt FROM wp_comments
WHERE comment_approved = 1 GROUP BY comment_post_id) wpc
ON wpp.id=wpc.c_post_id
SET wpp.comment_count=wpc.cnt
WHERE wpp.post_type IN ('post', 'page')
AND (wpp.comment_count!=wpc.cnt OR (wpp.comment_count != 0 AND wpc.cnt IS NULL));
Which as far as I can tell gets the comment counts for each post in wp_posts. (I can't tell you more than that because once you start getting into the "LEFT JOIN" stuff in sql, my brain shuts off)
The other piece advice I'll give is if you have a ginormous blog that you have to move like the one I was working with – set the blog up locally using MAMP, do the imports, all the database correcting stuff – then just export the database and import it onto your server. The database file ended up only being 14MB uncompressed – way better than trying to chew through a 30MB import file. Once the database is up on your server, a few more SQL commands (just like if you're moving your blog to a new domain) to change the local version (probably http://localhost:8888 ) to your actual domain name, and you're done!
Another tidbit to add: if you need to redirect blogger formatted url (that may include ".html" in the end) – add this to your htaccess file and it should redirect to the new (non extension ending) URL:
RewriteEngine On
RewriteRule (.*)\.html $1 [R=301,L]
(I tooke off the first line and put the second line under the "RewriteEngine On" that WordPress has in it's htaccess block and it worked great)