Another migration from Movabletype to WordPress

I had to move a very old Movabletype site over to a WordPress install. This is the migration log, for future reference.

The original site was a MT 2.5 install from 2002/2003 that has been running until now, literally untouched since a major server move in 2006, with new posts being added daily and a grand total of 5000 post, mostly medium- and long-form essays and fiction writings.

The site suffered from severe underperformance issues even under moderate load, both caused by its outdated server and by MT’s code inability to cope with a modern web environment. Comments were disabled but trackbacks were not, resulting in huge amounts of pingback spam. No plugins and no customizations other than the site’s own graphic template.

First, I refreshed my memories from 2005 when I migrated a similar site. Then, I discovered this post by David B. Bitton that shows a new way of importing data while preserving old permalinks and SEO. I’ll follow David’s steps, here’s how:

Do a backup of database and files (cgi-bin and document root) and replicate the install on my laptop so that I can set config/memory limits as I please. Logging in the local MT site (let’s call it “local1″) change the “blog config” parameters to reflect the local settings, then rebuilt the site in order to visually check the installation and use it for reference (for the original style, color palette, exact content).

After that, clone “local1″ into “local2″ (mysqldump, new db, restore db, copy files, change mt.cfg settings from local1 to local2). “local1″ will be the reference old installation, “local2″ will be the actual upgrade installation.

The goal here is to upgrade to a Movabletype version where I can do an XML “backup” of the blog, as opposed to a MT “export” in text format. The key difference is that the XML backup keeps the entry IDs of every post, that will be used later in the WordPress import. This feature was first introduced in MT 4.x MT (thanks Mihai Bocsaru in the MT forums for the information). I decided to upgrade straigth to 4.x without hopping through several intermediate upgrades 2.x 3.x 4.x and it worked, but if you have a more complex configuration, comments, plugins and customizations you may prefer the long upgrade path.

Download MT 4.37 (the earliest 4.x available) from http://www.movabletype.org/downloads/archives/. Read the documentation on upgrades from the tarball’s docs/mtupgrade.html. The two key issues are that mt-db-pass.cgi is deprecated and mt.cfg has moved to mt-config.cgi.

Go to “local2″ cgi-bin directory, copy mt.cfg to mt-config.cgi, and add a DBPassword “yourdbpassword” statement. Then set execution permissions to mt-config.cgi. Copy the MT 4.37 files over the old installation. Go to localhost/cgi-bin/mt.cgi (or where you installed local2): an upgrade page will walk you through the upgrade process, with a nice progress bar and a helpful upgrade log. I had to check apache2 error.log several times because of missing static files (javascript and such) that I forgot to move to the proper place.

When the upgrade is completed login in the new “local2″ site and check that everything is fine. It is not, actually, because somewhere between MT 2.5 and 4.3 the character encoding has switched from Latin1 to UTF-8 and all accented letters are garbled. Ignore this issue for the time being (it’s not well documented, MT is old software with a long complicated proprietary/open history).

Go to Tools>Backup in your MT dashboard. Do an uncompressed undivided backup. The output is an XML file of all your blog contents. The original MT2.5 database was 100 MB, the XML is just a little above that. Open the XML file with a text editor: in my case half of the contents were pingback/trackback spam which I deleted, final size was 50MB.

(The pingback spam could have beed deleted from the original database via SQL in the first place, but I was not familiar with MT2.5 database. Deleting spam from the db before upgrading is going to make the following process faster.)

Then convert the character encoding of the XML backup file with iconv:
iconv -f ISO-8859-1 -t UTF-8 file > newfile
It still contained illegal characters to be replaced with their UTF-8 equivalents or their XML escape entities, especially apostrophes in the entry titles. I used Firefox to check and validate the XML file every time. (A possible explanation for this character havoc is the accumulation of thousand of posts by dozens of authors over a decade: different word processors, operating systems, writing habits: nice mess!)

From now follow David B. Bitton’s post. I made a local WordPress installation, add the Movable Type Backup Importer plugin and add just after line 407 of class-mt-backup-import.php:
$post->import_id = $id;

Import the XML file and wait, it takes some time for 5000 posts.

Then set the permalink structure, I use the year/month/day/post-title style and slightly different rewrite rules. The key is to redirect existing incoming links to the old permalink structure to WordPress default short (numeric) URLs and redirect the feed subscribers. Here’s the .htaccess (snippet):


RewriteEngine On
RewriteRule ^archives/[0-9]{4}/[0-9]{2}/0*(\d+).html$ /?p=$1 [R=301,NC,L]
RewriteRule ^archives/[0-9]{4}/[0-9]{2}/0*(\d+)print.html$ /?p=$1 [R=301,NC,L]
RewriteRule ^archives/([0-9]{4})_([0-9]{2}).html$ /$1/$2 [R=301,NC,L]
RewriteRule ^archives/cat_([a-z_]*).html$ /categorie/$1 [R=301,NC,L]
RewriteRule ^archives.html$ / [R=301,NC,L]
RewriteRule ^index.rdf$ /feed [R=301,NC,L]
RewriteRule ^index.rss$ /feed [R=301,NC,L]
RewriteRule ^index.xml$ /feed [R=301,NC,L]
RewriteRule ^atom.xml$ /feed [R=301,NC,L]

# BEGIN WordPress

Edit: there are a couple more rules I added depending on the original permalink structure in the old site. I found out checking the apache logs for 404 not found errors.

Now I have a working localhost site with all the original old site content. Check that nothing is missing, verify the users/passwords. Check the old MT media directory (usually /archives) and delete all .xml files (mostly trackback spam), html files (single posts, category archives, date archives) and date directories.

Eventually, move it to a production server (I prefer editing a database dump for this purpose) and enjoy. Keep an eye on the web server logs for 404s, adjust the htaccess rules as needed.

Un pensiero su “Another migration from Movabletype to WordPress”

I commenti sono chiusi.