Migrate posts from Wordpress to Hugo for Blogdown
Once I found out how to make websites in R with the blogdown library I decided to migrate another blog that I have from wordpress to hugo.
Here there are the steps that I did to migrate the old posts.
I used a Macbook with Python installed.
1) Export XML from Wordpress
Go to Wordpress Admin –> Tools –> Export –> select posts.
2) In Terminal:
- change directory to the one where you want to save the files
- git clone https://github.com/thomasf/exitwp.git
- sudo easy_install pip
- sudo pip install pyyaml
- sudo pip install beautifulsoup4
- sudo pip install html2text
In the directory where the exitwp git clone has been created:
- Move the Wordpress export XML in the wordpress-xml directory.
And back to Terminal:
- Run in Terminal xmllint YOUR_WORDPRESS_EXPORT.xml
- Back in the exitwp folder, run python2 exitwp.py
3) Clean the markdown files created in the above step
The above steps will create a folder containing a markdown file for each old post.
Then I need to adapt those old posts for the new blog with Hugo.
Change the file formats from .markdown to .md
folder_path <- "~/Desktop/posts"
files = list.files(folder_path, full.names = TRUE)
file.rename(files, sub('[.]markdown$', '.md', files))
Remove the date in front of the name for each file
files = list.files(folder_path, full.names = TRUE)
file.rename(files, paste0(folder_path,"/",substring(files, 40))) #change this number according to the lenght of the folder path
Remove and replace text in YALM for each post
This is specific for your own posts. For my ones I just needed for each file to do some replacements.
files = list.files(folder_path, full.names = TRUE)
for (i in 1:length(files)) {
tx <- readLines(files[i])
tx <- gsub("layout: post", "type: post", x = tx)
tx <-gsub("^\\[Amazon.*NoScript\\)\\t\\t$", "", tx)
tx <-gsub("^\\[Amazon.*NoScript\\)$", "", tx)
writeLines(tx, con=files[i])
}
4) Final manual step
Copy the markdown files created above into the new blogdown project in the folder content/post.