Migrating from phpBB to Google Groups

For many years I’ve run a tiny web site for the village we live and work in. 8 years ago (or maybe more) I added a forum to the site using phpBB, as they say about themselves ‘THE #1 FREE, OPEN SOURCE BULLETIN BOARD SOFTWARE’.

It’s been very good software, regularly updated and very easy to maintain. However, the most interaction I have with the forum now is blocking spam registrations and migrating it to new servers every couple of years. There are only a couple of posts a year now, so I wanted to find a way of reducing my administration workload.

I decided to migrate it to a “google groups” group. Which is just like a forum with less customisation options. I couldn’t find any guides to migrate away from phpBB so I worked out my own method and here’s how I did it, in case you’re trying to do the same.

Steps in short form:
1) Get data from phpBB database tables as CSV file
2) Write script to process CSV file into multiple emails to the group

1) Get data from phpBB database tables as CSV file
I only needed to migrate each topic and all it’s replies. None of the other database content was important to me.
To do this, I wrote a SQL query:

SELECT po.post_subject, po.post_text, po.post_id, po.topic_id, po.post_time, us.username_clean, top.topic_title, top.topic_time
FROM phpbb_users as us, phpbb_posts as po, phpbb_topics as top
WHERE us.user_id = po.poster_id and po.topic_id = top.topic_id
ORDER BY po.topic_id ASC, post_time ASC

Essentially, this takes selected columns from the tables ‘phpbb_users’, ‘phpbb_posts’ and ‘phpbb_topics’. I’m not sure using ‘WHERE’ is very efficient and perhaps ‘INNER JOIN’/’OUTER JOIN’ would be technically better, but mine was a small database and this was more than fast enough for me (58ms for 114 rows).

Then, I saved the result as a CSV file. Opened it in LibreOffice to check. Several of the fields needed some hand editing, remove first line (headers), replacing some html characters, escaping speech marks, etc. I may have been able to fix those when saving the result of the query as CSV but I didn’t have many to do, so hand fix and move on was fastest.

2) Write script to process CSV file into multiple emails to the group

My script language of choice is ruby. Not because it’s any better than anything else, just what I happen to be using lately. I could have done the same in PHP if I spent a little more time on it.

This is the script:

# I saved file as: process.rb
# to run, "ruby process.rb" ... assuming you have ruby installed ;-)
# I had to install Pony from github, which i did using the specific install gem
# gem install specific_install
# gem specific_install -l https://github.com/benprew/pony
# If you're reading this later and forget where it came from,
# https://www.steveroot.co.uk/2015/11/migrating-from-phpbb-to-google-groups/
# Share any tips and fixes in the comments there to help others please!

require 'csv'
require 'date'
require 'Pony'

#initialise the topic counters
#some default text for the first email
#you will need to delete this manually in the google groups!
currenttopic = 0
lasttopic = 0
body = "initialise"
subject = "initialise"

CSV.foreach('phpbb_data.csv') do |row|

#get current topic
currenttopic = row[3]

if currenttopic == lasttopic
#This is a reply to the topic, add to the existing body
body = body+""+"n"
body = body+"-----------------------------------------------------"+"n"
body = body+"reply_by_username: "+row[5]+"n"
body = body+"reply_date: "+DateTime.strptime(row[7],'%s').strftime("%d/%^b/%Y")+"n"
body = body+""+"n"
body = body+row[1]+"n"
#This is a new topic. SEND the last group of messages
:to => 'YOUR-FORUM-NAME@googlegroups.com',
:subject => subject,
:via => :smtp,
:body => body,
:via_options => {
:address => 'smtp.gmail.com',
:port => '587',
:enable_starttls_auto => true,
:user_name => 'YOUR-EMAIL-ADDRESS',
:password => 'YOUR-PASSWORD',
:authentication => :plain, # :plain, :login, :cram_md5, no auth by default
:domain => "YOUR-SENDING-DOMAIN" # the HELO domain provided by the client to the server

#A message to terminal on every send, nice to know that something is happening!
puts "Sent "+subject

#Reset the body (subject is set only once, no need to clear)
body = ""
#Set subject, create standard header text and set subject for email.

#Set the subject as the topic name
subject = row[6]

#Put some generic header text in place
body = body+"-----------------------------------------------------"+"n"
body = body+"This post was transfered to the google group when the phpbb based forum was shutdown"+"n"
body = body+"You might find relevant information at YOUR-DOMAIN"+"n"
body = body+"This entry includes all replies to the original topic"+"n"
body = body+"-----------------------------------------------------"+"n"
body = body+""+"n"

body = body+"Topic: "+row[6]+"n"

body = body+"created_by_username: "+row[5]+"n"
body = body+"topic_date: "+DateTime.strptime(row[7],'%s').strftime("%d/%^b/%Y")+"n"
body = body+""+"n"
body = body+row[1]+"n"
#set the value of last topic ready for the next loop.
lasttopic = currenttopic


# These are the fields in order in the CSV. Here for easy reference whilst I coded
# numbers start from zero (so post_subject = row[0])
# "post_subject", "post_text", "post_id", "topic_id", "post_time", "username_clean", "topic_title", "topic_time"

Being very lazy, I didn’t write the code to understand the first pass should *NOT* be emailed to the group, so the first email to the group titled ‘initialise’ will need to be deleted manually.

You will need to enter your own values for: Forum name, your email address, your sending domain. You’ll need a password, but be aware that if you use 2 factor authentication you’ll need to get an app specific password from your apps account.

You will want to customise the text that is added to every email, perhaps correct the spelling of ‘transfered’ too 😉

The script isn’t particularly fast as it connects and sends each email individually. We use google apps and as there weren’t many topics to send it was well within my daily limit of gmail usage. However, if it was higher then I could have sent them directly via smtp. There are instructions for using different email methods on the ‘Pony’ github pages. The other problem I had was errors in the CSV causing the script to stop. For example some replies had no topic name and that made the script error when it encountered them. For me, I had fixed the CSV, deleted the posts already made to the forum, and run the whole script again. For others, you might like to set up a dummy group to send your messages too first to make sure everything works, then delete the dummy group and re-run the script to send messages to the new group.

To test the email messages, I suggest you take a few rows of your CSV file and send them to your own email to check formatting and content.

If you’re wondering what my results looked like, here’s one of the topics with a reply once posted to the google group