Royalty free music and a time lapse video for work

Here’s a great way to start 2016, Win an award for “Best Domestic Bathroom Installer 2015“.
My brother David entered the Geberit Awards, Geberit being a large multinational manufacturer of bathroom products and out of all the entries from all of the UK, he won. We’re very proud :-)

That prompted us to finish editing a timelapse video of the winning bathroom. Rather than a silent movie sound I went searching for suitable music to accompany the movie and found the track “Pamgaea” by Kevin McCleod. Best of all, the licence to use this sound track was ‘Royalty Free‘ as well as being free of cost on condition it was clearly attributed to the author. That’s very much like the software code I’ve written and shared, although Kevin is a master of his craft, whereas I’m just an amateur coding for fun.

As well as free when attributed, the music can also be licensed for a fee when an attribution is not possible or wanted. Example: Background music when you’re on hold. In my mind I always thought licensing that type of music was expensive, turns out to be a lot less than I expected.

Migrating from phpBB to Google Groups

For many years I’ve run a tiny web site for the village we live and work in. 8 years ago (or maybe more) I added a forum to the site using phpBB, as they say about themselves ‘THE #1 FREE, OPEN SOURCE BULLETIN BOARD SOFTWARE’.

It’s been very good software, regularly updated and very easy to maintain. However, the most interaction I have with the forum now is blocking spam registrations and migrating it to new servers every couple of years. There are only a couple of posts a year now, so I wanted to find a way of reducing my administration workload.

I decided to migrate it to a “google groups” group. Which is just like a forum with less customisation options. I couldn’t find any guides to migrate away from phpBB so I worked out my own method and here’s how I did it, in case you’re trying to do the same.

Steps in short form:
1) Get data from phpBB database tables as CSV file
2) Write script to process CSV file into multiple emails to the group

1) Get data from phpBB database tables as CSV file
I only needed to migrate each topic and all it’s replies. None of the other database content was important to me.
To do this, I wrote a SQL query:

SELECT po.post_subject, po.post_text, po.post_id, po.topic_id, po.post_time, us.username_clean, top.topic_title, top.topic_time
FROM phpbb_users as us, phpbb_posts as po, phpbb_topics as top
WHERE us.user_id = po.poster_id and po.topic_id = top.topic_id
ORDER BY po.topic_id ASC, post_time ASC

Essentially, this takes selected columns from the tables ‘phpbb_users’, ‘phpbb_posts’ and ‘phpbb_topics’. I’m not sure using ‘WHERE’ is very efficient and perhaps ‘INNER JOIN’/’OUTER JOIN’ would be technically better, but mine was a small database and this was more than fast enough for me (58ms for 114 rows).

Then, I saved the result as a CSV file. Opened it in LibreOffice to check. Several of the fields needed some hand editing, remove first line (headers), replacing some html characters, escaping speech marks, etc. I may have been able to fix those when saving the result of the query as CSV but I didn’t have many to do, so hand fix and move on was fastest.

2) Write script to process CSV file into multiple emails to the group

My script language of choice is ruby. Not because it’s any better than anything else, just what I happen to be using lately. I could have done the same in PHP if I spent a little more time on it.

This is the script:


# I saved file as: process.rb
# to run, "ruby process.rb" ... assuming you have ruby installed ;-)
# I had to install Pony from github, which i did using the specific install gem
# gem install specific_install
# gem specific_install -l https://github.com/benprew/pony
#
# If you're reading this later and forget where it came from,
# https://www.steveroot.co.uk/2015/11/migrating-from-phpbb-to-google-groups/
# Share any tips and fixes in the comments there to help others please!

require 'csv'
require 'date'
require 'Pony'

#initialise the topic counters
#some default text for the first email
#you will need to delete this manually in the google groups!
currenttopic = 0
lasttopic = 0
body = "initialise"
subject = "initialise"

CSV.foreach('phpbb_data.csv') do |row|

#get current topic
currenttopic = row[3]

if currenttopic == lasttopic
#This is a reply to the topic, add to the existing body
body = body+""+"n"
body = body+"-----------------------------------------------------"+"n"
body = body+"reply_by_username: "+row[5]+"n"
body = body+"reply_date: "+DateTime.strptime(row[7],'%s').strftime("%d/%^b/%Y")+"n"
body = body+""+"n"
body = body+row[1]+"n"
else
#This is a new topic. SEND the last group of messages
Pony.mail({
:to => '[email protected]',
:from => 'YOUR-EMAIL-ADDRESS',
:subject => subject,
:via => :smtp,
:body => body,
:via_options => {
:address => 'smtp.gmail.com',
:port => '587',
:enable_starttls_auto => true,
:user_name => 'YOUR-EMAIL-ADDRESS',
:password => 'YOUR-PASSWORD',
:authentication => :plain, # :plain, :login, :cram_md5, no auth by default
:domain => "YOUR-SENDING-DOMAIN" # the HELO domain provided by the client to the server
}
})

#A message to terminal on every send, nice to know that something is happening!
puts "Sent "+subject

#Reset the body (subject is set only once, no need to clear)
body = ""
#Set subject, create standard header text and set subject for email.

#Set the subject as the topic name
subject = row[6]

#Put some generic header text in place
body = body+"-----------------------------------------------------"+"n"
body = body+"This post was transfered to the google group when the phpbb based forum was shutdown"+"n"
body = body+"You might find relevant information at YOUR-DOMAIN"+"n"
body = body+"This entry includes all replies to the original topic"+"n"
body = body+"-----------------------------------------------------"+"n"
body = body+""+"n"

body = body+"Topic: "+row[6]+"n"

body = body+"created_by_username: "+row[5]+"n"
body = body+"topic_date: "+DateTime.strptime(row[7],'%s').strftime("%d/%^b/%Y")+"n"
body = body+""+"n"
body = body+row[1]+"n"
end
#set the value of last topic ready for the next loop.
lasttopic = currenttopic

end

# These are the fields in order in the CSV. Here for easy reference whilst I coded
# numbers start from zero (so post_subject = row[0])
# "post_subject", "post_text", "post_id", "topic_id", "post_time", "username_clean", "topic_title", "topic_time"

Being very lazy, I didn’t write the code to understand the first pass should *NOT* be emailed to the group, so the first email to the group titled ‘initialise’ will need to be deleted manually.

You will need to enter your own values for: Forum name, your email address, your sending domain. You’ll need a password, but be aware that if you use 2 factor authentication you’ll need to get an app specific password from your apps account.

You will want to customise the text that is added to every email, perhaps correct the spelling of ‘transfered’ too 😉

The script isn’t particularly fast as it connects and sends each email individually. We use google apps and as there weren’t many topics to send it was well within my daily limit of gmail usage. However, if it was higher then I could have sent them directly via smtp. There are instructions for using different email methods on the ‘Pony’ github pages. The other problem I had was errors in the CSV causing the script to stop. For example some replies had no topic name and that made the script error when it encountered them. For me, I had fixed the CSV, deleted the posts already made to the forum, and run the whole script again. For others, you might like to set up a dummy group to send your messages too first to make sure everything works, then delete the dummy group and re-run the script to send messages to the new group.

To test the email messages, I suggest you take a few rows of your CSV file and send them to your own email to check formatting and content.

If you’re wondering what my results looked like, here’s one of the topics with a reply once posted to the google group

Birthday Calculator – in case you don't want to wait a whole year to celebrate being alive

We have a tradition where I live. We celebrate being alive with a party and that party generally coincides with being alive for another 31,557,600 seconds.  31,557,600 seconds happens to be just about equal to a solar year, which is a happy co-incidence as it’s not so easy to remember otherwise.

I decided I could really do with a good excuse to party before that arbitrary unit of time though.  The solution? Write a web application where I can put in my date of birth and it will tell me other dates that I can celebrate on.

Try it for yourself at http://birthday.sroot.eu and it will tell you amazing things like;

  • How old you would be if you were born on Mercury, Venus, Mars and the other planets in our solar system
  • When your next MegaSecond birthday is (so you can have a party when you survive another 1 million seconds of existence)
  • Or for a really big bash, celebrate the very infrequent in our lifetime GigaSecond birthdays.

If you’d like me to add another arbitrary repeating unit of time post a comment.

Virtual PDF Printer for our small office network – a step by step how to

Alternative title: How I got multiple cups-pdf printers on the same server. (I didn’t, but postprocessing let me work around the problem).

Preamble:

I have a small business. For years we’ve been creating PDFs from any computer on our network through a “virtual appliance’ called YAFPC (“Yet Another Free PDF Composer”).

The appliance originally ran on an old PC, then on a server that ran several other virtual machines. It had a neat web interface and would allow PDF printers to be created that would appear on the network for all of our users to use. It had one printer for plain A4 paper, one for A4 paper with a letterhead background, another one for an obscure use of mine, and so on. If you printed to it, it would email you the PDF (for any user, without any extra setup needed per user). It could also put the PDFs on one of our file servers or make them available from it’s built in file server.

If I remember correctly it cost £30 and ran since 2006 right through until today, November 2014. One of my best software investments!

However, Windows 8 came along and it no longer worked. Getting Windows 8 to print to it directly turned out to be impossible.  The program was not going to be updated or replaced with a new version. I managed a short term work around having windows 8 print to a samba printer queue which converted and forwarded to the YAFPC virtual appliance. There were problems, page sizes not be exact and so on but it worked in a fashion.

Roll forward to today when I’ve just got a new network PDF virtual printer working. It wasn’t so easy to do (some 20 hours I guess) so here are my setup notes for others to follow.  The final run through of these notes had it installed and working in about an hour.

These steps assume you know quite a bit about setting up linux servers. Please feel free to use the comments to point out errors or corrections, or add more complete instructions, and I’ll edit this post with the updates.  Also please suggest alternatives methods that you needed to use to meet your needs.

Overview – We are going to create:

  • a new Ubuntu based linux server as a virtual machine
  • Install CUPS, the Common Unix Printing System
  • Install CUPS-PDF, and extension that allows files to be created from the print queue
  • Create a postprocessing script that will run every time CUPS-PDF is used that will customise our PDF’s and send them where we want them (to our users).

Sounds simple, right :-)

Continue reading Virtual PDF Printer for our small office network – a step by step how to

sunspot solr slow in production (fixed by using IP address instead of domain name)

Short version:
————–
In my sunspot.yml I used a FQDN ( solr.rkbb.co.uk ). Solr was slow
When I used the server IP (10.18.6.224). Solr was fast.

Setting the scene (you can skip this bit):
——————————————-
I’ve been slowing working on some improvements to our business system at work. Whilst most of it currently runs on MS Access and MySQL, I’m slowing working on moving bits into Ruby on Rails. One of the most important things our current system does is store prices and descriptions for over 200,000 products. Searching that database is a crucial task.

Searching in Rails turned out to be very easy. Sunspot had it working very quickly on my development machine. I also had it running on my production server using the sunspot_solr gem which is meant for development only (but mines a small business, so that’s fine). However, when the server was restarted sunspot_solr needed to be manually restarted which was a pain. I thought I should probably get around to setting up a real solr server and point my application to there. So far, so good, simply: copy the config from my rails app to my new Solr service , set the servers hostname in solr.yml, commit, deploy, it worked!

The problem – Solr was terribly slow!
——————————————-
Re-indexing was slow. I could tell something wasn’t right. Neither my rails server or my new solr server were under load.
I created a new product instead (so that would appear in the solr index).
That was slow, but it worked. Displaying search results was also slow.

Check the logs – wow! Yep, Solr is the slow bit


Started GET "/short_codes?utf8=%E2%9C%93&search=test" for 10.18.6.3 at 2014-10-01 14:28:03 +0100
Processing by ShortCodesController#index as HTML
Parameters: {"utf8"=>"✓", "search"=>"test"}
Rendered short_codes/_navigation.html.erb (1.0ms)
Rendered short_codes/index.html.erb within layouts/application (6.7ms)
Rendered layouts/_navigation.html.erb (1.3ms)
Completed 200 OK in 20337ms (Views: 10.3ms | ActiveRecord: 1.7ms | Solr: 20321.1ms)

No way should Solr take 20321ms to respond.

I tried the search on the solr admin interface and the response was instant, so I knew that solr wasn’t the problem. It must be my code (as always!).

As solr replies over http, I tried querying it from my rails server command line. Also slow. So… maybe it’s not my code… then I tried pinging the solr server from my rails server:

ping solr.rkbb.co.uk

it said replies were coming back in less than 1ms .. but then I realised they were taking about 3 or 4 seconds between each report.
I tried pinging another server … same effect…
then I tried pinging my office router… reports every second, just as fast as I’m used to seeing it. But this was the first time I’d used an IP address and not a FQDN
Then I tried pinging my solr server by it’s address … reports every second!

So, maybe all I have to do is configure my application to talk to solr via the server IP instead of FQDN…

I tried…


Started GET "/short_codes?utf8=%E2%9C%93&search=test" for 10.18.6.3 at 2014-10-02 11:51:49 +0100
Processing by ShortCodesController#index as HTML
Parameters: {"utf8"=>"✓", "search"=>"test"}
Rendered short_codes/_navigation.html.erb (0.9ms)
Rendered short_codes/index.html.erb within layouts/application (8.4ms)
Rendered layouts/_navigation.html.erb (0.8ms)
Completed 200 OK in 27ms (Views: 12.2ms | ActiveRecord: 1.1ms | Solr: 8.3ms)

… and I fixed it :-)

Well, solr is working great. Now I need to figure out what’s wrong with using FQDNs in my network.

How to change a folder icon to a picture in Mac OS X

In an earlier post I showed how I set my screenshots to save in a custom folder rather than onto my desktop (I seem to take a lot of screenshots). I also shared a little camera icon that I made for it. One of the comments asked how I changed the folder icon, so I’ve made a 30 second screen recording to show how.

1. Go to web page that has the image you want
2. Right click image (or ctrl + left click)
3. choose Copy image
4. select folder you want to change the icon for (single left click the folder)
5. Press cmd + i keys together (opens the info pane)
6. left click folder icon shown in the top left of the info pane (it will get a blue highlight border)
7. Press cmd + v keys together (this is the shortcut for paste). You’ll see the image will have replaced the folder icon.

VMware consolidated backup missing a catalogue file – fixed!

As always seems to be the case, a routine update of server software becomes a problem. This time it was updating VMware ESXi from 4 to 5. I know, I’m a little behind the times, but it was working, and it’s only a small office server… and I should have left it alone, sigh.

So, shutdown the Virtual Machines, overnight copy them all over the network to my laptop and a handy external disk. – Done.
Note: I probably should have used the VMWare standalone converter to copy them, rather than just copy them direct from the datastore.
This morning, in at 8am, install the new ESXi (having lost two hours ’cause the DVD on the server was playing up).

Start restoring the Virtual Machines. First a non important one… all good

Second, the most important one, our file server…. uh oh.

"The VMware Consolidated Backup source ... has a missing catalog file."

Several hours of trying to fix it, editing files trying different versions of the VMWare standalone importer (which may have helped, I’m not sure),
I solved it by Opening the Virtual Machine in VMware Player,
which spotted the problem (I had the VM disks split across two datastores but I’d saved them into one folder), asked me to tell it where they were, and that fixed it for VMware player, which also meant the importer was happy again.

PS – I also realised why I never upgraded from VMware ESXi 4. Version 5 takes away a lot of the essential functionality from the vSphere software. That makes ‘it not a lot of use’ for me. Still, it was free. So having fixed the import, I’m now waiting to import it back to a fresh install of version 4. At least I finally set up the 3+1 raid 5 (instead of the 2 sets of raid 1 left over from the original disks and an upgrade 2 years ago).

A quick note on my first steps using stripe.com

I’m building a web site for a charity that needs to take credit cards for tickets being sold. I’ve chosen to use stripe.com as:

  • It’s simple to implement
  • They take care of all the security and PCI DSS (I never get any card details to save, that’s a good thing).
  • It’s not expensive (compared to other options like having a merchant account for the charity).
  • I couldn’t get away with using existing services (eg: eventbrite, picatic, etc)

Users don’t have to register on this charity site (essentially it’s selling a one off event ticket) so my process is:
1) Vistor completes form and submits [let’s call it ‘Registration’]
2) Server validates form (email address present, other information entered, etc)
3) Server sends page with Stripe pay now button. That button contains the code to precomplete some of the stripe form (eg: the email address).
4) Vistor clicks stripe button, enters card details which are sent direct to the stripe server (ie: not through my server)
5) stripe returns a ‘token’ that can be used to charge the credit card and visitor is directed to my ‘charge’ page with their token (sent as a https post request).
6) when my /charge page is requested, My server can request the card is charged using the single use stripe token. Then thank the customer for paying.

I wanted to record the payment as processed against my Registration_ID, and thought I would be able to use the browser session to link the stripe request with a specific registration. It didn’t work, every test transaction came back with nothing in the session. It was as if the session was being refreshed every time a Stripe transaction occurred.
After several hours of frustration, I tracked it down to rails built in CSRF protection.
As the post form is coming via Stripe, it won’t read the session cookie from the browser and resets it.

All I have to match the registration record with the stripe transaction is the visitors email address. This obviously causes problems if:

  • A visitor wants to buy more than one registration on the same email address
  • A visitor changes their email address during the stripe process (not easy for them to do, but possible).

However, it’s the best I’ve got so I’ll have to write some backup code to prevent two registrations on one email address (they’ll have to get in touch and pay another way) and raise an error if the email address that stripe got is different from the address in our records (the charity will have to match the records manually which isn’t difficult for such a small event).

Here’s the part in my dev log that help me find the problem, along with this blog post on kalzumeus.com.

Started POST "/registers/charge" for 127.0.0.1 at 2014-03-14 16:12:34 +0000
Processing by RegistersController#charge as HTML
Parameters: {"stripeToken"=>"tok_1234sometokendata", "stripeEmail"=>"asdf@asdf", "stripeBillingName"=>"CARDNAME", "stripeBillingAddressLine1"=>"asdf", "stripeBillingAddressZip"=>"ME13 9AB", "stripeBillingAddressCity"=>"Faversham", "stripeBillingAddressState"=>"Kent", "stripeBillingAddressCountry"=>"United Kingdom"}
WARNING: Can't verify CSRF token authenticity

A new form of Comment spam? – url shorteners and redirection?

This is interesting. This blog just had a comment which, at first glance, looked normal.

URL redirection can hide the destination, not always a good thing
URL redirection can hide the destination, not always a good thing

The link first runs through URL shortening service tinyurl.com.

That in turn redirects it to adfly (http://adf.ly) which is where it becomes interesting.
Example of an Adfly landing page
Adfly is an advertising system. Instead of linking directly to the destination, you link with a custom link from them. Before the visitor can go to the new page, they see an advert.
They can interact with that advert or click the big “Skip Ad” button at the top of the page.
If people click on the advert, whoever created the link gets a commission.

I don’t have a problem with Adfly. I’ve seen my son skip the adverts lots of times when he’s getting plugins for Minecraft. What I hadn’t seen before was this method of hiding the adfly link and as far as I know, it’s the first one posted on my blog.

Is it a problem?
I don’t think so, just an observation. It means I’m going to be less trusting of any url shortening from now on.

Is it an opportunity?
Not for me, at least not yet.
It would not be difficult for me to write some code that redirected all my off site links via adfly, including those posted in comments. It does mean anyone visiting and following a link would have an extra step to go through and I’d rather not do that readers.

I used to have google adverts on the blog but when I came to update WordPress I didn’t bother rewriting the templates or installing any plugins. The revenue it was generating was trivial.
I suspect Adfly revenue from this site would also be too small to be worth the effort.