A new form of Comment spam? – url shorteners and redirection?

This is interesting. This blog just had a comment which, at first glance, looked normal.

URL redirection can hide the destination, not always a good thing
URL redirection can hide the destination, not always a good thing

The link first runs through URL shortening service tinyurl.com.

That in turn redirects it to adfly (http://adf.ly) which is where it becomes interesting.
Example of an Adfly landing page
Adfly is an advertising system. Instead of linking directly to the destination, you link with a custom link from them. Before the visitor can go to the new page, they see an advert.
They can interact with that advert or click the big “Skip Ad” button at the top of the page.
If people click on the advert, whoever created the link gets a commission.

I don’t have a problem with Adfly. I’ve seen my son skip the adverts lots of times when he’s getting plugins for Minecraft. What I hadn’t seen before was this method of hiding the adfly link and as far as I know, it’s the first one posted on my blog.

Is it a problem?
I don’t think so, just an observation. It means I’m going to be less trusting of any url shortening from now on.

Is it an opportunity?
Not for me, at least not yet.
It would not be difficult for me to write some code that redirected all my off site links via adfly, including those posted in comments. It does mean anyone visiting and following a link would have an extra step to go through and I’d rather not do that readers.

I used to have google adverts on the blog but when I came to update WordPress I didn’t bother rewriting the templates or installing any plugins. The revenue it was generating was trivial.
I suspect Adfly revenue from this site would also be too small to be worth the effort.

Peer to Peer downloading (Torrents) and network problems

Recently my parents had some friends visit. They had a laptop with them and asked to use the internet. We’ve not problem with that so let them connect…. except when they connected they were running a peer-to-peer file sharing program.

If you’re here, you probably know what that is. A way of sharing large files by turning it into lots of small pieces and allowing all the people who that file to share a little piece with you, until you have the whole file. Then you can share your little pieces with other users.

The advantage of peer-to-peer file sharing is the originators don’t need to have and pay for a server with lots of capacity. Used properly it’s a great idea. In the past I’ve downloaded and shared Ubuntu and OpenOffice files this way.

355-torrent-effect-thumb-300x146-354.png

My parents house is connected to the internet via our office. I was trying to set up a server and getting very confused as to why my connection kept dropping. It was making a difficult task impossible. I noticed other things like web pages taking longer than usual, or being sometimes fast and sometimes slow. I check the router and realised what was happening, so I blocked my parents guest completely.

That wasn’t enough to solve my problems though. I eventually gave up setting up the server and did it from home in the evening. The problem with torrents is the computers looking to you for little pieces ask for them hours after you’ve stopped advertising they can have them. This screen shot showed my problem clearly – this was taken AFTER I’d blocked the computer at 10.18.6.56. Essentially they’d invited DoS attack. 11,000 incoming connections was more than enough to ruin our ADSL connection for everyone else on the network.

Sure, we could still email, see web pages slowly, but everything was so much harder than it needed to be.

Moral of the story: If you’re a guest with us, please turn off any torrent software before you connect.

Bad Bot go away!

Sigh. Here I am at work on Tuesday morning. List of jobs to do being interrupted by our web server triggering over load alarms. Actually, it’s been doing it for quite a while, but I’ve never sat down to analyse the logs to find what’s happening to trigger the alarm (our gandi.net virtual server is more than powerful enough to cope, so fault finding has been low on my to do list). This morning as I walked to work I saw an overload message arrive in my email. The sun is up, the sky is blue, it’s 8am. It feels a good day to fault find…

It didn’t take long to find the problem. I used grep to pull out todays log entries from the apache log and put them into a temporary file

me@server4:/path_to_logs/rkbb.co.uk$ grep ’06/Apr/2010′ apache-log > check.txt

The bot causing the problem has a user agent of “Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)”, going to puritysearch.net I find a ‘search engine’ that doesn’t appear to do anything but display adverts disguised as search results.

So, how to stop this bot. Nice bots read a file called robots.txt which tells them where they’re allowed to go. Purebot didn’t read the robots.txt so I couldn’t excluded it there.

My next thought was to use apache to exclude the user agent. After an hour or so of trying I gave up with that (it is possible, I just didn’t figure it out and took the easy for me approach). The site is running Coldfusion (actually BlueDragon) so in the Application.cfm I can check the user agent and stop processing requests from Purebot there.

<cfset useragenttest = find(“Purebot”,#cgi.http_user_agent#)>

<cfif useragenttest GT 0 >
  <p>Purebot banned</p>
  <cfabort>
</cfif>

The code isn’t my most elegant but it works. Next time I come across a badbot (or Purebot changes it’s name) I’ll just updated this piece of code to ignore their requests.