Comments now back – and the battle to get SpamFirewall running

Published on 16 February 2008 in , , , ,

Well you’re a right bunch of miserable people. Not one email to keep me company whilst the comments were down! Tsk! What do you think you’re playing at? Call yourself blog readers…

Anyway, they’re back up now following some intensive hunting to find the cause of spam comments somehow making it on the site, despite all comments being pre-moderated.

How that happened, I still don’t know – I can find no similar occurrences online, but in the meantime, there needed to be more barriers put in place in the battle for spam.

To that end I really wanted to get SpamFirewall running for Movable Type – it’s a reasonably simple PHP script that the author reckons will block about 80% of all comment spam attempts. It scans through the comment looking for certain things, then passes the data over to the standard Movable Type comment script. I got it working for The F-Word which recently started using comments, and have been suitably impressed.

But on planetbods.org, SpamFirewall just wouldn’t work. It would just 404 on the comment script. Which confused me no end as I knew full well that the comment script was exactly where it should have been. Put it manually in the browser, and there it was.

At the time, the script was looking at the url of http://www.planetbods.org/management/hys.cgi . I wondered if the the version of PHP was locked down on the webserver in some way, and scoured through the php.ini to see if there was anything that gave the game away, but to no avail.

Despite not seeing anything, I wondered what would happen if I just removed the domain name – put in /management/hys.cgi. Maybe there was some hidden setting I couldn’t find.

This had a different result – SpamFirewall just presented some error text instead.

Hmm, I thought, and started looking deeper into SpamFirewall’s code to try and find out how it works.

It uses the HttpClient webclass, and when SpamFirewall tried to use it to contact the comment script, it wasn’t getting through.

Hmm, I thought again.

After more pondering and hmm-img, I tried writing my own HttpClient-based script. Well okay, I copied the demo. And my version of the demo worked perfectly – it contacted Amazon.com as designed and gave me the right feedback. Well one thing down I thought as I hmm-ed again.

Next I tried adapting the demo to pick up from the Planet Bods main homepage. Again, a success. I stroked my chin in a pondering kind of way.

Maybe it can’t reach the CGI script I thought – maybe there is some obscure PHP or Apache setting that was stopping it from happening. So I put the CGI script in. 404.

I put in the Blog homepage URL. 404.

I put in a BBC URL. All worked fine.

I put back in the Blog homepage URL. 404.

I put in the Planet Bods homepage URL. All fine.

Blog homepage. 404.

The head was scratched, the chin was stroked repeatedly and the hmm-chorus went on for hours. Some strange bug this�

It was at that point, that I did something random.

Before I owned planetbods.org, this website sat at the catchy URL of http://www.durge.org/~bods/ and whilst that no longer works, there is a similar, “hidden” URL that does. I decided to put in access to the comments script via that url

Bingo. Suddenly my test HttpClass script worked perfectly. And at that point, I remembered a nice little oddity to do with the server that hosts this site. When you SSH in and try to access www.planetbods.org (or any other domain hosted on it), the server configuration is wrong, and you always get the a Mandrake/Apache “Well done you’ve installed Apache” screen instead of the website you’re expecting.

Normally this isn’t a problem, but obviously HttpClass was getting that Apache page too. One change of domain later and everything was suddenly working fine. Well except for CGI not working properly, and the fact that everything on the “hidden” URL didn’t work properly because of the prepended “/~bods”.

Some hours later, and SpamFirewall is now installed and hopefully it will do its stuff and prevent spam from even getting too near Movable Type, yet alone published live without my say-so. We will see – comments are back now, but I’ll be keeping my eye on them.

Oh and if you’re a comment spammer, don’t for one minute think that you’ll be able to get through SpamFirewall by using the URLs above – I’m afraid I’ve now moved my copy of Movable Type to a completely different directory, and is… all nicely hidden.