Since Jem’s blog ate my full comment (and I need to test the appearance of codeblocks anyway), here it is again.
The Problem: Shitty spam referrers from Live.com.
The Solution: Block the fucking things.
Sure, you can block the bot, but the bot isn’t the problem per se; it’s the referrers. I figure no-one important actually uses Live.com, so it’s not like I’m missing out on anything by banning people from coming to here from there. Besides, if someone is really keen to visit, they can always type the URL in manually.
Note that this method requires mod_security to be enabled. If you host does not have said Apache module… get a new host.
So, in your .htaccess you want:
<IfModule mod_security.c> # turn ModSecurity On SecFilterEngine On # set up rejection to throw a 501 # you can pretty much set this to whatever you want, but I like 501 because it's unusual SecFilterDefaultAction "deny,log,status:501" # search.live.com requests; these are invaraibly spam SecFilterSelective HTTP_Referer "search\.live\.com" </IfModule>
Beautiful.
Edit (1 Jan ‘09)
Okay, it’s been brought to my attention that people are having trouble with mod_security. Firstly, not all hosts implement it, and secondly it has a couple of functions that will break well-known scripts (including some versions of Mint and WordPress) if you don’t know what you’re doing.
But lo! All is not lost! There’s an almost-identical thing that can be done with “straight” Apache. Watch in awe:
# bot blacklist Order Allow,Deny Allow from all # live.com SetEnvIf Referer "search\.live\.com" bad_bot Deny from env=bad_bot
Given the colour of the code in your comment, I think I should be doing the same :p
Thank you!
I don’t understand the code, but on a totally unrelated point, new JournalPress release
squee!
I’d already tried something identical to the edited version before I resorted to my ranty entry, and that didn’t work for me either. I can’t be arsed to fiddle now either.
Although fwiw, the first snippet works on rev.iew.me - both blocking the referrals and without disrupting mint. That’s not on site5 though, which is probably related :p
It’s interesting; I’ve actually got both in my
.htaccessfile now since I noticed that themod_securityone was failing too; albeit after I’d made this post. No idea why, but I stuck the deny from in there yesterday and it’s cleaned it out.The other thing you can try is by IP. The only problem is I think the subnet is, like, really large; you’d need to block at least
65\.55\.(109|110|165|232), if not more.The other option is to tighten up the regexps in these directives; they’re a bit loose at the moment. I’m… not sure why that would help. But hey, it probably can’t hurt. /shrug