Subject: 404 spam

Posted on: 16/11/2014 09:22am
By: remy

I get hammered by several IP's for certain url's that result in a 404. It is obviously not the intention to retrieve some content. Probably the target is to check for the presence of certain software, OR, to poison the stats my provider is creating. The latter is bad.

Suggestion: would a flood control on IP's that request a non-existing url be feasible?


Re: 404 spam

Posted on: 16/11/2014 01:54pm
By: Laugh

The updated ban plugin can help with that. The final version hasn't been released but you can find the latest code here:

http://code.google.com/p/geeklog/source/list?repo=ban

It also needs the GUS plugin as well which can be found on the same site.

I haven't released a final version since I want to clean up the config options and incorporate them into the Geeklog Config admin interface (right now they are in the config.php file). I also wanted to setup an automated way to download the stopforumspam database which I haven't done yet (right now you must update it manually)

The latest code in the repository is rock solid and I use it with my sites.

In the config you can setup auto banning for:

- Ban IP by using stopforumspam banned ips list (the best list out there of spammy ips http://www.stopforumspam.com/downloads/)
- Auto Ban - User Agents - If IP exceeds x number of different user agents in X number of seconds then Ban. Based on GUS data.
- Auto Ban - Hits - If IP exceeds X number of hits in X number of seconds then Ban. Based on GUS data.
- Auto Ban - Referrer - Ban IP that matches referrer in X number of seconds and for X number of times. Based on GUS data.
- Auto Ban - URL - Ban IP that request matching URL hit X times in X number of seconds. Based on GUS data.

You can set it up to ban ips for life or only for a certain time period (3 selectable options). If you do end up using the plugin make sure it is first in the plugin order list (I usually follow this by the spam-x plugin) this way if an IP is banned it will not chew up to many resources.

Re: 404 spam

Posted on: 17/11/2014 07:45am
By: remy

Thanks for the information, Laugh. The ban plugin is very useful and I will consider using it for a few sites. However it is not eliminating the 404's.
Like the login limits hacking of user/pass by introducing a delay after a failing entry, I merely think of a similar technique: catch the 404 and present further requests (all !) with a message saying that the user has to wait for some time to issue the next request and that the attempt is logged.
Remember this is only for anonymous users 'typing' the url in the address bar of the browser: there is no cookie or no referer, while it is the third attempt with 10 seconds.
Note that some attacks already fake the referer with either invalid url, either valid ones.

Is the example from login-flood protection suitable for duplicating?

Another approach to invoke a delay could be a captcha in the 404 response, saying 'please ignore this error'. Any other suggestions?


Re: 404 spam

Posted on: 17/11/2014 07:54pm
By: Laugh

I realize the referer is one of the easiest things to spoof but I had a few spam bots who used the same one over and over.

I am not sure why you would want to pose a page type limit on 404 pages. This just would use up resources and could confuse search engine spider bots like Google (who sometimes find links to sites that do not exist).

You are going to get 404's no matter what you do. It is either spam bots or other bots looking for ways into your site. If you don't serve a 404 page you could give them the impression they have found something and they will continue to hit that non existent page.


Re: 404 spam

Posted on: 17/11/2014 08:14pm
By: remy


I would serve a error page for every 404 caught and serve it with http error 410 : Gone.
Well, if the user is anonymous. And now I think a captcha on this error page is the right method.

Than, if these 404 continue to happen (not necessarily on the same url) for the same IP during, say 2 mins, a ban is suitable.

Btw, handling a 404 and serving the error page costs 0.17 secs on geeklog.net.

Geeklog - Forum
https://www.geeklog.net/forum/viewtopic.php?showtopic=95870