PHP Classes
elePHPant
Icontem

Form Spam Bot Blocker: Generate forms that prevent submission by robots

Recommend this page to a friend!
  Info   View files View files (4)   DownloadInstall with Composer Download .zip   Reputation   Support forum (8)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2007-05-03 (9 years ago) RSS 2.0 feedStarStarStarStar 70%Total: 3,989 All time: 758 This week: 1,043Down
Version License PHP version Categories
fsbb 0.1GNU Lesser Genera...4.0HTML, Validation, Security
Description Author

This class can be generate forms that prevent submission by spam robots without requiring human users to enter special values.

It generates hidden inputs for forms that have special values that are verified on the server after the form is submitted to eventually detect whether the form was sent by a spam robot.

The class can generate an hidden input that contains an encoded value of the user browser, user computer IP address and the current time.

The class verifies whether these browser name and IP address are still the same, and also whether the form is being submitted by a normal time interval after it was create, like when it is submitted by a real human user.

The class also generates a text input that are invisible for the user. A human user would not fill this input. If the input is submitted with a value, it was certainly a robot.

When the class detects a situation that demonstrates the form was submitted by a robot, the application should not accept the form submission.

Innovation Award
PHP Programming Innovation award nominee
April 2007
Number 3


Prize: One downloadable copy of Komodo Pro
More and more people have been using robots to make abusive use of Web sites. Usually robots pretend to be real users, or make unauthorized copies of whole sites, or even causing excessive to the site servers.

Many sites have implemented measures to halt robots like using CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) validation forms. However, this kind of validation is usually a little annoying to the users.

This class implements other measures that may help detecting when robots are using Web forms without being too intrusive to the users.

This way the users are usually not aware of the measures that have been implemented to halt robots, but not real humans.

Manuel Lemos
Picture of Giorgos
Name: Giorgos <contact>
Classes: 7 packages by
Country: Greece Greece
Innovation award
Innovation award
Nominee: 2x

Details
formSpamBotBlocker doc file
Class version: 0.3
Date: 7 Apr 2007

A. Introduction

Most of the webmasters know the problem of automatic form submissions by spambots too well! There are some solutions out there, the use of CAPTCHA is one of the best known. Unfortunately solutions such CAPTCHA require active human interaction, something that decreases accessibility.

This php class follows another way, that does not require any extra human interaction at all. It based on human behaviour patterns rather than on human intelligence. It creates <input type="hidden"> tags with encrypted values or visually hidden tags (CSS) to identify a spambot. The combination of multiple methods can really confuse the spambots, even if only html code is created. Please note, that no Capthca-, Cookie- or Javascript-based methods are used here by default. Just plain (x)html and optionally a little CSS code or Session variables. Most of the human users will not even realize, that the form is someway spam protected. That's the point...

B. The basic ideas:

1. A user (human or robot) must have the same IP and the same http user agent ID on both pages, that send (html form) and receive (action of html form - target page: same or other page) POST or GET requests. Humans always do, robots sometimes do not, as they often only call the target page with the required parameters. In other words: a page containing a html form must be loaded before its target page (page that accepts the parameters) is loaded and the IP and browser of the user must be the same on both pages.
-> A spambot is forced to use the same IP and agent ID when scanning and attacking

2. A human user will not be affected by hidden tags with daily changing names, depending on the current date, as they simply do not see them. As a matter of fact, humans could be affected, if e.g. they call a web form at 23.57 and send the request at 0.06 (next day), but there is a simple solution for that too (see below). On the other hand robots use to prescan a html page containing a form and then call the target page with the scanned parameters. A daily changing hidden input name requires prescanning at the current day.
-> A spambot is forced to prescan the form at the current day when attacking

3. A form should be submitted within a specific time window. If this time window is too short or too long, then the user is more likely to be a robot than a human. For example a human cannot submit a form, that has 6 required text inputs in just 2 seconds...
-> A spambot is forced to submit the form within a specific time window when scanning and attacking

4. A spambot will try to populate every form element with some value so as to best ensure that it will succeed in being posted. If a standard text input tag is used in the form, that is hidden visually from the user, a human will not enter anything into this field. it is quite likely though, that a spambot will still post some value for this form element.
-> A spambot is forced to identify visually hidden trap form elements and ignore them when attacking

5. There is no need for a user to call the target page of a web form using the same parameters more than once, without filling out the form again. Humans may do that by clicking on the reload button of the browser, but they should not. Spambots do that when trying to find a way to submit the form.
-> A spambot is forced to pass the protection of the form at once when attacking

C. Implementation of the above ideas

1. [userID] A <input type="hidden"> tag, generated dynamically, has name and value attributes depending on the current user IP and browser ID. The name and value are encrypted and their length can be easily changed. This input tag will be checked on the target page for validity. Of cource, these are static values for an unique IP and browser -> Robots would still have to use the same IP and browser ID, when scanning and calling a form, to simulate a human user. Many spambots cannot do that though... 

2. [dynID] Partial static values are not enough, we need some dynamically changing name/value to prevent automatic form submissions. Another <input type="hidden"> tag must be generated dynamically. This tag has an encrypted name attribute, depending on the current date. This daily changing name prevents prescanning older than the current day. To avoid the midnight problem, mentioned above, a class variable $minutesAfterMidnight has been added. It sets the minutes after midnight to still allow the submission of a form generated at the previous day. 

3. To set a time window for a form to be submitted, the class uses 2 variables $minTime and $maxTime. When a form is generated the current time will be encoded and set as value of the previous dynID input tag. On the target page, this time value will be compared using the variables $minTime and $maxTime. 

4. [trapID] A standard <input type="text"> tag with a tempting name will be generated. This element will be visually hidden from a human user with CSS. However, if CSS is disabled, the input will still be displayed. For this reason, an explanatory label is provided that informs the user to not enter anything into the trap tag. A spambot, that has scanned the web form, will probably submit the form with some value in the trap tag. The class checks is there is a value and identifies the spambot (or a human user, who has disabled CSS and has ignored the label instructions). This method can be disabled by function setTrap().

5. A session based method is used to prevent submitting a form more than once, before loading the form again. This method is enabled by default but it can be disabled by setting the public variable $hasSession=false. A session variable contains the number of the form submissions made after loading the form (and calling makeTags()). If the number is larger than 1, the function checkTags() returns false.

The generated <input type="hidden"> tags contain encoded names and values. To make it even more difficult for a spambot to guess the encoded names and values of the <input type="hidden"> tags, the class uses an encryption method based on a unique key passed though the form. These names and values (and the key) change dynamically each time a form is loaded. You can even make the names and values almost impossible to guess (even if the spambot knows the source code of the class) by setting your own unique string as the value of the public variable $initKey.

D. Can spambots still successfully submit a web form protected by this class?

I can currently think of 2 methods, a spambot could use to submit a web form protected by this class:
1. If the spambot (it's developer) knows the source code of the class and a unique $initKey value has not be set. A script could dynamically generate all the required encoded parameters and pass them to the target url. However, if the $initKey value of the class (not shown in plain html), which is used to encode/decode the parameters, has been set, an external script would not be of any use...
2. If a spambot is able to simulate human behaviour really good. To achieve that though, it should load a web form, scan its elements and call the target url with the required parameters using the same IP/agent ID (all within a limited time window on the same day). Moreover, it should be able to identify the trap <input type="text"> tag (by analysing the CSS?) and let its value="". If the session based method to prevent submitting a form more than once is enabled, the spambot will not be able to repeatidly call the target page, with some previous scanned valid parameters, until it finds a valid time window. It will have to do that at once.

E. How to use the class

1. Create the required <input> tags on the page contaning the web form
a. Optionally set your defaults in the class source file (public variables), set your own unique $initKey!
b. Include the class in your script
c. Create an object: $blocker=new formSpamBotBlocker();
d. Optionally call public functions or set public variables to adapt your defaults to the current web form
e. within your html form: print $blocker->makeTags();
e. get the xhtml string: $hiddentags=$blocker->makeTags(); (if $hasSession=true, make sure you call makeTags() before the output of any html code, or you will get an error message!)
f. within your web form: print $hiddentags;

2. Check if the $_POST or $_GET array contains the valid parametes on the target page
if ($_POST){ // or $_GET
	$blocker=new formSpamBotBlocker();
	$nospam=false;
	$nospam=$blocker->checkTags($_POST); // or $_GET
		if ($nospam) print "Valid Submission"; // handle valid request
		else print "Invalid Submission"; // handle invalid request
}

F. changelog
v0.3 - 3 May 2007:
New methods added:
- By setting the public class variable $hasSession=true 2 session variables will be generated in order to prevent submitting a form more than once, before loading the form again.
- A new public method getTagNames() has been added. This methods returns an array with the names of the generated form elements. 

v0.2 - 5 Apr 2007:
Initial release 
  Files folder image Files  
File Role Description
Plain text file fsbb.php Class the class source
Accessible without login Plain text file readme.txt Doc. the doc file
Accessible without login Plain text file example.php Example an example of a protected web form
Accessible without login Plain text file action.php Example an example of a protected form submission

 Version Control Unique User Downloads Download Rankings  
 0%
Total:3,989
This week:0
All time:758
This week:1,043Down
User Ratings User Comments (1)
 All time
Utility:90%StarStarStarStarStar
Consistency:90%StarStarStarStarStar
Documentation:83%StarStarStarStarStar
Examples:84%StarStarStarStarStar
Tests:-
Videos:-
Overall:70%StarStarStarStar
Rank:274
 
Outstanding!
8 years ago (David K. Lynn)
80%StarStarStarStarStar