Author: Vivek moyal
Viewers: 2,331
Last month viewers: 196
Categories: PHP Tutorials
Read this article to learn how to use a PHP OCR class to read text written in images like those that appear in CAPTCHA images.
Contents
CAPTCHA Solver Free PHP OCR Solution
Why a CAPTCHA Solver
How PHP Captcha Solver will work
Training the System to Recognize Characters
Recognizing Characters in Images
Complete CAPTCHA Solver Solution
Downloading the PHP OCR Class
CAPTCHA Solver Free PHP OCR Solution
CAPTCHA is a common method for preventing that bots access to resources in automated way. There are many types of CAPTCHA solutions but the most common consists of an image that displays a text usually distorted in some way to make it more difficult for bots to discover what text is written there.
One way to determine what is written in a CAPTCHA image is to use Optical Character Recognition (OCR) methods. These methods allow to recognize text characters that are written on images.
This article presents what can be the beginning of a solution of a CAPTCHA solver solution based on a free OCR class written in PHP.
Why a CAPTCHA Solver
Suppose you are working on a project and you have to download a file. The file link will be provided to you after the login and enter the text of a CAPTCHA challenge.
Now what you will do? Either you will login every time with the correct login details and enter the correct CAPTCHA value for getting the link, or you will use the smarter way of CAPTCHA Solver solution and a HTTP request library to get the link every time. Our second option is about an automated solution for you project.
But how the CAPTCHA Solver will work and how we can use it? Here it is explained how you can do it.
How PHP Captcha Solver will work
The solution will be split in two parts. First we make the system learn by creating the character set with all letters to be recognized. In the second part we test images with text against the character set we built to evaluate if it produces the correct text recognition results.
Training the System to Recognize Characters: example.php
Looking at the example.php script you can see that it includes the PHP OCR class and the Config class. OCR class contains functions that are used to learn to recognize characters from the images, create character objects, recognize, save and test the learning results.
include_once("config.php");
include_once("OCR.class.php");
$char = new OCR();
// learn about characters to be recognized
$char->Learn("W");
// save the learned object to file
$char->saveResult();
echo 'Saved information about the "W" letter<br>';
This script is the starting point of our PHP CAPTCHA Solver solution. It includes the OCR class and creates an object of that class. It uses the Learn function for learning about each character to recognize. The Learn function generates information to be serialized so it can be saved to a file in our storage folder using saveResult function.
Recognizing Characters in Images: example_1.php
After saving the training results to files in the storage folder, the class recognize function is able to recognize the characters that appear in image. If an error was found while checking character object, that means it did not recognized the image correctly. Otherwise it successfully recognized the character.
include_once './config.php';
include_once './OCR.class.php';
$char = new OCR();
// Recognition process, check images from files
echo "<hr/><img src='M.png'/><br/>";
$res = $char->Recognition('M.png');
if ($res !== false) {
echo "<b>" . $res->getName() . "</b>";
} else {
echo "Not yet recognised.<br/>";
}
Complete CAPTCHA Solver Solution
Learning how to recognize individual characters in an image is just part of the challenge to identify what text is in a CAPTCHA validation image.
Often CAPTCHA images display distorted text along with obfuscating graphics. With enough training with examples of distorted text images, the system can improve to recognize distrored text better.
But first a complete solution needs to be able to separate each letter in the image, so each one can be processed to recognized one letter at a time.
As you may see, a complete CAPTCHA solver solution needs to do more than the current CAPTCHA class does.
If you liked the challenge maybe you can evolve the current class or create additional classes to provide a complete CAPTCHA solver solution.
Alternatively you can also use existing CAPTCHA solving Web services that rely on real human users that decode the CAPTCHA images for you in a short period of time.
It is a different approach but you can also use existing PHP classes to use the anti-CAPTCHA services from your PHP application.
Downloading the PHP OCR Class
The PHP OCR class presented in this article can be download and installed from a ZIP archive or installed using the PHP composer tool using instructions available in the download page.
If you liked this article, share it with other developers. Post a comment here if you have questions or want to provide your opinion.
You need to be a registered user or login to post a comment
Login Immediately with your account on:
Comments:
No comments were submitted yet.