Author: Samuel Adeshina
Posted on: 2015-08-24
Package: PHP Sentiment Analyzer
Fortunately we already have the Machine Learning technology necessary to implement sentiment analysis even using pure PHP code.
Read this article to learn more about how you can implement your own sentiment analysis tools in your PHP own applications.
Contents
Introduction to Sentiment Analysis
Applications of Sentiment Analysis
How Does it Work?
N-grams as the Building Blocks for Sentiment Analysis
The Sentiment Analisys Algorithm
The PHP Sentiment Analyzer Class
Conclusion
Introduction to Sentiment Analysis
The word "sentiment" is formally defined as an exhibition or manifestation of the feeling, sensibility or tender emotions toward something, a regard or an opinion.
Generally sentiments are expressions, mental reactions or refined emotional responses towards anything. They are a way of deciphering the bad from the good and sorting the negatives from the positives.
Marketers, product or service providers, and so on all base most of their decisions on the user's point of view, which is usually in form of comments and criticisms. In other words, they all base their decision on the sentiment of their users or customers.
In the current social dynamic world community were every single idea, criticisms and communications are digital, it is becoming increasingly difficult to read through every user's review, criticism, or comment within a reasonable amount of time.
There' is an uprising need to be able to separate a workload of sentiments into categories from thousands of users automatically. This gave rise to a special branch of knowledge entitled "Sentiment Analysis".
As the name implies, it is a process of being able to automatically determine the sentiment (attitude, mental responses) of a speaker (usually a user, a reviewer, or an author).
Sentiment Analysis is also referred to as opinion mining and is actually a branch of artificial intelligence.
Applications of Sentiment Analysis
Like I said above, marketers are one of the few categories of people that would really love the idea of sentiment analysis. This is because most of their decisions are based on their user's review. With a program being able to sort these reviews automatically and derive conclusions, it makes it very easy to deliver precisely accurate products and services in very reasonable amount of time.
Sentiment analysis would also be very helpful and useful in a customer care center, it can help to automatically divert a negative user review or criticism to a particular agent and the positive ones to another agent.
Information retrieval, extraction and question answering systems could be built to base their decisions on statements and opinions rather facts using sentiment analysis algorithms.
All these are very few instances were sentiment analysis can be of immense application. The most important thing to note is that it is usually a process of automatically sorting statements, chats, comments, criticisms and so on into categories which are usually the negative and the positive categories. With this in mind, the areas where it can be of used are very diverse.
How Does it Work?
The main objective of sentiment analysis is to identify the view points underlying a text span. For instance, on a site such as imdb or rotten tomatoes, movies needs to be categorized automatically based on user's sentimental reviews.
The database of the sentimental reviews which are usually in plain text formats are referred to as the "text-span" that contains every reviewers view point, it's the mission of a Sentiment Analysis program, to identify this view points and sort them into different categories which are usually negatives and positives or good and bad. In other words Sentiment Analysis is all about determining the ratio of polarity of different kinds of sentiments in a group of text.
Sentiment Analysis is a branch of Machine Learning which is also a subset of Artificial Intelligence. Technically, Sentiment Analysis is completely based on using text-classification techniques / algorithms to determine document level or sentence level polarity of sentiments.
As a machine learning sub-branch, this can be achieved by using any default polarity classifier algorithm such as the Support Vector Machine (SVM) or the Naïve Bayes Algorithms.
In much more simpler terms, sentiment polarity can be determined through the use of n-grams; which can be a phoneme, a syllable, a letter, or group of words.
N-grams as the Building Blocks for Sentiment Analysis
N- grams are usually used for text classification in polarity classifier algorithms (Naïve Bayes, SVMs and so on) for predicting the next item or words in a sequence. An n-gram where n is equal to one is formally called a unigram, likewise an n-gram where n is two is called a bigram and an n-gram where n equals three is called a trigram.
Informally, n-grams are used in statistical analysis, probability theories, Bayesian inferences and so on for modelling words and parsing them into a group that's composed of n words each.
For example: The n-gram of the word "Samuel" where n = 2 is
"Sa", "am", "mu", "ue" and "el".
Likewise the n-gram of the same word when n = 3 (trigram) is
"Sam", "amu", "mue", "uel".
N-grams are used in various Machine Learning areas such as Natural Language Processing, Speech Recognition, Voice Synthesis, Language Identification and so on.
Sentiment Analysis happens to be one of them and it forms the major building block of any sentiment analysis algorithm.
Followed below is a simple PHP implementation of an n-gram extraction algorithm.
<?php function extractNgram($word, $n) { $ngramArray = array(); for ($counter = 0; $counter < strlen($word); $counter++) { $index = ($n - 2); if($counter > $index) { $ngram = ""; for($loop = $index + 1; $loop >= 0; $loop--) { $ngram = $word[$counter - $loop]; } $ngramArray[] = $ngram; } } return $ngramArray; } ?>
The above function returns an array that contains the n-grams of a word that's passed in as a parameter.
The Sentiment Analisys Algorithm
Like all other branches of machine learning, for sentiment analysis to work, the algorithm needs to be trained with various data sets that shows it what a positive sentiment is, what a negative one is and what a neutral sentiment looks like.
This training data sets can be in a plain text file, a binary data file or a database, as long as the algorithm can recognize and understand the data in any of these formats.
In PHP, this can easily be done by reading a text file, splitting the sentences into words and temporarily storing them in an array.
After the training sets are provided in distinct arrays, sentiment analysis can be performed using an SVM or the Naïve Bayes Algorithm to score a sentence based on the distribution of the words in it.
For instance, let's assume we have an array of sentiments that looks like these:
$negativeData = array("evil", "terrible", "bad", "hate", "waste", "unsatisfied");
$positiveData = array("great", "lovely", "amazing", "melodramatic", "idealistic");
If the following sentence "Occasionally melodramatic, it's also extremely effective. Although it contains pints of evil, its amazing plot was terrific", were to be analyzed, the whole sentence would be split into words and each word would be associated to a particular group that is determined by the array that contains the training set.
Then depending on the probability of positive words and negative words, the whole sentiment would be scored as positive or negative.
The PHP Sentiment Analyzer Class
I have made this task easy for your to have a sentiment analyzer to work with by writing this Sentiment Analyzer class.
The class has built in methods for accepting your training data in whatever formats and another for performing sentiment analysis on sentences or a whole document.
It is a complete sentiment Analyzer written in PHP to save you the stress of having to write yours. It uses the Naïve Bayes Polarity Classifier Algorithm and it comes predefined with a very huge training sets.
It is also fully customizable and you can add your training data to it. The algorithm also learns interactively, it does not only depend on the supplied training data set.
Conclusion
In this article we learned how sentiment analysis work and a PHP class that you can use to implement it in your PHP projects.
In the next part of the article you can learn how to implement it practice with real code samples.
If you liked this article or you have a question, post a comment here.
You need to be a registered user or login to post a comment
Login Immediately with your account on:
Comments:
5. Can you use weighting on the words? - martin barker (2016-05-26 10:15)
would it be possible to weight the words in this method... - 0 replies
Read the whole comment and replies
4. Sentiment Analysis is Eye Opening! - derek (2015-09-14 01:27)
a new form of thinking on uses and logic... - 2 replies
Read the whole comment and replies
4. Sentiment Analysis is Eye Opening! - derek (2015-09-14 01:27)
a new form of thinking on uses and logic... - 2 replies
Read the whole comment and replies
3. Nice intro, but missing something - Stephan (2015-09-01 07:29)
Nice intro, but missing something... - 0 replies
Read the whole comment and replies
2. Great for PHP - Israel Vázquez Morales (2015-08-27 16:58)
Quick question... - 3 replies
Read the whole comment and replies
1. very usefull - tinashe zulu (2015-08-25 16:39)
a great article... - 1 reply
Read the whole comment and replies