PHP Classes

PHP NGram Comparator: Compare strings to find the level of similarity

Recommend this page to a friend!

  Author Author  
Picture of JImmy Bo
Name: JImmy Bo <contact>
Classes: 12 packages by
Country: United States United States
Innovation award
Innovation award
Nominee: 6x

Winner: 1x


  Detailed description   Download Download .zip .tar.gz   Install with Composer Install with Composer  
This package can compare strings to find the level of similarity.

It can take a string and parses it to get the shingles and ngram words in an array.

The package can also compare the respective ngram word arrays of two strings and return the level of similarity as a percentage.

It can also compare two strings and return the number of ngram words that match.

The package also takes arrays of words of two phrases and generates arrays suitable for training with language models.

N-grams are contiguous sequences of n items from a given sample of text.

Shingles are overlapping sequences of words.

Instructions:

The class includes the following methods:

- get_ngrams($text, $n): This method takes a string of text and an integer n as input and returns an array of n-grams. The method splits the input text into n-grams and returns an array of these n-grams.

- compare_strings_ngram_pct($string1, $string2, $n): This method takes two strings and an integer n as input and returns the percentage of matching n-grams between the two strings. The method splits the two input strings into n-grams and calculates the percentage of matching n-grams.

- compare_strings_ngram_max_size($string1, $string2): This method takes two strings as input and returns the maximum matching n-gram size between the two strings. The method splits the two input strings into n-grams of varying lengths and returns the size of the largest matching n-gram.

- get_shingles($text, $shingle_size): This method takes a string of text and an integer shingle_size as input and returns an array of shingles. The method splits the input text into shingles of the specified size and returns an array of these shingles.

- train_ngram_model($tokenized_text=[], $n=[]): This method takes an array of tokenized text and an integer n as input and returns an array of n-gram counts. The method loops through each sentence in the tokenized text and creates n-grams of length n. It then counts the frequency of each n-gram and returns an array of n-gram counts.

  Classes of JImmy Bo  >  PHP NGram Comparator  >  Download Download .zip .tar.gz  >  Support forum Support forum  >  Blog Blog (1)  >  RSS 1.0 feed RSS 2.0 feed Latest changes  
Name: PHP NGram Comparator
Base name: phpngram
Description: Compare strings to find the level of similarity
Version: 1.0.0
PHP version: 7
License: BSD License
 
  Groups   Applications   Files Files  

  Groups  
Group folder image Algorithms Numerical and statistical algorithms View top rated classes
Group folder image Statistics Collecting, processing and reporting statistical data View top rated classes
Group folder image Searching Search engines, crawling and indexing View top rated classes
Group folder image Text processing Manipulating and validating text data View top rated classes
Group folder image Artificial intelligence Automation of tasks using human-like intelligence View top rated classes
Group folder image Parsers Programming language interpreters and format parsers View top rated classes
Group folder image PHP 7 Classes using PHP 7 specific features View top rated classes


  Innovation Award  
PHP Programming Innovation award nominee
May 2023
Number 3
Humans use languages to talk to each other. Usually, they form sentences that use words in several ways with the same meaning, although the sentences use different words.

When people ask questions to a software application, the software needs to understand how people express the same question.

This package can parse sentences in a way that can determine that a question is very similar to another that asks about the same problem.

This way, this package can implement the base of artificial intelligence applications that can understand what humans are asking in specific languages.

Manuel Lemos

  Applications that use this package  
No pages of applications that use this class were specified.

Add link image If you know an application of this package, send a message to the author to add a link here.

  Files folder image Files  
File Role Description
Plain text file class.ngram.php Class N-Gram Comparison and Shingling in PHP Class
Accessible without login Plain text file example.ngram.php Example example usage

Install with Composer Install with Composer - Download Download all files: phpngram.tar.gz phpngram.zip
NOTICE: if you are using a download manager program like 'GetRight', please Login before trying to download this archive.
  Files folder image Files  
File Role Description
Plain text file class.ngram.php Class N-Gram Comparison and Shingling in PHP Class
Accessible without login Plain text file example.ngram.php Example example usage

Install with Composer Install with Composer - Download Download all files: phpngram.tar.gz phpngram.zip
NOTICE: if you are using a download manager program like 'GetRight', please Login before trying to download this archive.