PHP Classes
elePHPant
Icontem

Guaranix Full Text: Index text documents for full text searching

Recommend this page to a friend!
  Info   View files View files (18)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2006-10-17 (12 years ago) RSS 2.0 feedStarStarStarStar 69%Total: 2,641 All time: 1,433 This week: 506Up
Version License Categories
guaranix 1.0.0GNU General Publi...Databases, Searching, Text processing
Description Author

This package can be used to index texts for full text searching.

It can build indexes of documents with support for stemming words (in Spanish or English), document idiom categorization, phrase search, etc..

The class can use a database as repository to store the index. Currently it supports SQLite and MySQL.

Innovation Award
PHP Programming Innovation award nominee
August 2006
Number 5


Prize: One subscription to the PDF edition of the magazine by PHP Architect
Full text search is a capability of applications that can search databases of documents using keywords that may be contained those documents.

Full text search is a feature already built-in some relational database servers. However, full text search may also be performed on documents that are not on a relational database.

This package provides a solution to index and search documents that may not be stored on a relational database, or even search documents stored in database servers that do not provide built-in full text search capabilities.

Manuel Lemos
  Performance   Level  
Name: Cesar D. Rodas <contact>
Classes: 39 packages by
Country: Paraguay Paraguay
Innovation award
Innovation award
Nominee: 25x

Winner: 5x

Details
"The PHPGnix is beta project this is not for production!"

The PHPGnix is a Class that will allow you to index texts(Full text search) without carry about database handler. PHPGnix goal is to do an Simple and Fast Search System. For index something you just need to do this:

include "gnix.php";
$gnix = new Gnix(array("db" => "phpgnix", "host" => "localhost", "user" => "root", "pass" => ""),MYSQL);
$gnix->install(); /* Install it just for the first time */
$texto['title 1'] = "This is a text for index";
$texto['title 2'] = "This is a text number 2 for index";
$gnix->Index($texto);

The actual databases supported is MYSQL and SQLITE (please contribute for others databases like postgresql).

To search you have to do something like this:
include "gnix.php";	
$gnix = new Gnix(array("db" => "phpgnix", "host" => "localhost", "user" => "root", "pass" => ""),MYSQL);
$query = new Gnix($connstring,MYSQL);
$salida = $query->search('"web search" google OR altavista NOT msn');

The edit is not supported yet! But it will be soon

This has a slow performance for queries that has OR and I dont know why. All the words that is bettween " " is phrase.

Please people contribute to this project a better one!. There is too much that is missing Like NEAR command, a better ranking algorithm, the stopwords list, the stemming for another langs (http://snowball.tartarus.org/index.php), and too much. 

Please Contribute to saddorNOTSPAM (AT) gmail (dot) com [remove the NOTSPAM].

Thanks.

When this project is optimize with your help! I will translate this project to C.

Thanks one more time to all.
  Files folder image Files  
File Role Description
Files folder imagelibtextcat (7 files)
Files folder imagestemmer (2 files)
Accessible without login Plain text file db.mysql Data Mysql Install File
Accessible without login Plain text file db.sqlite Data SQLite install file
Plain text file gnix.php Class The main class of this project
Plain text file mysql.php Class MySQL handler
Accessible without login Plain text file phpIndexer.php Example Test of indexing file
Accessible without login Plain text file README Doc. Readme! this is important
Accessible without login Plain text file search.php Example The search test
Plain text file sqlite.php Class SQLite handler
Plain text file tokenizer.php Class The Search Query Tokenizer

  Files folder image Files  /  libtextcat  
File Role Description
  Accessible without login Plain text file english.lm Data english
  Accessible without login Plain text file french.lm Data French
  Accessible without login Plain text file german.lm Data German
  Accessible without login Plain text file italian.lm Data Italian
  Plain text file saddorlibtextcat.php Class The LibTextCat
  Accessible without login Plain text file spanish.lm Data Spanish
  Accessible without login Plain text file test.php Example The test file ot libtextcat

  Files folder image Files  /  stemmer  
File Role Description
  Plain text file english.php Class English Stemmer
  Plain text file spanish.php Class Spanish Stemmer

 Version Control Unique User Downloads Download Rankings  
 0%
Total:2,641
This week:0
All time:1,433
This week:506Up
 User Ratings  
 
 All time
Utility:91%StarStarStarStarStar
Consistency:83%StarStarStarStarStar
Documentation:75%StarStarStarStar
Examples:83%StarStarStarStarStar
Tests:-
Videos:-
Overall:69%StarStarStarStar
Rank:416