PHP Classes

File: lib/data/src/README

Recommend this page to a friend!
  Classes of Juanjo López   PHP Language Detector   lib/data/src/README   Download  
File: lib/data/src/README
Role: Documentation
Content type: text/plain
Description: Documentation
Class: PHP Language Detector
Detect the idiom of a text automatically
Author: By
Last change:
Date: 12 years ago
Size: 885 bytes
 

Contents

Class file image Download
Source files directory for the trainer -------------------------------------- Create a directory for each language to model, using the identifier for the language as the name for the directory. You are encouraged to use ISO 639-1 language codes (es,en,de,fr, etc.) but you can use the names you want (spanish, english, german, french, ...) The trainer will use blindly the directory name as the identifier for the language. So, if you use "alemán" as the name of the directory with the german train data, the library will identify texts like those as "alemán", not "german", nor "de". Into every directory, copy sample texts for the language. Encode all of them in UTF-8 only, and use only plain text files with .txt extension (or .txt.gz if you want to save space). After running the trainer, the models for every language will be saved in the "model" directory. Good luck