Download .zip |
Info | Documentation | View files (88) | Download .zip | Reputation | Support forum | Blog | Links |
Last Updated | Ratings | Unique User Downloads | Download Rankings | |||||
2016-12-20 (13 days ago) | Not yet rated by the users | Total: Not yet counted | Not yet ranked |
Version | License | PHP version | Categories | |||
language-detection 1.3 | Custom (specified... | 7.1 | Localization, Algorithms, Text proces..., A..., P... |
Collaborate with this project | Author | ||||||||
language_detection - github.com Description This package can detect the language of a given text string. |
|
Detect the language from a given text.
To do that it generates a language profile based on N-grams for every file in etc
directory.
Then it generate such language profile for the unknown text and compare the previosly language profiles against the unknown.
Only requirement is a PHP version greater than or equal to 7.1. > Note: language_detection requires the Multibyte String extension in order to work.
composer require patrick-schur/language-detection
Or add the following to composer.json
{
"require": {
"patrick-schur/language-detection": "*"
}
}
Before we can recognize the language from a given text, we have to generate a language profile for each language.
From the beginning it comes with a pre-trained language profile (etc/_langs.json
).<br>
Also you can add new files to etc
or change existing ones.
First we have to generate a language profile.
require_once 'vendor/autoload.php';
use LanguageDetector\Trainer;
$t = new Trainer;
$t->learn();
If we have our language profile, we can classify texts by their language. To detect the language correctly, the length of the input text should be at least some sentences.
require_once 'vendor/autoload.php';
use LanguageDetector\LanguageDetector;
$ld = new LanguageDetector;
var_dump($ld->detect('Das ist ein deutscher Satz.')); // de
It supports up to now 73 languages. If your language not supported, feel free to add your own language files.
Files |
File | Role | Description | ||
---|---|---|---|---|
etc (74 files) | ||||
src (1 directory) | ||||
tests (3 files) | ||||
.travis.yml | Data | Auxiliary data | ||
composer.json | Data | Auxiliary data | ||
LICENSE.md | Lic. | License text | ||
phpunit.xml | Data | Auxiliary data | ||
README.md | Doc. | Documentation |
Files | / | etc |
File | Role | Description |
---|---|---|
ab.txt | Doc. | Documentation |
af.txt | Doc. | Documentation |
am.txt | Doc. | Documentation |
ar.txt | Doc. | Documentation |
az.txt | Doc. | Documentation |
be.txt | Doc. | Documentation |
bg.txt | Doc. | Documentation |
bn.txt | Doc. | Documentation |
co.txt | Doc. | Documentation |
cs.txt | Doc. | Documentation |
cy.txt | Doc. | Documentation |
de.txt | Doc. | Documentation |
dk.txt | Doc. | Documentation |
el.txt | Doc. | Documentation |
en.txt | Doc. | Documentation |
eo.txt | Doc. | Documentation |
es.txt | Doc. | Documentation |
et.txt | Doc. | Documentation |
eu.txt | Doc. | Documentation |
fa.txt | Doc. | Documentation |
fi.txt | Doc. | Documentation |
fj.txt | Doc. | Documentation |
fo.txt | Doc. | Documentation |
fr.txt | Doc. | Documentation |
ga.txt | Doc. | Documentation |
gd.txt | Doc. | Documentation |
gl.txt | Doc. | Documentation |
gn.txt | Doc. | Documentation |
ha.txt | Doc. | Documentation |
he.txt | Doc. | Documentation |
hi.txt | Doc. | Documentation |
hr.txt | Doc. | Documentation |
hu.txt | Doc. | Documentation |
hy.txt | Doc. | Documentation |
ia.txt | Doc. | Documentation |
ig.txt | Doc. | Documentation |
io.txt | Doc. | Documentation |
is.txt | Doc. | Documentation |
it.txt | Doc. | Documentation |
iu.txt | Doc. | Documentation |
jp.txt | Doc. | Documentation |
jv.txt | Doc. | Documentation |
ka.txt | Doc. | Documentation |
ko.txt | Doc. | Documentation |
ku.txt | Doc. | Documentation |
la.txt | Doc. | Documentation |
lg.txt | Doc. | Documentation |
lo.txt | Doc. | Documentation |
lt.txt | Doc. | Documentation |
lv.txt | Doc. | Documentation |
mh.txt | Doc. | Documentation |
mn.txt | Doc. | Documentation |
ms.txt | Doc. | Documentation |
mt.txt | Doc. | Documentation |
nl.txt | Doc. | Documentation |
no.txt | Doc. | Documentation |
nv.txt | Doc. | Documentation |
pl.txt | Doc. | Documentation |
pt.txt | Doc. | Documentation |
ro.txt | Doc. | Documentation |
ru.txt | Doc. | Documentation |
sk.txt | Doc. | Documentation |
sl.txt | Doc. | Documentation |
so.txt | Doc. | Documentation |
sv.txt | Doc. | Documentation |
th.txt | Doc. | Documentation |
tr.txt | Doc. | Documentation |
ty.txt | Doc. | Documentation |
ug.txt | Doc. | Documentation |
uk.txt | Doc. | Documentation |
uz.txt | Doc. | Documentation |
vi.txt | Doc. | Documentation |
zh.txt | Doc. | Documentation |
_langs.json | Data | Auxiliary data |
Files | / | src | / | LanguageDetector |
File | Role | Description | ||
---|---|---|---|---|
Tokenizer (3 files) | ||||
LanguageDetector.php | Class | Class source | ||
NGramParser.php | Class | Class source | ||
Trainer.php | Class | Class source |
Files | / | src | / | LanguageDetector | / | Tokenizer |
File | Role | Description |
---|---|---|
Tokenizer.php | Class | Class source |
TokenizerInterface.php | Class | Class source |
WordTokenizer.php | Class | Class source |
Files | / | tests |
File | Role | Description |
---|---|---|
LanguageDetectorTest.php | Class | Class source |
NGramParserTest.php | Class | Class source |
TrainerTest.php | Class | Class source |
language-detection-2016-12-20.zip 548KB | |
language-detection-2016-12-20.tar.gz 543KB | |
Install with Composer |
Version Control | Unique User Downloads | |||||||
100% |
|
Applications that use this package |
If you know an application of this package, send a message to the author to add a link here.