File: testInTextFile.php

Recommend this page to a friend!

???

File:	`???`
Role:	Example script
Content type:	`text/plain`
Description:	Test script
Class:	PHP Word Frequency Analysis Extract text frequent terms of two or more words
Author:	By Alejandro Mitrou
Last change:
Date:	10 years ago
Size:	`1,113 bytes`

Download


<?php


  include('frequentTermsAnalyzer.php');





  $generalTextFile  = file_get_contents('data/wikipedia_new_york_city.txt');


  $generalTextFile .= file_get_contents('data/wikipedia_social_media.txt');


  $generalTextFile .= file_get_contents('data/wikipedia_personal_finance.txt');


  $generalTextFile .= file_get_contents('data/wikipedia_barbicue.txt');


  $candidates = explode(' ', vacuumCleaner($generalTextFile));


  $analyzer   = new frequentTermsAnalyzer($candidates);


  $excludedWords = $analyzer->getFrequentWords();


  #print_r($excludedWords);


  


  $dataFile = 'data/wikipedia_personal_finance.txt';


  $particularTextFile = file_get_contents($dataFile);


  $candidates = explode(' ', vacuumCleaner($particularTextFile));


  $analyzer   = new frequentTermsAnalyzer($candidates, $excludedWords);


  $compoundWords = $analyzer->getCompoundTerms();


  print "Processing data file: ". $dataFile ."\n";


  print_r($compoundWords);


  


  function vacuumCleaner($str){


    $str = strtolower($str);


    $str = preg_replace('/[^a-z ]/', '', $str);


    return preg_replace('/\s+/', ' ', $str);


  }


?>

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.

File: testInTextFile.php

Contents