Login   Register  
PHP Classes
elePHPant
Icontem

File: htdig_setup_configuration.php

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us
  Classes of Manuel Lemos  >  Htdig site indexing and searching interface  >  htdig_setup_configuration.php  >  Download  
File: htdig_setup_configuration.php
Role: Example script
Content type: text/plain
Description: Example script to setup a Ht:/Dig configuration file.
Class: Htdig site indexing and searching interface
Interface with Ht:/Dig indexing and search engine.
Author: By
Last change: - Used external file to define Ht:/Dig configuration options.
- Added a setting to specify the time to wait between each page that is crawled.
- Set the path of template files.
Date: 2005-02-07 22:34
Size: 2,814 bytes
 

Contents

Class file image Download
<?php
/*
 * htdig_setup_configuration.php
 *
 * Purpose: create a configuration file for use by Htdig programs.
 *
 * Run this script from the command line use PHP standalone CGI
 * executable program.
 *
 * @(#) $Header: /home/mlemos/cvsroot/htdiginterface/htdig_setup_configuration.php,v 1.5 2005/02/08 06:09:48 mlemos Exp $
 *
 */

    
require("htdig.php");
    require(
"configuration.php");

    
$htdig=new htdig_class;

    
/*
     * Where are the executables of htsearch, htdig, htmerge, htfuzzy
     * located? They should be in the same directory. It does not need
     * to be in the original instalation directory.
     */
    
$htdig->htdig_path=$htdig_path;

    
/*
     * Where this search engine configuration file should be stored? It
     * does not need to be in the original htdig instalation directory.
     * If you need to index more than one site in your server run this
     * script as many times as need specifying different configuration file
     * names.
     */
    
$htdig->configuration=$htdig_configuration_file;

    
/*
     * Where this search engine database files hould be stored? It
     * does not need to be in the original htdig instalation directory.
     * If you need to index more than one site in your server run this
     * script as many times as need specifying different database
     * directories.
     */
    
$htdig->database_directory=$htdig_database_directory;

    
/*
     * Additional options that should be added to the configuration file.
     * Consult htdig manual to learn about all of them.
     */
    
$options=array(

        
/*
         * List of one or more URLs that htdig should start digging. It
         * will follow the links contained in these URL pages.
         */
        
"start_url"=>$site_url,

        
/*
         * List of one or more URLs that htdig should restrict when
         * following links.
         */
        
"limit_urls"=>$site_url,

        
/*
         * List of search algoritms to use and the associated weights that will
         * be used to compute the score of each match.
         */
        
"search_algorithm"=>"exact:1 endings:0.5",

        
/*
         * List of patterns that is used to exclude URLs from being indexed.
         */
        
"exclude_urls"=>"? browse/ user_options.html search.html",

        
/*
         * Wait a few seconds before proceeding to the next page of the site
         * being crawled.
         */
        
"server_wait_time"=>5,

        
/*
         * Where the special template files htdig_header.html
         * htdig_nomatch.html htdig_syntaxerror.html htdig_template.html are
         * located.  These are special template files used by the htdig_class
         * to parse htsearch program results. Do not change the template files.
         * Install them to the path specified by this option.
         */
        
"template_path"=>"templates"

    
);

    
/*
     * Generate and save the configuration file in path specified in
     * $htdig->configuration variable.
     */
    
$error=$htdig->GenerateConfiguration($options);
    if(
strcmp($error,""))
        echo 
"Error: $error\n";
?>