PHP Classes

File: README.md

Recommend this page to a friend!
  Classes of Gavin Gordon Markowski   Sorcerer   README.md   Download  
File: README.md
Role: Documentation
Content type: text/markdown
Description: Documentation
Class: Sorcerer
Scrape Web page content using regular expressions
Author: By
Last change: Updated README

Updated README.md
Date: 7 years ago
Size: 1,573 bytes
 

Contents

Class file image Download

Sorcerer

Packagist Version Github Release Usage License

Description

An easy-to-use PHP class for scraping webpages' source code.

Usage

Installation

	$ composer require gavinggordon/sorcerer

Examples

Insantiation

	include( 'vendor/autoload.php' );

	use GGG\Http\Data\Collection\Sorcerer as Sorcerer;
	
	$scraper = new Sorcerer();

Configuration

	$url = 'http://www.testurl.com/index.php';
	
	$regexes = [
		'/\<a\s?[^\>]+?\>(.+)\<\/a\>/i',
		'/\<img\s?([^\>]+?)[\s\/]*?\>/i'
	];
	
	$savefile = __DIR__ . './testurl-scrapedata.txt';
	
	$scraper->configure( $url, $regexes, $savefile );

Run

If no filepath was set for "$savefile",...

	$data = $scraper->scrape();
	
	print_r( $data );

...the scraped data will be returned.

If a filepath was set for "$savefile",...

	$scraper->scrape();

...the scraped data will be saved to the file which you specified.

Issues

If you have any issues at all, please post your findings in the issues page at https://github.com/gavinggordon/sorcerer/issues.

License

This package utilizes the MIT License.