Scraper:
This class has two functions
classifiedslist($searchURL)
gets one parameter any listing url and returns the all classifieds urls.
scrap_OLX($url)
This function will get the details of the given url
Usage Example:
<?php
error_reporting(0);
include_once('scraper.php');
$scraper = new OLXscraper;
//for single item
//$url = 'http://lahore.olx.com.pk/germany-10-pfennig-1950-old-coin-iid-508966031';
//$ret = $scraper->scrap_OLX($url);
$searchURL = "http://www.olx.com.pk/nf/search/86%2Bcorrola";
$list = $scraper->classifiedslist($searchURL);
//loop through each url to get details
foreach($list as $url)
{
//get classified details
$ret = $scraper->scrap_OLX($url);
foreach($ret as $k=>$v)
echo '<strong>'.$k.' </strong>'.$v.'<br>';
print"------------------------------------------<br>";
}
?>
simple_html_dom.php
This document is taken from the
http://simplehtmldom.sourceforge.net/
If you want to extend the class functionality you can check the references here.
Description, Requirement & Features
• A HTML DOM parser written in PHP5+ let you manipulate HTML in a very easy way!
• Require PHP 5+.
• Supports invalid HTML.
• Find tags on an HTML page with selectors just like jQuery.
• Extract contents from HTML in a single line.
Download & Documents
• Download latest version form Sourceforge.
• Read Online Document.
Quick Start
• How to get HTML elements?
• How to modify HTML elements?
• Extract contents from HTML
• Scraping Slashdot!
// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');
// Find all images
foreach($html->find('img') as $element)
echo $element->src . '<br>';
// Find all links
foreach($html->find('a') as $element)
echo $element->href . '<br>';
|