Very simple page details: Parse and extract Web page information details

Recommend this page to a friend!

Download

Info

Example

Files

Install with Composer

Download

Reputation

Support forum

Blog

Links

Ratings				Unique User Downloads		Download Rankings
63%				Total: 244		All time: 7,996 This week: 50

Version		License		PHP version		Categories
`php-vspd` 1.4.0		Custom (specified...		5		HTML, PHP 5, Parsers

Description

Author

zinsou A.A.E.Mo�se

This class can parse and extract Web page information details.

It can retrieve a Web page from a given URL and parse it to extract details like:

- Page title
- Page head and body
- Meta tags
- Character set
- Links expanded to full path
- Images
- Page headers from H1 through H6
- Internal and external links checking if they are broken
- Page elements by class or id value

zinsou A.A.E.Mo�se

Performance

Level

Name:	zinsou A.A.E.Mo�se is available for providing paid consulting. Contact zinsou A.A.E.Mo�se .
Classes:	50 packages by zinsou A.A.E.Mo�se
Country:	Benin
Age:	35
All time rank:	676	1 in Benin
Week rank:	22	1 in Benin

Level 5

Innovation award

Nominee: 23x

Winner: 2x

Recommendations

Link Checker
Find broken links in a Web site

Extract div data or tags text from Web pages
I need to extract the values that are in divs of the same class

What is the best PHP web content crawler class?
Extracting content by passing the URL of a web site

Extract text or links from a web page
i need to parse and extract text

Retrieve a page content
I need a crawler to get a data from an url

Example


<?php session_start(); ?>


<!DOCTYPE HTML>


<html lang="en"> 


    <head>


    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" > 


    <title>Test</title>


    </head>


    <body>





<?php 


set_time_limit(0);


include_once "VSPD.class.php";


 //$obj=new VSPD("https://www.phpclasses.org/");





 $obj=new VSPD("https://fr.investing.com/indices/major-indices",stream_context_create($opts = array(


  'http'=>array(


    'method'=>"GET",


    'user_agent'=>"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"


  )


)));


// echo "Page title:";


// echo '<pre>'.$obj->getTitle().'</pre>';





// echo "All Images:";


// echo '<pre>'.print_r($obj->getImages(),true).'</pre>';











// echo "Internal links:";





// echo '<pre>'.print_r($obj->getInternalinks(true),true).'</pre>';





// echo "External links:";





// echo '<pre>'.print_r($obj->getExternalinks(true),true).'</pre>';


// echo "Headers:";


// echo '<pre>'.print_r($obj->getHeaders(),true).'</pre>';


// echo "Header1:";


// echo '<pre>'.print_r($obj->getH1(),true).'</pre>';


// echo "Header2:";


// echo '<pre>'.print_r($obj->getH2(),true).'</pre>';


// echo "Header3:";


// echo '<pre>'.print_r($obj->getH3(),true).'</pre>';


// echo "CHARSET:";


echo '<pre>'.print_r($obj-> getCharset(),true).'</pre>';


echo "METAS:";


echo '<pre>'.print_r($obj-> XplicitMeta(),true).'</pre>';


// echo "Specifics tag:";


// echo '<pre>'.print_r($obj-> getDTag('div'),true).'</pre>';


// echo '<pre>'.print_r($obj-> getSTag('img'),true).'</pre>';


// echo '<pre>'.var_dump($obj->getElementsByTagName('div')).'</pre>';


echo '<pre>'.print_r($obj-> getOG(),true).'</pre>';


echo '<pre>'.print_r($obj-> getTwitterTags(),true).'</pre>';


echo '<pre>'.print_r($obj-> getHttpEquiv(),true).'</pre>';


// echo "BROKEN LINKS:";


// echo '<pre>'.var_dump($obj->check_broken_externalLinks()).'</pre>';


// echo "check FAKE BROKEN LINKS:";


// $ar=array('https://www.phpclasses.org/browse/mouton.html','https://www.phpclasses.org/voleur.html','https://www.stupidthieves.com','www.phpclasses.org/');


// foreach($ar as $k=>$v){


// if(VSPD::is_broken_link($v)) $brokens[]=$v;


// }


// echo '<pre>';


// var_dump($brokens);


// echo '</pre>';


?>


</body>


</html>

Details

		PHP  VSPD is a little package to get more details about a web page content
		Actually there are methods
		to get title
		to get the full head
		to get the full body
		to find any html tags
		to get explicit meta tags 
		to get charset
		to get open graph tags
		to get twitter tags
		to get Applinks tags
		to get Http-equiv tags
		to find and rebuild all links(to absolute path)
		to find and rebuild all images and source(to avoid broken images href)
		to get all headers once but also individual type of header as H1,H2 etc...
		to get element by id 
		to get elements by class
		to get elements by tag name
                to get elements by name
		to get all internal Links
		to get all externals Links 
		to check if a link is a broken link
		to check all internal broken links
		to check about all externals broken links
		to check globally broken links
		
		The package only parse Html or xhtml files when URL is valid and will throw exception
		when the url is not valid or when the file is not html or xhtml
		
		
		for more details check the class statement and see the how to use file test.php
		for feedback and bug reporting write to [email protected]
		or use the dedicated support forum....

Files (4)

File	Role	Description
`license.txt`	Lic.	license file
`readme.txt`	Doc.	readme
`test.php`	Example	example script
`VSPD.class.php`	Class	class source

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.

Install with Composer

	php-vspd-2018-04-16.zip 7KB
	php-vspd-2018-04-16.tar.gz 6KB
	Install with Composer

Version Control

Unique User Downloads

Download Rankings

Total:	244
This week:	0

All time:	7,996
This week:	50

User Ratings

User Comments (2)

	All time
Utility:	66%
Consistency:	100%
Documentation:	100%
Examples:	100%
Tests:	-
Videos:	-
Overall:	63%
Rank:	832

I try your php code on PHPCLASSES : « Very simple page deta... 7 years ago (Dominique VARLET)	42%
Very good. 7 years ago (Alekos Psimikakis)	67%

Applications that use this package

No pages of applications that use this class were specified.

If you know an application of this package, send a message to the author to add a link here.

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.