PHP Classes

Email Parsing

Recommend this page to a friend!

      MIME E-mail message sending  >  MIME E-mail message sending package blog  >  How Can PHP Send Emai...  >  All threads  >  Email Parsing  >  (Un) Subscribe thread alerts  
Subject:Email Parsing
Summary:Parse Email headers using PHP
Messages:4
Author:Chitvan Seth
Date:2007-05-16 12:21:02
Update:2008-08-21 04:54:06
 

  1. Email Parsing   Reply   Report abuse  
Picture of Chitvan Seth Chitvan Seth - 2007-05-16 12:21:02
Hi All,

I am using PHP classes to parse the Email content. In the following example I have a pop3 class.

following is the code I have used:


$email = $pop3->get_mail1($i);

$email = parse_email($email);

function get_mail1( $msg_number, $qmailer = FALSE )
{
if(!$this->socket)
{
$this->error = "POP3 get_mail() - Error: No connection avalible.";

return FALSE;
}

if(!$this->_checkstate("get_mail")) return FALSE;

$response = "";
$cmd = "RETR $msg_number";
if(!$this->_logging($cmd)) return FALSE;
if(!$this->_putline($cmd)) return FALSE;

$response = $this->_getnextstring();

if(!$this->_logging($response)) return FALSE;

if ($qmailer == TRUE)
{
if(substr($response,0,1) != '.')
{
$this->error = "POP3 get_mail() - Error: ".$response;
return FALSE;
}
}
else
{
if(substr($response,0,3) != "+OK")
{
$this->error = "POP3 get_mail() - Error: ".$response;
return FALSE;
}
}

// Get MAIL !!!
$i = "0";
$response = "<HEADER> \r\n";
while(!eregi("^\.\r\n",$response))
{
if(substr($response,0,4) == "\r\n") break;
$output[$i] = $response;
$animess .=$response;
$i++;
$response = $this->_getnextstring();
}
$output[$i++] = "</HEADER> \r\n";

$response = "<MESSAGE> \r\n";

while(!eregi("^\.\r\n",$response))
{
$output[$i] = $response;
$animess .=$response;
$i++;
$response = $this->_getnextstring();
}

$output[$i] = "</MESSAGE> \r\n";

if(!$this->_logging("Complete.")) return FALSE;

return $this->getmessage($animess);
//return $output;
}



function _getnextstring( $buffer_size = 512 )
{
$buffer = "";
$buffer = fgets( $this->socket , $buffer_size );

$this->socket_status = socket_get_status( $this->socket );

if( $this->socket_status["timed_out"] )
{
$this->_cleanup();
return "POP3 _getnextstring() - Socket_Timeout_reached.";
}
$this->socket_status = FALSE;

return $buffer;
}


function parse_email ($email) {
// Split header and message
$header = array();
$message = array();

$is_header = true;
foreach ($email as $line) {
if ($line == '<HEADER> ' . "\r\n") continue;
if ($line == '<MESSAGE> ' . "\r\n") continue;
if ($line == '</MESSAGE> ' . "\r\n") continue;
if ($line == '</HEADER> ' . "\r\n") { $is_header = false; continue; }

if ($is_header == true) {
$header[] = $line;
} else {
$message[] = $line;
}
}

// Parse headers
$headers = array();

foreach ($header as $line) {
$colon_pos = strpos($line, ':');
$space_pos = strpos($line, ' ');

if ($colon_pos === false OR $space_pos < $colon_pos) {
// attach to previous
$previous .= "\r\n" . $line;
continue;
}

// Get key
$key = substr($line, 0, $colon_pos);

// Get value
$value = substr($line, $colon_pos+2);
$headers[$key] = $value;

$previous =& $headers[$key];
}
// Parse message
$message = implode('', $message);

// Return array
$email = array();
$email['message'] = $message;
$email['headers'] = $headers;
/* echo "<pre>";
echo print_r($message);;
echo "<pre> message$id";
*/ return $email;
}




In the above code I am able to fetch all the hedares and the message part. I need to fetch the mail id

from where the mail is received. Now there are 2 cases for the FROM feild:

1. abc@example.com
2. ABC XYZ <abc.xyz@example.com>

In the second case when I retrieve the header part I get "ABC XYZ" and not the other part, i.e.,

<abc.xyz@example.com>. Also I am not able to fetch the [return-path] value.

Please can someone explain where I am wrong in the above code.

thanks in advance.

  2. Re: Email Parsing   Reply   Report abuse  
Picture of Manuel Lemos Manuel Lemos - 2007-05-16 18:16:08 - In reply to message 1 from Chitvan Seth
This is not exactly the forum of the POP3 class package.

Anyway, if you want to parse your messages, you may want to use the MIME parser class instead and it is able to parse and decode headers and bodies as you wish:

phpclasses.org/mimeparser

The message identifier can be obtained using the POP3 message listing by ID. Usually this is the same as in the Message-ID header.

The return-path header may or may not appear in all messages, as it is inserted optionally by the MTA that sends the message.

  3. Re: Email Parsing   Reply   Report abuse  
Picture of tzvi mirell tzvi mirell - 2008-08-20 09:17:15 - In reply to message 2 from Manuel Lemos
hi
I'm getting the same problem .

Using outlook or another webmail program, I'm able to view the return path email address.
But with this class, I get all other headers, but not the return path.

other than this, its working great.

Please let me know if you have any ideas

thanks alot
leo

  4. Re: Email Parsing   Reply   Report abuse  
Picture of Manuel Lemos Manuel Lemos - 2008-08-21 04:54:06 - In reply to message 3 from tzvi mirell
You should not be using the Return-path header for anything. It may even not be present in the message.

Anyway, you are probably not seeing the value of the Return-path header because it is between < and > characters. If you display it in a HTML page without escaping it, the browser treats it as an HTML tag and does not show anything, but it is there if you make your browser show the HTML source.