PHP Classes

extract italic text from PDF

Recommend this page to a friend!

      Top level forums  >  PHP Specialists  >  General  >  extract italic text from PDF  
Subject:extract italic text from PDF
Summary:how can I extract only italic texts from a PDF file?
Messages:2
Author:Messala
Date:2011-08-26 10:04:28
Update:2011-08-27 23:04:22
 

  1. extract italic text from PDF   Reply   Report abuse  
Picture of Messala Messala - 2011-08-26 20:19:18
Hi there.

It's the first time I'm dealing with PDF files. I took a quick read on the official reference, but I have not figured out EXACTLY how is specified the format of the text.

From what I understood, the text is divided into blocks (objects and streams) and each of these blocks have a kind of "header" where will be the formatting of content of block (when appropriate). Correct me if I'm wrong.

So, I tested some classes of PDF handlers, but almost all of them ignore the blocks especifications, extracting only the text. One, from Thomas Chester, give many options to extract others separated information from a PDF file beyond the text, but nothing that give me a trail to filter italic texts.

Could someone help me?

Thanks in advance.
[]'s

There is 1 reply in this thread, which is not being displayed.
Browsing this forum thread replies is available only to premium subscribers.


Go to the premium subscriptions page to learn how to become a premium subscriber and have full access to this forum.