PHP Classes
elePHPant
Icontem

Decode PDF, ODT, Word, DOC, DOCX, RTF: Need a decoder for multiple formats

Recommend this page to a friend!
  All requests RSS feed  >  Decode PDF, ODT, Word, DOC, DOCX, RTF  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

Decode PDF, ODT, Word, DOC, DOCX, RTF

A request is featured when there is no good recommended package on the site when it is posted. Edit

Picture of herman lapre by herman lapre - 8 days ago (2016-02-20)

Need a decoder for multiple formats

This request is clear and relevant.
This request is not clear or is not relevant.

+1

The PHP file reader must be able to read PDF, ODT, DOC, DOCX and RTF documents.

  • 2 Clarification requests
  • 2. Picture of Christian Vigh by Christian Vigh - Yesterday (2016-02-26) Reply

    Please clarify your demand : which data do you expect from the PDF/ODT/DOC/DOCX and RTF document reader ? do you want to manipulate document elements after decoding ? do you want to be able to perform modifications after decoding ? or do you simply want to display the document contents on a web page ?

    • 3. Picture of Manuel Lemos by Manuel Lemos - Yesterday (2016-02-27) in reply to comment 2 by Christian Vigh Comment

      According to the request tags he wants a file viewer for those formats. So I suppose something that converts those formats to images will be helpful.

      It seems that OpenOffice/LibreOffice can be used for that purpose. the soffice program has options that can start the program opening a given file and convert the file to some other format, like Web pages with pictures, and then it exits without opening the GUI.

      So it can run from the console using the options --headless and --convert-to .

    • 4. Picture of Christian Vigh by Christian Vigh - Yesterday (2016-02-27) in reply to comment 3 by Manuel Lemos Comment

      I have had some experience with OpenOffice/LibreOffice for converting .DOC/.DOCX to .PDF documents. I have encountered some formatting issues, especially with tables but in general it works well.

      In addition, the unoconv script provides a command-line interface for doing the conversion.

      However, as far as I can remember, I requires the openoffice daemon to be up and running.

      I don't know if this could address Herman's needs ?

    • 5. Picture of Manuel Lemos by Manuel Lemos - 17 hours ago (2016-02-27) in reply to comment 4 by Christian Vigh Comment

      You do not need to have the OpenOffice daemon running. You can just start OppeOffice on demand to make the format conversion using the soffice command with the options mentioned above. So you do not need the unoconv script as well.

      Starting OpenOffice as a daemon has the advantage of keeping OpenOffice running in memory, just in case you need to convert many documents without delay. In that case you would use a script like unoconv to communicate with the daemon.

  • 1. Picture of Manuel Lemos by Manuel Lemos - Yesterday (2016-02-26) Reply

    There are packages that can render some of those formats as images that you can display on a Web page.

    There are not packages for all those formats but some of them could be added later using external programs to render the files as images.

    That could be a innovative solution.

    Ask clarification

    1 Recommendation

    ApiLayer API Encapsulation: Send requests to APILayer REST APIs

    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +1

    Picture of Christian Vigh by Christian Vigh package author package author Reputation 30 - Yesterday (2016-02-26) Comment

    As Manuel said, there is currently no universal solution for that. The package referenced here is able to capture html contents and generate either an image or a pdf document, using a third party web service.


    Recommend package
    : 
    :