railo.runtime.search.lucene2.docs
Class PDFDocument

java.lang.Object
  extended by railo.runtime.search.lucene2.docs.PDFDocument

public final class PDFDocument
extends Object

This class is used to create a document for the lucene search engine. This should easily plug into the IndexHTML or IndexFiles that comes with the lucene project. This class will populate the following fields.

Lucene Field Name Description
path File system path if loaded from a file
url URL to PDF document
contents Entire contents of PDF document, indexed but not stored
summary First 500 characters of content
modified The modified date/time according to the url or path
uid A unique identifier for the Lucene document.
CreationDate From PDF meta-data if available
Creator From PDF meta-data if available
Keywords From PDF meta-data if available
ModificationDate From PDF meta-data if available
Producer From PDF meta-data if available
Subject From PDF meta-data if available
Trapped From PDF meta-data if available


Method Summary
static org.apache.lucene.document.Document getDocument(Resource res)
          This will get a lucene document from a PDF file.
static org.apache.lucene.document.Document getDocument(StringBuffer content, InputStream is)
          This will get a lucene document from a PDF file.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getDocument

public static org.apache.lucene.document.Document getDocument(StringBuffer content,
                                                              InputStream is)
This will get a lucene document from a PDF file.

Parameters:
is - The stream to read the PDF from.
Returns:
The lucene document.
Throws:
IOException - If there is an error parsing or indexing the document.

getDocument

public static org.apache.lucene.document.Document getDocument(Resource res)
This will get a lucene document from a PDF file.

Parameters:
res - The file to get the document for.
Returns:
The lucene document.
Throws:
IOException - If there is an error parsing or indexing the document.


Copyright © 2012 Railo