| |
|
| java.lang.Object org.pdfbox.util.PDFStreamEngine org.pdfbox.util.PDFTextStripper org.pdfbox.util.PDFText2HTML
PDFText2HTML | public class PDFText2HTML extends PDFTextStripper (Code) | | Wrap stripped text in simple HTML, trying to form HTML paragraphs.
Paragraphs broken by pages, columns, or figures are not mended.
author: jjb - http://www.johnjbarton.com version: $Revision: 1.3 $ |
endParagraph | protected void endParagraph() throws IOException(Code) | | Write out the paragraph separator.
throws: IOException - If there is an error writing to the stream. |
getTitleGuess | protected String getTitleGuess()(Code) | | The guess to the document title.
A string that is the title of this document. |
guessTitle | protected TextPosition guessTitle(Iterator textIter)(Code) | | This method will attempt to guess the title of the document.
Parameters: textIter - The characters on the first page. The text position that is guessed to be the title. |
isSuppressParagraphs | public boolean isSuppressParagraphs()(Code) | | Returns the suppressParagraphs. |
setSuppressParagraphs | public void setSuppressParagraphs(boolean shouldSuppressParagraphs)(Code) | | Parameters: shouldSuppressParagraphs - The suppressParagraphs to set. |
startParagraph | protected void startParagraph() throws IOException(Code) | | Write out the paragraph separator.
throws: IOException - If there is an error writing to the stream. |
writeHeader | protected void writeHeader() throws IOException(Code) | | Write the header to the output document.
throws: IOException - If there is a problem writing out the header to the document. |
|
|
|