| |
|
| java.lang.Object org.pdfbox.util.PDFStreamEngine org.pdfbox.util.PDFTextStripper org.pdfbox.util.PDFTextStripperByArea
PDFTextStripperByArea | public class PDFTextStripperByArea extends PDFTextStripper (Code) | | This will extract text from a specified region in the PDF.
author: Ben Litchfield version: $Revision: 1.5 $ |
PDFTextStripperByArea | public PDFTextStripperByArea() throws IOException(Code) | | Constructor.
throws: IOException - If there is an error loading properties. |
addRegion | public void addRegion(String regionName, Rectangle2D rect)(Code) | | Add a new region to group text by.
Parameters: regionName - The name of the region. Parameters: rect - The rectangle area to retrieve the text from. |
extractRegions | public void extractRegions(PDPage page) throws IOException(Code) | | Process the page to extract the region text.
Parameters: page - The page to extract the regions from. throws: IOException - If there is an error while extracting text. |
flushText | protected void flushText() throws IOException(Code) | | This will print the text to the output stream.
throws: IOException - If there is an error writing the text. |
getRegions | public List getRegions()(Code) | | Get the list of regions that have been setup.
A list of java.lang.String objects to identify the region names. |
getTextForRegion | public String getTextForRegion(String regionName)(Code) | | Get the text for the region, this should be called after extractRegions().
Parameters: regionName - The name of the region to get the text from. The text that was identified in that region. |
|
|
|