| This interface defines the type of family of extractors. Extractors extract all relevant info from a File, and return the info in
a ParsedFileInfo Object. These are mappings used by zilverline to plugin extractors based on file extensions. The plugin is a
java class that implements the Extractor interface and needs to be available on the classpath.
So if for example you specify the mapping "pdf => org.zilverline.extractors.PDFExtractor" make sure
org.zilverline.extractors.PDFExtractor is available, otherwise an Exception will be raised and handled by zilverline.
Right now you can use the TEXT, HTML, WORD, EXCEL, POWERPOINT and PDF extractors, and define the extensions you want to map. You
can not use wildcards, but you can define multiple extensions for one Extractor. By default the extensions are treated case
insensitively, but you can change that. Note that you van use an empty extension as well.
author: Michael Franken version: $Revision: 1.5 $ See Also: org.zilverline.core.ParsedFileInfo |