nl.hippo.slide.extractor |
|
Java Source File Name | Type | Comment |
ConfigurableXMLContentExtractor.java | Class | Extracts only the character data from an XML stream. |
ConstantExtractor.java | Class | Put constant properties on a document (like cms:type = asset in files/domain.preview/binaries). |
HippoLastmodifiedExtractor.java | Class | |
HippoMultiValueXMLPropertyExtractor.java | Class | |
HippoSimpleXmlExtractor.java | Class | |
HippoUrlListXMLPropertyExtractor.java | Class | Extract a list of URLs from an XML property, ensure they are properly encoded and provide
a blank-separated list of them. |
HippoXMLDatePropertyExtractor.java | Class | Extract and format a date string from an xml file. |
HippoXmlPropertyExtractor.java | Class | |
ImagePropertyExtractor.java | Class | Extracts image information. |
LanguageSpecificContentExtractor.java | Interface | |
MultiValueXMLPropertyExtractor.java | Class | |
OfficeExtractor.java | Class | Property extractor for Microsoft office documents.
This property extractor extracts properties from SummaryInformation and
DocumentSummaryInformation headers of office documents.
Sample configuration:
<extractor classname="org.apache.slide.extractor.OfficeExtractor" uri="/files/docs/">
<configuration>
<instruction property="author" namespace="http://mycomp.com/namepsaces/webdav" summary-information="4" />
<instruction property="application" namespace="http://mycomp.com/namepsaces/webdav" summary-information="18" />
<instruction property="title" namespace="http://mycomp.com/namepsaces/webdav" summary-information="2" />
<instruction property="category" namespace="http://mycomp.com/namepsaces/webdav" document-summary-information="2" />
<instruction property="docid" namespace="http://mycomp.com/namepsaces/webdav" label="Document-ID" />
</configuration>
</extractor>
The sample configuration
- maps the author info of office documents to the
author
property. |
OpenDocumentContentExtractor.java | Class | |
OpenDocumentPropertyExtractor.java | Class | |
PropertyExtractorTrigger.java | Class | |
UrlListXMLPropertyExtractor.java | Class | Extract a list of URLs from an XML property, ensure they are properly encoded and provide
a blank-separated list of them. |
XMLContentExtractor.java | Class | Extracts only the character data from an XML stream. |
XMLDatePropertyExtractor.java | Class | Extract and format a date string from an xml file. |
XmlPropertyExtractor.java | Class | |