This property extractor extracts properties from SummaryInformation and DocumentSummaryInformation headers of office documents.
SummaryInformation
DocumentSummaryInformation
Sample configuration:
<extractor classname="org.apache.slide.extractor.OfficeExtractor" uri="/files/docs/"> <configuration> <instruction property="author" namespace="http://mycomp.com/namepsaces/webdav" summary-information="4" /> <instruction property="application" namespace="http://mycomp.com/namepsaces/webdav" summary-information="18" /> <instruction property="title" namespace="http://mycomp.com/namepsaces/webdav" summary-information="2" /> <instruction property="category" namespace="http://mycomp.com/namepsaces/webdav" document-summary-information="2" /> <instruction property="docid" namespace="http://mycomp.com/namepsaces/webdav" label="Document-ID" /> </configuration> </extractor>
author