Class XPathMetadataExtracter
- All Implemented Interfaces:
NamespaceContext,ContentWorker,MetadataEmbedder,MetadataExtracter,org.springframework.beans.factory.Aware,org.springframework.beans.factory.BeanNameAware,org.springframework.context.ApplicationContextAware
When an instance of this extracter is configured, XPath statements should be provided to extract all the available metadata. The implementation is sensitive to what is actually requested by the configured mapping and will only perform the queries necessary to fulfill the requirements.
To summarize, there are two configurations required for this class:
- A mapping of all reasonable document properties to XPath statements. See
AbstractMappingMetadataExtracter.setMappingProperties(java.util.Properties). - A mapping of document property names to Alfresco repository model QNames. See {@link #setXpathMappingProperties(Properties).}
All values are extracted as text values and therefore all XPath statements must evaluate to a node that can be rendered as text.
- Since:
- 2.1
- Author:
- Derek Hulley
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.alfresco.repo.content.metadata.MetadataExtracter
MetadataExtracter.OverwritePolicy -
Field Summary
FieldsFields inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
dictionaryService, MEGABYTE_SIZE, NAMESPACE_PROPERTY_PREFIX, PROPERTY_COMPONENT_EMBED, PROPERTY_COMPONENT_EXTRACT, PROPERTY_PREFIX_METADATA -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected Map<String,Serializable> extractRaw(ContentReader reader) Override to provide the raw extracted metadata values.It is not possible to have any default mappings, but something has to be returned.getNamespaceURI(String prefix) getPrefixes(String namespaceURI) protected voidinit()Provides a hook point for implementations to perform initialization.protected Map<String,Serializable> processDocument(Document document) Executes all the necessary XPath statements to extract values.protected voidreadXPathMappingProperties(Properties xpathMappingProperties) A utility method to convert mapping properties to the Map form.voidsetXpathMappingProperties(Properties xpathMappingProperties) Set the properties file that maps document properties to the XPath statements necessary to retrieve them.Methods inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
checkIsEmbedSupported, checkIsSupported, convertSystemPropertyValues, embed, embed, embedInternal, embedInternal, extract, extract, extract, extract, extract, extractRawInThread, extractRawThreadFinished, filterSystemProperties, getBeanName, getDefaultEmbedMapping, getEmbedMapping, getExecutorService, getLimits, getMapping, getMimetypeService, getSupportedMimetypes, isEmbeddingSupported, isEnabled, isSupported, makeDate, mapSystemToRaw, newRawMap, putRawValue, readEmbedMappingProperties, readEmbedMappingProperties, readGlobalEmbedMappingProperties, readGlobalExtractMappingProperties, readMappingProperties, readMappingProperties, register, setApplicationContext, setBeanName, setDictionaryService, setEmbedMapping, setEmbedMappingProperties, setEnableStringTagging, setExecutorService, setFailOnTypeConversion, setInheritDefaultEmbedMapping, setInheritDefaultMapping, setMapping, setMappingProperties, setMimetypeLimits, setMimetypeService, setOverwritePolicy, setProperties, setRegistry, setSupportedDateFormats, setSupportedEmbedMimetypes, setSupportedMimetypes
-
Field Details
-
SUPPORTED_MIMETYPES
-
-
Constructor Details
-
XPathMetadataExtracter
public XPathMetadataExtracter()Default constructor
-
-
Method Details
-
getNamespaceURI
- Specified by:
getNamespaceURIin interfaceNamespaceContext
-
getPrefix
- Specified by:
getPrefixin interfaceNamespaceContext
-
getPrefixes
- Specified by:
getPrefixesin interfaceNamespaceContext
-
setXpathMappingProperties
Set the properties file that maps document properties to the XPath statements necessary to retrieve them.The Xpath mapping is of the form:
# Namespaces prefixes namespace.prefix.my=http://www....com/alfresco/1.0 # Mapping editor=/my:example-element/@cm:editor title=/my:example-element/text()
-
init
protected void init()Description copied from class:AbstractMappingMetadataExtracterProvides a hook point for implementations to perform initialization. The base implementation must be invoked or the extracter will fail during extraction. Thedefault mappingswill be requested during initialization.- Overrides:
initin classAbstractMappingMetadataExtracter
-
getDefaultMapping
It is not possible to have any default mappings, but something has to be returned.- Overrides:
getDefaultMappingin classAbstractMappingMetadataExtracter- Returns:
- Returns an empty map
- See Also:
-
extractRaw
Description copied from class:AbstractMappingMetadataExtracterOverride to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if thedefault mappingdoesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations.Raw values must not be trimmed or removed for any reason. Null values and empty strings are
- Null: Removed
- Empty String: Passed to the OverwritePolicy
- Non Serializable: Converted to String or fails if that is not possible
Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:
editor: - the document editor --> cm:author title: - the document title --> cm:title user1: - the document summary user2: - the document description --> cm:description user3: - user4: -
- Specified by:
extractRawin classAbstractMappingMetadataExtracter- Parameters:
reader- the document to extract the values from. This stream provided by the reader must be closed if accessed directly.- Returns:
- Returns a map of document property values keyed by property name.
- Throws:
Throwable- All exception conditions can be handled.- See Also:
-
processDocument
Executes all the necessary XPath statements to extract values.- Throws:
Throwable
-
readXPathMappingProperties
A utility method to convert mapping properties to the Map form.
-