Class AsynchronousExtractor
- All Implemented Interfaces:
ContentWorker,MetadataEmbedder,MetadataExtracter,org.springframework.beans.factory.Aware,org.springframework.beans.factory.BeanNameAware,org.springframework.context.ApplicationContextAware
RenditionService2.transform(NodeRef, TransformDefinition). The properties that will extracted are defined
by the transform. This allows out of process metadata extracts to be defined without the need to apply an AMP.
The actual transform is a request to go from the source mimetype to "alfresco-metadata-extract". The
resulting transform is a Map in json of properties and values to be set on the source node.
As with other sub-classes of AbstractMappingMetadataExtracter it also supports embedding of metadata in
a source node. In this case the remote async transform states that it supports a transform from a source mimetype
to "alfresco-metadata-embed". The resulting transform is a replacement for the content of the node.
- Author:
- adavis
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.alfresco.repo.content.metadata.MetadataExtracter
MetadataExtracter.OverwritePolicy -
Field Summary
Fields inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
dictionaryService, logger, MEGABYTE_SIZE, NAMESPACE_PROPERTY_PREFIX, PROPERTY_COMPONENT_EMBED, PROPERTY_COMPONENT_EXTRACT, PROPERTY_PREFIX_METADATA -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidcheckIsEmbedSupported(ContentWriter writer) Checks if embedding for the mimetype is supported.protected voidcheckIsSupported(ContentReader reader) Checks if the mimetype is supported.protected voidembedInternal(NodeRef nodeRef, Map<String, Serializable> metadata, ContentReader reader, ContentWriter writer) protected Map<String,Serializable> extractRaw(ContentReader reader) Override to provide the raw extracted metadata values.protected Map<String,Serializable> extractRawInThread(NodeRef nodeRef, ContentReader reader, MetadataExtracterLimits limits) This method provides a best guess of where to store the values extracted from the documents.static StringgetExtension(String targetMimetype, String sourceExtension, String targetExtension) Returns a file extension used as the target in a transform.static StringgetRenditionName(String renditionName) Returns a rendition name used inTransformerDebug.static StringgetSourceMimetypeFromTransformName(String transformName) static StringgetTargetMimetypeFromTransformName(String transformName) booleanisEmbedderSupported(String sourceMimetype, long sourceSizeInBytes) static booleanisMetadataEmbedMimetype(String targetMimetype) static booleanisMetadataExtractMimetype(String targetMimetype) booleanisSupported(String sourceMimetype, long sourceSizeInBytes) protected Map<String,Serializable> mapSystemToRaw(Map<QName, Serializable> systemMetadata) As T-Engines do the mapping, all this method can do is convert QNames to fully qualified Strings and the values to Strings or a Collection of Strings.voidsetContentService(ContentService contentService) voidsetEmbeddedMetadata(NodeRef nodeRef, InputStream transformInputStream) voidsetMetadata(NodeRef nodeRef, InputStream transformInputStream) voidsetMetadataExtractorPropertyMappingOverrides(List<MetadataExtractorPropertyMappingOverride> metadataExtractorPropertyMappingOverrides) voidsetNamespacePrefixResolver(NamespacePrefixResolver namespacePrefixResolver) voidsetNodeService(NodeService nodeService) voidsetRenditionDefinitionRegistry2(RenditionDefinitionRegistry2Impl renditionDefinitionRegistry2) voidsetRenditionService2(RenditionService2 renditionService2) voidsetTaggingService(TaggingService taggingService) voidsetTransactionService(TransactionService transactionService) voidsetTransformerDebug(TransformerDebug transformerDebug) voidsetTransformServiceRegistry(org.alfresco.transform.registry.TransformServiceRegistry transformServiceRegistry) Methods inherited from class org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
convertSystemPropertyValues, embed, embed, embedInternal, extract, extract, extract, extract, extract, extractRawThreadFinished, filterSystemProperties, getBeanName, getDefaultEmbedMapping, getEmbedMapping, getExecutorService, getLimits, getMapping, getMimetypeService, getSupportedMimetypes, init, isEmbeddingSupported, isEnabled, isSupported, makeDate, newRawMap, putRawValue, readEmbedMappingProperties, readEmbedMappingProperties, readGlobalEmbedMappingProperties, readGlobalExtractMappingProperties, readMappingProperties, readMappingProperties, register, setApplicationContext, setBeanName, setDictionaryService, setEmbedMapping, setEmbedMappingProperties, setEnableStringTagging, setExecutorService, setFailOnTypeConversion, setInheritDefaultEmbedMapping, setInheritDefaultMapping, setMapping, setMappingProperties, setMimetypeLimits, setMimetypeService, setOverwritePolicy, setProperties, setRegistry, setSupportedDateFormats, setSupportedEmbedMimetypes, setSupportedMimetypes
-
Constructor Details
-
AsynchronousExtractor
public AsynchronousExtractor()
-
-
Method Details
-
setNodeService
-
setNamespacePrefixResolver
-
setTransformerDebug
-
setRenditionService2
-
setRenditionDefinitionRegistry2
public void setRenditionDefinitionRegistry2(RenditionDefinitionRegistry2Impl renditionDefinitionRegistry2) -
setContentService
-
setTransactionService
-
setTransformServiceRegistry
public void setTransformServiceRegistry(org.alfresco.transform.registry.TransformServiceRegistry transformServiceRegistry) -
setTaggingService
-
setMetadataExtractorPropertyMappingOverrides
public void setMetadataExtractorPropertyMappingOverrides(List<MetadataExtractorPropertyMappingOverride> metadataExtractorPropertyMappingOverrides) -
getDefaultMapping
Description copied from class:AbstractMappingMetadataExtracterThis method provides a best guess of where to store the values extracted from the documents. The list of properties mapped by default need not include all properties extracted from the document; just the obvious set of mappings need be supplied. Implementations must either provide the default mapping properties in the expected location or override the method to provide the default mapping.The default implementation looks for the default mapping file in the location given by the class name and .properties. If the extracter's class is x.y.z.MyExtracter then the default properties will be picked up at classpath:/alfresco/metadata/MyExtracter.properties. The previous location of classpath:/x/y/z/MyExtracter.properties is still supported but may be removed in a future release. Inner classes are supported, but the '$' in the class name is replaced with '-', so default properties for x.y.z.MyStuff$MyExtracter will be located using classpath:/alfresco/metadata/MyStuff-MyExtracter.properties.
The default mapping implementation should include thorough Javadocs so that the system administrators can accurately determine how to best enhance or override the default mapping.
If the default mapping is declared in a properties file other than the one named after the class, then the
AbstractMappingMetadataExtracter.readMappingProperties(String)method can be used to quickly generate the return value:
The map can also be created in code either statically or during the call.{ return readMappingProperties(DEFAULT_MAPPING); }- Overrides:
getDefaultMappingin classAbstractMappingMetadataExtracter- Returns:
- Returns the default, static mapping. It may not be null.
- See Also:
-
isSupported
-
isEmbedderSupported
-
isMetadataExtractMimetype
-
isMetadataEmbedMimetype
-
getTargetMimetypeFromTransformName
-
getSourceMimetypeFromTransformName
-
getExtension
public static String getExtension(String targetMimetype, String sourceExtension, String targetExtension) Returns a file extension used as the target in a transform. The normal extension is changed if thetargetMimetypeis an extraction or embedding type.- Parameters:
targetMimetype- the target mimetypesourceExtension- normal source extensiontargetExtension- current target extension (normally"bin" for embedding and extraction)- Returns:
- the extension to be used.
-
getRenditionName
Returns a rendition name used inTransformerDebug. The normal name is changed if it is a metadata extract or embed. The name in this case is actually the"alfresco-metadata-extract/""alfresco-metadata-embed/"followed by the source mimetype.- Parameters:
renditionName- the normal name, or a special one based on the source mimetype and a prefixed.- Returns:
- the renditionName to be used.
-
checkIsSupported
Description copied from class:AbstractMappingMetadataExtracterChecks if the mimetype is supported.- Overrides:
checkIsSupportedin classAbstractMappingMetadataExtracter- Parameters:
reader- the reader to check
-
checkIsEmbedSupported
Description copied from class:AbstractMappingMetadataExtracterChecks if embedding for the mimetype is supported.- Overrides:
checkIsEmbedSupportedin classAbstractMappingMetadataExtracter- Parameters:
writer- the writer to check
-
extractRaw
Description copied from class:AbstractMappingMetadataExtracterOverride to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if thedefault mappingdoesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations.Raw values must not be trimmed or removed for any reason. Null values and empty strings are
- Null: Removed
- Empty String: Passed to the OverwritePolicy
- Non Serializable: Converted to String or fails if that is not possible
Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:
editor: - the document editor --> cm:author title: - the document title --> cm:title user1: - the document summary user2: - the document description --> cm:description user3: - user4: -
- Specified by:
extractRawin classAbstractMappingMetadataExtracter- Parameters:
reader- the document to extract the values from. This stream provided by the reader must be closed if accessed directly.- Returns:
- Returns a map of document property values keyed by property name.
- See Also:
-
extractRawInThread
protected Map<String,Serializable> extractRawInThread(NodeRef nodeRef, ContentReader reader, MetadataExtracterLimits limits) throws Throwable - Overrides:
extractRawInThreadin classAbstractMappingMetadataExtracter- Throws:
Throwable
-
mapSystemToRaw
As T-Engines do the mapping, all this method can do is convert QNames to fully qualified Strings and the values to Strings or a Collection of Strings.- Overrides:
mapSystemToRawin classAbstractMappingMetadataExtracter- Parameters:
systemMetadata- Metadata keyed by system properties- Returns:
- the original map but with QNames turned into Strings.
-
embedInternal
protected void embedInternal(NodeRef nodeRef, Map<String, Serializable> metadata, ContentReader reader, ContentWriter writer) - Overrides:
embedInternalin classAbstractMappingMetadataExtracter
-
setMetadata
-
setEmbeddedMetadata
-