Class AsynchronousExtractor

java.lang.Object
org.alfresco.repo.content.metadata.AbstractMappingMetadataExtracter
org.alfresco.repo.content.metadata.AsynchronousExtractor
All Implemented Interfaces:
ContentWorker, MetadataEmbedder, MetadataExtracter, org.springframework.beans.factory.Aware, org.springframework.beans.factory.BeanNameAware, org.springframework.context.ApplicationContextAware

public class AsynchronousExtractor extends AbstractMappingMetadataExtracter
Requests an extract of metadata via a remote async transform using RenditionService2.transform(NodeRef, TransformDefinition). The properties that will extracted are defined by the transform. This allows out of process metadata extracts to be defined without the need to apply an AMP. The actual transform is a request to go from the source mimetype to "alfresco-metadata-extract". The resulting transform is a Map in json of properties and values to be set on the source node.

As with other sub-classes of AbstractMappingMetadataExtracter it also supports embedding of metadata in a source node. In this case the remote async transform states that it supports a transform from a source mimetype to "alfresco-metadata-embed". The resulting transform is a replacement for the content of the node.

Author:
adavis
  • Constructor Details

    • AsynchronousExtractor

      public AsynchronousExtractor()
  • Method Details

    • setNodeService

      public void setNodeService(NodeService nodeService)
    • setNamespacePrefixResolver

      public void setNamespacePrefixResolver(NamespacePrefixResolver namespacePrefixResolver)
    • setTransformerDebug

      public void setTransformerDebug(TransformerDebug transformerDebug)
    • setRenditionService2

      public void setRenditionService2(RenditionService2 renditionService2)
    • setRenditionDefinitionRegistry2

      public void setRenditionDefinitionRegistry2(RenditionDefinitionRegistry2Impl renditionDefinitionRegistry2)
    • setContentService

      public void setContentService(ContentService contentService)
    • setTransactionService

      public void setTransactionService(TransactionService transactionService)
    • setTransformServiceRegistry

      public void setTransformServiceRegistry(org.alfresco.transform.registry.TransformServiceRegistry transformServiceRegistry)
    • setTaggingService

      public void setTaggingService(TaggingService taggingService)
    • setMetadataExtractorPropertyMappingOverrides

      public void setMetadataExtractorPropertyMappingOverrides(List<MetadataExtractorPropertyMappingOverride> metadataExtractorPropertyMappingOverrides)
    • getDefaultMapping

      protected Map<String,Set<QName>> getDefaultMapping()
      Description copied from class: AbstractMappingMetadataExtracter
      This method provides a best guess of where to store the values extracted from the documents. The list of properties mapped by default need not include all properties extracted from the document; just the obvious set of mappings need be supplied. Implementations must either provide the default mapping properties in the expected location or override the method to provide the default mapping.

      The default implementation looks for the default mapping file in the location given by the class name and .properties. If the extracter's class is x.y.z.MyExtracter then the default properties will be picked up at classpath:/alfresco/metadata/MyExtracter.properties. The previous location of classpath:/x/y/z/MyExtracter.properties is still supported but may be removed in a future release. Inner classes are supported, but the '$' in the class name is replaced with '-', so default properties for x.y.z.MyStuff$MyExtracter will be located using classpath:/alfresco/metadata/MyStuff-MyExtracter.properties.

      The default mapping implementation should include thorough Javadocs so that the system administrators can accurately determine how to best enhance or override the default mapping.

      If the default mapping is declared in a properties file other than the one named after the class, then the AbstractMappingMetadataExtracter.readMappingProperties(String) method can be used to quickly generate the return value:

       
            {
                return readMappingProperties(DEFAULT_MAPPING);
            }
       
       
      The map can also be created in code either statically or during the call.
      Overrides:
      getDefaultMapping in class AbstractMappingMetadataExtracter
      Returns:
      Returns the default, static mapping. It may not be null.
      See Also:
    • isSupported

      public boolean isSupported(String sourceMimetype, long sourceSizeInBytes)
    • isEmbedderSupported

      public boolean isEmbedderSupported(String sourceMimetype, long sourceSizeInBytes)
    • isMetadataExtractMimetype

      public static boolean isMetadataExtractMimetype(String targetMimetype)
    • isMetadataEmbedMimetype

      public static boolean isMetadataEmbedMimetype(String targetMimetype)
    • getTargetMimetypeFromTransformName

      public static String getTargetMimetypeFromTransformName(String transformName)
    • getSourceMimetypeFromTransformName

      public static String getSourceMimetypeFromTransformName(String transformName)
    • getExtension

      public static String getExtension(String targetMimetype, String sourceExtension, String targetExtension)
      Returns a file extension used as the target in a transform. The normal extension is changed if the targetMimetype is an extraction or embedding type.
      Parameters:
      targetMimetype - the target mimetype
      sourceExtension - normal source extension
      targetExtension - current target extension (normally "bin" for embedding and extraction)
      Returns:
      the extension to be used.
    • getRenditionName

      public static String getRenditionName(String renditionName)
      Returns a rendition name used in TransformerDebug. The normal name is changed if it is a metadata extract or embed. The name in this case is actually the "alfresco-metadata-extract/" "alfresco-metadata-embed/" followed by the source mimetype.
      Parameters:
      renditionName - the normal name, or a special one based on the source mimetype and a prefixed.
      Returns:
      the renditionName to be used.
    • checkIsSupported

      protected void checkIsSupported(ContentReader reader)
      Description copied from class: AbstractMappingMetadataExtracter
      Checks if the mimetype is supported.
      Overrides:
      checkIsSupported in class AbstractMappingMetadataExtracter
      Parameters:
      reader - the reader to check
    • checkIsEmbedSupported

      protected void checkIsEmbedSupported(ContentWriter writer)
      Description copied from class: AbstractMappingMetadataExtracter
      Checks if embedding for the mimetype is supported.
      Overrides:
      checkIsEmbedSupported in class AbstractMappingMetadataExtracter
      Parameters:
      writer - the writer to check
    • extractRaw

      protected Map<String,Serializable> extractRaw(ContentReader reader)
      Description copied from class: AbstractMappingMetadataExtracter
      Override to provide the raw extracted metadata values. An extracter should extract as many of the available properties as is realistically possible. Even if the default mapping doesn't handle all properties, it is possible for each instance of the extracter to be configured differently and more or less of the properties may be used in different installations.

      Raw values must not be trimmed or removed for any reason. Null values and empty strings are

      • Null: Removed
      • Empty String: Passed to the OverwritePolicy
      • Non Serializable: Converted to String or fails if that is not possible

      Properties extracted and their meanings and types should be thoroughly described in the class-level javadocs of the extracter implementation, for example:

       editor: - the document editor        -->  cm:author
       title:  - the document title         -->  cm:title
       user1:  - the document summary
       user2:  - the document description   -->  cm:description
       user3:  -
       user4:  -
       
      Specified by:
      extractRaw in class AbstractMappingMetadataExtracter
      Parameters:
      reader - the document to extract the values from. This stream provided by the reader must be closed if accessed directly.
      Returns:
      Returns a map of document property values keyed by property name.
      See Also:
    • extractRawInThread

      protected Map<String,Serializable> extractRawInThread(NodeRef nodeRef, ContentReader reader, MetadataExtracterLimits limits, MetadataExtracter.OverwritePolicy overwritePolicy) throws Throwable
      Overrides:
      extractRawInThread in class AbstractMappingMetadataExtracter
      Throws:
      Throwable
    • mapSystemToRaw

      protected Map<String,Serializable> mapSystemToRaw(Map<QName,Serializable> systemMetadata)
      As T-Engines do the mapping, all this method can do is convert QNames to fully qualified Strings and the values to Strings or a Collection of Strings.
      Overrides:
      mapSystemToRaw in class AbstractMappingMetadataExtracter
      Parameters:
      systemMetadata - Metadata keyed by system properties
      Returns:
      the original map but with QNames turned into Strings.
    • embedInternal

      protected void embedInternal(NodeRef nodeRef, Map<String,Serializable> metadata, ContentReader reader, ContentWriter writer)
      Overrides:
      embedInternal in class AbstractMappingMetadataExtracter
    • setMetadata

      public void setMetadata(NodeRef nodeRef, InputStream transformInputStream)
    • setEmbeddedMetadata

      public void setEmbeddedMetadata(NodeRef nodeRef, InputStream transformInputStream)