Interface MetadataExtracter

All Superinterfaces:
ContentWorker
All Known Implementing Classes:
AbstractMappingMetadataExtracter, AsynchronousExtractor, RFC822MetadataExtracter, XmlMetadataExtracter, XPathMetadataExtracter

@AlfrescoPublicApi public interface MetadataExtracter extends ContentWorker
Interface for document property extracters.

Please pardon the incorrect spelling of extractor.

Author:
Jesper Steen Møller, Derek Hulley
  • Method Details

    • isSupported

      boolean isSupported(String mimetype)
      Determines if the extracter works against the given mimetype.
      Parameters:
      mimetype - the document mimetype
      Returns:
      Returns true if the mimetype is supported, otherwise false.
    • extract

      Map<QName,Serializable> extract(ContentReader reader, Map<QName,Serializable> destination)
      Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map. The internal mapping and overwrite policy between document metadata and system metadata will be used.

      The extraction viability can be determined by an up front call to isSupported(String).

      The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

      Parameters:
      reader - the source of the content
      destination - the map of properties to populate (essentially a return value)
      Returns:
      Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
      Throws:
      ContentIOException - if a detectable error occurs
      See Also:
    • extract

      Map<QName,Serializable> extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination)
      Extracts the metadata values from the content provided by the reader and source mimetype to the supplied map.

      The extraction viability can be determined by an up front call to isSupported(String).

      The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

      Parameters:
      reader - the source of the content
      overwritePolicy - the policy stipulating how the system properties must be overwritten if present
      destination - the map of properties to populate (essentially a return value)
      Returns:
      Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
      Throws:
      ContentIOException - if a detectable error occurs
      See Also:
    • extract

      Map<QName,Serializable> extract(ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination, Map<String,Set<QName>> mapping)
      Extracts the metadata from the content provided by the reader and source mimetype to the supplied map. The mapping from document metadata to system metadata is explicitly provided. The overwrite policy is also explictly set.

      The extraction viability can be determined by an up front call to isSupported(String).

      The source mimetype must be available on the ContentAccessor.getMimetype() method of the reader.

      Parameters:
      reader - the source of the content
      overwritePolicy - the policy stipulating how the system properties must be overwritten if present
      destination - the map of properties to populate (essentially a return value)
      mapping - a mapping of document-specific properties to system properties.
      Returns:
      Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
      Throws:
      ContentIOException - if a detectable error occurs
      See Also:
    • extract

      default Map<QName,Serializable> extract(NodeRef nodeRef, ContentReader reader, Map<QName,Serializable> destination)
      Identical to extract(ContentReader, Map) but with the addition of the NodeRef being acted on. By default, the method without the NodeRef is called.
      Parameters:
      nodeRef - the node being acted on.
      reader - the source of the content
      destination - the map of properties to populate (essentially a return value)
      Returns:
      Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
      Throws:
      ContentIOException - if a detectable error occurs
    • extract

      default Map<QName,Serializable> extract(NodeRef nodeRef, ContentReader reader, MetadataExtracter.OverwritePolicy overwritePolicy, Map<QName,Serializable> destination, Map<String,Set<QName>> mapping)
      Identical to extract(ContentReader, OverwritePolicy, Map, Map) but with the addition of the NodeRef being acted on. By default, the method without the NodeRef is called.
      Parameters:
      nodeRef - the node being acted on.
      reader - the source of the content
      overwritePolicy - the policy stipulating how the system properties must be overwritten if present
      destination - the map of properties to populate (essentially a return value)
      mapping - a mapping of document-specific properties to system properties.
      Returns:
      Returns a map of all properties on the destination map that were added or modified. If the return map is empty, then no properties were modified.
      Throws:
      ContentIOException - if a detectable error occurs