Interface ContentAssembler<D>

    • Method Detail

      • begin

        void begin​(ZipAssembler zip,
                   Counters metrics)
        Start the assembly process.
        Parameters:
        zip - Container to add the digital objects to.
        metrics - Metrics for keeping track of total number of digital objects and their size.
      • addContentsOf

        java.util.Map<java.lang.String,​ContentInfo> addContentsOf​(D domainObject)
                                                                 throws java.io.IOException
        Extracts content from a domain object and adds it to the SIP.
        Parameters:
        domainObject - The domain object for which to extract content and add it to the SIP.
        Returns:
        The mapping between reference information and content info for all digital objects extracted from the domain object
        Throws:
        java.io.IOException - If an exception occurs during the addition of the content
      • noDedup

        static <D> ContentAssembler<D> noDedup​(DigitalObjectsExtraction<D> contentsExtraction,
                                               HashAssembler contentHashAssembler)
        Do not deduplicate the digital objects but perform the specified hash calculations.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created content assembler
      • noDedup

        static <D> ContentAssembler<D> noDedup​(DigitalObjectsExtraction<D> contentsExtraction)
        Do not deduplicate the digital objects and do not perform any hash calculations.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        Returns:
        The newly created content assembler
      • ignoreContent

        static <D> ContentAssembler<D> ignoreContent()
        Ignore all digital objects, the SIP will contain structured data only.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Returns:
        The newly created content assembler
      • withDedupOnRi

        static <D> ContentAssembler<D> withDedupOnRi​(DigitalObjectsExtraction<D> contentsExtraction)
        Deduplicate digital objects based on the reference information but do not perform any hash calculation.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        Returns:
        The newly created content assembler
      • withDedupOnRi

        static <D> ContentAssembler<D> withDedupOnRi​(DigitalObjectsExtraction<D> contentsExtraction,
                                                     HashAssembler contentHashAssembler)
        Deduplicate digital objects based on the reference information and perform the specified hash calculations.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created content assembler
      • withDedupOnRi

        static <D> ContentAssembler<D> withDedupOnRi​(DigitalObjectsExtraction<D> contentsExtraction,
                                                     HashAssembler contentHashAssembler,
                                                     int estimatedMaxDigitalObjects)
        Deduplicate digital objects based on the reference information and perform the specified hash calculations.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        estimatedMaxDigitalObjects - a hint which will initialize the internal buffers to handle the specified number of digital objects without reallocation
        Returns:
        The newly created content assembler
      • withDedupOnRiAndValidation

        static <D> ContentAssembler<D> withDedupOnRiAndValidation​(DigitalObjectsExtraction<D> contentsExtraction,
                                                                  HashAssembler contentHashAssembler,
                                                                  boolean errorWhenEqualRiAndNotEqualHash,
                                                                  boolean errorWhenEqualHashAndNotEqualRI)
        Deduplicate digital objects based on the reference information and perform the specified hash calculations and optionally validate that the same reference information is always used to refer to the same content.

        Note that setting errorWhenEqualRiAndNotEqualHash to true will force a read of the content when the same reference information is used twice.

        Note that setting either validations to true requires at least one hash.

        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        errorWhenEqualRiAndNotEqualHash - Throw an exception when the same reference information is used but the actual content is different.
        errorWhenEqualHashAndNotEqualRI - Throw an exception when the same content is included twice but using different reference informations.
        Returns:
        The newly created content assembler
      • withDedupOnRiAndValidation

        static <D> ContentAssembler<D> withDedupOnRiAndValidation​(DigitalObjectsExtraction<D> contentsExtraction,
                                                                  HashAssembler contentHashAssembler,
                                                                  boolean errorWhenEqualRiAndNotEqualHash,
                                                                  boolean errorWhenEqualHashAndNotEqualRI,
                                                                  int estimatedMaxDigitalObjects)
        Deduplicate digital objects based on the reference information and perform the specified hash calculations and optionally validate that the same reference information is always used to refer to the same content.

        Note that setting errorWhenEqualRiAndNotEqualHash to true will force a read of the content when the same reference information is used twice.

        Note that setting either validations to true requires at least one hash.

        information is used twice.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        errorWhenEqualRiAndNotEqualHash - Throw an exception when the same reference information is used but the actual content is different.
        errorWhenEqualHashAndNotEqualRI - Throw an exception when the same content is included twice but using different reference informations.
        estimatedMaxDigitalObjects - a hint which will initialize the internal buffers to handle the specified number of digital objects without reallocation
        Returns:
        The newly created content assembler
      • withDedupOnHash

        static <D> ContentAssembler<D> withDedupOnHash​(DigitalObjectsExtraction<D> contentsExtraction,
                                                       HashAssembler contentHashAssembler)
        Deduplicate digital objects based on their hash value.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created content assembler
      • withDedupOnHash

        static <D> ContentAssembler<D> withDedupOnHash​(DigitalObjectsExtraction<D> contentsExtraction,
                                                       HashAssembler contentHashAssembler,
                                                       int estimatedMaxDigitalObjects)
        Deduplicate digital objects based on their hash value.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        estimatedMaxDigitalObjects - a hint which will initialize the internal buffers to handle the specified number of digital objects without reallocation
        Returns:
        The newly created content assembler