Class SipAssembler<D>

  • Type Parameters:
    D - The type of domain objects to assemble the SIP from
    All Implemented Interfaces:
    Assembler<D>

    public class SipAssembler<D>
    extends java.lang.Object
    implements Assembler<D>
    Assembles a Submission Information Package (SIP) from several domain objects of the same type. Each domain object is typically a Plain Old Java Object (POJO) that you create in an application-specific manner.

    A SIP is a ZIP file that contains:

    • One Packaging Information that describes the content of the SIP
    • One Preservation Description Information (PDI) that contains structured data to be archived
    • Zero or more Content Data Objects that contain unstructured data to be archived. These are referenced from the PDI

    Packaging Information is created by a factory. If you want only one SIP in a DSS, then you can use a DefaultPackagingInformationFactory to create the Packaging Information based on a prototype which contains application-specific fields.

    The PDI will be assembled from the domain objects by an Assembler and added to the ZIP by a ZipAssembler. Each domain object may also contain zero or more DigitalObjects, which are extracted from the domain object using a DigitalObjectsExtraction and added to the ZIP. The PDI is written to a DataBuffer until it is complete. For small PDIs, you can use a MemoryBuffer to hold this data, but for larger PDIs you should use a FileBuffer to prevent running out of memory.

    Use the following steps to assemble a SIP:

    1. Start the process by calling the start(DataBuffer) method
    2. Add zero or more domain objects by calling the add(Object) method multiple times
    3. Finish the process by calling the end() method
    You can optionally get metrics about the SIP assembly process by calling getMetrics() at any time.

    If the number of domain objects is small and each individual domain object is also small, you can wrap a SipAssembler in a Generator to reduce the above code to a single call.

    To assemble a number of SIPs in a batch, use BatchSipAssembler.

    Warning:
    This object is not thread-safe. If you want to use multiple threads to assemble SIPs, let each use their own instance.
    • Constructor Detail

      • SipAssembler

        public SipAssembler​(PackagingInformationFactory packagingInformationFactory,
                            Assembler<HashedContents<D>> pdiAssembler,
                            HashAssembler pdiHashAssembler,
                            java.util.function.Supplier<? extends DataBuffer> pdiBufferSupplier,
                            ContentAssembler<D> contentAssembler)
        Create a new instance.
        Parameters:
        packagingInformationFactory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI and the unstructured data
        pdiBufferSupplier - Supplier for a data buffer to store the PDI
        contentAssembler - ContentAssembler that adds the digital objects to the SIP
    • Method Detail

      • forPdi

        public static <D> SipAssembler<D> forPdi​(PackagingInformation prototype,
                                                 Assembler<HashedContents<D>> pdiAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        Returns:
        The newly created SIP assembler
      • forPdi

        public static <D> SipAssembler<D> forPdi​(PackagingInformationFactory factory,
                                                 Assembler<HashedContents<D>> pdiAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        Returns:
        The newly created SIP assembler
      • forPdiWithHashing

        public static <D> SipAssembler<D> forPdiWithHashing​(PackagingInformation prototype,
                                                            Assembler<HashedContents<D>> pdiAssembler,
                                                            HashAssembler pdiHashAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI
        Returns:
        The newly created SIP assembler
      • forPdiWithHashing

        public static <D> SipAssembler<D> forPdiWithHashing​(PackagingInformationFactory factory,
                                                            Assembler<HashedContents<D>> pdiAssembler,
                                                            HashAssembler pdiHashAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI
        Returns:
        The newly created SIP assembler
      • forPdiAndContent

        public static <D> SipAssembler<D> forPdiAndContent​(PackagingInformation prototype,
                                                           Assembler<HashedContents<D>> pdiAssembler,
                                                           DigitalObjectsExtraction<D> contentsExtraction)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentsExtraction - Extraction of content from domain objects added to the SIP
        Returns:
        The newly created SIP assembler
      • forPdiAndContent

        public static <D> SipAssembler<D> forPdiAndContent​(PackagingInformation prototype,
                                                           Assembler<HashedContents<D>> pdiAssembler,
                                                           ContentAssembler<D> contentAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentAssembler - ContentAssembler that adds the digital objects to the SIP
        Returns:
        The newly created SIP assembler
      • forPdiAndContent

        public static <D> SipAssembler<D> forPdiAndContent​(PackagingInformationFactory factory,
                                                           Assembler<HashedContents<D>> pdiAssembler,
                                                           ContentAssembler<D> contentAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentAssembler - ContentAssembler that adds the digital objects to the SIP
        Returns:
        The newly created SIP assembler
      • forPdiAndContent

        public static <D> SipAssembler<D> forPdiAndContent​(PackagingInformationFactory factory,
                                                           Assembler<HashedContents<D>> pdiAssembler,
                                                           DigitalObjectsExtraction<D> contentsExtraction)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentsExtraction - Extraction of content from domain objects added to the SIP
        Returns:
        The newly created SIP assembler
      • forPdiAndContentWithContentHashing

        public static <D> SipAssembler<D> forPdiAndContentWithContentHashing​(PackagingInformation prototype,
                                                                             Assembler<HashedContents<D>> pdiAssembler,
                                                                             DigitalObjectsExtraction<D> contentsExtraction,
                                                                             HashAssembler contentHashAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created SIP assembler
      • forPdiAndContentWithContentHashing

        public static <D> SipAssembler<D> forPdiAndContentWithContentHashing​(PackagingInformationFactory factory,
                                                                             Assembler<HashedContents<D>> pdiAssembler,
                                                                             DigitalObjectsExtraction<D> contentsExtraction,
                                                                             HashAssembler contentHashAssembler)
        Assemble a SIP that contains only structured data and is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created SIP assembler
      • forPdiAndContentWithHashing

        public static <D> SipAssembler<D> forPdiAndContentWithHashing​(PackagingInformation prototype,
                                                                      Assembler<HashedContents<D>> pdiAssembler,
                                                                      HashAssembler pdiHashAssembler,
                                                                      DigitalObjectsExtraction<D> contentsExtraction,
                                                                      HashAssembler contentHashAssembler)
        Assemble a SIP that is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI
        contentsExtraction - Extraction of content from domain objects added to the SIP
        contentHashAssembler - Assembler that builds up an encoded hash for the extracted content
        Returns:
        The newly created SIP assembler
      • forPdiAndContentWithHashing

        public static <D> SipAssembler<D> forPdiAndContentWithHashing​(PackagingInformation prototype,
                                                                      Assembler<HashedContents<D>> pdiAssembler,
                                                                      HashAssembler pdiHashAssembler,
                                                                      ContentAssembler<D> contentAssembler)
        Assemble a SIP that is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        prototype - Prototype for the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI
        contentAssembler - ContentAssembler that adds the digital objects to the SIP
        Returns:
        The newly created SIP assembler
      • forPdiAndContentWithHashing

        public static <D> SipAssembler<D> forPdiAndContentWithHashing​(PackagingInformationFactory factory,
                                                                      Assembler<HashedContents<D>> pdiAssembler,
                                                                      HashAssembler pdiHashAssembler,
                                                                      ContentAssembler<D> contentAssembler)
        Assemble a SIP that is the only SIP in its DSS.
        Type Parameters:
        D - The type of domain objects to assemble the SIP from
        Parameters:
        factory - Factory for creating the Packaging Information
        pdiAssembler - Assembler that builds up the PDI
        pdiHashAssembler - Assembler that builds up an encoded hash for the PDI
        contentAssembler - ContentAssembler that adds the digital objects to the SIP
        Returns:
        The newly created SIP assembler
      • start

        public void start​(DataBuffer buffer)
                   throws java.io.IOException
        Description copied from interface: Assembler
        Start the assembly process.
        Specified by:
        start in interface Assembler<D>
        Parameters:
        buffer - Storage for the assembled product
        Throws:
        java.io.IOException - When an I/O error occurs
      • add

        public void add​(D domainObject)
        Description copied from interface: Assembler
        Add a component to the assembly. The assembler must be opened first.
        Specified by:
        add in interface Assembler<D>
        Parameters:
        domainObject - The component to add
      • end

        public void end()
                 throws java.io.IOException
        Description copied from interface: Assembler
        Finish the assembly process.
        Specified by:
        end in interface Assembler<D>
        Throws:
        java.io.IOException - When an I/O error occurs
      • getMetrics

        public SipMetrics getMetrics()
        Description copied from interface: Assembler
        Return metrics about the assembly process. Implementations will generally provide dedicated classes that you should cast the result to.
        Specified by:
        getMetrics in interface Assembler<D>
        Returns:
        Metrics about the assembly process, or null if no metrics are provided