Class ParallelInference.Builder

    • Constructor Detail

      • Builder

        public Builder​(@NonNull
                       @NonNull org.deeplearning4j.nn.api.Model model)
    • Method Detail

      • inferenceMode

        public ParallelInference.Builder inferenceMode​(@NonNull
                                                       @NonNull InferenceMode inferenceMode)
        This method allows you to define mode that'll be used during inference. Options are: SEQUENTIAL: Input will be sent to last-used worker unmodified. BATCHED: Multiple inputs will be packed into single batch, and sent to last-used device.
        Parameters:
        inferenceMode -
        Returns:
      • loadBalanceMode

        public ParallelInference.Builder loadBalanceMode​(@NonNull
                                                         @NonNull LoadBalanceMode loadBalanceMode)
        This method allows you to specify load balance mode
        Parameters:
        loadBalanceMode -
        Returns:
      • workers

        public ParallelInference.Builder workers​(int workers)
        This method defines, how many model copies will be used for inference. PLEASE NOTE: This method primarily suited for multi-GPU systems PLEASE NOTE: For INPLACE inference mode this value will mean number of models per DEVICE
        Parameters:
        workers -
        Returns:
      • batchLimit

        public ParallelInference.Builder batchLimit​(int limit)
        This method defines, how many input samples can be batched within given time frame. PLEASE NOTE: This value has no effect in SEQUENTIAL inference mode
        Parameters:
        limit -
        Returns:
      • queueLimit

        public ParallelInference.Builder queueLimit​(int limit)
        This method defines buffer queue size. Default value: 64
        Parameters:
        limit -
        Returns:
      • build

        public ParallelInference build()
        This method builds new ParallelInference instance
        Returns: