Package org.deeplearning4j.parallelism
Class ParallelInference.Builder
- java.lang.Object
-
- org.deeplearning4j.parallelism.ParallelInference.Builder
-
- Enclosing class:
- ParallelInference
public static class ParallelInference.Builder extends Object
-
-
Field Summary
Fields Modifier and Type Field Description protected LoadBalanceMode
loadBalanceMode
-
Constructor Summary
Constructors Constructor Description Builder(@NonNull org.deeplearning4j.nn.api.Model model)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ParallelInference.Builder
batchLimit(int limit)
This method defines, how many input samples can be batched within given time frame.ParallelInference
build()
This method builds new ParallelInference instanceParallelInference.Builder
inferenceMode(@NonNull InferenceMode inferenceMode)
This method allows you to define mode that'll be used during inference.ParallelInference.Builder
loadBalanceMode(@NonNull LoadBalanceMode loadBalanceMode)
This method allows you to specify load balance modeParallelInference.Builder
queueLimit(int limit)
This method defines buffer queue size.ParallelInference.Builder
workers(int workers)
This method defines, how many model copies will be used for inference.
-
-
-
Field Detail
-
loadBalanceMode
protected LoadBalanceMode loadBalanceMode
-
-
Method Detail
-
inferenceMode
public ParallelInference.Builder inferenceMode(@NonNull @NonNull InferenceMode inferenceMode)
This method allows you to define mode that'll be used during inference. Options are: SEQUENTIAL: Input will be sent to last-used worker unmodified. BATCHED: Multiple inputs will be packed into single batch, and sent to last-used device.- Parameters:
inferenceMode
-- Returns:
-
loadBalanceMode
public ParallelInference.Builder loadBalanceMode(@NonNull @NonNull LoadBalanceMode loadBalanceMode)
This method allows you to specify load balance mode- Parameters:
loadBalanceMode
-- Returns:
-
workers
public ParallelInference.Builder workers(int workers)
This method defines, how many model copies will be used for inference. PLEASE NOTE: This method primarily suited for multi-GPU systems PLEASE NOTE: For INPLACE inference mode this value will mean number of models per DEVICE- Parameters:
workers
-- Returns:
-
batchLimit
public ParallelInference.Builder batchLimit(int limit)
This method defines, how many input samples can be batched within given time frame. PLEASE NOTE: This value has no effect in SEQUENTIAL inference mode- Parameters:
limit
-- Returns:
-
queueLimit
public ParallelInference.Builder queueLimit(int limit)
This method defines buffer queue size. Default value: 64- Parameters:
limit
-- Returns:
-
build
public ParallelInference build()
This method builds new ParallelInference instance- Returns:
-
-