Class DataProcessing
- java.lang.Object
-
- software.amazon.awssdk.services.sagemaker.model.DataProcessing
-
- All Implemented Interfaces:
Serializable
,SdkPojo
,ToCopyableBuilder<DataProcessing.Builder,DataProcessing>
@Generated("software.amazon.awssdk:codegen") public final class DataProcessing extends Object implements SdkPojo, Serializable, ToCopyableBuilder<DataProcessing.Builder,DataProcessing>
The data structure used to specify the data to be used for inference in a batch transform job and to associate the data that is relevant to the prediction results in the output. The input filter provided allows you to exclude input data that is not needed for inference in a batch transform job. The output filter provided allows you to include input data relevant to interpreting the predictions in the output from the job. For more information, see Associate Prediction Results with their Corresponding Input Records.
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
DataProcessing.Builder
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static DataProcessing.Builder
builder()
boolean
equals(Object obj)
boolean
equalsBySdkFields(Object obj)
<T> Optional<T>
getValueForField(String fieldName, Class<T> clazz)
int
hashCode()
String
inputFilter()
A JSONPath expression used to select a portion of the input data to pass to the algorithm.JoinSource
joinSource()
Specifies the source of the data to join with the transformed data.String
joinSourceAsString()
Specifies the source of the data to join with the transformed data.String
outputFilter()
A JSONPath expression used to select a portion of the joined dataset to save in the output file for a batch transform job.List<SdkField<?>>
sdkFields()
static Class<? extends DataProcessing.Builder>
serializableBuilderClass()
DataProcessing.Builder
toBuilder()
String
toString()
Returns a string representation of this object.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
-
-
-
Method Detail
-
inputFilter
public final String inputFilter()
A JSONPath expression used to select a portion of the input data to pass to the algorithm. Use the
InputFilter
parameter to exclude fields, such as an ID column, from the input. If you want SageMaker to pass the entire input dataset to the algorithm, accept the default value$
.Examples:
"$"
,"$[1:]"
,"$.features"
- Returns:
- A JSONPath expression used to select a portion of the input data to pass to the algorithm. Use the
InputFilter
parameter to exclude fields, such as an ID column, from the input. If you want SageMaker to pass the entire input dataset to the algorithm, accept the default value$
.Examples:
"$"
,"$[1:]"
,"$.features"
-
outputFilter
public final String outputFilter()
A JSONPath expression used to select a portion of the joined dataset to save in the output file for a batch transform job. If you want SageMaker to store the entire input dataset in the output file, leave the default value,
$
. If you specify indexes that aren't within the dimension size of the joined dataset, you get an error.Examples:
"$"
,"$[0,5:]"
,"$['id','SageMakerOutput']"
- Returns:
- A JSONPath expression used to select a portion of the joined dataset to save in the output file for a
batch transform job. If you want SageMaker to store the entire input dataset in the output file, leave
the default value,
$
. If you specify indexes that aren't within the dimension size of the joined dataset, you get an error.Examples:
"$"
,"$[0,5:]"
,"$['id','SageMakerOutput']"
-
joinSource
public final JoinSource joinSource()
Specifies the source of the data to join with the transformed data. The valid values are
None
andInput
. The default value isNone
, which specifies not to join the input with the transformed data. If you want the batch transform job to join the original input data with the transformed data, setJoinSource
toInput
. You can specifyOutputFilter
as an additional filter to select a portion of the joined dataset and store it in the output file.For JSON or JSONLines objects, such as a JSON array, SageMaker adds the transformed data to the input JSON object in an attribute called
SageMakerOutput
. The joined result for JSON must be a key-value pair object. If the input is not a key-value pair object, SageMaker creates a new JSON file. In the new JSON file, and the input data is stored under theSageMakerInput
key and the results are stored inSageMakerOutput
.For CSV data, SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
If the service returns an enum value that is not available in the current SDK version,
joinSource
will returnJoinSource.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromjoinSourceAsString()
.- Returns:
- Specifies the source of the data to join with the transformed data. The valid values are
None
andInput
. The default value isNone
, which specifies not to join the input with the transformed data. If you want the batch transform job to join the original input data with the transformed data, setJoinSource
toInput
. You can specifyOutputFilter
as an additional filter to select a portion of the joined dataset and store it in the output file.For JSON or JSONLines objects, such as a JSON array, SageMaker adds the transformed data to the input JSON object in an attribute called
SageMakerOutput
. The joined result for JSON must be a key-value pair object. If the input is not a key-value pair object, SageMaker creates a new JSON file. In the new JSON file, and the input data is stored under theSageMakerInput
key and the results are stored inSageMakerOutput
.For CSV data, SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
- See Also:
JoinSource
-
joinSourceAsString
public final String joinSourceAsString()
Specifies the source of the data to join with the transformed data. The valid values are
None
andInput
. The default value isNone
, which specifies not to join the input with the transformed data. If you want the batch transform job to join the original input data with the transformed data, setJoinSource
toInput
. You can specifyOutputFilter
as an additional filter to select a portion of the joined dataset and store it in the output file.For JSON or JSONLines objects, such as a JSON array, SageMaker adds the transformed data to the input JSON object in an attribute called
SageMakerOutput
. The joined result for JSON must be a key-value pair object. If the input is not a key-value pair object, SageMaker creates a new JSON file. In the new JSON file, and the input data is stored under theSageMakerInput
key and the results are stored inSageMakerOutput
.For CSV data, SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
If the service returns an enum value that is not available in the current SDK version,
joinSource
will returnJoinSource.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromjoinSourceAsString()
.- Returns:
- Specifies the source of the data to join with the transformed data. The valid values are
None
andInput
. The default value isNone
, which specifies not to join the input with the transformed data. If you want the batch transform job to join the original input data with the transformed data, setJoinSource
toInput
. You can specifyOutputFilter
as an additional filter to select a portion of the joined dataset and store it in the output file.For JSON or JSONLines objects, such as a JSON array, SageMaker adds the transformed data to the input JSON object in an attribute called
SageMakerOutput
. The joined result for JSON must be a key-value pair object. If the input is not a key-value pair object, SageMaker creates a new JSON file. In the new JSON file, and the input data is stored under theSageMakerInput
key and the results are stored inSageMakerOutput
.For CSV data, SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
- See Also:
JoinSource
-
toBuilder
public DataProcessing.Builder toBuilder()
- Specified by:
toBuilder
in interfaceToCopyableBuilder<DataProcessing.Builder,DataProcessing>
-
builder
public static DataProcessing.Builder builder()
-
serializableBuilderClass
public static Class<? extends DataProcessing.Builder> serializableBuilderClass()
-
equalsBySdkFields
public final boolean equalsBySdkFields(Object obj)
- Specified by:
equalsBySdkFields
in interfaceSdkPojo
-
toString
public final String toString()
Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
-
-