Class TabularJobConfig
- java.lang.Object
-
- software.amazon.awssdk.services.sagemaker.model.TabularJobConfig
-
- All Implemented Interfaces:
Serializable
,SdkPojo
,ToCopyableBuilder<TabularJobConfig.Builder,TabularJobConfig>
@Generated("software.amazon.awssdk:codegen") public final class TabularJobConfig extends Object implements SdkPojo, Serializable, ToCopyableBuilder<TabularJobConfig.Builder,TabularJobConfig>
The collection of settings used by an AutoML job V2 for the tabular problem type.
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
TabularJobConfig.Builder
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static TabularJobConfig.Builder
builder()
CandidateGenerationConfig
candidateGenerationConfig()
The configuration information of how model candidates are generated.AutoMLJobCompletionCriteria
completionCriteria()
Returns the value of the CompletionCriteria property for this object.boolean
equals(Object obj)
boolean
equalsBySdkFields(Object obj)
String
featureSpecificationS3Uri()
A URL to the Amazon S3 data source containing selected features from the input data source to run an Autopilot job V2.Boolean
generateCandidateDefinitionsOnly()
Generates possible candidates without training the models.<T> Optional<T>
getValueForField(String fieldName, Class<T> clazz)
int
hashCode()
AutoMLMode
mode()
The method that Autopilot uses to train the data.String
modeAsString()
The method that Autopilot uses to train the data.ProblemType
problemType()
The type of supervised learning problem available for the model candidates of the AutoML job V2.String
problemTypeAsString()
The type of supervised learning problem available for the model candidates of the AutoML job V2.String
sampleWeightAttributeName()
If specified, this column name indicates which column of the dataset should be treated as sample weights for use by the objective metric during the training, evaluation, and the selection of the best model.List<SdkField<?>>
sdkFields()
static Class<? extends TabularJobConfig.Builder>
serializableBuilderClass()
String
targetAttributeName()
The name of the target variable in supervised learning, usually represented by 'y'.TabularJobConfig.Builder
toBuilder()
String
toString()
Returns a string representation of this object.-
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
-
-
-
Method Detail
-
candidateGenerationConfig
public final CandidateGenerationConfig candidateGenerationConfig()
The configuration information of how model candidates are generated.
- Returns:
- The configuration information of how model candidates are generated.
-
completionCriteria
public final AutoMLJobCompletionCriteria completionCriteria()
Returns the value of the CompletionCriteria property for this object.- Returns:
- The value of the CompletionCriteria property for this object.
-
featureSpecificationS3Uri
public final String featureSpecificationS3Uri()
A URL to the Amazon S3 data source containing selected features from the input data source to run an Autopilot job V2. You can input
FeatureAttributeNames
(optional) in JSON format as shown below:{ "FeatureAttributeNames":["col1", "col2", ...] }
.You can also specify the data type of the feature (optional) in the format shown below:
{ "FeatureDataTypes":{"col1":"numeric", "col2":"categorical" ... } }
These column keys may not include the target column.
In ensembling mode, Autopilot only supports the following data types:
numeric
,categorical
,text
, anddatetime
. In HPO mode, Autopilot can supportnumeric
,categorical
,text
,datetime
, andsequence
.If only
FeatureDataTypes
is provided, the column keys (col1
,col2
,..) should be a subset of the column names in the input data.If both
FeatureDataTypes
andFeatureAttributeNames
are provided, then the column keys should be a subset of the column names provided inFeatureAttributeNames
.The key name
FeatureAttributeNames
is fixed. The values listed in["col1", "col2", ...]
are case sensitive and should be a list of strings containing unique values that are a subset of the column names in the input data. The list of columns provided must not include the target column.- Returns:
- A URL to the Amazon S3 data source containing selected features from the input data source to run an
Autopilot job V2. You can input
FeatureAttributeNames
(optional) in JSON format as shown below:{ "FeatureAttributeNames":["col1", "col2", ...] }
.You can also specify the data type of the feature (optional) in the format shown below:
{ "FeatureDataTypes":{"col1":"numeric", "col2":"categorical" ... } }
These column keys may not include the target column.
In ensembling mode, Autopilot only supports the following data types:
numeric
,categorical
,text
, anddatetime
. In HPO mode, Autopilot can supportnumeric
,categorical
,text
,datetime
, andsequence
.If only
FeatureDataTypes
is provided, the column keys (col1
,col2
,..) should be a subset of the column names in the input data.If both
FeatureDataTypes
andFeatureAttributeNames
are provided, then the column keys should be a subset of the column names provided inFeatureAttributeNames
.The key name
FeatureAttributeNames
is fixed. The values listed in["col1", "col2", ...]
are case sensitive and should be a list of strings containing unique values that are a subset of the column names in the input data. The list of columns provided must not include the target column.
-
mode
public final AutoMLMode mode()
The method that Autopilot uses to train the data. You can either specify the mode manually or let Autopilot choose for you based on the dataset size by selecting
AUTO
. InAUTO
mode, Autopilot choosesENSEMBLING
for datasets smaller than 100 MB, andHYPERPARAMETER_TUNING
for larger ones.The
ENSEMBLING
mode uses a multi-stack ensemble model to predict classification and regression tasks directly from your dataset. This machine learning mode combines several base models to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing members. A multi-stack ensemble model can provide better performance over a single model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported byENSEMBLING
mode.The
HYPERPARAMETER_TUNING
(HPO) mode uses the best hyperparameters to train the best version of a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported byHYPERPARAMETER_TUNING
mode.If the service returns an enum value that is not available in the current SDK version,
mode
will returnAutoMLMode.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available frommodeAsString()
.- Returns:
- The method that Autopilot uses to train the data. You can either specify the mode manually or let
Autopilot choose for you based on the dataset size by selecting
AUTO
. InAUTO
mode, Autopilot choosesENSEMBLING
for datasets smaller than 100 MB, andHYPERPARAMETER_TUNING
for larger ones.The
ENSEMBLING
mode uses a multi-stack ensemble model to predict classification and regression tasks directly from your dataset. This machine learning mode combines several base models to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing members. A multi-stack ensemble model can provide better performance over a single model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported byENSEMBLING
mode.The
HYPERPARAMETER_TUNING
(HPO) mode uses the best hyperparameters to train the best version of a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported byHYPERPARAMETER_TUNING
mode. - See Also:
AutoMLMode
-
modeAsString
public final String modeAsString()
The method that Autopilot uses to train the data. You can either specify the mode manually or let Autopilot choose for you based on the dataset size by selecting
AUTO
. InAUTO
mode, Autopilot choosesENSEMBLING
for datasets smaller than 100 MB, andHYPERPARAMETER_TUNING
for larger ones.The
ENSEMBLING
mode uses a multi-stack ensemble model to predict classification and regression tasks directly from your dataset. This machine learning mode combines several base models to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing members. A multi-stack ensemble model can provide better performance over a single model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported byENSEMBLING
mode.The
HYPERPARAMETER_TUNING
(HPO) mode uses the best hyperparameters to train the best version of a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported byHYPERPARAMETER_TUNING
mode.If the service returns an enum value that is not available in the current SDK version,
mode
will returnAutoMLMode.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available frommodeAsString()
.- Returns:
- The method that Autopilot uses to train the data. You can either specify the mode manually or let
Autopilot choose for you based on the dataset size by selecting
AUTO
. InAUTO
mode, Autopilot choosesENSEMBLING
for datasets smaller than 100 MB, andHYPERPARAMETER_TUNING
for larger ones.The
ENSEMBLING
mode uses a multi-stack ensemble model to predict classification and regression tasks directly from your dataset. This machine learning mode combines several base models to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing members. A multi-stack ensemble model can provide better performance over a single model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported byENSEMBLING
mode.The
HYPERPARAMETER_TUNING
(HPO) mode uses the best hyperparameters to train the best version of a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported byHYPERPARAMETER_TUNING
mode. - See Also:
AutoMLMode
-
generateCandidateDefinitionsOnly
public final Boolean generateCandidateDefinitionsOnly()
Generates possible candidates without training the models. A model candidate is a combination of data preprocessors, algorithms, and algorithm parameter settings.
- Returns:
- Generates possible candidates without training the models. A model candidate is a combination of data preprocessors, algorithms, and algorithm parameter settings.
-
problemType
public final ProblemType problemType()
The type of supervised learning problem available for the model candidates of the AutoML job V2. For more information, see SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in
ProblemType
and provide the AutoMLJobObjective metric, or none at all.If the service returns an enum value that is not available in the current SDK version,
problemType
will returnProblemType.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromproblemTypeAsString()
.- Returns:
- The type of supervised learning problem available for the model candidates of the AutoML job V2. For more
information, see SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in
ProblemType
and provide the AutoMLJobObjective metric, or none at all. - See Also:
ProblemType
-
problemTypeAsString
public final String problemTypeAsString()
The type of supervised learning problem available for the model candidates of the AutoML job V2. For more information, see SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in
ProblemType
and provide the AutoMLJobObjective metric, or none at all.If the service returns an enum value that is not available in the current SDK version,
problemType
will returnProblemType.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromproblemTypeAsString()
.- Returns:
- The type of supervised learning problem available for the model candidates of the AutoML job V2. For more
information, see SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in
ProblemType
and provide the AutoMLJobObjective metric, or none at all. - See Also:
ProblemType
-
targetAttributeName
public final String targetAttributeName()
The name of the target variable in supervised learning, usually represented by 'y'.
- Returns:
- The name of the target variable in supervised learning, usually represented by 'y'.
-
sampleWeightAttributeName
public final String sampleWeightAttributeName()
If specified, this column name indicates which column of the dataset should be treated as sample weights for use by the objective metric during the training, evaluation, and the selection of the best model. This column is not considered as a predictive feature. For more information on Autopilot metrics, see Metrics and validation.
Sample weights should be numeric, non-negative, with larger values indicating which rows are more important than others. Data points that have invalid or no weight value are excluded.
Support for sample weights is available in Ensembling mode only.
- Returns:
- If specified, this column name indicates which column of the dataset should be treated as sample weights
for use by the objective metric during the training, evaluation, and the selection of the best model.
This column is not considered as a predictive feature. For more information on Autopilot metrics, see Metrics and
validation.
Sample weights should be numeric, non-negative, with larger values indicating which rows are more important than others. Data points that have invalid or no weight value are excluded.
Support for sample weights is available in Ensembling mode only.
-
toBuilder
public TabularJobConfig.Builder toBuilder()
- Specified by:
toBuilder
in interfaceToCopyableBuilder<TabularJobConfig.Builder,TabularJobConfig>
-
builder
public static TabularJobConfig.Builder builder()
-
serializableBuilderClass
public static Class<? extends TabularJobConfig.Builder> serializableBuilderClass()
-
equalsBySdkFields
public final boolean equalsBySdkFields(Object obj)
- Specified by:
equalsBySdkFields
in interfaceSdkPojo
-
toString
public final String toString()
Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
-
-