public class CreateDataSourceFromRedshiftRequest extends AmazonWebServiceRequest implements Serializable, Cloneable
CreateDataSourceFromRedshift operation.
Creates a DataSource from
Amazon Redshift
. A DataSource references data that can be used to
perform either CreateMLModel, CreateEvaluation or
CreateBatchPrediction operations.
CreateDataSourceFromRedshift is an asynchronous
operation. In response to CreateDataSourceFromRedshift ,
Amazon Machine Learning (Amazon ML) immediately returns and
sets the DataSource status to PENDING .
After the DataSource is created and ready for
use, Amazon ML sets the Status parameter to
COMPLETED .
DataSource in COMPLETED
or PENDING status can only be used to perform
CreateMLModel, CreateEvaluation, or CreateBatchPrediction operations.
If Amazon ML cannot accept the input source, it sets the
Status parameter to FAILED and includes an
error message in the Message attribute of the
GetDataSource operation response.
The observations should exist in the database hosted on an Amazon
Redshift cluster and should be specified by a
SelectSqlQuery .
Amazon ML executes
Unload
command in Amazon Redshift to transfer the result set of
SelectSqlQuery to S3StagingLocation.
After the DataSource is created, it's ready for use in
evaluations and batch predictions. If you plan to use the
DataSource to train an MLModel , the
DataSource requires another item -- a recipe. A recipe
describes the observation variables that participate in training an
MLModel . A recipe describes how each input variable will
be used in training. Will the variable be included or excluded from
training? Will the variable be manipulated, for example, combined with
another variable or split apart into word combinations? The recipe
provides answers to these questions. For more information, see the
Amazon Machine Learning Developer Guide.
NOOP| Constructor and Description |
|---|
CreateDataSourceFromRedshiftRequest() |
| Modifier and Type | Method and Description |
|---|---|
CreateDataSourceFromRedshiftRequest |
clone()
Creates a shallow clone of this request.
|
boolean |
equals(Object obj) |
Boolean |
getComputeStatistics()
The compute statistics for a
DataSource. |
String |
getDataSourceId()
A user-supplied ID that uniquely identifies the
DataSource. |
String |
getDataSourceName()
A user-supplied name or description of the
DataSource. |
RedshiftDataSpec |
getDataSpec()
The data specification of an Amazon Redshift
DataSource:
|
String |
getRoleARN()
A fully specified role Amazon Resource Name (ARN).
|
int |
hashCode() |
Boolean |
isComputeStatistics()
The compute statistics for a
DataSource. |
void |
setComputeStatistics(Boolean computeStatistics)
The compute statistics for a
DataSource. |
void |
setDataSourceId(String dataSourceId)
A user-supplied ID that uniquely identifies the
DataSource. |
void |
setDataSourceName(String dataSourceName)
A user-supplied name or description of the
DataSource. |
void |
setDataSpec(RedshiftDataSpec dataSpec)
The data specification of an Amazon Redshift
DataSource:
|
void |
setRoleARN(String roleARN)
A fully specified role Amazon Resource Name (ARN).
|
String |
toString()
Returns a string representation of this object; useful for testing and
debugging.
|
CreateDataSourceFromRedshiftRequest |
withComputeStatistics(Boolean computeStatistics)
The compute statistics for a
DataSource. |
CreateDataSourceFromRedshiftRequest |
withDataSourceId(String dataSourceId)
A user-supplied ID that uniquely identifies the
DataSource. |
CreateDataSourceFromRedshiftRequest |
withDataSourceName(String dataSourceName)
A user-supplied name or description of the
DataSource. |
CreateDataSourceFromRedshiftRequest |
withDataSpec(RedshiftDataSpec dataSpec)
The data specification of an Amazon Redshift
DataSource:
|
CreateDataSourceFromRedshiftRequest |
withRoleARN(String roleARN)
A fully specified role Amazon Resource Name (ARN).
|
copyBaseTo, getCustomQueryParameters, getCustomRequestHeaders, getGeneralProgressListener, getReadLimit, getRequestClientOptions, getRequestCredentials, getRequestMetricCollector, putCustomQueryParameter, putCustomRequestHeader, setGeneralProgressListener, setRequestCredentials, setRequestMetricCollector, withGeneralProgressListener, withRequestMetricCollectorpublic String getDataSourceId()
DataSource.
Constraints:
Length: 1 - 64
Pattern: [a-zA-Z0-9_.-]+
DataSource.public void setDataSourceId(String dataSourceId)
DataSource.
Constraints:
Length: 1 - 64
Pattern: [a-zA-Z0-9_.-]+
dataSourceId - A user-supplied ID that uniquely identifies the
DataSource.public CreateDataSourceFromRedshiftRequest withDataSourceId(String dataSourceId)
DataSource.
Returns a reference to this object so that method calls can be chained together.
Constraints:
Length: 1 - 64
Pattern: [a-zA-Z0-9_.-]+
dataSourceId - A user-supplied ID that uniquely identifies the
DataSource.public String getDataSourceName()
DataSource.
Constraints:
Length: 0 - 1024
Pattern: .*\S.*|^$
DataSource.public void setDataSourceName(String dataSourceName)
DataSource.
Constraints:
Length: 0 - 1024
Pattern: .*\S.*|^$
dataSourceName - A user-supplied name or description of the DataSource.public CreateDataSourceFromRedshiftRequest withDataSourceName(String dataSourceName)
DataSource.
Returns a reference to this object so that method calls can be chained together.
Constraints:
Length: 0 - 1024
Pattern: .*\S.*|^$
dataSourceName - A user-supplied name or description of the DataSource.public RedshiftDataSpec getDataSpec()
DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
public void setDataSpec(RedshiftDataSpec dataSpec)
DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
dataSpec - The data specification of an Amazon Redshift DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
public CreateDataSourceFromRedshiftRequest withDataSpec(RedshiftDataSpec dataSpec)
DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
Returns a reference to this object so that method calls can be chained together.
dataSpec - The data specification of an Amazon Redshift DataSource:
DatabaseInformation -
DatabaseName
- Name of the Amazon Redshift database.
ClusterIdentifier - Unique ID for the Amazon Redshift
cluster.DatabaseCredentials - AWS Identity abd Access Management (IAM) credentials that are used to connect to the Amazon Redshift database.
SelectSqlQuery - Query that is
used to retrieve the observation data for the
Datasource.
S3StagingLocation - Amazon Simple
Storage Service (Amazon S3) location for staging Amazon Redshift data.
The data retrieved from Amazon Relational Database Service (Amazon
RDS) using SelectSqlQuery is stored in this
location.
DataSchemaUri - Amazon S3 location of the
DataSchema.
DataSchema - A JSON string
representing the schema. This is not required if
DataSchemaUri is specified.
DataRearrangement - A JSON string representing the splitting
requirement of a Datasource.
Sample -
"{\"randomSeed\":\"some-random-seed\",
\"splitting\":{\"percentBegin\":10,\"percentEnd\":60}}"
public String getRoleARN()
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
Constraints:
Length: 1 - 100
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
public void setRoleARN(String roleARN)
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
Constraints:
Length: 1 - 100
roleARN - A fully specified role Amazon Resource Name (ARN). Amazon ML assumes
the role on behalf of the user to create the following:
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
public CreateDataSourceFromRedshiftRequest withRoleARN(String roleARN)
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
Returns a reference to this object so that method calls can be chained together.
Constraints:
Length: 1 - 100
roleARN - A fully specified role Amazon Resource Name (ARN). Amazon ML assumes
the role on behalf of the user to create the following:
A security group to allow Amazon ML to execute the
SelectSqlQuery query on an Amazon Redshift cluster
An Amazon S3 bucket policy to grant Amazon ML read/write
permissions on the S3StagingLocation
public Boolean isComputeStatistics()
DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingDataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingpublic void setComputeStatistics(Boolean computeStatistics)
DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingcomputeStatistics - The compute statistics for a DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingpublic CreateDataSourceFromRedshiftRequest withComputeStatistics(Boolean computeStatistics)
DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel training
Returns a reference to this object so that method calls can be chained together.
computeStatistics - The compute statistics for a DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingpublic Boolean getComputeStatistics()
DataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingDataSource. The statistics
are generated from the observation data referenced by a
DataSource. Amazon ML uses the statistics internally
during MLModel training. This parameter must be set to
true if the DataSource needs to be used for
MLModel trainingpublic String toString()
toString in class ObjectObject.toString()public CreateDataSourceFromRedshiftRequest clone()
AmazonWebServiceRequestclone in class AmazonWebServiceRequestObject.clone()Copyright © 2015. All rights reserved.