@Generated(value="com.amazonaws:aws-java-sdk-code-generator") public class S3DataSource extends Object implements Serializable, Cloneable, StructuredPojo
Describes the S3 data source.
| Constructor and Description | 
|---|
| S3DataSource() | 
| Modifier and Type | Method and Description | 
|---|---|
| S3DataSource | clone() | 
| boolean | equals(Object obj) | 
| List<String> | getAttributeNames()
 A list of one or more attribute names to use that are found in a specified augmented manifest file. | 
| String | getS3DataDistributionType()
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify  FullyReplicated. | 
| String | getS3DataType()
 If you choose  S3Prefix,S3Uriidentifies a key name prefix. | 
| String | getS3Uri()
 Depending on the value specified for the  S3DataType, identifies either a key name prefix or a
 manifest. | 
| int | hashCode() | 
| void | marshall(ProtocolMarshaller protocolMarshaller)Marshalls this structured data using the given  ProtocolMarshaller. | 
| void | setAttributeNames(Collection<String> attributeNames)
 A list of one or more attribute names to use that are found in a specified augmented manifest file. | 
| void | setS3DataDistributionType(String s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify  FullyReplicated. | 
| void | setS3DataType(String s3DataType)
 If you choose  S3Prefix,S3Uriidentifies a key name prefix. | 
| void | setS3Uri(String s3Uri)
 Depending on the value specified for the  S3DataType, identifies either a key name prefix or a
 manifest. | 
| String | toString()Returns a string representation of this object. | 
| S3DataSource | withAttributeNames(Collection<String> attributeNames)
 A list of one or more attribute names to use that are found in a specified augmented manifest file. | 
| S3DataSource | withAttributeNames(String... attributeNames)
 A list of one or more attribute names to use that are found in a specified augmented manifest file. | 
| S3DataSource | withS3DataDistributionType(S3DataDistribution s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify  FullyReplicated. | 
| S3DataSource | withS3DataDistributionType(String s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify  FullyReplicated. | 
| S3DataSource | withS3DataType(S3DataType s3DataType)
 If you choose  S3Prefix,S3Uriidentifies a key name prefix. | 
| S3DataSource | withS3DataType(String s3DataType)
 If you choose  S3Prefix,S3Uriidentifies a key name prefix. | 
| S3DataSource | withS3Uri(String s3Uri)
 Depending on the value specified for the  S3DataType, identifies either a key name prefix or a
 manifest. | 
public void setS3DataType(String s3DataType)
 If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all
 objects that match the specified key name prefix for model training.
 
 If you choose ManifestFile, S3Uri identifies an object that is a manifest file
 containing a list of object keys that you want Amazon SageMaker to use for model training.
 
 If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file
 in JSON lines format. This file contains the data you want to use for model training.
 AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
 
s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker
        uses all objects that match the specified key name prefix for model training. 
        
        If you choose ManifestFile, S3Uri identifies an object that is a manifest file
        containing a list of object keys that you want Amazon SageMaker to use for model training.
        
        If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest
        file in JSON lines format. This file contains the data you want to use for model training.
        AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
S3DataTypepublic String getS3DataType()
 If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all
 objects that match the specified key name prefix for model training.
 
 If you choose ManifestFile, S3Uri identifies an object that is a manifest file
 containing a list of object keys that you want Amazon SageMaker to use for model training.
 
 If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file
 in JSON lines format. This file contains the data you want to use for model training.
 AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
 
S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker
         uses all objects that match the specified key name prefix for model training. 
         
         If you choose ManifestFile, S3Uri identifies an object that is a manifest file
         containing a list of object keys that you want Amazon SageMaker to use for model training.
         
         If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented
         manifest file in JSON lines format. This file contains the data you want to use for model training.
         AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
S3DataTypepublic S3DataSource withS3DataType(String s3DataType)
 If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all
 objects that match the specified key name prefix for model training.
 
 If you choose ManifestFile, S3Uri identifies an object that is a manifest file
 containing a list of object keys that you want Amazon SageMaker to use for model training.
 
 If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file
 in JSON lines format. This file contains the data you want to use for model training.
 AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
 
s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker
        uses all objects that match the specified key name prefix for model training. 
        
        If you choose ManifestFile, S3Uri identifies an object that is a manifest file
        containing a list of object keys that you want Amazon SageMaker to use for model training.
        
        If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest
        file in JSON lines format. This file contains the data you want to use for model training.
        AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
S3DataTypepublic S3DataSource withS3DataType(S3DataType s3DataType)
 If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all
 objects that match the specified key name prefix for model training.
 
 If you choose ManifestFile, S3Uri identifies an object that is a manifest file
 containing a list of object keys that you want Amazon SageMaker to use for model training.
 
 If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file
 in JSON lines format. This file contains the data you want to use for model training.
 AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
 
s3DataType - If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker
        uses all objects that match the specified key name prefix for model training. 
        
        If you choose ManifestFile, S3Uri identifies an object that is a manifest file
        containing a list of object keys that you want Amazon SageMaker to use for model training.
        
        If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest
        file in JSON lines format. This file contains the data you want to use for model training.
        AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
S3DataTypepublic void setS3Uri(String s3Uri)
 Depending on the value specified for the S3DataType, identifies either a key name prefix or a
 manifest. For example:
 
 A key name prefix might look like this: s3://bucketname/exampleprefix
 
 A manifest might look like this: s3://bucketname/example.manifest
 
 A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is a prefix
 which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix to get a full set
 of S3Uri. Note that the prefix must be a valid non-empty S3Uri that precludes users
 from specifying a manifest whose individual S3Uri is sourced from different S3 buckets.
 
The following code example shows a valid manifest format:
 [ {"prefix": "s3://customer_bucket/some/prefix/"},
 
  "relative/path/to/custdata-1",
 
  "relative/path/custdata-2",
 
  ...
 
  "relative/path/custdata-N"
 
 ]
 
 This JSON is equivalent to the following S3Uri list:
 
 s3://customer_bucket/some/prefix/relative/path/to/custdata-1
 
 s3://customer_bucket/some/prefix/relative/path/custdata-2
 
 ...
 
 s3://customer_bucket/some/prefix/relative/path/custdata-N
 
 The complete set of S3Uri in this manifest is the input data for the channel for this data source.
 The object that each S3Uri points to must be readable by the IAM role that Amazon SageMaker uses to
 perform tasks on your behalf.
 
s3Uri - Depending on the value specified for the S3DataType, identifies either a key name prefix or a
        manifest. For example: 
        
        A key name prefix might look like this: s3://bucketname/exampleprefix
        
        A manifest might look like this: s3://bucketname/example.manifest
        
        A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is a
        prefix which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix to
        get a full set of S3Uri. Note that the prefix must be a valid non-empty S3Uri
        that precludes users from specifying a manifest whose individual S3Uri is sourced from
        different S3 buckets.
        
The following code example shows a valid manifest format:
        [ {"prefix": "s3://customer_bucket/some/prefix/"},
        
         "relative/path/to/custdata-1",
        
         "relative/path/custdata-2",
        
         ...
        
         "relative/path/custdata-N"
        
        ]
        
        This JSON is equivalent to the following S3Uri list:
        
        s3://customer_bucket/some/prefix/relative/path/to/custdata-1
        
        s3://customer_bucket/some/prefix/relative/path/custdata-2
        
        ...
        
        s3://customer_bucket/some/prefix/relative/path/custdata-N
        
        The complete set of S3Uri in this manifest is the input data for the channel for this data
        source. The object that each S3Uri points to must be readable by the IAM role that Amazon
        SageMaker uses to perform tasks on your behalf.
        
public String getS3Uri()
 Depending on the value specified for the S3DataType, identifies either a key name prefix or a
 manifest. For example:
 
 A key name prefix might look like this: s3://bucketname/exampleprefix
 
 A manifest might look like this: s3://bucketname/example.manifest
 
 A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is a prefix
 which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix to get a full set
 of S3Uri. Note that the prefix must be a valid non-empty S3Uri that precludes users
 from specifying a manifest whose individual S3Uri is sourced from different S3 buckets.
 
The following code example shows a valid manifest format:
 [ {"prefix": "s3://customer_bucket/some/prefix/"},
 
  "relative/path/to/custdata-1",
 
  "relative/path/custdata-2",
 
  ...
 
  "relative/path/custdata-N"
 
 ]
 
 This JSON is equivalent to the following S3Uri list:
 
 s3://customer_bucket/some/prefix/relative/path/to/custdata-1
 
 s3://customer_bucket/some/prefix/relative/path/custdata-2
 
 ...
 
 s3://customer_bucket/some/prefix/relative/path/custdata-N
 
 The complete set of S3Uri in this manifest is the input data for the channel for this data source.
 The object that each S3Uri points to must be readable by the IAM role that Amazon SageMaker uses to
 perform tasks on your behalf.
 
S3DataType, identifies either a key name prefix or
         a manifest. For example: 
         
         A key name prefix might look like this: s3://bucketname/exampleprefix
         
         A manifest might look like this: s3://bucketname/example.manifest
         
         A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is
         a prefix which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix
         to get a full set of S3Uri. Note that the prefix must be a valid non-empty
         S3Uri that precludes users from specifying a manifest whose individual S3Uri is
         sourced from different S3 buckets.
         
The following code example shows a valid manifest format:
         [ {"prefix": "s3://customer_bucket/some/prefix/"},
         
          "relative/path/to/custdata-1",
         
          "relative/path/custdata-2",
         
          ...
         
          "relative/path/custdata-N"
         
         ]
         
         This JSON is equivalent to the following S3Uri list:
         
         s3://customer_bucket/some/prefix/relative/path/to/custdata-1
         
         s3://customer_bucket/some/prefix/relative/path/custdata-2
         
         ...
         
         s3://customer_bucket/some/prefix/relative/path/custdata-N
         
         The complete set of S3Uri in this manifest is the input data for the channel for this data
         source. The object that each S3Uri points to must be readable by the IAM role that Amazon
         SageMaker uses to perform tasks on your behalf.
         
public S3DataSource withS3Uri(String s3Uri)
 Depending on the value specified for the S3DataType, identifies either a key name prefix or a
 manifest. For example:
 
 A key name prefix might look like this: s3://bucketname/exampleprefix
 
 A manifest might look like this: s3://bucketname/example.manifest
 
 A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is a prefix
 which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix to get a full set
 of S3Uri. Note that the prefix must be a valid non-empty S3Uri that precludes users
 from specifying a manifest whose individual S3Uri is sourced from different S3 buckets.
 
The following code example shows a valid manifest format:
 [ {"prefix": "s3://customer_bucket/some/prefix/"},
 
  "relative/path/to/custdata-1",
 
  "relative/path/custdata-2",
 
  ...
 
  "relative/path/custdata-N"
 
 ]
 
 This JSON is equivalent to the following S3Uri list:
 
 s3://customer_bucket/some/prefix/relative/path/to/custdata-1
 
 s3://customer_bucket/some/prefix/relative/path/custdata-2
 
 ...
 
 s3://customer_bucket/some/prefix/relative/path/custdata-N
 
 The complete set of S3Uri in this manifest is the input data for the channel for this data source.
 The object that each S3Uri points to must be readable by the IAM role that Amazon SageMaker uses to
 perform tasks on your behalf.
 
s3Uri - Depending on the value specified for the S3DataType, identifies either a key name prefix or a
        manifest. For example: 
        
        A key name prefix might look like this: s3://bucketname/exampleprefix
        
        A manifest might look like this: s3://bucketname/example.manifest
        
        A manifest is an S3 object which is a JSON file consisting of an array of elements. The first element is a
        prefix which is followed by one or more suffixes. SageMaker appends the suffix elements to the prefix to
        get a full set of S3Uri. Note that the prefix must be a valid non-empty S3Uri
        that precludes users from specifying a manifest whose individual S3Uri is sourced from
        different S3 buckets.
        
The following code example shows a valid manifest format:
        [ {"prefix": "s3://customer_bucket/some/prefix/"},
        
         "relative/path/to/custdata-1",
        
         "relative/path/custdata-2",
        
         ...
        
         "relative/path/custdata-N"
        
        ]
        
        This JSON is equivalent to the following S3Uri list:
        
        s3://customer_bucket/some/prefix/relative/path/to/custdata-1
        
        s3://customer_bucket/some/prefix/relative/path/custdata-2
        
        ...
        
        s3://customer_bucket/some/prefix/relative/path/custdata-N
        
        The complete set of S3Uri in this manifest is the input data for the channel for this data
        source. The object that each S3Uri points to must be readable by the IAM role that Amazon
        SageMaker uses to perform tasks on your behalf.
        
public void setS3DataDistributionType(String s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify FullyReplicated.
 
 If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model
 training, specify ShardedByS3Key. If there are n ML compute instances launched for a training
 job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on
 each machine uses only the subset of training data.
 
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
 In distributed training, where you use multiple ML compute EC2 instances, you might choose
 ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when
 TrainingInputMode is set to File), this copies 1/n of the number of objects.
 
s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched
        for model training, specify FullyReplicated. 
        
        If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched
        for model training, specify ShardedByS3Key. If there are n ML compute instances
        launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In
        this case, model training on each machine uses only the subset of training data.
        
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
        In distributed training, where you use multiple ML compute EC2 instances, you might choose
        ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume
        (when TrainingInputMode is set to File), this copies 1/n of the number of
        objects.
S3DataDistributionpublic String getS3DataDistributionType()
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify FullyReplicated.
 
 If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model
 training, specify ShardedByS3Key. If there are n ML compute instances launched for a training
 job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on
 each machine uses only the subset of training data.
 
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
 In distributed training, where you use multiple ML compute EC2 instances, you might choose
 ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when
 TrainingInputMode is set to File), this copies 1/n of the number of objects.
 
FullyReplicated. 
         
         If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched
         for model training, specify ShardedByS3Key. If there are n ML compute instances
         launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In
         this case, model training on each machine uses only the subset of training data.
         
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
         In distributed training, where you use multiple ML compute EC2 instances, you might choose
         ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume
         (when TrainingInputMode is set to File), this copies 1/n of the number
         of objects.
S3DataDistributionpublic S3DataSource withS3DataDistributionType(String s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify FullyReplicated.
 
 If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model
 training, specify ShardedByS3Key. If there are n ML compute instances launched for a training
 job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on
 each machine uses only the subset of training data.
 
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
 In distributed training, where you use multiple ML compute EC2 instances, you might choose
 ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when
 TrainingInputMode is set to File), this copies 1/n of the number of objects.
 
s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched
        for model training, specify FullyReplicated. 
        
        If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched
        for model training, specify ShardedByS3Key. If there are n ML compute instances
        launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In
        this case, model training on each machine uses only the subset of training data.
        
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
        In distributed training, where you use multiple ML compute EC2 instances, you might choose
        ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume
        (when TrainingInputMode is set to File), this copies 1/n of the number of
        objects.
S3DataDistributionpublic S3DataSource withS3DataDistributionType(S3DataDistribution s3DataDistributionType)
 If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for
 model training, specify FullyReplicated.
 
 If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model
 training, specify ShardedByS3Key. If there are n ML compute instances launched for a training
 job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on
 each machine uses only the subset of training data.
 
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
 In distributed training, where you use multiple ML compute EC2 instances, you might choose
 ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when
 TrainingInputMode is set to File), this copies 1/n of the number of objects.
 
s3DataDistributionType - If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched
        for model training, specify FullyReplicated. 
        
        If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched
        for model training, specify ShardedByS3Key. If there are n ML compute instances
        launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In
        this case, model training on each machine uses only the subset of training data.
        
Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipe modes. Keep this in mind when developing algorithms.
        In distributed training, where you use multiple ML compute EC2 instances, you might choose
        ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume
        (when TrainingInputMode is set to File), this copies 1/n of the number of
        objects.
S3DataDistributionpublic List<String> getAttributeNames()
A list of one or more attribute names to use that are found in a specified augmented manifest file.
public void setAttributeNames(Collection<String> attributeNames)
A list of one or more attribute names to use that are found in a specified augmented manifest file.
attributeNames - A list of one or more attribute names to use that are found in a specified augmented manifest file.public S3DataSource withAttributeNames(String... attributeNames)
A list of one or more attribute names to use that are found in a specified augmented manifest file.
 NOTE: This method appends the values to the existing list (if any). Use
 setAttributeNames(java.util.Collection) or withAttributeNames(java.util.Collection) if you want
 to override the existing values.
 
attributeNames - A list of one or more attribute names to use that are found in a specified augmented manifest file.public S3DataSource withAttributeNames(Collection<String> attributeNames)
A list of one or more attribute names to use that are found in a specified augmented manifest file.
attributeNames - A list of one or more attribute names to use that are found in a specified augmented manifest file.public String toString()
toString in class ObjectObject.toString()public S3DataSource clone()
public void marshall(ProtocolMarshaller protocolMarshaller)
StructuredPojoProtocolMarshaller.marshall in interface StructuredPojoprotocolMarshaller - Implementation of ProtocolMarshaller used to marshall this object's data.