com.krux.hyperion

activity

package activity

Visibility
  1. Public
  2. All

Type Members

  1. case class CopyActivity extends PipelineActivity with Product with Serializable

    The activity that copies data from one data node to the other.

    The activity that copies data from one data node to the other.

    Note

    it seems that both input and output format needs to be in CsvDataFormat for this copy to work properly and it needs to be a specific variance of the CSV, for more information check the web page:

    http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-copyactivity.html

    From our experience it's really hard to export using TsvDataFormat, in both import and export especially for tasks involving RedshiftCopyActivity. A general rule of thumb is always use default CsvDataFormat for tasks involving both exporting to S3 and copy to redshift.

  2. case class DeleteS3PathActivity extends PipelineActivity with Product with Serializable

    Activity to recursively delete files in an S3 path.

  3. trait EmrActivity extends PipelineActivity

    The base trait for activities that run on an Amazon EMR cluster

  4. trait GoogleStorageActivity extends PipelineActivity

  5. case class GoogleStorageDownloadActivity extends GoogleStorageActivity with Product with Serializable

    Google Storage Download activity

  6. case class GoogleStorageUploadActivity extends GoogleStorageActivity with Product with Serializable

    Google Storage Upload activity

  7. case class HiveActivity extends PipelineActivity with Product with Serializable

    Runs a Hive query on an Amazon EMR cluster.

    Runs a Hive query on an Amazon EMR cluster. HiveActivity makes it easier to set up an Amzon EMR activity and automatically creates Hive tables based on input data coming in from either Amazon S3 or Amazon RDS. All you need to specify is the HiveQL to run on the source data. AWS Data Pipeline automatically creates Hive tables with ${input1}, ${input2}, etc. based on the input fields in the Hive Activity object. For S3 inputs, the dataFormat field is used to create the Hive column names. For MySQL (RDS) inputs, the column names for the SQL query are used to create the Hive column names.

  8. case class HiveCopyActivity extends PipelineActivity with Product with Serializable

    Runs a Hive query on an Amazon EMR cluster.

    Runs a Hive query on an Amazon EMR cluster. HiveCopyActivity makes it easier to copy data between Amazon S3 and DynamoDB. HiveCopyActivity accepts a HiveQL statement to filter input data from Amazon S3 or DynomoDB at the column and row level.

  9. case class JarActivity extends PipelineActivity with Product with Serializable

    Shell command activity that runs a given Jar

  10. case class MapReduceActivity extends EmrActivity with Product with Serializable

    Runs map reduce steps on an Amazon EMR cluster

  11. case class MapReduceStep extends Product with Serializable

    A MapReduce step that runs on MapReduce Cluster

  12. case class PigActivity extends PipelineActivity with Product with Serializable

    PigActivity provides native support for Pig scripts in AWS Data Pipeline without the requirement to use ShellCommandActivity or EmrActivity.

    PigActivity provides native support for Pig scripts in AWS Data Pipeline without the requirement to use ShellCommandActivity or EmrActivity. In addition, PigActivity supports data staging. When the stage field is set to true, AWS Data Pipeline stages the input data as a schema in Pig without additional code from the user.

  13. trait PipelineActivity extends PipelineObject

    The activity trait.

    The activity trait. All activities should mixin this trait.

  14. case class RedshiftCopyActivity extends PipelineActivity with Product with Serializable

    Copies data directly from DynamoDB or Amazon S3 to Amazon Redshift.

    Copies data directly from DynamoDB or Amazon S3 to Amazon Redshift. You can load data into a new table, or easily merge data into an existing table.

  15. trait RedshiftCopyOption extends AnyRef

  16. case class RedshiftUnloadActivity extends PipelineActivity with Product with Serializable

    Unload result of the given sql script from redshift to given s3Path.

  17. trait RedshiftUnloadOption extends AnyRef

  18. trait RunnableObject extends AnyRef

    Run time references of runnable objects

  19. class S3DistCpActivity extends EmrActivity

  20. case class ShellCommandActivity extends PipelineActivity with Product with Serializable

    Runs a command or script

  21. case class SparkActivity extends EmrActivity with Product with Serializable

    Runs spark steps on given spark cluster with Amazon EMR

  22. case class SparkStep extends Product with Serializable

    A spark step that runs on Spark Cluster

  23. case class SqlActivity extends PipelineActivity with Product with Serializable

    Runs an SQL query on a RedShift cluster.

    Runs an SQL query on a RedShift cluster. If the query writes out to a table that does not exist, a new table with that name is created.

Value Members

  1. object CopyActivity extends RunnableObject with Serializable

  2. object DeleteS3PathActivity extends RunnableObject with Serializable

  3. object GoogleStorageDownloadActivity extends Serializable

  4. object GoogleStorageUploadActivity extends RunnableObject with Serializable

  5. object HiveActivity extends RunnableObject with Serializable

  6. object HiveCopyActivity extends RunnableObject with Serializable

  7. object JarActivity extends RunnableObject with Serializable

  8. object MapReduceActivity extends RunnableObject with Serializable

  9. object MapReduceStep extends Serializable

  10. object PigActivity extends RunnableObject with Serializable

  11. object RedshiftCopyActivity extends Enumeration with RunnableObject

  12. object RedshiftCopyOption

  13. object RedshiftUnloadActivity extends RunnableObject with Serializable

  14. object RedshiftUnloadOption

  15. object S3DistCpActivity

  16. object ShellCommandActivity extends RunnableObject with Serializable

  17. object SparkActivity extends RunnableObject with Serializable

  18. object SparkStep extends Serializable

  19. object SqlActivity extends RunnableObject with Serializable

Ungrouped