Class

com.ebiznext.comet.workflow

IngestionWorkflow

Related Doc: package workflow

Permalink

class IngestionWorkflow extends StrictLogging

The whole worklfow works as follow :

Linear Supertypes
StrictLogging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. IngestionWorkflow
  2. StrictLogging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new IngestionWorkflow(storageHandler: StorageHandler, schemaHandler: SchemaHandler, launchHandler: LaunchHandler)(implicit settings: Settings)

    Permalink

    storageHandler

    : Minimum set of features required for the underlying filesystem

    schemaHandler

    : Schema interface

    launchHandler

    : Cron Manager interface

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def atlas(config: AtlasConfig): Boolean

    Permalink
  6. def autoJob(config: TransformConfig): Boolean

    Permalink

    Successively run each task of a job

    Successively run each task of a job

    config

    : job name as defined in the YML file and sql parameters to pass to SQL statements.

  7. def bqload(config: BigQueryLoadConfig, maybeSchema: Option[Schema] = None): Try[JobResult]

    Permalink
  8. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. val domains: List[Domain]

    Permalink
  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  12. def esLoad(config: ESLoadConfig): Try[JobResult]

    Permalink
  13. def esload(job: AutoJobDesc, task: AutoTaskDesc): Boolean

    Permalink
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  16. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  17. def infer(config: InferSchemaConfig): Try[Unit]

    Permalink
  18. def ingest(domain: Domain, schema: Schema, ingestingPath: List[Path], options: Map[String, String]): Try[JobResult]

    Permalink
  19. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  20. def jdbcload(config: ConnectionLoadConfig): Try[JobResult]

    Permalink
  21. def load(config: LoadConfig): Boolean

    Permalink

    Ingest the file (called by the cron manager at ingestion time for a specific dataset

  22. def loadLanding(): Unit

    Permalink

    Move the files from the landing area to the pending area.

    Move the files from the landing area to the pending area. files are loaded one domain at a time each domain has its own directory and is specified in the "directory" key of Domain YML file compressed files are uncompressed if a corresponding ack file exist. Compressed files are recognized by their extension which should be one of .tgz, .zip, .gz. raw file should also have a corresponding ack file before moving the files to the pending area, the ack files are deleted To import files without ack specify an empty "ack" key (aka ack:"") in the domain YML file. "ack" is the default ack extension searched for but you may specify a different one in the domain YML file.

  23. def loadPending(config: WatchConfig = WatchConfig()): Boolean

    Permalink

    Split files into resolved and unresolved datasets.

    Split files into resolved and unresolved datasets. A file is unresolved if a corresponding schema is not found. Schema matching is based on the dataset filename pattern

    config

    : includes Load pending dataset of these domain only excludes : Do not load datasets of these domains if both lists are empty, all domains are included

  24. val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    StrictLogging
  25. def metric(cliConfig: MetricsConfig): Try[JobResult]

    Permalink

    Runs the metrics job

    Runs the metrics job

    cliConfig

    : Client's configuration for metrics computing

  26. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  27. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  28. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  29. def setNullableStateOfColumn(df: DataFrame, nullable: Boolean): DataFrame

    Permalink

    Set nullable property of column.

    Set nullable property of column.

    df

    source DataFrame

    nullable

    is the flag to set, such that the column is either nullable or not

  30. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  31. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  32. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from StrictLogging

Inherited from AnyRef

Inherited from Any

Ungrouped