Class/Object

com.vbounyasit.bigdata

SparkApplication

Related Docs: object SparkApplication | package bigdata

Permalink

abstract class SparkApplication[U, V] extends SparkSessionProvider with ETL[U, V] with LoggerProvider

A class representing a submitted Spark application.

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SparkApplication
  2. LoggerProvider
  3. ETL
  4. SparkSessionProvider
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SparkApplication()

    Permalink

Abstract Value Members

  1. abstract val configDefinition: ConfigDefinition

    Permalink

    The configuration files definition

  2. abstract def executionPlans(implicit spark: SparkSession): Map[String, ExecutionConfig]

    Permalink

    The defined execution plans

    The defined execution plans

    spark

    an implicit spark session

    returns

    A JobName/ExecutionPlan Map

  3. abstract def load(dataFrame: DataFrame, database: String, table: String, optionalJobParameters: OptionalJobParameters[U, V]): Unit

    Permalink

    Saves the resulting dataFrame to disk

    Saves the resulting dataFrame to disk

    dataFrame

    The resulting DataFrame

    database

    The output database name

    table

    The output table name (job name)

    optionalJobParameters

    An OptionalJobParameters object containing any custom argument/application files we defined through our application.

    Definition Classes
    ETL

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def extract(jobName: String, jobSourcesConf: List[JobSource], sourcesConf: SourcesConf, env: String)(implicit spark: SparkSession): Sources

    Permalink

    Extracts data from a provided sources configuration

    Extracts data from a provided sources configuration

    jobName

    The Job name

    jobSourcesConf

    The Job input sources configuration

    sourcesConf

    The different input sources configuration

    env

    The environment in which we want to extract the input sources from

    spark

    An implicit spark session

    returns

    A Map of sourceName/SourcePipeline containing the extracted sources.

    Definition Classes
    SparkApplicationETL
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def getSparkSession(sparkParamsConf: SparkParamsConf): SparkSession

    Permalink
    Definition Classes
    SparkSessionProvider
  12. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  14. def loadExecutionData(args: Array[String]): ExecutionData

    Permalink

    Loads a set of parameters needed for the ETL Operation

    Loads a set of parameters needed for the ETL Operation

    through : config files loading, argument parsing, execution parameters creation, etc...

    args

    The list of arguments to parse

    returns

    An ExecutionData object containing all the required parameters

    Attributes
    protected
    Definition Classes
    SparkApplicationETL
  15. val logger: Logger

    Permalink
    Definition Classes
    LoggerProvider
  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. def runETL[Config, Argument, ConfigInput, ArgumentInput](executionData: ExecutionData): Unit

    Permalink

    The main method containing the logic for running our ETL job

    The main method containing the logic for running our ETL job

    executionData

    The ExecutionData object that will be used

    Definition Classes
    ETL
  20. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  21. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  22. def transform(jobName: String, sources: Sources, executionPlan: ExecutionPlan, outputColumns: Option[Seq[String]], exportDateColumn: Option[String])(implicit spark: SparkSession): DataFrame

    Permalink

    Apply transformations to a given set of sources

    Apply transformations to a given set of sources

    jobName

    The Job name

    sources

    The extracted input sources

    executionPlan

    The execution plan to apply

    exportDateColumn

    An optional date column name to tie the result computation date

    spark

    An implicit spark session

    returns

    The resulting DataFrame

    Definition Classes
    SparkApplicationETL
  23. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from LoggerProvider

Inherited from ETL[U, V]

Inherited from SparkSessionProvider

Inherited from AnyRef

Inherited from Any

Ungrouped