Class

com.ebiznext.comet.job.infer

InferSchemaJob

Related Doc: package infer

Permalink

class InferSchemaJob extends AnyRef

* Infers the schema of a given datapath, domain name, schema name.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. InferSchemaJob
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new InferSchemaJob()(implicit settings: Settings)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def createDataFrameWithFormat(datasetInit: Dataset[String], path: Path, header: Boolean): DataFrame

    Permalink

    Create the dataframe with its associated format

    Create the dataframe with its associated format

    datasetInit

    : created dataset without specifying format

    path

    : file path

  7. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. def getDomainDirectoryName(path: Path): String

    Permalink

    Get domain directory name

    Get domain directory name

    path

    : file path

    returns

    the domain directory name

  12. def getFormatFile(datasetInit: Dataset[String]): String

    Permalink

    Get format file by using the first and the last line of the dataset We use mapPartitionsWithIndex to retrieve these informations to make sure that the first line really corresponds to the first line (same for the last)

    Get format file by using the first and the last line of the dataset We use mapPartitionsWithIndex to retrieve these informations to make sure that the first line really corresponds to the first line (same for the last)

    datasetInit

    : created dataset without specifying format

  13. def getSchemaPattern(path: Path): String

    Permalink

    Get schema pattern

    Get schema pattern

    path

    : file path

    returns

    the schema pattern

  14. def getSeparator(datasetInit: Dataset[String]): String

    Permalink

    Get separator file by taking the character that appears the most in 10 lines of the dataset

    Get separator file by taking the character that appears the most in 10 lines of the dataset

    datasetInit

    : created dataset without specifying format

    returns

    the file separator

  15. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  16. def infer(domainName: String, schemaName: String, dataPath: String, savePath: String, header: Boolean): Try[Unit]

    Permalink

    Just to force any spark job to implement its entry point using within the "run" method

    Just to force any spark job to implement its entry point using within the "run" method

    returns

    : Spark Session used for the job

  17. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  18. def name: String

    Permalink
  19. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  22. def readFile(path: Path): Dataset[String]

    Permalink

    Read file without specifying the format

    Read file without specifying the format

    path

    : file path

    returns

    a dataset of string that contains data file

  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  24. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  25. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped