Class/Object

com.ebiznext.comet.schema.model

Metadata

Related Docs: object Metadata | package model

Permalink

case class Metadata(mode: Option[Mode] = None, format: Option[Format] = None, encoding: Option[String] = None, multiline: Option[Boolean] = None, array: Option[Boolean] = None, withHeader: Option[Boolean] = None, separator: Option[String] = None, quote: Option[String] = None, escape: Option[String] = None, write: Option[WriteMode] = None, partition: Option[Partition] = None, sink: Option[Sink] = None, ignore: Option[String] = None, clustering: Option[Seq[String]] = None, xml: Option[Map[String, String]] = None) extends Product with Serializable

Specify Schema properties. These properties may be specified at the schema or domain level Any property not specified at the schema level is taken from the one specified at the domain level or else the default value is returned.

mode

: FILE mode by default. FILE and STREAM are the two accepted values. FILE is currently the only supported mode.

format

: DSV by default. Supported file formats are :

  • DSV : Delimiter-separated values file. Delimiter value iss specified in the "separator" field.
  • POSITION : FIXED format file where values are located at an exact position in each line.
  • SIMPLE_JSON : For optimisation purpose, we differentiate JSON with top level values from JSON with deep level fields. SIMPLE_JSON are JSON files with top level fields only.
  • JSON : Deep JSON file. Use only when your json documents contain subdocuments, otherwise prefer to use SIMPLE_JSON since it is much faster.
  • XML : XML files
encoding

: UTF-8 if not specified.

multiline

: are json objects on a single line or multiple line ? Single by default. false means single. false also means faster

array

: Is the json stored as a single object array ? false by default. This means that by default we have on json document per line.

withHeader

: does the dataset has a header ? true bu default

separator

: the values delimiter, ';' by default value may be a multichar string starting from Spark3

quote

: The String quote char, '"' by default

escape

: escaping char '\' by default

write

: Write mode, APPEND by default

partition

: Partition columns, no partitioning by default

sink

: should the dataset be indexed in elasticsearch after ingestion ?

ignore

: Pattern to ignore or UDF to apply to ignore some lines

clustering

: List of attributes to use for clustering

xml

: com.databricks.spark.xml options to use (eq. rowTag)

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Metadata
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Metadata(mode: Option[Mode] = None, format: Option[Format] = None, encoding: Option[String] = None, multiline: Option[Boolean] = None, array: Option[Boolean] = None, withHeader: Option[Boolean] = None, separator: Option[String] = None, quote: Option[String] = None, escape: Option[String] = None, write: Option[WriteMode] = None, partition: Option[Partition] = None, sink: Option[Sink] = None, ignore: Option[String] = None, clustering: Option[Seq[String]] = None, xml: Option[Map[String, String]] = None)

    Permalink

    mode

    : FILE mode by default. FILE and STREAM are the two accepted values. FILE is currently the only supported mode.

    format

    : DSV by default. Supported file formats are :

    • DSV : Delimiter-separated values file. Delimiter value iss specified in the "separator" field.
    • POSITION : FIXED format file where values are located at an exact position in each line.
    • SIMPLE_JSON : For optimisation purpose, we differentiate JSON with top level values from JSON with deep level fields. SIMPLE_JSON are JSON files with top level fields only.
    • JSON : Deep JSON file. Use only when your json documents contain subdocuments, otherwise prefer to use SIMPLE_JSON since it is much faster.
    • XML : XML files
    encoding

    : UTF-8 if not specified.

    multiline

    : are json objects on a single line or multiple line ? Single by default. false means single. false also means faster

    array

    : Is the json stored as a single object array ? false by default. This means that by default we have on json document per line.

    withHeader

    : does the dataset has a header ? true bu default

    separator

    : the values delimiter, ';' by default value may be a multichar string starting from Spark3

    quote

    : The String quote char, '"' by default

    escape

    : escaping char '\' by default

    write

    : Write mode, APPEND by default

    partition

    : Partition columns, no partitioning by default

    sink

    : should the dataset be indexed in elasticsearch after ingestion ?

    ignore

    : Pattern to ignore or UDF to apply to ignore some lines

    clustering

    : List of attributes to use for clustering

    xml

    : com.databricks.spark.xml options to use (eq. rowTag)

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val array: Option[Boolean]

    Permalink

    : Is the json stored as a single object array ? false by default.

    : Is the json stored as a single object array ? false by default. This means that by default we have on json document per line.

  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def checkValidity(schemaHandler: SchemaHandler): Either[List[String], Boolean]

    Permalink
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. val clustering: Option[Seq[String]]

    Permalink

    : List of attributes to use for clustering

  9. val encoding: Option[String]

    Permalink

    : UTF-8 if not specified.

  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. val escape: Option[String]

    Permalink

    : escaping char '\' by default

  12. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. val format: Option[Format]

    Permalink

    : DSV by default.

    : DSV by default. Supported file formats are :

    • DSV : Delimiter-separated values file. Delimiter value iss specified in the "separator" field.
    • POSITION : FIXED format file where values are located at an exact position in each line.
    • SIMPLE_JSON : For optimisation purpose, we differentiate JSON with top level values from JSON with deep level fields. SIMPLE_JSON are JSON files with top level fields only.
    • JSON : Deep JSON file. Use only when your json documents contain subdocuments, otherwise prefer to use SIMPLE_JSON since it is much faster.
    • XML : XML files
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def getEncoding(): String

    Permalink
  16. def getEscape(): String

    Permalink
  17. def getFormat(): Format

    Permalink
  18. def getMode(): Mode

    Permalink
  19. def getMultiline(): Boolean

    Permalink
  20. def getPartitionAttributes(): List[String]

    Permalink
    Annotations
    @JsonIgnore()
  21. def getQuote(): String

    Permalink
  22. def getSamplingStrategy(): Double

    Permalink
    Annotations
    @JsonIgnore()
  23. def getSeparator(): String

    Permalink
  24. def getSink(): Option[Sink]

    Permalink
  25. def getWrite(): WriteMode

    Permalink
  26. val ignore: Option[String]

    Permalink

    : Pattern to ignore or UDF to apply to ignore some lines

  27. def import(child: Metadata): Metadata

    Permalink

    Merge this metadata with its child.

    Merge this metadata with its child. Any property defined at the child level overrides the one defined at this level This allow a schema to override the domain metadata attribute Applied to a Domain level metadata

    child

    : Schema level metadata

    returns

    the metadata resulting of the merge of the schema and the domain metadata.

  28. def isArray(): Boolean

    Permalink
  29. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  30. def isWithHeader(): Boolean

    Permalink
  31. def merge[T](parent: Option[T], child: Option[T]): Option[T]

    Permalink

    Merge a single attribute

    Merge a single attribute

    parent

    : Domain level metadata attribute

    child

    : Schema level metadata attribute

    returns

    attribute if merge, the domain attribute otherwise.

    Attributes
    protected
  32. val mode: Option[Mode]

    Permalink

    : FILE mode by default.

    : FILE mode by default. FILE and STREAM are the two accepted values. FILE is currently the only supported mode.

  33. val multiline: Option[Boolean]

    Permalink

    : are json objects on a single line or multiple line ? Single by default.

    : are json objects on a single line or multiple line ? Single by default. false means single. false also means faster

  34. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  35. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  36. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  37. val partition: Option[Partition]

    Permalink

    : Partition columns, no partitioning by default

  38. val quote: Option[String]

    Permalink

    : The String quote char, '"' by default

  39. val separator: Option[String]

    Permalink

    : the values delimiter, ';' by default value may be a multichar string starting from Spark3

  40. val sink: Option[Sink]

    Permalink

    : should the dataset be indexed in elasticsearch after ingestion ?

  41. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  42. def toString(): String

    Permalink
    Definition Classes
    Metadata → AnyRef → Any
  43. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. val withHeader: Option[Boolean]

    Permalink

    : does the dataset has a header ? true bu default

  47. val write: Option[WriteMode]

    Permalink

    : Write mode, APPEND by default

  48. val xml: Option[Map[String, String]]

    Permalink

    : com.databricks.spark.xml options to use (eq.

    : com.databricks.spark.xml options to use (eq. rowTag)

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped