Class

io.eels.datastream

DataStreamSource

Related Doc: package datastream

Permalink

class DataStreamSource extends DataStream with Using with Logging

Linear Supertypes
Using, DataStream, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataStreamSource
  2. Using
  3. DataStream
  4. Logging
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataStreamSource(source: Source)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. def ++(other: DataStream): DataStream

    Permalink

    Joins two streams together, such that the elements of the given datastream are appended to the end of this datastream.

    Joins two streams together, such that the elements of the given datastream are appended to the end of this datastream.

    Definition Classes
    DataStream
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. def addField(field: Field, defaultValue: Any, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  6. def addField(name: Field, defaultValue: Any): DataStream

    Permalink

    Returns a new DataStream with the given field added at the end.

    Returns a new DataStream with the given field added at the end. The value of this field for each Row is specified by the default value. The value must be compatible with the field definition. Eg, an error will occur if the field has type Int and the default value was 1.3

    Definition Classes
    DataStream
  7. def addField(field: Field, expression: Expression, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  8. def addField(field: Field, expression: Expression): DataStream

    Permalink
    Definition Classes
    DataStream
  9. def addField(name: String, defaultValue: String, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  10. def addField(name: String, defaultValue: String): DataStream

    Permalink

    Returns a new DataStream with the new field of type String added at the end.

    Returns a new DataStream with the new field of type String added at the end. The value of this field for each Row is specified by the default value.

    Definition Classes
    DataStream
  11. def addFieldFn(name: String, fn: (Row) ⇒ Any, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  12. def addFieldFn(name: String, fn: (Row) ⇒ Any): DataStream

    Permalink
    Definition Classes
    DataStream
  13. def addFieldFn(field: Field, fn: (Row) ⇒ Any, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  14. def addFieldFn(field: Field, fn: (Row) ⇒ Any): DataStream

    Permalink

    Returns a new DataStream with a new field added at the end.

    Returns a new DataStream with a new field added at the end. The value for the field is taken from the function which is invoked for each row.

    Definition Classes
    DataStream
  15. def aggregated(): GroupedDataStream

    Permalink
    Definition Classes
    DataStream
  16. def align(_schema: StructType): DataStream

    Permalink

    Accepts a schema and 'aligns' this datastream to match the schema.

    Accepts a schema and 'aligns' this datastream to match the schema. In this sense, align means the values of each row will be re-ordered to match the schema, and extraneous fields will be dropped. Any missing values will cause an exception to the thrown.

    For example, given a DataStream of schema a,b,c and align is called with a schema of c,a then the row with values (1,2,3) would become (3,1).

    If a DataStream of schema a,b,c was invoked with align with the schema d,a then an exception would be raised because d was not in the original schema.

    Definition Classes
    DataStream
  17. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  18. def cartesian(other: DataStream): DataStream

    Permalink

    Returns a new DataStream which is the result of joining every row in this datastream with every row in the given datastream.

    Returns a new DataStream which is the result of joining every row in this datastream with every row in the given datastream.

    The given datastream will be materialized before it is used.

    For example, if this datastream has rows [a,b], [c,d] and [e,f] and the given datastream has [1,2] and [3,4] then the result will be [a,b,1,2], [a,b,3,4], [c,d,1,2], [c,d,3,4], [e,f,1,2] and [e,f,3,4].

    Definition Classes
    DataStream
  19. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. def collect: Vector[Row]

    Permalink

    Action which results in all the rows being returned in memory as a Vector.

    Action which results in all the rows being returned in memory as a Vector.

    Definition Classes
    DataStream
  21. def collectValues: Vector[Seq[Any]]

    Permalink
    Definition Classes
    DataStream
  22. def concat(other: DataStream): DataStream

    Permalink

    Combines two datastreams together such that the fields from this datastream are joined with the fields of the given datastream.

    Combines two datastreams together such that the fields from this datastream are joined with the fields of the given datastream. Eg, if this datastream has fields A,B and the given datastream has fields C,D then the result will have fields A,B,C,D

    This operation requires an executor, as it must buffer rows to ensure an even distribution.

    Definition Classes
    DataStream
  23. def count: Long

    Permalink
    Definition Classes
    DataStream
  24. def drop(n: Int): DataStream

    Permalink
    Definition Classes
    DataStream
  25. def dropField(fieldName: String, caseSensitive: Boolean = true): DataStream

    Permalink
    Definition Classes
    DataStream
  26. def dropFieldIfExists(fieldName: String, caseSensitive: Boolean = true): DataStream

    Permalink
    Definition Classes
    DataStream
  27. def dropFields(regex: Regex): DataStream

    Permalink
    Definition Classes
    DataStream
  28. def dropNullRows(): DataStream

    Permalink
    Definition Classes
    DataStream
  29. def dropWhile(fieldName: String, p: (Any) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  30. def dropWhile(p: (Row) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  31. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  32. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  33. def exists(p: (Row) ⇒ Boolean): Boolean

    Permalink
    Definition Classes
    DataStream
  34. def explode(fn: (Row) ⇒ Seq[Row]): DataStream

    Permalink
    Definition Classes
    DataStream
  35. def filter(expression: Equals): DataStream

    Permalink
    Definition Classes
    DataStream
  36. def filter(fieldName: String, p: (Any) ⇒ Boolean): DataStream

    Permalink

    Filters where the given field name matches the given predicate.

    Filters where the given field name matches the given predicate.

    Definition Classes
    DataStream
  37. def filter(f: (Row) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  38. def filterNot(p: (Row) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  39. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  40. def find(p: (Row) ⇒ Boolean): Option[Row]

    Permalink
    Definition Classes
    DataStream
  41. def foreach[U](fn: (Row) ⇒ U): DataStream

    Permalink

    Execute a side effecting function for every row in the stream, returning the same row.

    Execute a side effecting function for every row in the stream, returning the same row.

    Definition Classes
    DataStream
  42. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  43. def groupBy(fn: (Row) ⇒ Any): GroupedDataStream

    Permalink
    Definition Classes
    DataStream
  44. def groupBy(fields: Iterable[String]): GroupedDataStream

    Permalink
    Definition Classes
    DataStream
  45. def groupBy(first: String, rest: String*): GroupedDataStream

    Permalink
    Definition Classes
    DataStream
  46. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  47. def head: Row

    Permalink
    Definition Classes
    DataStream
  48. def intersection(stream: DataStream): DataStream

    Permalink
    Definition Classes
    DataStream
  49. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  50. def iterator: Iterator[Row]

    Permalink
    Definition Classes
    DataStream
  51. def join(key: String, other: DataStream): DataStream

    Permalink

    Joins the given datastream to this datastream on the given key column, where the values of the keys are equal as taken by the scala == operator.

    Joins the given datastream to this datastream on the given key column, where the values of the keys are equal as taken by the scala == operator. Both datastreams must contain the key column.

    The given datastream is fully inflated when this datastream needs to be materialized. For that reason, always use the smallest datastream as the parameter, and the larger datastream as the receiver.

    Definition Classes
    DataStream
  52. def listener(_listener: Listener): DataStream

    Permalink
    Definition Classes
    DataStream
  53. val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  54. def map(f: (Row) ⇒ Row): DataStream

    Permalink
    Definition Classes
    DataStream
  55. def mapField(fieldName: String, fn: (Any) ⇒ Any): DataStream

    Permalink
    Definition Classes
    DataStream
  56. def mapFieldIfExists(fieldName: String, fn: (Any) ⇒ Any): DataStream

    Permalink
    Definition Classes
    DataStream
  57. def maxBy[T](fn: (Row) ⇒ T)(implicit ordering: Ordering[T]): Row

    Permalink
    Definition Classes
    DataStream
  58. def minBy[T](fn: (Row) ⇒ T)(implicit ordering: Ordering[T]): Row

    Permalink
    Definition Classes
    DataStream
  59. def multiplex(count: Int): Seq[DataStream]

    Permalink
    Definition Classes
    DataStream
  60. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  61. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  62. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  63. def projection(fields: Seq[String]): DataStream

    Permalink

    Returns a new DataStream which contains the given list of fields from the existing stream.

    Returns a new DataStream which contains the given list of fields from the existing stream.

    Definition Classes
    DataStream
  64. def projection(first: String, rest: String*): DataStream

    Permalink
    Definition Classes
    DataStream
  65. def projectionExpression(expr: String): DataStream

    Permalink
    Definition Classes
    DataStream
  66. def removeField(fieldName: String, caseSensitive: Boolean = true): DataStream

    Permalink
    Definition Classes
    DataStream
  67. def removeFieldIfExists(fieldName: String, caseSensitive: Boolean = true): DataStream

    Permalink
    Definition Classes
    DataStream
  68. def removeFields(regex: Regex): DataStream

    Permalink
    Definition Classes
    DataStream
  69. def renameField(nameFrom: String, nameTo: String): DataStream

    Permalink
    Definition Classes
    DataStream
  70. def replace(from: String, target: Any): DataStream

    Permalink
    Definition Classes
    DataStream
  71. def replace(fieldName: String, from: String, target: Any, errorIfUnknownField: Boolean = true): DataStream

    Permalink
    Definition Classes
    DataStream
  72. def replace(fieldName: String, from: String, target: Any): DataStream

    Permalink
    Definition Classes
    DataStream
  73. def replace(fieldName: String, fn: (Any) ⇒ Any, errorIfUnknownField: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  74. def replace(fieldName: String, fn: (Any) ⇒ Any): DataStream

    Permalink
    Definition Classes
    DataStream
  75. def replaceField(name: String, field: Field): DataStream

    Permalink
    Definition Classes
    DataStream
  76. def replaceFieldType(regex: Regex, datatype: DataType): DataStream

    Permalink
    Definition Classes
    DataStream
  77. def replaceFieldType(from: DataType, to: DataType): DataStream

    Permalink
    Definition Classes
    DataStream
  78. def replaceFieldType(fieldName: String, datatype: DataType): DataStream

    Permalink

    Returns the same data but with an updated schema.

    Returns the same data but with an updated schema. The field that matches the given name will have its datatype set to the given datatype.

    Definition Classes
    DataStream
  79. def replaceNullValues(defaultValue: String): DataStream

    Permalink
    Definition Classes
    DataStream
  80. def sample(k: Int): DataStream

    Permalink

    Returns a new DataStream where only each "k" row is retained.

    Returns a new DataStream where only each "k" row is retained. Ie, if sample is 2, then on average, every other row will be returned. If sample is 10 then only 10% of rows will be returned. When running concurrently, the rows that are sampled will vary depending on the ordering that the workers pull through the rows. Each partition uses its own couter.

    Definition Classes
    DataStream
  81. def schema: StructType

    Permalink
    Definition Classes
    DataStreamSourceDataStream
  82. def size: Long

    Permalink
    Definition Classes
    DataStream
  83. def stripCharsFromFieldNames(chars: Seq[Char]): DataStream

    Permalink

    Returns a new DataStream with the same data as this stream, but where the field names have been sanitized by removing any occurances of the given characters.

    Returns a new DataStream with the same data as this stream, but where the field names have been sanitized by removing any occurances of the given characters.

    Definition Classes
    DataStream
  84. def subscribe(s: Subscriber[Seq[Row]]): Unit

    Permalink
    Definition Classes
    DataStreamSourceDataStream
  85. def substract(stream: DataStream): DataStream

    Permalink
    Definition Classes
    DataStream
  86. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  87. def take(n: Int): DataStream

    Permalink
    Definition Classes
    DataStream
  88. def takeWhile(fieldName: String, p: (Any) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  89. def takeWhile(p: (Row) ⇒ Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  90. def tee(schema: StructType, fn: (Row) ⇒ Seq[Row]): (DataStream, DataStream)

    Permalink

    Invoking this method returns two DataStreams.

    Invoking this method returns two DataStreams. The first is the original datastream which will continue as is. The second is a DataStream which is fed by rows generated from the given function. The function is invoked for each row that passes through this stream.

    Cancellation requests in the tee'd datastream do not propagate back to the original stream.

    Definition Classes
    DataStream
  91. def to(sink: Sink, parallelism: Int): Long

    Permalink
    Definition Classes
    DataStream
  92. def to(sink: Sink): Long

    Permalink
    Definition Classes
    DataStream
  93. def toDataTable: DataTable

    Permalink
    Definition Classes
    DataStream
  94. def toSet: Set[Row]

    Permalink
    Definition Classes
    DataStream
  95. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  96. def toVector: Vector[Row]

    Permalink

    Action which results in all the rows being returned in memory as a Vector.

    Action which results in all the rows being returned in memory as a Vector. Alias for 'collect()'

    Definition Classes
    DataStream
  97. def union(other: DataStream): DataStream

    Permalink
    Definition Classes
    DataStream
  98. def update(from: String, target: Any): DataStream

    Permalink

    For each row, any values that match "from" will be replaced with "target".

    For each row, any values that match "from" will be replaced with "target". This operation applies to all fields for all rows.

    Definition Classes
    DataStream
  99. def update(fieldName: String, from: String, target: Any, errorIfUnknownField: Boolean = true): DataStream

    Permalink

    Replaces any values that match "form" with the value "target".

    Replaces any values that match "form" with the value "target". This operation only applies to the field name specified.

    errorIfUnknownField

    throw an exception if the field specified does not exist in the dataset If set to false, then this operation will be a no-op if the field does not exist.

    Definition Classes
    DataStream
  100. def update(fieldName: String, from: String, target: Any): DataStream

    Permalink
    Definition Classes
    DataStream
  101. def update(fieldName: String, fn: (Any) ⇒ Any, errorIfUnknownField: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
  102. def update(fieldName: String, fn: (Any) ⇒ Any): DataStream

    Permalink

    For each row, the value corresponding to the given fieldName is applied to the function.

    For each row, the value corresponding to the given fieldName is applied to the function. The result of the function is the new value for that cell.

    Definition Classes
    DataStream
  103. def using[T, U <: AnyRef { def close(): Unit }](closeable: U)(f: (U) ⇒ T): T

    Permalink
    Definition Classes
    Using
  104. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  105. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  106. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  107. def withLowerCaseSchema(): DataStream

    Permalink
    Definition Classes
    DataStream

Deprecated Value Members

  1. def addField(name: String, fn: (Row) ⇒ Any, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) Use addFieldFn for better type inference

  2. def addField(name: String, fn: (Row) ⇒ Any): DataStream

    Permalink

    Returns a new DataStream with a new field added at the end.

    Returns a new DataStream with a new field added at the end. The datatype for the field is assumed to be String. The value for the field is taken from the function which is invoked for each row.

    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) Use addFieldFn for better type inference

  3. def addField(field: Field, fn: (Row) ⇒ Any, errorIfFieldExists: Boolean): DataStream

    Permalink
    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use addFieldFn

  4. def addField(field: Field, fn: (Row) ⇒ Any): DataStream

    Permalink
    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use addFieldFn

  5. def addFieldIfNotExists(field: Field, defaultValue: Any): DataStream

    Permalink
    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use addField with errorIfFieldExists = false

  6. def addFieldIfNotExists(name: String, defaultValue: Any): DataStream

    Permalink
    Definition Classes
    DataStream
    Annotations
    @deprecated
    Deprecated

    (Since version 1.3.0) use addField with errorIfFieldExists = false

Inherited from Using

Inherited from DataStream

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped