object deserialized
Methods on TypedDataset[T]
that go through a full serialization and
deserialization of T
, and execute outside of the Catalyst runtime.
The correct way to do a projection on a single column is to use the
select
method as follows:ds: TypedDataset[(String, String, String)] -> ds.select(ds('_2)).run()
Spark provides an alternative way to obtain the same resulting
Dataset
, using themap
method:ds: TypedDataset[(String, String, String)] -> ds.deserialized.map(_._2).run()
This second approach is however substantially slower than the first one, and should be avoided as possible. Indeed, under the hood this
map
will deserialize the entireTuple3
to an full JVM object, call the apply method of the_._2
closure on it, and serialize the resulting String back to its Catalyst representation.
- Alphabetic
- By Inheritance
- deserialized
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def filter(func: (T) => Boolean): TypedDataset[T]
Returns a new TypedDataset that only contains elements where
func
returnstrue
.Returns a new TypedDataset that only contains elements where
func
returnstrue
.apache/spark
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def flatMap[U](func: (T) => TraversableOnce[U])(implicit arg0: TypedEncoder[U]): TypedDataset[U]
Returns a new TypedDataset by first applying a function to all elements of this TypedDataset, and then flattening the results.
Returns a new TypedDataset by first applying a function to all elements of this TypedDataset, and then flattening the results.
apache/spark
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def map[U](func: (T) => U)(implicit arg0: TypedEncoder[U]): TypedDataset[U]
Returns a new TypedDataset that contains the result of applying
func
to each element.Returns a new TypedDataset that contains the result of applying
func
to each element.apache/spark
- def mapPartitions[U](func: (Iterator[T]) => Iterator[U])(implicit arg0: TypedEncoder[U]): TypedDataset[U]
Returns a new TypedDataset that contains the result of applying
func
to each partition.Returns a new TypedDataset that contains the result of applying
func
to each partition.apache/spark
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def reduceOption[F[_]](func: (T, T) => T)(implicit F: SparkDelay[F]): F[Option[T]]
Optionally reduces the elements of this TypedDataset using the specified binary function.
Optionally reduces the elements of this TypedDataset using the specified binary function. The given
func
must be commutative and associative or the result may be non-deterministic.Differs from
Dataset#reduce
by wrapping its result into anOption
and an effect-suspendingF
. - final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()