org.apache.spark.sql.api.java

JavaSchemaRDD

class JavaSchemaRDD extends JavaRDDLike[Row, JavaRDD[Row]] with SchemaRDDLike

An RDD of Row objects that is returned as the result of a Spark SQL query. In addition to standard RDD operations, a JavaSchemaRDD can also be registered as a table in the JavaSQLContext that was used to create. Registering a JavaSchemaRDD allows its contents to be queried in future SQL statement.

Linear Supertypes
SchemaRDDLike, JavaRDDLike[Row, JavaRDD[Row]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. JavaSchemaRDD
  2. SchemaRDDLike
  3. JavaRDDLike
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new JavaSchemaRDD(sqlContext: SQLContext, baseLogicalPlan: LogicalPlan)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def aggregate[U](zeroValue: U)(seqOp: Function2[U, Row, U], combOp: Function2[U, U, U]): U

    Definition Classes
    JavaRDDLike
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. val baseLogicalPlan: LogicalPlan

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike
  9. def cache(): JavaSchemaRDD

    Persist this RDD with the default storage level (MEMORY_ONLY).

  10. def cartesian[U](other: JavaRDDLike[U, _]): JavaPairRDD[Row, U]

    Definition Classes
    JavaRDDLike
  11. def checkpoint(): Unit

    Definition Classes
    JavaRDDLike
  12. val classTag: ClassTag[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  13. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  14. def coalesce(numPartitions: Int, shuffle: Boolean = false): JavaSchemaRDD

    Return a new RDD that is reduced into numPartitions partitions.

  15. def collect(): List[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  16. def collectAsync(): JavaFutureAction[List[Row]]

    Definition Classes
    JavaRDDLike
  17. def collectPartitions(partitionIds: Array[Int]): Array[List[Row]]

    Definition Classes
    JavaRDDLike
  18. def context: SparkContext

    Definition Classes
    JavaRDDLike
  19. def count(): Long

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  20. def countApprox(timeout: Long): PartialResult[BoundedDouble]

    Definition Classes
    JavaRDDLike
    Annotations
    @Experimental()
  21. def countApprox(timeout: Long, confidence: Double): PartialResult[BoundedDouble]

    Definition Classes
    JavaRDDLike
    Annotations
    @Experimental()
  22. def countApproxDistinct(relativeSD: Double): Long

    Definition Classes
    JavaRDDLike
  23. def countAsync(): JavaFutureAction[Long]

    Definition Classes
    JavaRDDLike
  24. def countByValue(): Map[Row, Long]

    Definition Classes
    JavaRDDLike
  25. def countByValueApprox(timeout: Long): PartialResult[Map[Row, BoundedDouble]]

    Definition Classes
    JavaRDDLike
  26. def countByValueApprox(timeout: Long, confidence: Double): PartialResult[Map[Row, BoundedDouble]]

    Definition Classes
    JavaRDDLike
  27. def distinct(numPartitions: Int): JavaSchemaRDD

    Return a new RDD containing the distinct elements in this RDD.

  28. def distinct(): JavaSchemaRDD

    Return a new RDD containing the distinct elements in this RDD.

  29. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  30. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  31. def filter(f: Function[Row, Boolean]): JavaSchemaRDD

    Return a new RDD containing only the elements that satisfy a predicate.

  32. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  33. def first(): Row

    Definition Classes
    JavaRDDLike
  34. def flatMap[U](f: FlatMapFunction[Row, U]): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  35. def flatMapToDouble(f: DoubleFlatMapFunction[Row]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  36. def flatMapToPair[K2, V2](f: PairFlatMapFunction[Row, K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  37. def fold(zeroValue: Row)(f: Function2[Row, Row, Row]): Row

    Definition Classes
    JavaRDDLike
  38. def foreach(f: VoidFunction[Row]): Unit

    Definition Classes
    JavaRDDLike
  39. def foreachAsync(f: VoidFunction[Row]): JavaFutureAction[Void]

    Definition Classes
    JavaRDDLike
  40. def foreachPartition(f: VoidFunction[Iterator[Row]]): Unit

    Definition Classes
    JavaRDDLike
  41. def foreachPartitionAsync(f: VoidFunction[Iterator[Row]]): JavaFutureAction[Void]

    Definition Classes
    JavaRDDLike
  42. def getCheckpointFile(): Optional[String]

    Definition Classes
    JavaRDDLike
  43. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  44. def getStorageLevel: StorageLevel

    Definition Classes
    JavaRDDLike
  45. def glom(): JavaRDD[List[Row]]

    Definition Classes
    JavaRDDLike
  46. def groupBy[U](f: Function[Row, U], numPartitions: Int): JavaPairRDD[U, Iterable[Row]]

    Definition Classes
    JavaRDDLike
  47. def groupBy[U](f: Function[Row, U]): JavaPairRDD[U, Iterable[Row]]

    Definition Classes
    JavaRDDLike
  48. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  49. def id: Int

    Definition Classes
    JavaRDDLike
  50. def insertInto(tableName: String): Unit

    :: Experimental :: Appends the rows from this RDD to the specified table.

    :: Experimental :: Appends the rows from this RDD to the specified table.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  51. def insertInto(tableName: String, overwrite: Boolean): Unit

    :: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.

    :: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  52. def intersection(other: JavaSchemaRDD, numPartitions: Int): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did. Performs a hash partition across the cluster

    Note that this method performs a shuffle internally.

    numPartitions

    How many partitions to use in the resulting RDD

  53. def intersection(other: JavaSchemaRDD, partitioner: Partitioner): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.

    Note that this method performs a shuffle internally.

    partitioner

    Partitioner to use for the resulting RDD

  54. def intersection(other: JavaSchemaRDD): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.

    Note that this method performs a shuffle internally.

  55. def isCheckpointed: Boolean

    Definition Classes
    JavaRDDLike
  56. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  57. def iterator(split: Partition, taskContext: TaskContext): Iterator[Row]

    Definition Classes
    JavaRDDLike
  58. def keyBy[U](f: Function[Row, U]): JavaPairRDD[U, Row]

    Definition Classes
    JavaRDDLike
  59. val logicalPlan: LogicalPlan

    Attributes
    protected[org.apache.spark]
    Definition Classes
    SchemaRDDLike
  60. def map[R](f: Function[Row, R]): JavaRDD[R]

    Definition Classes
    JavaRDDLike
  61. def mapPartitions[U](f: FlatMapFunction[Iterator[Row], U], preservesPartitioning: Boolean): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  62. def mapPartitions[U](f: FlatMapFunction[Iterator[Row], U]): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  63. def mapPartitionsToDouble(f: DoubleFlatMapFunction[Iterator[Row]], preservesPartitioning: Boolean): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  64. def mapPartitionsToDouble(f: DoubleFlatMapFunction[Iterator[Row]]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  65. def mapPartitionsToPair[K2, V2](f: PairFlatMapFunction[Iterator[Row], K2, V2], preservesPartitioning: Boolean): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  66. def mapPartitionsToPair[K2, V2](f: PairFlatMapFunction[Iterator[Row], K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  67. def mapPartitionsWithIndex[R](f: Function2[Integer, Iterator[Row], Iterator[R]], preservesPartitioning: Boolean): JavaRDD[R]

    Definition Classes
    JavaRDDLike
  68. def mapToDouble[R](f: DoubleFunction[Row]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  69. def mapToPair[K2, V2](f: PairFunction[Row, K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  70. def max(comp: Comparator[Row]): Row

    Definition Classes
    JavaRDDLike
  71. def min(comp: Comparator[Row]): Row

    Definition Classes
    JavaRDDLike
  72. def name(): String

    Definition Classes
    JavaRDDLike
  73. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  74. final def notify(): Unit

    Definition Classes
    AnyRef
  75. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  76. def partitions: List[Partition]

    Definition Classes
    JavaRDDLike
  77. def persist(newLevel: StorageLevel): JavaSchemaRDD

    Set this RDD's storage level to persist its values across operations after the first time it is computed.

    Set this RDD's storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet..

  78. def persist(): JavaSchemaRDD

    Persist this RDD with the default storage level (MEMORY_ONLY).

  79. def pipe(command: List[String], env: Map[String, String]): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  80. def pipe(command: List[String]): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  81. def pipe(command: String): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  82. def printSchema(): Unit

    Prints out the schema.

    Prints out the schema.

    Definition Classes
    SchemaRDDLike
  83. lazy val queryExecution: QueryExecution

    :: DeveloperApi :: A lazily computed query execution workflow.

    :: DeveloperApi :: A lazily computed query execution workflow. All other RDD operations are passed through to the RDD that is produced by this workflow. This workflow is produced lazily because invoking the whole query optimization pipeline can be expensive.

    The query execution is considered a Developer API as phases may be added or removed in future releases. This execution is only exposed to provide an interface for inspecting the various phases for debugging purposes. Applications should not depend on particular phases existing or producing any specific output, even for exactly the same query.

    Additionally, the RDD exposed by this execution is not designed for consumption by end users. In particular, it does not contain any schema information, and it reuses Row objects internally. This object reuse improves performance, but can make programming against the RDD more difficult. Instead end users should perform RDD operations on a SchemaRDD directly.

    Definition Classes
    SchemaRDDLike
  84. val rdd: RDD[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  85. def reduce(f: Function2[Row, Row, Row]): Row

    Definition Classes
    JavaRDDLike
  86. def registerTempTable(tableName: String): Unit

    Registers this RDD as a temporary table using the given name.

    Registers this RDD as a temporary table using the given name. The lifetime of this temporary table is tied to the SQLContext that was used to create this SchemaRDD.

    Definition Classes
    SchemaRDDLike
  87. def repartition(numPartitions: Int): JavaSchemaRDD

    Return a new RDD that has exactly numPartitions partitions.

    Return a new RDD that has exactly numPartitions partitions.

    Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data.

    If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.

  88. def saveAsObjectFile(path: String): Unit

    Definition Classes
    JavaRDDLike
  89. def saveAsParquetFile(path: String): Unit

    Saves the contents of this SchemaRDD as a parquet file, preserving the schema.

    Saves the contents of this SchemaRDD as a parquet file, preserving the schema. Files that are written out using this method can be read back in as a SchemaRDD using the parquetFile function.

    Definition Classes
    SchemaRDDLike
  90. def saveAsTable(tableName: String): Unit

    :: Experimental :: Creates a table from the the contents of this SchemaRDD.

    :: Experimental :: Creates a table from the the contents of this SchemaRDD. This will fail if the table already exists.

    Note that this currently only works with SchemaRDDs that are created from a HiveContext as there is no notion of a persisted catalog in a standard SQL context. Instead you can write an RDD out to a parquet file, and then register that file as a table. This "table" can then be the target of an insertInto.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  91. def saveAsTextFile(path: String, codec: Class[_ <: CompressionCodec]): Unit

    Definition Classes
    JavaRDDLike
  92. def saveAsTextFile(path: String): Unit

    Definition Classes
    JavaRDDLike
  93. def schema: StructType

    Returns the schema of this JavaSchemaRDD (represented by a StructType).

  94. val schemaRDD: SchemaRDD

    Returns the underlying Scala SchemaRDD.

  95. def schemaString: String

    Returns the schema as a string in the tree format.

    Returns the schema as a string in the tree format.

    Definition Classes
    SchemaRDDLike
  96. def setName(name: String): JavaSchemaRDD

    Assign a name to this RDD

  97. val sqlContext: SQLContext

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike
  98. def subtract(other: JavaSchemaRDD, p: Partitioner): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

  99. def subtract(other: JavaSchemaRDD, numPartitions: Int): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

  100. def subtract(other: JavaSchemaRDD): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

    Return an RDD with the elements from this that are not in other.

    Uses this partitioner/partition size, because even if other is huge, the resulting RDD will be <= us.

  101. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  102. def take(num: Int): List[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  103. def takeAsync(num: Int): JavaFutureAction[List[Row]]

    Definition Classes
    JavaRDDLike
  104. def takeOrdered(num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  105. def takeOrdered(num: Int, comp: Comparator[Row]): List[Row]

    Definition Classes
    JavaRDDLike
  106. def takeSample(withReplacement: Boolean, num: Int, seed: Long): List[Row]

    Definition Classes
    JavaRDDLike
  107. def takeSample(withReplacement: Boolean, num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  108. def toArray(): List[Row]

    Definition Classes
    JavaRDDLike
    Annotations
    @Deprecated
  109. def toDebugString(): String

    Definition Classes
    JavaRDDLike
  110. def toJSON(): JavaRDD[String]

    Returns a new RDD with each row transformed to a JSON string.

  111. def toLocalIterator(): Iterator[Row]

    Definition Classes
    JavaRDDLike
  112. def toString(): String

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike → AnyRef → Any
  113. def top(num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  114. def top(num: Int, comp: Comparator[Row]): List[Row]

    Definition Classes
    JavaRDDLike
  115. def unpersist(blocking: Boolean = true): JavaSchemaRDD

    Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

    Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

    blocking

    Whether to block until all blocks are deleted.

    returns

    This RDD.

  116. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  117. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  118. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  119. def wrapRDD(rdd: RDD[Row]): JavaRDD[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  120. def zip[U](other: JavaRDDLike[U, _]): JavaPairRDD[Row, U]

    Definition Classes
    JavaRDDLike
  121. def zipPartitions[U, V](other: JavaRDDLike[U, _], f: FlatMapFunction2[Iterator[Row], Iterator[U], V]): JavaRDD[V]

    Definition Classes
    JavaRDDLike
  122. def zipWithIndex(): JavaPairRDD[Row, Long]

    Definition Classes
    JavaRDDLike
  123. def zipWithUniqueId(): JavaPairRDD[Row, Long]

    Definition Classes
    JavaRDDLike

Deprecated Value Members

  1. def registerAsTable(tableName: String): Unit

    Definition Classes
    SchemaRDDLike
    Annotations
    @deprecated
    Deprecated

    (Since version 1.1) Use registerTempTable instead of registerAsTable.

  2. def splits: List[Partition]

    Definition Classes
    JavaRDDLike
    Annotations
    @deprecated
    Deprecated

    (Since version 1.1.0) Use partitions() instead.

Inherited from SchemaRDDLike

Inherited from JavaRDDLike[Row, JavaRDD[Row]]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

SchemaRDD Functions

Base RDD Functions