org.apache.spark.sql.api.java

JavaSchemaRDD

class JavaSchemaRDD extends JavaRDDLike[Row, JavaRDD[Row]] with SchemaRDDLike

An RDD of Row objects that is returned as the result of a Spark SQL query. In addition to standard RDD operations, a JavaSchemaRDD can also be registered as a table in the JavaSQLContext that was used to create. Registering a JavaSchemaRDD allows its contents to be queried in future SQL statement.

Linear Supertypes
SchemaRDDLike, JavaRDDLike[Row, JavaRDD[Row]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. JavaSchemaRDD
  2. SchemaRDDLike
  3. JavaRDDLike
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new JavaSchemaRDD(sqlContext: SQLContext, baseLogicalPlan: LogicalPlan)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def aggregate[U](zeroValue: U)(seqOp: Function2[U, Row, U], combOp: Function2[U, U, U]): U

    Definition Classes
    JavaRDDLike
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. val baseLogicalPlan: LogicalPlan

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike
  9. def cache(): JavaSchemaRDD

    Persist this RDD with the default storage level (MEMORY_ONLY).

  10. def cartesian[U](other: JavaRDDLike[U, _]): JavaPairRDD[Row, U]

    Definition Classes
    JavaRDDLike
  11. def checkpoint(): Unit

    Definition Classes
    JavaRDDLike
  12. val classTag: ClassTag[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  13. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  14. def coalesce(numPartitions: Int, shuffle: Boolean = false): JavaSchemaRDD

    Return a new RDD that is reduced into numPartitions partitions.

  15. def collect(): List[Row]

    Definition Classes
    JavaRDDLike
  16. def collectPartitions(partitionIds: Array[Int]): Array[List[Row]]

    Definition Classes
    JavaRDDLike
  17. def context: SparkContext

    Definition Classes
    JavaRDDLike
  18. def count(): Long

    Definition Classes
    JavaRDDLike
  19. def countApprox(timeout: Long): PartialResult[BoundedDouble]

    Definition Classes
    JavaRDDLike
    Annotations
    @Experimental()
  20. def countApprox(timeout: Long, confidence: Double): PartialResult[BoundedDouble]

    Definition Classes
    JavaRDDLike
    Annotations
    @Experimental()
  21. def countApproxDistinct(relativeSD: Double): Long

    Definition Classes
    JavaRDDLike
  22. def countByValue(): Map[Row, Long]

    Definition Classes
    JavaRDDLike
  23. def countByValueApprox(timeout: Long): PartialResult[Map[Row, BoundedDouble]]

    Definition Classes
    JavaRDDLike
  24. def countByValueApprox(timeout: Long, confidence: Double): PartialResult[Map[Row, BoundedDouble]]

    Definition Classes
    JavaRDDLike
  25. def distinct(numPartitions: Int): JavaSchemaRDD

    Return a new RDD containing the distinct elements in this RDD.

  26. def distinct(): JavaSchemaRDD

    Return a new RDD containing the distinct elements in this RDD.

  27. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  28. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  29. def filter(f: Function[Row, Boolean]): JavaSchemaRDD

    Return a new RDD containing only the elements that satisfy a predicate.

  30. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  31. def first(): Row

    Definition Classes
    JavaRDDLike
  32. def flatMap[U](f: FlatMapFunction[Row, U]): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  33. def flatMapToDouble(f: DoubleFlatMapFunction[Row]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  34. def flatMapToPair[K2, V2](f: PairFlatMapFunction[Row, K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  35. def fold(zeroValue: Row)(f: Function2[Row, Row, Row]): Row

    Definition Classes
    JavaRDDLike
  36. def foreach(f: VoidFunction[Row]): Unit

    Definition Classes
    JavaRDDLike
  37. def foreachPartition(f: VoidFunction[Iterator[Row]]): Unit

    Definition Classes
    JavaRDDLike
  38. def getCheckpointFile(): Optional[String]

    Definition Classes
    JavaRDDLike
  39. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  40. def getStorageLevel: StorageLevel

    Definition Classes
    JavaRDDLike
  41. def glom(): JavaRDD[List[Row]]

    Definition Classes
    JavaRDDLike
  42. def groupBy[K](f: Function[Row, K], numPartitions: Int): JavaPairRDD[K, Iterable[Row]]

    Definition Classes
    JavaRDDLike
  43. def groupBy[K](f: Function[Row, K]): JavaPairRDD[K, Iterable[Row]]

    Definition Classes
    JavaRDDLike
  44. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  45. def id: Int

    Definition Classes
    JavaRDDLike
  46. def insertInto(tableName: String): Unit

    :: Experimental :: Appends the rows from this RDD to the specified table.

    :: Experimental :: Appends the rows from this RDD to the specified table.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  47. def insertInto(tableName: String, overwrite: Boolean): Unit

    :: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.

    :: Experimental :: Adds the rows from this RDD to the specified table, optionally overwriting the existing data.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  48. def intersection(other: JavaSchemaRDD, numPartitions: Int): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did. Performs a hash partition across the cluster

    Note that this method performs a shuffle internally.

    numPartitions

    How many partitions to use in the resulting RDD

  49. def intersection(other: JavaSchemaRDD, partitioner: Partitioner): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.

    Note that this method performs a shuffle internally.

    partitioner

    Partitioner to use for the resulting RDD

  50. def intersection(other: JavaSchemaRDD): JavaSchemaRDD

    Return the intersection of this RDD and another one.

    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.

    Note that this method performs a shuffle internally.

  51. def isCheckpointed: Boolean

    Definition Classes
    JavaRDDLike
  52. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  53. def iterator(split: Partition, taskContext: TaskContext): Iterator[Row]

    Definition Classes
    JavaRDDLike
  54. def keyBy[K](f: Function[Row, K]): JavaPairRDD[K, Row]

    Definition Classes
    JavaRDDLike
  55. val logicalPlan: LogicalPlan

    Attributes
    protected[org.apache.spark]
    Definition Classes
    SchemaRDDLike
  56. def map[R](f: Function[Row, R]): JavaRDD[R]

    Definition Classes
    JavaRDDLike
  57. def mapPartitions[U](f: FlatMapFunction[Iterator[Row], U], preservesPartitioning: Boolean): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  58. def mapPartitions[U](f: FlatMapFunction[Iterator[Row], U]): JavaRDD[U]

    Definition Classes
    JavaRDDLike
  59. def mapPartitionsToDouble(f: DoubleFlatMapFunction[Iterator[Row]], preservesPartitioning: Boolean): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  60. def mapPartitionsToDouble(f: DoubleFlatMapFunction[Iterator[Row]]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  61. def mapPartitionsToPair[K2, V2](f: PairFlatMapFunction[Iterator[Row], K2, V2], preservesPartitioning: Boolean): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  62. def mapPartitionsToPair[K2, V2](f: PairFlatMapFunction[Iterator[Row], K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  63. def mapPartitionsWithIndex[R](f: Function2[Integer, Iterator[Row], Iterator[R]], preservesPartitioning: Boolean): JavaRDD[R]

    Definition Classes
    JavaRDDLike
  64. def mapToDouble[R](f: DoubleFunction[Row]): JavaDoubleRDD

    Definition Classes
    JavaRDDLike
  65. def mapToPair[K2, V2](f: PairFunction[Row, K2, V2]): JavaPairRDD[K2, V2]

    Definition Classes
    JavaRDDLike
  66. def max(comp: Comparator[Row]): Row

    Definition Classes
    JavaRDDLike
  67. def min(comp: Comparator[Row]): Row

    Definition Classes
    JavaRDDLike
  68. def name(): String

    Definition Classes
    JavaRDDLike
  69. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  70. final def notify(): Unit

    Definition Classes
    AnyRef
  71. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  72. def persist(newLevel: StorageLevel): JavaSchemaRDD

    Set this RDD's storage level to persist its values across operations after the first time it is computed.

    Set this RDD's storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet..

  73. def persist(): JavaSchemaRDD

    Persist this RDD with the default storage level (MEMORY_ONLY).

  74. def pipe(command: List[String], env: Map[String, String]): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  75. def pipe(command: List[String]): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  76. def pipe(command: String): JavaRDD[String]

    Definition Classes
    JavaRDDLike
  77. def printSchema(): Unit

    Prints out the schema in the tree format.

    Prints out the schema in the tree format.

    Definition Classes
    SchemaRDDLike
  78. lazy val queryExecution: QueryExecution

    :: DeveloperApi :: A lazily computed query execution workflow.

    :: DeveloperApi :: A lazily computed query execution workflow. All other RDD operations are passed through to the RDD that is produced by this workflow. This workflow is produced lazily because invoking the whole query optimization pipeline can be expensive.

    The query execution is considered a Developer API as phases may be added or removed in future releases. This execution is only exposed to provide an interface for inspecting the various phases for debugging purposes. Applications should not depend on particular phases existing or producing any specific output, even for exactly the same query.

    Additionally, the RDD exposed by this execution is not designed for consumption by end users. In particular, it does not contain any schema information, and it reuses Row objects internally. This object reuse improves performance, but can make programming against the RDD more difficult. Instead end users should perform RDD operations on a SchemaRDD directly.

    Definition Classes
    SchemaRDDLike
  79. val rdd: RDD[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  80. def reduce(f: Function2[Row, Row, Row]): Row

    Definition Classes
    JavaRDDLike
  81. def registerAsTable(tableName: String): Unit

    Registers this RDD as a temporary table using the given name.

    Registers this RDD as a temporary table using the given name. The lifetime of this temporary table is tied to the SQLContext that was used to create this SchemaRDD.

    Definition Classes
    SchemaRDDLike
  82. def repartition(numPartitions: Int): JavaSchemaRDD

    Return a new RDD that has exactly numPartitions partitions.

    Return a new RDD that has exactly numPartitions partitions.

    Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data.

    If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.

  83. def saveAsObjectFile(path: String): Unit

    Definition Classes
    JavaRDDLike
  84. def saveAsParquetFile(path: String): Unit

    Saves the contents of this SchemaRDD as a parquet file, preserving the schema.

    Saves the contents of this SchemaRDD as a parquet file, preserving the schema. Files that are written out using this method can be read back in as a SchemaRDD using the parquetFile function.

    Definition Classes
    SchemaRDDLike
  85. def saveAsTable(tableName: String): Unit

    :: Experimental :: Creates a table from the the contents of this SchemaRDD.

    :: Experimental :: Creates a table from the the contents of this SchemaRDD. This will fail if the table already exists.

    Note that this currently only works with SchemaRDDs that are created from a HiveContext as there is no notion of a persisted catalog in a standard SQL context. Instead you can write an RDD out to a parquet file, and then register that file as a table. This "table" can then be the target of an insertInto.

    Definition Classes
    SchemaRDDLike
    Annotations
    @Experimental()
  86. def saveAsTextFile(path: String, codec: Class[_ <: CompressionCodec]): Unit

    Definition Classes
    JavaRDDLike
  87. def saveAsTextFile(path: String): Unit

    Definition Classes
    JavaRDDLike
  88. def schemaString: String

    Returns the output schema in the tree format.

    Returns the output schema in the tree format.

    Definition Classes
    SchemaRDDLike
  89. def setName(name: String): JavaSchemaRDD

    Assign a name to this RDD

  90. def splits: List[Partition]

    Definition Classes
    JavaRDDLike
  91. val sqlContext: SQLContext

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike
  92. def subtract(other: JavaSchemaRDD, p: Partitioner): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

  93. def subtract(other: JavaSchemaRDD, numPartitions: Int): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

  94. def subtract(other: JavaSchemaRDD): JavaSchemaRDD

    Return an RDD with the elements from this that are not in other.

    Return an RDD with the elements from this that are not in other.

    Uses this partitioner/partition size, because even if other is huge, the resulting RDD will be <= us.

  95. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  96. def take(num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  97. def takeOrdered(num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  98. def takeOrdered(num: Int, comp: Comparator[Row]): List[Row]

    Definition Classes
    JavaRDDLike
  99. def takeSample(withReplacement: Boolean, num: Int, seed: Long): List[Row]

    Definition Classes
    JavaRDDLike
  100. def takeSample(withReplacement: Boolean, num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  101. def toArray(): List[Row]

    Definition Classes
    JavaRDDLike
    Annotations
    @Deprecated
  102. def toDebugString(): String

    Definition Classes
    JavaRDDLike
  103. def toLocalIterator(): Iterator[Row]

    Definition Classes
    JavaRDDLike
  104. def toString(): String

    Definition Classes
    JavaSchemaRDD → SchemaRDDLike → AnyRef → Any
  105. def top(num: Int): List[Row]

    Definition Classes
    JavaRDDLike
  106. def top(num: Int, comp: Comparator[Row]): List[Row]

    Definition Classes
    JavaRDDLike
  107. def unpersist(blocking: Boolean = true): JavaSchemaRDD

    Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

    Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.

    blocking

    Whether to block until all blocks are deleted.

    returns

    This RDD.

  108. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  109. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  110. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  111. def wrapRDD(rdd: RDD[Row]): JavaRDD[Row]

    Definition Classes
    JavaSchemaRDD → JavaRDDLike
  112. def zip[U](other: JavaRDDLike[U, _]): JavaPairRDD[Row, U]

    Definition Classes
    JavaRDDLike
  113. def zipPartitions[U, V](other: JavaRDDLike[U, _], f: FlatMapFunction2[Iterator[Row], Iterator[U], V]): JavaRDD[V]

    Definition Classes
    JavaRDDLike
  114. def zipWithIndex(): JavaPairRDD[Row, Long]

    Definition Classes
    JavaRDDLike
  115. def zipWithUniqueId(): JavaPairRDD[Row, Long]

    Definition Classes
    JavaRDDLike

Inherited from SchemaRDDLike

Inherited from JavaRDDLike[Row, JavaRDD[Row]]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

SchemaRDD Functions

Base RDD Functions