Package

org.apache.spark

sql

Permalink

package sql

Visibility
  1. Public
  2. All

Type Members

  1. class AnalysisException extends Exception with Serializable

    Permalink

    :: DeveloperApi :: Thrown when a query fails to analyze, usually because the query itself is invalid.

    :: DeveloperApi :: Thrown when a query fails to analyze, usually because the query itself is invalid.

    Annotations
    @DeveloperApi()
  2. trait Encoder[T] extends Serializable

    Permalink

    :: Experimental :: Used to convert a JVM object of type T to and from the internal Spark SQL representation.

    :: Experimental :: Used to convert a JVM object of type T to and from the internal Spark SQL representation.

    Scala

    Encoders are generally created automatically through implicits from a SparkSession, or can be explicitly created by calling static methods on Encoders.

    import spark.implicits._
    
    val ds = Seq(1, 2, 3).toDS() // implicitly provided (spark.implicits.newIntEncoder)

    Java

    Encoders are specified by calling static methods on Encoders.

    List<String> data = Arrays.asList("abc", "abc", "xyz");
    Dataset<String> ds = context.createDataset(data, Encoders.STRING());

    Encoders can be composed into tuples:

    Encoder<Tuple2<Integer, String>> encoder2 = Encoders.tuple(Encoders.INT(), Encoders.STRING());
    List<Tuple2<Integer, String>> data2 = Arrays.asList(new scala.Tuple2(1, "a");
    Dataset<Tuple2<Integer, String>> ds2 = context.createDataset(data2, encoder2);

    Or constructed from Java Beans:

    Encoders.bean(MyClass.class);

    Implementation

    • Encoders are not required to be thread-safe and thus they do not need to use locks to guard against concurrent access if they reuse internal buffers to improve performance.
    Annotations
    @Experimental() @implicitNotFound( ... )
    Since

    1.6.0

  3. trait Row extends Serializable

    Permalink

    Represents one row of output from a relational operator.

    Represents one row of output from a relational operator. Allows both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access.

    It is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null.

    To create a new Row, use RowFactory.create() in Java or Row.apply() in Scala.

    A Row object can be constructed by providing field values. Example:

    import org.apache.spark.sql._
    
    // Create a Row from values.
    Row(value1, value2, value3, ...)
    // Create a Row from a Seq of values.
    Row.fromSeq(Seq(value1, value2, ...))

    A value of a row can be accessed through both generic access by ordinal, which will incur boxing overhead for primitives, as well as native primitive access. An example of generic access by ordinal:

    import org.apache.spark.sql._
    
    val row = Row(1, true, "a string", null)
    // row: Row = [1,true,a string,null]
    val firstValue = row(0)
    // firstValue: Any = 1
    val fourthValue = row(3)
    // fourthValue: Any = null

    For native primitive access, it is invalid to use the native primitive interface to retrieve a value that is null, instead a user must check isNullAt before attempting to retrieve a value that might be null. An example of native primitive access:

    // using the row from the previous example.
    val firstValue = row.getInt(0)
    // firstValue: Int = 1
    val isNull = row.isNullAt(3)
    // isNull: Boolean = true

    In Scala, fields in a Row object can be extracted in a pattern match. Example:

    import org.apache.spark.sql._
    
    val pairs = sql("SELECT key, value FROM src").rdd.map {
      case Row(key: Int, value: String) =>
        key -> value
    }
  4. class RowFactory extends AnyRef

    Permalink

Value Members

  1. object Encoders

    Permalink

    :: Experimental :: Methods for creating an Encoder.

    :: Experimental :: Methods for creating an Encoder.

    Annotations
    @Experimental()
    Since

    1.6.0

  2. object Row extends Serializable

    Permalink
  3. package catalyst

    Permalink

    Catalyst is a library for manipulating relational query plans.

    Catalyst is a library for manipulating relational query plans. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

  4. package execution

    Permalink
  5. package types

    Permalink

    Contains a type system for attributes produced by relations, including complex types like structs, arrays and maps.

row

Ungrouped