Creates a SparkSession.Builder.
Creates a SparkSession.Builder.
See UnderlyingSparkSession.builder for more information.
Closes the current SparkSession.
Creates a DataFrame
from an RDD
containing Rows using the
given schema.
Creates a DataFrame
from an RDD
containing Rows using the
given schema. It is important to make sure that the structure of
every Row of the provided RDD matches the provided schema.
Otherwise, there will be runtime exception. Example:
import org.apache.spark.sql._ import org.apache.spark.sql.types._ val sparkSession = new org.apache.spark.sql.SparkSession(sc) val schema = StructType( StructField("name", StringType, false) :: StructField("age", IntegerType, true) :: Nil) val people = sc.textFile("examples/src/main/resources/people.txt").map( _.split(",")).map(p => Row(p(0), p(1).trim.toInt)) val dataFrame = sparkSession.createDataFrame(people, schema) dataFrame.printSchema // root // |-- name: string (nullable = false) // |-- age: integer (nullable = true) dataFrame.createOrReplaceTempView("people") sparkSession.sql("select name from people").collect.foreach(println)
2.0.0
Creates a DataFrame
from a local Seq of Product.
Creates a DataFrame
from a local Seq of Product.
2.0.0
Creates a DataFrame
from an RDD of Product (e.g.
Creates a DataFrame
from an RDD of Product (e.g. case classes,
tuples).
2.0.0
Creates a Dataset from an RDD of a given type.
Creates a Dataset from an RDD of a given type. This method
requires an encoder (to convert a JVM object of type T
to and
from the internal Spark SQL representation) that is generally
created automatically through implicits from a SparkSession
, or
can be created explicitly by calling static methods on
Encoders.
2.0.0
Creates a Dataset from a local Seq of data of a given type.
Creates a Dataset from a local Seq of data of a given type.
This method requires an encoder (to convert a JVM object of type
T
to and from the internal Spark SQL representation) that is
generally created automatically through implicits from a
SparkSession
, or can be created explicitly by calling static
methods on Encoders.
import spark.implicits._ case class Person(name: String, age: Long) val data = Seq(Person("Michael", 29), Person("Andy", 30), Person("Justin", 19)) val ds = spark.createDataset(data) ds.show() // +-------+---+ // | name|age| // +-------+---+ // |Michael| 29| // | Andy| 30| // | Justin| 19| // +-------+---+
2.0.0
Creates a new Dataset of type T containing zero elements.
Creates a DataFrameReader.
Creates a DataStreamReader.
Executes a SQL query using Spark.