com.datastax.spark

connector

package connector

The root package of Cassandra connector for Apache Spark. Offers handy implicit conversions that add Cassandra-specific methods to SparkContext and RDD.

Call cassandraTable method on the SparkContext object to create a CassandraRDD exposing Cassandra tables as Spark RDDs.

Call RDDFunctions saveToCassandra function on any RDD to save distributed collection to a Cassandra table.

Example:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words(word, count) VALUES ("and", 50);
import com.datastax.spark.connector._

val sparkMasterHost = "127.0.0.1"
val cassandraHost = "127.0.0.1"
val keyspace = "test"
val table = "words"

// Tell Spark the address of one Cassandra node:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", cassandraHost)

// Connect to the Spark cluster:
val sc = new SparkContext("spark://" + sparkMasterHost + ":7077", "example", conf)

// Read the table and print its contents:
val rdd = sc.cassandraTable(keyspace, table)
rdd.toArray().foreach(println)

// Write two rows to the table:
val col = sc.parallelize(Seq(("of", 1200), ("the", "863")))
col.saveToCassandra(keyspace, table)

sc.stop()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. connector
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. sealed trait BatchSize extends AnyRef

  2. case class BytesInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  3. final class CassandraRow extends ScalaGettableData with Serializable

    Represents a single row fetched from Cassandra.

  4. case class CassandraRowMetadata(columnNames: IndexedSeq[String], resultSetColumnNames: Option[IndexedSeq[String]] = scala.None, codecs: IndexedSeq[TypeCodec[AnyRef]] = null) extends Product with Serializable

    All CassandraRows shared data

  5. final class CassandraTableScanPairRDDFunctions[K, V] extends Serializable

  6. final class CassandraTableScanRDDFunctions[R] extends Serializable

  7. sealed trait CollectionBehavior extends AnyRef

    Insert behaviors for Collections.

  8. case class CollectionColumnName(columnName: String, alias: Option[String] = scala.None, collectionBehavior: CollectionBehavior = CollectionOverwrite) extends ColumnRef with Product with Serializable

    References a collection column by name with insert instructions

  9. case class ColumnName(columnName: String, alias: Option[String] = scala.None) extends ColumnRef with Product with Serializable

    References a column by name.

  10. implicit final class ColumnNameFunctions extends AnyVal

  11. class ColumnNotFoundException extends Exception

    Thrown when the requested column does not exist in the result set.

  12. sealed trait ColumnRef extends AnyRef

    A column that can be selected from CQL results set by name.

  13. sealed trait ColumnSelector extends AnyRef

  14. class DataFrameFunctions extends Serializable

    Provides Cassandra-specific methods on org.apache.spark.sql.DataFrame

  15. case class FunctionCallRef(columnName: String, actualParams: Seq[Either[ColumnRef, String]] = collection.this.Seq.empty[Nothing], alias: Option[String] = scala.None) extends ColumnRef with Product with Serializable

    References a function call *

  16. trait GettableByIndexData extends Serializable

  17. trait GettableData extends GettableByIndexData

  18. class PairRDDFunctions[K, V] extends Serializable

  19. class RDDFunctions[T] extends WritableToCassandra[T] with Serializable

    Provides Cassandra-specific methods on RDD

  20. case class RowsInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  21. trait ScalaGettableByIndexData extends GettableByIndexData

  22. trait ScalaGettableData extends ScalaGettableByIndexData with GettableData

  23. case class SomeColumns(columns: ColumnRef*) extends ColumnSelector with Product with Serializable

  24. class SparkContextFunctions extends Serializable

    Provides Cassandra-specific methods on SparkContext

  25. case class TTL(columnName: String, alias: Option[String] = scala.None) extends ColumnRef with Product with Serializable

    References TTL of a column.

  26. final case class TupleValue(values: Any*) extends ScalaGettableByIndexData with Product with Serializable

  27. final case class UDTValue(columnNames: IndexedSeq[String], columnValues: IndexedSeq[AnyRef]) extends ScalaGettableData with Product with Serializable

  28. case class WriteTime(columnName: String, alias: Option[String] = scala.None) extends ColumnRef with Product with Serializable

    References write time of a column.

Value Members

  1. object AllColumns extends ColumnSelector with Product with Serializable

  2. object BatchSize

  3. object CassandraRow extends Serializable

  4. object CassandraRowMetadata extends Serializable

  5. object CollectionAppend extends CollectionBehavior with Product with Serializable

  6. object CollectionOverwrite extends CollectionBehavior with Product with Serializable

  7. object CollectionPrepend extends CollectionBehavior with Product with Serializable

  8. object CollectionRemove extends CollectionBehavior with Product with Serializable

  9. object DocUtil

  10. object GettableData extends Serializable

  11. object PartitionKeyColumns extends ColumnSelector with Product with Serializable

  12. object PrimaryKeyColumns extends ColumnSelector with Product with Serializable

  13. object RowCountRef extends ColumnRef with Product with Serializable

    References a row count value returned from SELECT count(*)

  14. object SomeColumns extends Serializable

  15. object TupleValue extends Serializable

  16. object UDTValue extends Serializable

  17. package cql

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it.

  18. package japi

  19. package mapper

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples.

  20. package rdd

    Contains com.datastax.spark.connector.rdd.CassandraTableScanRDD class that is the main entry point for analyzing Cassandra data from Spark.

  21. package streaming

  22. implicit def toCassandraTableScanFunctions[T](rdd: CassandraTableScanRDD[T]): CassandraTableScanRDDFunctions[T]

  23. implicit def toCassandraTableScanRDDPairFunctions[K, V](rdd: CassandraTableScanRDD[(K, V)]): CassandraTableScanPairRDDFunctions[K, V]

  24. implicit def toDataFrameFunctions(dataFrame: DataFrame): DataFrameFunctions

  25. implicit def toNamedColumnRef(columnName: String): ColumnName

  26. implicit def toPairRDDFunctions[K, V](rdd: RDD[(K, V)]): PairRDDFunctions[K, V]

  27. implicit def toRDDFunctions[T](rdd: RDD[T]): RDDFunctions[T]

  28. implicit def toSparkContextFunctions(sc: SparkContext): SparkContextFunctions

  29. package types

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most.

  30. package util

    Useful stuff that didn't fit elsewhere.

  31. package writer

    Contains components for writing RDDs to Cassandra

Inherited from AnyRef

Inherited from Any

Ungrouped