Packages

p

com.datastax.spark

connector

package connector

The root package of Cassandra connector for Apache Spark. Offers handy implicit conversions that add Cassandra-specific methods to SparkContext and RDD.

Call cassandraTable method on the SparkContext object to create a CassandraRDD exposing Cassandra tables as Spark RDDs.

Call RDDFunctions saveToCassandra function on any RDD to save distributed collection to a Cassandra table.

Example:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words(word, count) VALUES ("and", 50);
import com.datastax.spark.connector._

val sparkMasterHost = "127.0.0.1"
val cassandraHost = "127.0.0.1"
val keyspace = "test"
val table = "words"

// Tell Spark the address of one Cassandra node:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", cassandraHost)

// Connect to the Spark cluster:
val sc = new SparkContext("spark://" + sparkMasterHost + ":7077", "example", conf)

// Read the table and print its contents:
val rdd = sc.cassandraTable(keyspace, table)
rdd.toArray().foreach(println)

// Write two rows to the table:
val col = sc.parallelize(Seq(("of", 1200), ("the", "863")))
col.saveToCassandra(keyspace, table)

sc.stop()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. connector
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. sealed trait BatchSize extends AnyRef
  2. case class BytesInBatch(batchSize: Int) extends BatchSize with Product with Serializable
  3. final class CassandraRow extends ScalaGettableData with Serializable

    Represents a single row fetched from Cassandra.

    Represents a single row fetched from Cassandra. Offers getters to read individual fields by column name or column index. The getters try to convert value to desired type, whenever possible. Most of the column types can be converted to a String. For nullable columns, you should use the getXXXOption getters which convert nulls to None values, otherwise a NullPointerException would be thrown.

    All getters throw an exception if column name/index is not found. Column indexes start at 0.

    If the value cannot be converted to desired type, com.datastax.spark.connector.types.TypeConversionException is thrown.

    Recommended getters for Cassandra types:

    - ascii: getString, getStringOption - bigint: getLong, getLongOption - blob: getBytes, getBytesOption - boolean: getBool, getBoolOption - counter: getLong, getLongOption - decimal: getDecimal, getDecimalOption - double: getDouble, getDoubleOption - float: getFloat, getFloatOption - inet: getInet, getInetOption - int: getInt, getIntOption - text: getString, getStringOption - timestamp: getDate, getDateOption - timeuuid: getUUID, getUUIDOption - uuid: getUUID, getUUIDOption - varchar: getString, getStringOption - varint: getVarInt, getVarIntOption - list: getList[T] - set: getSet[T] - map: getMap[K, V]

    Collection getters getList, getSet and getMap require to explicitly pass an appropriate item type:

    row.getList[String]("a_list")
    row.getList[Int]("a_list")
    row.getMap[Int, String]("a_map")

    Generic get allows to automatically convert collections to other collection types. Supported containers: - scala.collection.immutable.List - scala.collection.immutable.Set - scala.collection.immutable.TreeSet - scala.collection.immutable.Vector - scala.collection.immutable.Map - scala.collection.immutable.TreeMap - scala.collection.Iterable - scala.collection.IndexedSeq - java.util.ArrayList - java.util.HashSet - java.util.HashMap

    Example:

    row.get[List[Int]]("a_list")
    row.get[Vector[Int]]("a_list")
    row.get[java.util.ArrayList[Int]]("a_list")
    row.get[TreeMap[Int, String]]("a_map")

    Timestamps can be converted to other Date types by using generic get. Supported date types: - java.util.Date - java.sql.Date - org.joda.time.DateTime

  4. case class CassandraRowMetadata(columnNames: IndexedSeq[String], resultSetColumnNames: Option[IndexedSeq[String]] = None, codecs: IndexedSeq[TypeCodec[AnyRef]] = null) extends Product with Serializable

    All CassandraRows shared data

    All CassandraRows shared data

    columnNames

    row column names

    resultSetColumnNames

    column names from java driver row result set, without connector aliases.

    codecs

    cached java driver codecs to avoid registry lookups

  5. final class CassandraTableScanPairRDDFunctions[K, V] extends Serializable
  6. final class CassandraTableScanRDDFunctions[R] extends Serializable
  7. sealed trait CollectionBehavior extends AnyRef

    Insert behaviors for Collections.

  8. case class CollectionColumnName(columnName: String, alias: Option[String] = None, collectionBehavior: CollectionBehavior = CollectionOverwrite) extends ColumnRef with Product with Serializable

    References a collection column by name with insert instructions

  9. case class ColumnName(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    References a column by name.

  10. implicit final class ColumnNameFunctions extends AnyVal
  11. class ColumnNotFoundException extends Exception

    Thrown when the requested column does not exist in the result set.

  12. sealed trait ColumnRef extends AnyRef

    A column that can be selected from CQL results set by name.

  13. sealed trait ColumnSelector extends AnyRef
  14. class DataFrameFunctions extends Serializable

    Provides Cassandra-specific methods on org.apache.spark.sql.DataFrame

  15. case class FunctionCallRef(columnName: String, actualParams: Seq[Either[ColumnRef, String]] = Seq.empty, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    References a function call *

  16. trait GettableByIndexData extends Serializable
  17. trait GettableData extends GettableByIndexData
  18. class PairRDDFunctions[K, V] extends Serializable
  19. class RDDFunctions[T] extends WritableToCassandra[T] with Serializable

    Provides Cassandra-specific methods on RDD

  20. case class RowsInBatch(batchSize: Int) extends BatchSize with Product with Serializable
  21. trait ScalaGettableByIndexData extends GettableByIndexData
  22. trait ScalaGettableData extends ScalaGettableByIndexData with GettableData
  23. case class SomeColumns(columns: ColumnRef*) extends ColumnSelector with Product with Serializable
  24. class SparkContextFunctions extends Serializable

    Provides Cassandra-specific methods on SparkContext

  25. case class TTL(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    References TTL of a column.

  26. final case class TupleValue(values: Any*) extends ScalaGettableByIndexData with Product with Serializable
  27. final case class UDTValue(columnNames: IndexedSeq[String], columnValues: IndexedSeq[AnyRef]) extends ScalaGettableData with Product with Serializable
  28. case class WriteTime(columnName: String, alias: Option[String] = None) extends ColumnRef with Product with Serializable

    References write time of a column.

Value Members

  1. implicit def toCassandraTableScanFunctions[T](rdd: CassandraTableScanRDD[T]): CassandraTableScanRDDFunctions[T]
  2. implicit def toCassandraTableScanRDDPairFunctions[K, V](rdd: CassandraTableScanRDD[(K, V)]): CassandraTableScanPairRDDFunctions[K, V]
  3. implicit def toDataFrameFunctions(dataFrame: DataFrame): DataFrameFunctions
  4. implicit def toNamedColumnRef(columnName: String): ColumnName
  5. implicit def toPairRDDFunctions[K, V](rdd: RDD[(K, V)]): PairRDDFunctions[K, V]
  6. implicit def toRDDFunctions[T](rdd: RDD[T]): RDDFunctions[T]
  7. implicit def toSparkContextFunctions(sc: SparkContext): SparkContextFunctions
  8. object AllColumns extends ColumnSelector with Product with Serializable
  9. object BatchSize
  10. object CassandraRow extends Serializable
  11. object CassandraRowMetadata extends Serializable
  12. object CollectionAppend extends CollectionBehavior with Product with Serializable
  13. object CollectionOverwrite extends CollectionBehavior with Product with Serializable
  14. object CollectionPrepend extends CollectionBehavior with Product with Serializable
  15. object CollectionRemove extends CollectionBehavior with Product with Serializable
  16. object DocUtil
  17. object GettableData extends Serializable
  18. object PartitionKeyColumns extends ColumnSelector with Product with Serializable
  19. object PrimaryKeyColumns extends ColumnSelector with Product with Serializable
  20. object RowCountRef extends ColumnRef with Product with Serializable

    References a row count value returned from SELECT count(*)

  21. object SomeColumns extends Serializable
  22. object TupleValue extends Serializable
  23. object UDTValue extends Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped