com.spotify.scio

ScioContext

class ScioContext extends AnyRef

Main entry point for Dataflow functionality. A ScioContext represents a Dataflow pipeline, and can be used to create SCollections and distributed caches on that cluster.

Linear Supertypes
AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By inheritance
Inherited
  1. ScioContext
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def addArtifacts(extraLocalArtifacts: List[String]): Unit

    Add artifact to stage in Dataflow - artifact can be jar/text-files etc.

    Add artifact to stage in Dataflow - artifact can be jar/text-files etc. NOTE: currently one can add artifacts only before pipeline object is created

  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def avroFile[T](path: String, schema: Schema = null)(implicit arg0: ClassTag[T]): SCollection[T]

    Get an SCollection for an Avro file.

  9. def bigQuerySelect(sqlQuery: String): SCollection[TableRow]

    Get an SCollection for a BigQuery SELECT query.

  10. def bigQueryTable(tableSpec: String): SCollection[TableRow]

    Get an SCollection for a BigQuery table.

  11. def bigQueryTable(table: TableReference): SCollection[TableRow]

    Get an SCollection for a BigQuery table.

  12. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. def close(): ScioResult

    Close the context.

    Close the context. No operation can be performed once the context is closed.

  14. def datastore(datasetId: String, query: Query): SCollection[Entity]

    Get an SCollection for a Datastore query.

  15. def distCache[F](uris: Seq[String])(initFn: (Seq[File]) ⇒ F): DistCache[F]

    Create a new DistCache instance.

    Create a new DistCache instance.

    uris

    Google Cloud Storage URIs of the files to be distributed to all workers

    initFn

    function to initialized the distributed files

  16. def distCache[F](uri: String)(initFn: (File) ⇒ F): DistCache[F]

    Create a new DistCache instance.

    Create a new DistCache instance.

    uri

    Google Cloud Storage URI of the file to be distributed to all workers

    initFn

    function to initialized the distributed file

    // Prepare distributed cache as Map[Int, String]
    val dc = sc.distCache("gs://dataflow-samples/samples/misc/months.txt") { f =>
      scala.io.Source.fromFile(f).getLines().map { s =>
        val t = s.split(" ")
        (t(0).toInt, t(1))
      }.toMap
    }
    
    val p: SCollection[Int] = // ...
    // Extract distributed cache inside a transform
    p.map(x => dc().getOrElse(x, "unknown"))
  17. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  19. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  21. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  22. def isClosed: Boolean

    Whether the context is closed.

  23. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  24. def maxAccumulator[T](n: String)(implicit at: AccumulatorType[T]): Accumulator[T]

    Create a new Accumulator that keeps track of the maximum value.

    Create a new Accumulator that keeps track of the maximum value. See SCollection.withAccumulator for examples.

  25. def minAccumulator[T](n: String)(implicit at: AccumulatorType[T]): Accumulator[T]

    Create a new Accumulator that keeps track of the minimum value.

    Create a new Accumulator that keeps track of the minimum value. See SCollection.withAccumulator for examples.

  26. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  27. final def notify(): Unit

    Definition Classes
    AnyRef
  28. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  29. def objectFile[T](path: String)(implicit arg0: ClassTag[T]): SCollection[T]

    Get an SCollection for an object file.

  30. val options: DataflowPipelineOptions

  31. def parallelize[K, V](elems: Map[K, V])(implicit arg0: ClassTag[K], arg1: ClassTag[V]): SCollection[(K, V)]

    Distribute a local Scala Map to form an SCollection.

  32. def parallelize[T](elems: Iterable[T])(implicit arg0: ClassTag[T]): SCollection[T]

    Distribute a local Scala Iterable to form an SCollection.

  33. def parallelizeTimestamped[T](elems: Iterable[T], timestamps: Iterable[Instant])(implicit arg0: ClassTag[T]): SCollection[T]

    Distribute a local Scala Iterable with timestamps to form an SCollection.

  34. def parallelizeTimestamped[T](elems: Iterable[(T, Instant)])(implicit arg0: ClassTag[T]): SCollection[T]

    Distribute a local Scala Iterable with timestamps to form an SCollection.

  35. def pipeline: Pipeline

    Dataflow pipeline.

  36. def pubsubSubscription(sub: String, idLabel: String = null, timestampLabel: String = null): SCollection[String]

    Get an SCollection for a Pub/Sub subscription.

  37. def pubsubTopic(topic: String, idLabel: String = null, timestampLabel: String = null): SCollection[String]

    Get an SCollection for a Pub/Sub topic.

  38. def setName(name: String): Unit

    Set name for the context.

  39. def sumAccumulator[T](n: String)(implicit at: AccumulatorType[T]): Accumulator[T]

    Create a new Accumulator that keeps track of the sum of values.

    Create a new Accumulator that keeps track of the sum of values. See SCollection.withAccumulator for examples.

  40. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  41. def tableRowJsonFile(path: String): SCollection[TableRow]

    Get an SCollection of TableRow for a JSON file.

  42. def textFile(path: String): SCollection[String]

    Get an SCollection for a text file.

  43. def toString(): String

    Definition Classes
    AnyRef → Any
  44. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  46. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  47. def wrap[T](p: PCollection[T])(implicit arg0: ClassTag[T]): SCollection[T]

    Wrap a PCollection.

Inherited from AnyRef

Inherited from Any

Accumulators

Distributed Cache

In-memory Collections

Input Sources

Other Members