Enhanced version of SparkeyReader
that mimics a Map
.
Enhanced version of SCollection with Sparkey methods.
Enhanced version of SCollection with Sparkey methods.
Enhanced version of ScioContext with Sparkey methods.
Represents the base URI for a Sparkey index and log file, either on the local or a remote file system.
Represents the base URI for a Sparkey index and log file, either on the local or a remote file
system. For remote file systems, basePath
should be in the form
'scheme://<bucket>/<path>/<sparkey-prefix>'. For local files, it should be in the form
'/<path>/<sparkey-prefix>'. Note that basePath
must not be a folder or GCS bucket as it is
a base path representing two files - <sparkey-prefix>.spi and <sparkey-prefix>.spl.
Main package for Sparkey side input APIs. Import all.
import com.spotify.scio.extra.sparkey._
To save an
SCollection[(String, String)]
to a Sparkey file:The result
SCollection[SparkeyUri]
can be converted to a side input:These two steps can be done with a syntactic sugar:
An existing Sparkey file can also be converted to a side input directly:
sc.sparkeySideInput("gs:////" )
SparkeyReader
can be used like a lookup table in a side input operation: