Create a new DistCache instance.
Create a new DistCache instance.
Google Cloud Storage URIs of the files to be distributed to all workers
function to initialized the distributed files
Create a new DistCache instance.
Create a new DistCache instance.
Google Cloud Storage URI of the file to be distributed to all workers
function to initialized the distributed file
// Prepare distributed cache as Map[Int, String] val dc = sc.distCache("gs://dataflow-samples/samples/misc/months.txt") { f => scala.io.Source.fromFile(f).getLines().map { s => val t = s.split(" ") (t(0).toInt, t(1)) }.toMap } val p: SCollection[Int] = // ... // Extract distributed cache inside a transform p.map(x => dc().getOrElse(x, "unknown"))
An enhanced ScioContext with distributed cache features.