Caches the data produced by the logical representation of the given Dataset.
Caches the data produced by the logical representation of the given Dataset.
Unlike RDD.cache()
, the default storage level is set to be MEMORY_AND_DISK
because
recomputing the in-memory columnar representation of the underlying table is expensive.
Clears all cached tables.
Invalidates the cache of any data that contains plan
.
Invalidates the cache of any data that contains plan
. Note that it is possible that this
function will over invalidate.
Invalidates the cache of any data that contains resourcePath
in one or more
HadoopFsRelation
node(s) as part of its logical plan.
Checks if the cache is empty.
Optionally returns cached data for the given LogicalPlan.
Optionally returns cached data for the given Dataset
Tries to remove the data for the given Dataset from the cache.
Tries to remove the data for the given Dataset from the cache. No operation, if it's already uncached.
Replaces segments of the given logical plan with cached versions where possible.
Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed. Data is cached using byte buffers stored in an InMemoryRelation. This relation is automatically substituted query plans that return the
sameResult
as the originally cached query.Internal to Spark SQL.