Class SansaCmdUtils

java.lang.Object
net.sansa_stack.spark.cli.util.SansaCmdUtils

public class SansaCmdUtils extends Object
Utility methods for implementing command line interface tooling based on Sansa
  • Field Details

  • Constructor Details

    • SansaCmdUtils

      public SansaCmdUtils()
  • Method Details

    • newDefaultSparkSessionBuilder

      public static org.apache.spark.sql.SparkSession.Builder newDefaultSparkSessionBuilder()
    • configureRowSetWriter

      public static net.sansa_stack.spark.io.rdf.output.RddRowSetWriterFactory configureRowSetWriter(RdfOutputConfig out)
    • configure

      public static <T extends net.sansa_stack.spark.io.rdf.output.RddWriterSettings> T configure(T dst, RdfOutputConfig out)
    • configureRdfWriter

      public static net.sansa_stack.spark.io.rdf.output.RddRdfWriterFactory configureRdfWriter(RdfOutputConfig out)
    • getValidPaths

      public static Set<String> getValidPaths(Collection<String> paths, org.apache.hadoop.conf.Configuration hadoopConf)
      Given a set of paths, return those that point to existing locations w.r.t. to the configured file system
    • validatePaths

      public static void validatePaths(Collection<String> paths, org.apache.hadoop.conf.Configuration hadoopConf)
    • createRdfSourceCollection

      public static net.sansa_stack.spark.io.rdf.input.api.RdfSourceCollection createRdfSourceCollection(net.sansa_stack.spark.io.rdf.input.api.RdfSourceFactory rdfSourceFactory, Collection<String> inputs, RdfInputConfig inputConfig)
    • createUnionRdd

      public static <T> org.apache.spark.api.java.JavaRDD<T> createUnionRdd(org.apache.spark.api.java.JavaSparkContext javaSparkContext, Collection<String> inputs, org.aksw.commons.lambda.throwing.ThrowingFunction<String,org.apache.spark.api.java.JavaRDD<T>> mapper)
    • createUnionRdd

      public static <T, X> org.apache.spark.api.java.JavaRDD<T> createUnionRdd(org.apache.spark.api.java.JavaSparkContext javaSparkContext, Collection<X> inputs, Function<? super X,String> inputToPath, org.aksw.commons.lambda.throwing.ThrowingFunction<? super X,org.apache.spark.api.java.JavaRDD<T>> mapper)
      Only creates a union rdd from the given collection of input objects if all paths obtained from the input via the 'inputToPath' function are accessible.
    • createExecCxtSupplier

      public static Supplier<org.apache.jena.sparql.engine.ExecutionContext> createExecCxtSupplier(org.aksw.jenax.arq.picocli.CmdMixinArq arqConfig)