Class MongoTableProvider

  • All Implemented Interfaces:
    org.apache.spark.sql.connector.catalog.TableProvider, DataSourceRegister

    public final class MongoTableProvider
    extends java.lang.Object
    implements org.apache.spark.sql.connector.catalog.TableProvider, DataSourceRegister
    The MongoDB collection provider

    Note that: TableProvider can only apply data operations to existing tables, like read, append, delete, and overwrite. It does not support the operations that require metadata changes, like create/drop tables. // TODO support table creation and dropping on write.

    The major responsibility of this interface is to return a MongoTable for read/write.

    Also registers a shortname for use via the services api: spark.read().format("mongodb").load();

    • Constructor Summary

      Constructors 
      Constructor Description
      MongoTableProvider()
      Construct a new instance
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.spark.sql.connector.catalog.Table getTable​(StructType schema, org.apache.spark.sql.connector.expressions.Transform[] partitioning, java.util.Map<java.lang.String,​java.lang.String> properties)
      Return a Table instance with the specified table schema, partitioning and properties to do read/write.
      StructType inferSchema​(CaseInsensitiveStringMap options)
      Infer the schema of the table identified by the given options.
      java.lang.String shortName()  
      boolean supportsExternalMetadata()
      Returns true if the source has the ability of accepting external table metadata when getting tables.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface org.apache.spark.sql.connector.catalog.TableProvider

        inferPartitioning
    • Constructor Detail

      • MongoTableProvider

        public MongoTableProvider()
        Construct a new instance
    • Method Detail

      • inferSchema

        public StructType inferSchema​(CaseInsensitiveStringMap options)
        Infer the schema of the table identified by the given options.
        Specified by:
        inferSchema in interface org.apache.spark.sql.connector.catalog.TableProvider
        Parameters:
        options - an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
      • getTable

        public org.apache.spark.sql.connector.catalog.Table getTable​(StructType schema,
                                                                     org.apache.spark.sql.connector.expressions.Transform[] partitioning,
                                                                     java.util.Map<java.lang.String,​java.lang.String> properties)
        Return a Table instance with the specified table schema, partitioning and properties to do read/write. The returned table should report the same schema and partitioning with the specified ones, or Spark may fail the operation.
        Specified by:
        getTable in interface org.apache.spark.sql.connector.catalog.TableProvider
        Parameters:
        schema - The specified table schema.
        partitioning - The specified table partitioning.
        properties - The specified table properties. It's case preserving (contains exactly what users specified) and implementations are free to use it case sensitively or insensitively. It should be able to identify a table, e.g. file path, Kafka topic name, etc.
      • supportsExternalMetadata

        public boolean supportsExternalMetadata()
        Returns true if the source has the ability of accepting external table metadata when getting tables. The external table metadata includes: 1. For table reader: user-specified schema from `DataFrameReader`/`DataStreamReader` and schema/partitioning stored in Spark catalog. 2. For table writer: the schema of the input `Dataframe` of `DataframeWriter`/`DataStreamWriter`.
        Specified by:
        supportsExternalMetadata in interface org.apache.spark.sql.connector.catalog.TableProvider