config

Type Members

case class AggregationConfig(collationString: Option[String] = None, hintString: Option[String] = None, pipelineString: Option[String] = None, allowDiskUse: Boolean = ...) extends MongoClassConfig with Product with Serializable

The aggregation configuration
The aggregation configuration
collationString
the optional collation config
hintString
the optional hint document in extended json format
pipelineString
the optional aggregation pipeline, either a list of documents in json syntax or a single document in json syntax
allowDiskUse
enables writing to temporary files
trait MongoClassConfig extends Serializable

Mongo Spark Configurations
Mongo Spark Configurations
Defines helper methods for transforming or updating configurations.

Since
1.0
See also
ReadConfig
trait MongoCollectionConfig extends Serializable

Configurations for connecting to a specific collection in a database
Configurations for connecting to a specific collection in a database

Since
1.0
trait MongoCompanionConfig extends Serializable

The Mongo configuration base trait
The Mongo configuration base trait
Defines companion object helper methods for creating MongoConfig instances

Since
1.0
trait MongoInputConfig extends MongoCompanionConfig

Mongo input configurations
Mongo input configurations
Configurations used when reading from MongoDB
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input. followed by the property name:
- database, the database name to read data from.
- collection, the collection name to read data from.
- readPreference.name, the name of the ReadPreference to use.
- readPreference.tagSets, the ReadPreference TagSets to use.
- readConcern.level, the ReadConcern level to use.
- sampleSize, the sample size to use when inferring the schema.
- sampleSize, the size pool to sample from when inferring the schema.
- partitioner, the name of the partitioner to use to partition the data.
- partitionerOptions, the custom options used to configure the partitioner.
- localThreshold, the number of milliseconds used when choosing among multiple MongoDB servers to send a request.
- registerSQLHelperFunctions, register SQL helper functions allow easy querying of Bson types inside SQL queries.
- sql.inferschema.mapTypes.enabled, enable schema inference of MapTypes.
- sql.inferschema.mapTypes.minimumKeys, the minimum number of keys of how large a struct must be before a MapType should be inferred.
- sql.pipeline.includeNullFilters, include null filters in the aggregation pipeline
- sql.pipeline.includeFiltersAndProjections, include any filters and projections in the aggregation pipeline
- pipeline, enables custom aggregation pipelines to applied to the collection before sending to Spark
- allowDiskUse, enables writing to temporary files during aggregation in MongoDB.
- batchSize, customize the size of the internal batches within the MongoDB cursor.
Since
1.0
See also
com.mongodb.spark.config.ReadConfig$
trait MongoOutputConfig extends MongoCompanionConfig

Mongo output configurations
Mongo output configurations
Configurations used when writing data from Spark into MongoDB
outputProperties

Since
1.0
See also
WriteConfig
case class ReadConcernConfig(readConcernLevel: Option[String] = None) extends MongoClassConfig with Product with Serializable

The ReadConcern configuration used by the ReadConfig.
The ReadConcern configuration used by the ReadConfig.
readConcernLevel
the optional read concern level. If None the servers default level will be used.

Since
1.0
case class ReadConfig(databaseName: String, collectionName: String, connectionString: Option[String] = None, sampleSize: Int = ReadConfig.DefaultSampleSize, partitioner: MongoPartitioner = ReadConfig.DefaultPartitioner, partitionerOptions: Map[String, String] = ..., localThreshold: Int = ..., readPreferenceConfig: ReadPreferenceConfig = ReadPreferenceConfig(), readConcernConfig: ReadConcernConfig = ReadConcernConfig(), aggregationConfig: AggregationConfig = AggregationConfig(), registerSQLHelperFunctions: Boolean = ..., inferSchemaMapTypesEnabled: Boolean = ..., inferSchemaMapTypesMinimumKeys: Int = ..., pipelineIncludeNullFilters: Boolean = ..., pipelineIncludeFiltersAndProjections: Boolean = ..., samplePoolSize: Int = ReadConfig.DefaultSamplePoolSize, batchSize: Option[Int] = ReadConfig.DefaultBatchSize) extends MongoCollectionConfig with MongoClassConfig with Product with Serializable

Read Configuration used when reading data from MongoDB
Read Configuration used when reading data from MongoDB
databaseName
the database name
collectionName
the collection name
connectionString
the optional connection string used in the creation of this configuration
sampleSize
a positive integer sample size to draw from the collection when inferring the schema
partitioner
the class name of the partitioner to use to create partitions
partitionerOptions
the configuration options for the partitioner
localThreshold
the local threshold in milliseconds used when choosing among multiple MongoDB servers to send a request. Only servers whose ping time is less than or equal to the server with the fastest ping time plus the local threshold will be chosen.
readPreferenceConfig
the readPreference configuration
readConcernConfig
the readConcern configuration
aggregationConfig
the aggregation configuration
registerSQLHelperFunctions
true to register sql helper functions
inferSchemaMapTypesEnabled
true to detect MapTypes when inferring Schema.
inferSchemaMapTypesMinimumKeys
the minimum number of keys before a document can be inferred as a MapType.
pipelineIncludeNullFilters
true to include and push down null and exists filters into the pipeline when using sql.
pipelineIncludeFiltersAndProjections
true to push down filters and projections into the pipeline when using sql.
samplePoolSize
the size of the pool to take a sample from, used when there is no $sample support or if there is a pushed down aggregation
batchSize
the optional size for the internal batches used within the cursor

Since
1.0
case class ReadPreferenceConfig(name: String = "primary", tagSets: Option[String] = None) extends MongoClassConfig with Product with Serializable

The ReadPreference configuration used by the ReadConfig
The ReadPreference configuration used by the ReadConfig
name
the read preference name
tagSets
optional string of tagSets

Since
1.0
case class WriteConcernConfig(w: Option[Int] = None, wName: Option[String] = None, journal: Option[Boolean] = None, wTimeout: Option[Duration] = None) extends MongoClassConfig with Product with Serializable

The WriteConcern configuration used by the WriteConfig
The WriteConcern configuration used by the WriteConfig
w
the optional w integer value
wName
the optional w string value
journal
the optional journal value
wTimeout
the optional timeout value

Since
1.0
case class WriteConfig(databaseName: String, collectionName: String, connectionString: Option[String] = None, replaceDocument: Boolean = WriteConfig.DefaultReplaceDocument, maxBatchSize: Int = WriteConfig.DefaultMaxBatchSize, localThreshold: Int = ..., writeConcernConfig: WriteConcernConfig = WriteConcernConfig.Default, shardKey: Option[String] = None, forceInsert: Boolean = WriteConfig.DefaultForceInsert, ordered: Boolean = WriteConfig.DefaultOrdered, extendedBsonTypes: Boolean = true) extends MongoCollectionConfig with MongoClassConfig with Product with Serializable

Write Configuration for writes to MongoDB
Write Configuration for writes to MongoDB
databaseName
the database name
collectionName
the collection name
connectionString
the optional connection string used in the creation of this configuration.
replaceDocument
replaces the whole document, when saving a Dataset that contains an _id field. If false only updates / sets the fields declared in the Dataset.
maxBatchSize
the maxBatchSize when performing a bulk update/insert. Defaults to 512.
localThreshold
the local threshold in milliseconds used when choosing among multiple MongoDB servers to send a request. Only servers whose ping time is less than or equal to the server with the fastest ping time plus the local threshold will be chosen.
writeConcernConfig
the write concern configuration
shardKey
an optional shardKey in extended json form: "{key: 1, key2: 1}". Used when upserting DataSets in sharded clusters.
forceInsert
if true forces the writes to be inserts, even if a Dataset contains an _id field. Default false.
ordered
configures the bulk operation ordered property. Defaults to true.
extendedBsonTypes
the data contains extended bson types and any datasets that contain structs that follow the extended bson types will automatically be converted into native bson types. For example the following _id field would be converted into an ObjectId: {_id: {oid: "000000000000000000000000"}}

Since
1.0

Value Members

object AggregationConfig extends MongoInputConfig

The AggregationConfig companion object.
The AggregationConfig companion object.

Since
2.3
object BuildInfo extends Product with Serializable

This object was generated by sbt-buildinfo.
object ReadConcernConfig extends MongoInputConfig

The ReadConcernConfig companion object
The ReadConcernConfig companion object

Since
1.0
object ReadConfig extends MongoInputConfig with LoggingTrait

The ReadConfig companion object
The ReadConfig companion object
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input. followed by the property name:
- database, the database name to read data from.
- collection, the collection name to read data from.
- readPreference.name, the name of the ReadPreference to use.
- readPreference.tagSets, the ReadPreference TagSets to use.
- readConcern.level, the ReadConcern level to use.
- sampleSize, the sample size to use when inferring the schema.
- sampleSize, the size pool to sample from when inferring the schema.
- partitioner, the name of the partitioner to use to partition the data.
- partitionerOptions, the custom options used to configure the partitioner.
- localThreshold, the number of milliseconds used when choosing among multiple MongoDB servers to send a request.
- registerSQLHelperFunctions, register SQL helper functions allow easy querying of Bson types inside SQL queries.
- sql.inferschema.mapTypes.enabled, enable schema inference of MapTypes.
- sql.inferschema.mapTypes.minimumKeys, the minimum number of keys of how large a struct must be before a MapType should be inferred.
- sql.pipeline.includeNullFilters, include null filters in the aggregation pipeline
- sql.pipeline.includeFiltersAndProjections, include any filters and projections in the aggregation pipeline
- pipeline, enables custom aggregation pipelines to applied to the collection before sending to Spark
- allowDiskUse, enables writing to temporary files during aggregation in MongoDB.
- batchSize, customize the size of the internal batches within the MongoDB cursor.
Since
1.0
object ReadPreferenceConfig extends MongoInputConfig

The ReadPreferenceConfig companion object
The ReadPreferenceConfig companion object

Since
1.0
object WriteConcernConfig extends MongoOutputConfig

The WriteConcernConfig companion object
The WriteConcernConfig companion object

Since
1.0
object WriteConfig extends MongoOutputConfig

The WriteConfig companion object
The WriteConfig companion object

Since
1.0

package config

Type Members

case class AggregationConfig(collationString: Option[String] = None, hintString: Option[String] = None, pipelineString: Option[String] = None, allowDiskUse: Boolean = ...) extends MongoClassConfig with Product with Serializable

trait MongoClassConfig extends Serializable

trait MongoCollectionConfig extends Serializable

trait MongoCompanionConfig extends Serializable

trait MongoInputConfig extends MongoCompanionConfig

Configuration Properties

trait MongoOutputConfig extends MongoCompanionConfig

case class ReadConcernConfig(readConcernLevel: Option[String] = None) extends MongoClassConfig with Product with Serializable

case class ReadPreferenceConfig(name: String = "primary", tagSets: Option[String] = None) extends MongoClassConfig with Product with Serializable

case class WriteConcernConfig(w: Option[Int] = None, wName: Option[String] = None, journal: Option[Boolean] = None, wTimeout: Option[Duration] = None) extends MongoClassConfig with Product with Serializable

Value Members

object AggregationConfig extends MongoInputConfig

object BuildInfo extends Product with Serializable

object ReadConcernConfig extends MongoInputConfig

object ReadConfig extends MongoInputConfig with LoggingTrait

Configuration Properties

object ReadPreferenceConfig extends MongoInputConfig

object WriteConcernConfig extends MongoOutputConfig

object WriteConfig extends MongoOutputConfig

Ungrouped