Base datastore model.
A builder able to create instances of IndexModel.
A model for grouping of topics.
A model for grouping of topics.
The name
field specifies the name of the model, which is used as the unique identifier for the model in the
models database.
The topicNameField
field specifies the field whose contents will be used as the name of the topic to which the
message will be sent when writing to Kafka. The field must be of type string. The original field will be left as-is,
so your schema must handle it (or you can use valueFieldsNames
).
The topicModelNames
contains the names of the topic model that constitute this grouping of topics.
The topic models that constitute this grouping of topics must: - consist of at least one topic model - be all different models - refer to different topics - use the same settings for everything but partitions and replicas
A model for a pipegraph, a processing pipeline abstraction.
A model for a pipegraph, a processing pipeline abstraction.
name of the pipegraph
description of the pipegraph
owner of the pipegraph
whether the pipegraph is from the WASP system
time of creation of the pipegraph
components describing processing built on Spark Legacy Streaming
components describing processing built on Spark Structured Streaming
components describing processing built on Akka actors
dashboard of the pipegraph
DataSource class.
DataSource class. The fields must be the same as the ones inside the MongoDB document associated with this model *
A named model for data stored as files on a raw datastore (eg HDFS).
A named model for data stored as files on a raw datastore (eg HDFS).
The uri
is augmented with time information if timed
is true. For writers this means whether to use uri
as-is or create timed namespaces (eg for HDFS, a subdirectory) inside; for readers whether to read from uri
as-is
or from the most recent timed namespace inside.
schema
is a json-encoded DataFrame schema, that is, a StructType. See DataType.fromJson and DataType.json.
options
control the underlying spark DataFrameWriter/Reader in the writers/readers using an instance of this model.
the name of the datastore
the uri where the data files reside
whether the uri must be augmented with time information
the schema of the data
the options for the datastore
Options for a raw datastore.
Options for a raw datastore.
saveMode
specifies the behaviour when saving and the output uri already exists; valid values are:
format
specifies the data format to use; valid values are:
extraOptions
allows specifying any writer-specific options accepted by DataFrameReader/Writer.option
partitionBy
allows specifying columns to be used to partition the data by using different directories for
different values
specifies the behaviour when the output uri exists
specifies the format to use
extra options for the underlying writer
A model for a reader, composed by a name, a datastoreModelName defining the datastore, a datastoreProduct defining the datastore software product to use, and any additional options needed to configure the reader.
Class representing a SqlSource model
Class representing a SqlSource model
The name of the SqlSource model
The name of the connection to use. N.B. have to be present in jdbc-subConfig
The name of the table
optional - Partition info (column, lowerBound, upperBound)
optional - Number of partitions
optional - Fetch size
A streaming processing component that leverages Spark's Structured Streaming API.
A streaming processing component that leverages Spark's Structured Streaming API.
unique name of the processing component
group of which the processing component is part
list of inputs for static datasets
streaming output
machine learning models to be used in the processing
strategy model that defines the processing
trigger interval to use, in milliseconds
has no effect at all
A model for a topic, that is, a message queue of some sort.
A model for a topic, that is, a message queue of some sort. Right now this means just Kafka topics.
The name
field specifies the name of the topic, and doubles as the unique identifier for the model in the
models database.
The creationTime
marks the time at which the model was generated.
The partitions
and replicas
fields are used the configure the topic when the framework creates it.
The topicDataType
field specifies the format to use when encoding/decoding data to/from messages.
The keyFieldName
field allows you to optionally specify a field whose contents will be used as a message key when
writing to Kafka. The field must be of type string or binary. The original field will be left as-is, so you schema
must handle it (or you can use valueFieldsNames
).
The headersFieldName
field allows you to optionally specify a field whose contents will be used as message headers
when writing to Kafka. The field must contain an array of non-null objects which must have a non-null field
headerKey
of type string and a field headerValue
of type binary. The original field will be left as-is, so your
schema must handle it (or you can use valueFieldsNames
).
The valueFieldsNames
field allows you to specify a list of field names to be used to filter the fields that get
passed to the value encoding; with this you can filter out fields that you don't need in the value, obviating the
need to handle them in the schema. This is especially useful when specifying the keyFieldName
or
headersFieldName
. For the avro and json topic data type this is optional; for the plaintext and binary topic data
types this field is mandatory and the list must contain a single value field name that has the proper type (string
for plaintext and binary for binary).
The schema
field contains the schema to use when encoding the value.
A model for a writer, composed by a name, a datastoreModelName defining the datastore, a datastoreProduct defining the datastore software product to use, and any additional options needed to configure the writer.
(Since version 2.8.0)
Companion object of IndexModelBuilder, contains the syntax.
Companion object of IndexModelBuilder, contains the syntax.
import IndexModelBuilder._ when you want to construct an IndexModel.
A builder able to create instances of IndexModel.
The current Stage of the builder.
The kind of DataStore whose index is being built.