io.smartdatalake.workflow.dataobject
Creates the read schema based on a given write schema.
Creates the read schema based on a given write schema. Normally this is the same, but some DataObjects can remove & add columns on read (e.g. KafkaTopicDataObject, SparkFileDataObject) In this cases we have to break the DataFrame lineage und create a dummy DataFrame in init phase.
Returns the factory that can parse this type (that is, type CO
).
Returns the factory that can parse this type (that is, type CO
).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
the factory (object) for this class.
Handle class cast exception when getting objects from instance registry
Handle class cast exception when getting objects from instance registry
SparkSession to use
DataFrame including all Actions in the instanceRegistry, used for exporting the metadata
Configure a housekeeping mode to e.g cleanup, archive and compact partitions.
Configure a housekeeping mode to e.g cleanup, archive and compact partitions. Default is None.
A unique identifier for this instance.
A unique identifier for this instance.
Additional metadata for the DataObject
Additional metadata for the DataObject
Exports a util DataFrame that contains properties and metadata extracted from all io.smartdatalake.workflow.action.Actions that are registered in the current InstanceRegistry.
Alternatively, it can export the properties and metadata of all io.smartdatalake.workflow.action.Actions defined in config files. For this, the configuration "config" has to be set to the location of the config.
Example:
dataObjects = { ... actions-exporter { type = ActionsExporterDataObject config = path/to/myconfiguration.conf } ... }
The config value can point to a configuration file or a directory containing configuration files.
Refer to ConfigLoader.loadConfigFromFilesystem() for details about the configuration loading.