io.smartdatalake.workflow.dataobject
partition layout defines how partition values can be extracted from the path. Use "%<colname>%" as token to extract the value for a partition column. With "%<colname:regex>%" a regex can be given to limit search. This is especially useful if there is no char to delimit the last token from the rest of the path or also between two tokens.
Overwrite or Append new data.
create empty partition
create empty partition
Create empty partitions for partition values not yet existing
Create empty partitions for partition values not yet existing
Delete all data.
Delete all data. This is used to implement SaveMode.Overwrite.
Delete given files.
Delete given files. This is used to cleanup files after they are processed.
Delete given partitions.
Delete given partitions. This is used to cleanup partitions after they are processed.
Extract partition values from a given file path
Extract partition values from a given file path
Returns the factory that can parse this type (that is, type CO
).
Returns the factory that can parse this type (that is, type CO
).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
the factory (object) for this class.
Definition of fileName.
Definition of fileName. Default is an asterix to match everything. This is concatenated with the partition layout to search for files.
Handle class cast exception when getting objects from instance registry
Handle class cast exception when getting objects from instance registry
List files for given partition values
List files for given partition values
List of partition values to be filtered. If empty all files in root path of DataObject will be listed.
List of FileRefs
get partition values formatted by partition layout
get partition values formatted by partition layout
prepare paths to be searched
prepare paths to be searched
A unique identifier for this instance.
A unique identifier for this instance.
List partitions on data object's root path
List partitions on data object's root path
Additional metadata for the DataObject
Additional metadata for the DataObject
partition layout defines how partition values can be extracted from the path.
partition layout defines how partition values can be extracted from the path. Use "%<colname>%" as token to extract the value for a partition column. With "%<colname:regex>%" a regex can be given to limit search. This is especially useful if there is no char to delimit the last token from the rest of the path or also between two tokens.
Definition of partition columns
Definition of partition columns
The root path of the files that are handled by this DataObject.
The root path of the files that are handled by this DataObject.
Runs operations after reading from DataObject
Runs operations after reading from DataObject
Runs operations after writing to DataObject
Runs operations after writing to DataObject
Runs operations before reading from DataObject
Runs operations before reading from DataObject
Runs operations before writing to DataObject
Runs operations before writing to DataObject
Prepare & test DataObject's prerequisits
Prepare & test DataObject's prerequisits
This runs during the "prepare" operation of the DAG.
Overwrite or Append new data.
Overwrite or Append new data.
default separator for paths
default separator for paths
Given some FileRefs for another DataObject, translate the paths to the root path of this DataObject
Given some FileRefs for another DataObject, translate the paths to the root path of this DataObject
Connects to SFtp files Needs java library "com.hieronymus % sshj % 0.21.1" The following authentication mechanisms are supported -> public/private-key: private key must be saved in ~/.ssh, public key must be registered on server. -> user/pwd authentication: user and password is taken from two variables set as parameters. These variables could come from clear text (CLEAR), a file (FILE) or an environment variable (ENV)
partition layout defines how partition values can be extracted from the path. Use "%<colname>%" as token to extract the value for a partition column. With "%<colname:regex>%" a regex can be given to limit search. This is especially useful if there is no char to delimit the last token from the rest of the path or also between two tokens.
Overwrite or Append new data.