io.smartdatalake.workflow.dataobject
list of partitions with list of possible values for every entry
definition of partitions in query string. Use %<partitionColName>% as placeholder for partition column value in layout.
Same as getResponse, but returns response as InputStream
Same as getResponse, but returns response as InputStream
it should be possible to define the partition to read as query string, but this is not yet implemented
is ignored for webservices
is ignored for webservices
implicit spark session
outputstream that writes to WebService once it's closed
Delete all data.
Delete all data. This is used to implement SaveMode.Overwrite.
Delete given files.
Delete given files. This is used to cleanup files after they are processed.
Definition of partitions that are expected to exists.
Definition of partitions that are expected to exists. This is used to validate that partitions being read exists and don't return no data. Define a Spark SQL expression that is evaluated against a PartitionValues instance and returns true or false example: "elements['yourColName'] > 2017"
true if partition is expected to exist.
Extract partition values from a given file path
Extract partition values from a given file path
Returns the factory that can parse this type (that is, type CO
).
Returns the factory that can parse this type (that is, type CO
).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
the factory (object) for this class.
Definition of fileName.
Definition of fileName. Default is an asterix to match everything. This is concatenated with the partition layout to search for files.
Get Access Token through Keycloak
Handle class cast exception when getting objects from instance registry
Handle class cast exception when getting objects from instance registry
For WebserviceFileDataObject, every partition is mapped to one FileRef
For WebserviceFileDataObject, every partition is mapped to one FileRef
List of partition values to be filtered. If empty all files in root path of DataObject will be listed.
List of FileRefs
get partition values formatted by partition layout
get partition values formatted by partition layout
Method for subclasses to override the base path for this DataObject.
Method for subclasses to override the base path for this DataObject. This is for instance needed if pathPrefix is defined in a connection.
Calls webservice and returns response as string Supports different methods client-id / client-secret --> Call with Bearer token incl.
Calls webservice and returns response as string Supports different methods client-id / client-secret --> Call with Bearer token incl. automatic refresh of token if necessary Normal call with optional custom header and user/password
Response as Array[Byte]
prepare paths to be searched
prepare paths to be searched
A unique identifier for this instance.
A unique identifier for this instance.
Initializes the webservice according to given parameters
Initializes the webservice according to given parameters
URL to call
Connection timeout in milliseconds
Read timeout in milliseconds
custom authentication header
client-id for OAuth2
client-secret for OAuth2
user for basic authentication
password for basic authentication
token for direct call with token
List partition values defined for this web service.
List partition values defined for this web service. Note that this is a fixed list.
Additional metadata for the DataObject
Additional metadata for the DataObject
list of partitions with list of possible values for every entry
definition of partitions in query string.
definition of partitions in query string. Use %<partitionColName>% as placeholder for partition column value in layout.
Definition of partition columns
Definition of partition columns
No root path needed for Webservice.
No root path needed for Webservice. It can be included in webserviceOptions.url.
Calls webservice POST method with binary data as body
Calls webservice POST method with binary data as body
URL to call
post body as Byte Array, type will be determined by Tika
Prepare & test DataObject's prerequisits
Prepare & test DataObject's prerequisits
This runs during the "prepare" operation of the DAG.
Initialized the webservice
Overwrite or Append new data.
Overwrite or Append new data. When writing partitioned data, this applies only to partitions concerned.
default separator for paths
default separator for paths
Given some FileRefs for another DataObject, translate the paths to the root path of this DataObject
Given some FileRefs for another DataObject, translate the paths to the root path of this DataObject
DataObject to call webservice and return response as InputStream This is implemented as FileRefDataObject because the response is treated as some file content. FileRefDataObjects support partitioned data. For a WebserviceFileDataObject partitions are mapped as query parameters to create query string. All possible query parameter values must be given in configuration.
list of partitions with list of possible values for every entry
definition of partitions in query string. Use %<partitionColName>% as placeholder for partition column value in layout.