The name of the table
Optionally, the primary key columns for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
Optionally, the last updated column for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
Optionally, the maximum number of rows to be read per Dataset partition for this table This number will be used to generate predicates to be passed to org.apache.spark.sql.SparkSession.read.jdbc If this is not set, the DataFrame will only have one partition. This could result in memory issues when extracting large tables. Be careful not to create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems. You can also control the maximum number of jdbc connections to open by limiting the number of executors for your application.
Optionally, the last updated column for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
Optionally, the maximum number of rows to be read per Dataset partition for this table This number will be used to generate predicates to be passed to org.apache.spark.sql.SparkSession.read.jdbc If this is not set, the DataFrame will only have one partition.
Optionally, the maximum number of rows to be read per Dataset partition for this table This number will be used to generate predicates to be passed to org.apache.spark.sql.SparkSession.read.jdbc If this is not set, the DataFrame will only have one partition. This could result in memory issues when extracting large tables. Be careful not to create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems. You can also control the maximum number of jdbc connections to open by limiting the number of executors for your application.
Optionally, the primary key columns for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
The name of the table
Table configuration used for RDBM extraction
The name of the table
Optionally, the primary key columns for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
Optionally, the last updated column for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)
Optionally, the maximum number of rows to be read per Dataset partition for this table This number will be used to generate predicates to be passed to org.apache.spark.sql.SparkSession.read.jdbc If this is not set, the DataFrame will only have one partition. This could result in memory issues when extracting large tables. Be careful not to create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems. You can also control the maximum number of jdbc connections to open by limiting the number of executors for your application.