K
- Key typeV
- Value typepublic abstract class AbstractBigQueryInputFormat<K,V> extends org.apache.hadoop.mapreduce.InputFormat<K,V> implements DelegateRecordReaderFactory<K,V>
Modifier and Type | Field and Description |
---|---|
static String |
EXTERNAL_TABLE_TYPE
The keyword for the type of BigQueryTable store externally.
|
static HadoopConfigurationProperty<Class<?>> |
INPUT_FORMAT_CLASS
Configuration key for InputFormat class name.
|
Constructor and Description |
---|
AbstractBigQueryInputFormat() |
Modifier and Type | Method and Description |
---|---|
static void |
cleanupJob(BigQueryHelper bigQueryHelper,
org.apache.hadoop.conf.Configuration config)
Similar to
cleanupJob(Configuration, JobID) , but allows specifying the Bigquery
instance to use. |
static void |
cleanupJob(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.mapreduce.JobID jobId)
Cleans up relevant temporary resources associated with a job which used the
GsonBigQueryInputFormat; this should be called explicitly after the completion of the entire
job.
|
org.apache.hadoop.mapreduce.RecordReader<K,V> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit,
org.apache.hadoop.conf.Configuration configuration) |
org.apache.hadoop.mapreduce.RecordReader<K,V> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit,
org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext) |
protected com.google.api.services.bigquery.Bigquery |
getBigQuery(org.apache.hadoop.conf.Configuration config)
Helper method to override for testing.
|
protected BigQueryHelper |
getBigQueryHelper(org.apache.hadoop.conf.Configuration config)
Helper method to override for testing.
|
abstract ExportFileFormat |
getExportFileFormat()
Get the ExportFileFormat that this input format supports.
|
protected static ExportFileFormat |
getExportFileFormat(Class<? extends AbstractBigQueryInputFormat<?,?>> clazz) |
protected static ExportFileFormat |
getExportFileFormat(org.apache.hadoop.conf.Configuration configuration) |
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(org.apache.hadoop.mapreduce.JobContext context) |
static void |
setInputTable(org.apache.hadoop.conf.Configuration configuration,
String projectId,
String datasetId,
String tableId)
Configure the BigQuery input table for a job
|
static void |
setInputTable(org.apache.hadoop.conf.Configuration configuration,
com.google.api.services.bigquery.model.TableReference tableReference)
Configure the BigQuery input table for a job
|
static void |
setTemporaryCloudStorageDirectory(org.apache.hadoop.conf.Configuration configuration,
String path)
Configure a directory to which we will export BigQuery data
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
createDelegateRecordReader
public static final HadoopConfigurationProperty<Class<?>> INPUT_FORMAT_CLASS
public static final String EXTERNAL_TABLE_TYPE
public static void setInputTable(org.apache.hadoop.conf.Configuration configuration, String projectId, String datasetId, String tableId) throws IOException
IOException
public static void setInputTable(org.apache.hadoop.conf.Configuration configuration, com.google.api.services.bigquery.model.TableReference tableReference) throws IOException
IOException
public static void setTemporaryCloudStorageDirectory(org.apache.hadoop.conf.Configuration configuration, String path)
public abstract ExportFileFormat getExportFileFormat()
protected static ExportFileFormat getExportFileFormat(org.apache.hadoop.conf.Configuration configuration)
protected static ExportFileFormat getExportFileFormat(Class<? extends AbstractBigQueryInputFormat<?,?>> clazz)
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException, InterruptedException
getSplits
in class org.apache.hadoop.mapreduce.InputFormat<K,V>
IOException
InterruptedException
public org.apache.hadoop.mapreduce.RecordReader<K,V> createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException
createRecordReader
in class org.apache.hadoop.mapreduce.InputFormat<K,V>
IOException
InterruptedException
public org.apache.hadoop.mapreduce.RecordReader<K,V> createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.conf.Configuration configuration) throws IOException, InterruptedException
IOException
InterruptedException
public static void cleanupJob(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.mapreduce.JobID jobId) throws IOException
IOException
public static void cleanupJob(BigQueryHelper bigQueryHelper, org.apache.hadoop.conf.Configuration config) throws IOException
cleanupJob(Configuration, JobID)
, but allows specifying the Bigquery
instance to use.bigQueryHelper
- The Bigquery API-client helper instance to use.config
- The job Configuration object which contains settings such as whether sharded
export was enabled, which GCS directory the export was performed in, etc.IOException
protected com.google.api.services.bigquery.Bigquery getBigQuery(org.apache.hadoop.conf.Configuration config) throws GeneralSecurityException, IOException
IOException
- on IO Error.GeneralSecurityException
- on security exception.protected BigQueryHelper getBigQueryHelper(org.apache.hadoop.conf.Configuration config) throws GeneralSecurityException, IOException
GeneralSecurityException
IOException
Copyright © 2020. All rights reserved.