public class BigQueryHelper extends Object
Modifier and Type | Field and Description |
---|---|
static int |
BIGQUERY_JOB_ID_MAX_LENGTH |
static String |
BIGQUERY_JOB_ID_PATTERN |
Constructor and Description |
---|
BigQueryHelper(com.google.api.services.bigquery.Bigquery service) |
Modifier and Type | Method and Description |
---|---|
void |
checkJobIdEquality(com.google.api.services.bigquery.model.Job expected,
com.google.api.services.bigquery.model.Job actual)
Helper to check for non-null Job.getJobReference().getJobId() and quality of the getJobId()
between
expected and actual , using Preconditions.checkState. |
com.google.api.services.bigquery.model.JobReference |
createJobReference(String projectId,
String jobIdPrefix,
String location)
Creates a new JobReference with a unique jobId generated from
jobIdPrefix plus a
randomly generated UUID String. |
void |
exportBigQueryToGcs(String projectId,
com.google.api.services.bigquery.model.TableReference tableRef,
List<String> gcsPaths,
boolean awaitCompletion)
Exports BigQuery results into GCS, polls for completion before returning.
|
com.google.api.services.bigquery.Bigquery |
getRawBigquery()
Returns the underlying Bigquery instance used for communicating with the BigQuery API.
|
com.google.api.services.bigquery.model.Table |
getTable(com.google.api.services.bigquery.model.TableReference tableRef)
Gets the specified table resource by table ID.
|
void |
importFederatedFromGcs(String projectId,
com.google.api.services.bigquery.model.TableReference tableRef,
com.google.api.services.bigquery.model.TableSchema schema,
BigQueryFileFormat sourceFormat,
List<String> gcsPaths)
Performs a federated import on data from GCS into BigQuery via a table insert.
|
void |
importFromGcs(String projectId,
com.google.api.services.bigquery.model.TableReference tableRef,
com.google.api.services.bigquery.model.TableSchema schema,
com.google.api.services.bigquery.model.TimePartitioning timePartitioning,
String kmsKeyName,
BigQueryFileFormat sourceFormat,
String createDisposition,
String writeDisposition,
List<String> gcsPaths,
boolean awaitCompletion)
Imports data from GCS into BigQuery via a load job.
|
com.google.api.services.bigquery.model.Job |
insertJobOrFetchDuplicate(String projectId,
com.google.api.services.bigquery.model.Job job)
Tries to run jobs().insert(...) with the provided
projectId and job , which
returns a Job under normal operation, which is then returned from this method. |
boolean |
tableExists(com.google.api.services.bigquery.model.TableReference tableRef)
Returns true if the table exists, or false if not.
|
public static final String BIGQUERY_JOB_ID_PATTERN
public static final int BIGQUERY_JOB_ID_MAX_LENGTH
public BigQueryHelper(com.google.api.services.bigquery.Bigquery service)
public com.google.api.services.bigquery.Bigquery getRawBigquery()
public void importFederatedFromGcs(String projectId, com.google.api.services.bigquery.model.TableReference tableRef, @Nullable com.google.api.services.bigquery.model.TableSchema schema, BigQueryFileFormat sourceFormat, List<String> gcsPaths) throws IOException
projectId
- the project on whose behalf to perform the load.tableRef
- the reference to the destination table.schema
- the schema of the source data to populate the destination table by.sourceFormat
- the file format of the source data.gcsPaths
- the location of the source data in GCS.IOException
public void importFromGcs(String projectId, com.google.api.services.bigquery.model.TableReference tableRef, @Nullable com.google.api.services.bigquery.model.TableSchema schema, @Nullable com.google.api.services.bigquery.model.TimePartitioning timePartitioning, @Nullable String kmsKeyName, BigQueryFileFormat sourceFormat, String createDisposition, String writeDisposition, List<String> gcsPaths, boolean awaitCompletion) throws IOException, InterruptedException
projectId
- the project on whose behalf to perform the load.tableRef
- the reference to the destination table.schema
- the schema of the source data to populate the destination table by.timePartitioning
- time partitioning to populate the destination table.kmsKeyName
- the Cloud KMS encryption key used to protect the output table.sourceFormat
- the file format of the source data.createDisposition
- the create disposition of the output table.writeDisposition
- the write disposition of the output table.gcsPaths
- the location of the source data in GCS.awaitCompletion
- if true, block and poll until job completes, otherwise return as soon as
the job has been successfully dispatched.IOException
InterruptedException
- if interrupted while waiting for job completion.public void exportBigQueryToGcs(String projectId, com.google.api.services.bigquery.model.TableReference tableRef, List<String> gcsPaths, boolean awaitCompletion) throws IOException, InterruptedException
projectId
- the project on whose behalf to perform the export.tableRef
- the table to export.gcsPaths
- the GCS paths to export to.awaitCompletion
- if true, block and poll until job completes, otherwise return as soon as
the job has been successfully dispatched.IOException
- on IO error.InterruptedException
- on interrupt.public boolean tableExists(com.google.api.services.bigquery.model.TableReference tableRef) throws IOException
IOException
public com.google.api.services.bigquery.model.Table getTable(com.google.api.services.bigquery.model.TableReference tableRef) throws IOException
tableRef
- The BigQuery table reference.IOException
public com.google.api.services.bigquery.model.JobReference createJobReference(String projectId, String jobIdPrefix, @Nullable String location)
jobIdPrefix
plus a
randomly generated UUID String.public void checkJobIdEquality(com.google.api.services.bigquery.model.Job expected, com.google.api.services.bigquery.model.Job actual)
expected
and actual
, using Preconditions.checkState.public com.google.api.services.bigquery.model.Job insertJobOrFetchDuplicate(String projectId, com.google.api.services.bigquery.model.Job job) throws IOException
projectId
and job
, which
returns a Job
under normal operation, which is then returned from this method. In case
of an exception being thrown, if the cause was "409 conflict", then we issue a separate
"jobs().get(...)" request and return the results of that fetch instead. Other exceptions
propagate out as normal.IOException
Copyright © 2020. All rights reserved.