Trait

com.coxautodata.waimak.storage

AuditTable

Related Doc: package storage

Permalink

trait AuditTable extends AnyRef

Main abstraction for an audit table that a client application must use to store records with a timestamp. It hides all details of the physical storage, so that client apps can use various file systems (Ex: HDFS, ADLS, S3, Local, etc) or key value (Ex: HBase).

Also this abstraction can produce a snapshot of data de-duplicated on the primary key and true to the specified moment in time.

Also surfaces custom attributes initialised during table creation, so that client applications do not need to worry about storing the relevant metadata in a separate storage. It also will simplify backup, restore and sharing of data between environments.

Some storage layers might be quite inefficient when it comes to storing lots of appends in multiple files and storage optimisation, aka compaction, should not intervene with normal operation of the application. Therefore application should be able to control when compaction can take place.

An instance of AuditTable represents a functional state, if data was modified, do not use it again.

There are 2 types of operations on the table:

  1. data extraction - which do not modify the state of the table, thus same instance of the AuditTable can be used for multiple data extraction operations; 2. data mutators - adding data to the table, optimising storage. These lead to new state of the underlying storage and the same instance of AuditTable can not be used for data mutators again.

Created by Alexei Perelighin on 2018/03/03

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. AuditTable
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def allBetween(from: Option[Timestamp], to: Option[Timestamp]): Option[Dataset[_]]

    Permalink

    Include all records between the given timestamps.

    Include all records between the given timestamps.

    returns

    if no data in storage layer, return None

  2. abstract def append(ds: Dataset[_], lastUpdated: Column, appendTS: Timestamp): Try[(AuditTable, Long)]

    Permalink

    Appends a new set of records to the audit table.

    Appends a new set of records to the audit table.

    Fails when is called second time on same instance.

    ds

    records to append

    lastUpdated

    column that returns java.sql.Timestamp that will be used for de-duplication on the primary keys

    appendTS

    timestamp of when the append has happened. It will not be used for de-duplications

    returns

    (new state of the AuditTable, count of appended records) or error

  3. abstract def compact(compactTS: Timestamp): Try[AuditTable]

    Permalink

    Request optimisation of the storage layer.

    Request optimisation of the storage layer.

    Fails when is called second time on same instance.

    compactTS

    timestamp of when the compaction is requested, will not be used for any filtering of the data

    returns

    new state of the AuditTable

  4. abstract def getLatestTimestamp(): Option[Timestamp]

    Permalink

    Returns latest timestamp of records stored in the audit table.

  5. abstract def initNewTable(): Try[AuditTable]

    Permalink

    Initializes audit table in the storage layer.

    Initializes audit table in the storage layer. It will also persist all of the metadata (name, primary keys, custom meta) to the storage layer.

    returns

    new state of the table or error

  6. abstract def meta: Map[String, String]

    Permalink

    Custom attributes assigned by the client application during table creation.

  7. abstract def regions: Seq[AuditTableRegionInfo]

    Permalink
  8. abstract def snapshot(ts: Timestamp): Option[Dataset[_]]

    Permalink

    Generates snapshot that contains only the latest records for the given timestamp.

    Generates snapshot that contains only the latest records for the given timestamp. De-duplication happens on the primary keys.

    ts

    use records that are closest to this timestamp

    returns

    if no data in storage layer, return None

  9. abstract def tableName: String

    Permalink

    Name of the table.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  14. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  16. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped