Object

com.twitter.summingbird.example

Serialization

Related Doc: package example

Permalink

object Serialization

Serialization is often the most important (and hairy) configuration issue for any system that needs to store its data over the long term. Summingbird controls serialization through the "Injection" interface.

By maintaining identical Injections from K and V to Array[Byte], one can guarantee that data written one day will be readable the next. This isn't the case with serialization engines like Kryo, where serialization format depends on unstable parameters, like the serializer registration order for the given Kryo instance.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. Serialization
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. implicit def kInjection[T](implicit arg0: Codec[T]): Injection[(T, BatchID), Array[Byte]]

    Permalink

    Summingbird's implementation of the batch/realtime merge requires that the Storm-based workflow store (K, BatchID) -> V pairs, while the Hadoop-based workflow stores K -> (BatchID, V) pairs.

    Summingbird's implementation of the batch/realtime merge requires that the Storm-based workflow store (K, BatchID) -> V pairs, while the Hadoop-based workflow stores K -> (BatchID, V) pairs.

    The following two injections use Bijection's "Bufferable" object to generate injections that take (T, BatchID) or (BatchID, T) to bytes.

    For true production applications, I'd suggest defining a thrift or protobuf "pair" structure that can safely store these pairs over the long-term.

  13. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. implicit val statusCodec: Injection[Status, String]

    Permalink

    This Injection converts the twitter4j.Status objects that Storm and Scalding will process into Strings.

  17. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  18. implicit val toBytes: Injection[Status, Array[Byte]]

    Permalink

    We can chain the Status <-> String injection above with the library-supplied String <-> Array[Byte] injection to generate a full-on serializer for Status objects of the type Injection[Status, Array[Byte]].

    We can chain the Status <-> String injection above with the library-supplied String <-> Array[Byte] injection to generate a full-on serializer for Status objects of the type Injection[Status, Array[Byte]]. Our Storm and Scalding sources can now pull in this injection using Scala's implicit resolution and properly register the serializer.

  19. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  20. implicit def vInj[V](implicit arg0: Codec[V]): Injection[(BatchID, V), Array[Byte]]

    Permalink
  21. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped