com.snowplowanalytics.snowplow.enrich.common.utils

ConversionUtils

object ConversionUtils

General-purpose utils to help the ETL process along.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. ConversionUtils
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. case class UriComponents(scheme: String, host: String, port: Integer, path: Option[String], query: Option[String], fragment: Option[String]) extends Product with Serializable

    Simple case class wrapper around the components of a URI.

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def booleanToJByte(bool: Boolean): Byte

    Helper to convert a Boolean value to a Byte.

    Helper to convert a Boolean value to a Byte. Does not require any validation.

    bool

    The Boolean to convert into a Byte

    returns

    0 if false, 1 if true

  8. def byteToBoolean(b: Byte): Validation[String, Boolean]

    Helper to convert a Byte value (1 or 0) into a Boolean.

    Helper to convert a Byte value (1 or 0) into a Boolean.

    b

    The Byte to turn into a Boolean

    returns

    the Boolean value of b, or an error message if b is not 0 or 1 - all boxed in a Scalaz Validation

  9. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def decodeBase64Url(field: String, str: String): Validation[String, String]

    Decodes a URL-safe Base64 string.

    Decodes a URL-safe Base64 string.

    For details on the Base 64 Encoding with URL and Filename Safe Alphabet see:

    http://tools.ietf.org/html/rfc4648#page-7

    field

    The name of the field

    str

    The encoded string to be decoded

    returns

    a Scalaz Validation, wrapping either an an error String or the decoded String

  11. val decodeString: (String, String, String) ⇒ ValidatedString

    Decodes a String in the specific encoding, also removing: * Newlines - because they will break Hive * Tabs - because they will break non-Hive targets (e.

    Decodes a String in the specific encoding, also removing: * Newlines - because they will break Hive * Tabs - because they will break non-Hive targets (e.g. Infobright)

    IMPLDIFF: note that this version, unlike the Hive serde version, does not call cleanUri. This is because we cannot assume that str is a URI which needs 'cleaning'.

    TODO: simplify this when we move to a more robust output format (e.g. Avro) - as then no need to remove line breaks, tabs etc

    returns

    a Scalaz Validation, wrapping either an error String or the decoded String

  12. def doubleDecode(field: String, str: String): ValidatedString

    Decode double-encoded percents, then percent decode

    Decode double-encoded percents, then percent decode

    field

    The name of the field

    str

    The String to decode

    returns

    a Scalaz Validation, wrapping either an error String or the decoded String

  13. def encodeBase64Url(str: String): String

    Encodes a URL-safe Base64 string.

    Encodes a URL-safe Base64 string.

    For details on the Base 64 Encoding with URL and Filename Safe Alphabet see:

    http://tools.ietf.org/html/rfc4648#page-7

    str

    The string to be encoded

    returns

    the string encoded in URL-safe Base64

  14. def encodeString(enc: String, str: String): String

    Encodes a string in the specified encoding

    Encodes a string in the specified encoding

    enc

    The encoding to be used

    str

    The string which needs to be URLEncoded

    returns

    a URL encoded string

  15. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  17. def explodeUri(uri: URI): UriComponents

    Explodes a URI into its 6 components pieces.

    Explodes a URI into its 6 components pieces. Simple code but we use it in multiple places

    uri

    The URI to explode into its constituent pieces

    returns

    The 6 components in a UriComponents case class

  18. def extractQuerystring(uri: URI, encoding: String): Validation[String, Map[String, String]]

    Attempt to extract the querystring from a URI as a map

    Attempt to extract the querystring from a URI as a map

    uri

    URI containing the querystring

    encoding

    Encoding of the URI

  19. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  20. def fixTabsNewlines(str: String): Option[String]

    Replaces tabs with four spaces and removes newlines altogether.

    Replaces tabs with four spaces and removes newlines altogether.

    Useful to prepare user-created strings for fragile storage formats like TSV.

    str

    The String to fix

    returns

    The String with tabs and newlines fixed.

  21. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  22. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  23. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  24. def makeTsvSafe(str: String): String

    Quick helper to make sure our Strings are TSV-safe, i.

    Quick helper to make sure our Strings are TSV-safe, i.e. don't include tabs, special characters, newlines etc.

    str

    The string we want to make safe

    returns

    a safe String

  25. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  26. final def notify(): Unit

    Definition Classes
    AnyRef
  27. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  28. def singleEncodePcts(str: String): String

    On 17th August 2013, Amazon made an unannounced change to their CloudFront log format - they went from always encoding % characters, to only encoding % characters which were not previously encoded.

    On 17th August 2013, Amazon made an unannounced change to their CloudFront log format - they went from always encoding % characters, to only encoding % characters which were not previously encoded. For a full discussion of this see:

    https://forums.aws.amazon.com/thread.jspa?threadID=134017&tstart=0#

    On 14th September 2013, Amazon rolled out a further fix, from which point onwards all fields, including the referer and useragent, would have %s double-encoded.

    This causes issues, because the ETL process expects referers and useragents to be only single-encoded.

    This function turns a double-encoded percent (%) into a single-encoded one.

    Examples: 1. "page=Celestial%25Tarot" - no change (only single encoded) 2. "page=Dreaming%2520Way%2520Tarot" -> "page=Dreaming%20Way%20Tarot" 3. "loading 30%2525 complete" -> "loading 30%25 complete"

    Limitation of this approach: %2588 is ambiguous. Is it a: a) A double-escaped caret "ˆ" (%2588 -> %88 -> ^), or: b) A single-escaped "%88" (%2588 -> %88)

    This code assumes it's a).

    str

    The String which potentially has double-encoded %s

    returns

    the String with %s now single-encoded

  29. def stringToBoolean(str: String): Validation[String, Boolean]

    Converts a String of value "1" or "0" to true or false respectively.

    Converts a String of value "1" or "0" to true or false respectively.

    str

    The String to convert

    returns

    True for "1", false for "0", or an error message for any other value, all boxed in a Scalaz Validation

  30. val stringToBooleanlikeJByte: (String, String) ⇒ Validation[String, Byte]

    Extract a Java Byte representing 1 or 0 only from a String, or error.

    Extract a Java Byte representing 1 or 0 only from a String, or error.

    returns

    a Scalaz Validation, being either a Failure String or a Success Byte

  31. val stringToDoublelike: (String, String) ⇒ ValidatedString

    Convert a String to a String containing a Redshift-compatible Double.

    Convert a String to a String containing a Redshift-compatible Double.

    Necessary because Redshift does not support all Java Double syntaxes e.g. "3.4028235E38"

    Note that this code does NOT check that the value will fit within a Redshift Double - meaning Redshift may silently round this number on load.

    returns

    a Scalaz Validation, being either a Failure String or a Success String

  32. val stringToJInteger: (String, String) ⇒ Validation[String, Integer]

    Extract a Scala Int from a String, or error.

    Extract a Scala Int from a String, or error.

    returns

    a Scalaz Validation, being either a Failure String or a Success JInt

  33. def stringToMaybeDouble(field: String, str: String): Validation[String, Option[Double]]

    Convert a String to a Double

    Convert a String to a Double

    field

    The name of the field we are validating. To use in our error message

    str

    The String which we hope contains a Double

    returns

    a Scalaz Validation, being either a Failure String or a Success Double

  34. def stringToUri(uri: String, useNetaporter: Boolean = false): Validation[String, Option[URI]]

    A wrapper around Java's URI.

    A wrapper around Java's URI.create().

    Exceptions thrown by URI.create(): 1. NullPointerException if uri is null 2. IllegalArgumentException if uri violates RFC 2396

    uri

    The URI string to convert

    useNetaporter

    Whether to use the com.netaporter.uri library

    returns

    an Option-boxed URI object, or an error message, all wrapped in a Validation

  35. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  36. def toString(): String

    Definition Classes
    AnyRef → Any
  37. def truncate(str: String, length: Int): String

    Truncates a String - useful for making sure Strings can't overflow a database field.

    Truncates a String - useful for making sure Strings can't overflow a database field.

    str

    The String to truncate

    length

    The maximum length of the String to keep

    returns

    the truncated String

  38. val validateInteger: (String, String) ⇒ ValidatedString

    returns

    a Scalaz ValidatedString containing either the original String on Success, or an error String on Failure.

  39. val validateUuid: (String, String) ⇒ ValidatedString

    Validates that the given field contains a valid UUID.

    Validates that the given field contains a valid UUID.

    returns

    a Scalaz ValidatedString containing either the original String on Success, or an error String on Failure.

  40. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped