Object

com.snowplowanalytics.snowplow.enrich.common.utils

ConversionUtils

Related Doc: package utils

Permalink

object ConversionUtils

General-purpose utils to help the ETL process along.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ConversionUtils
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class UriComponents(scheme: String, host: String, port: Integer, path: Option[String], query: Option[String], fragment: Option[String]) extends Product with Serializable

    Permalink

    Simple case class wrapper around the components of a URI.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def booleanToJByte(bool: Boolean): Byte

    Permalink

    Helper to convert a Boolean value to a Byte.

    Helper to convert a Boolean value to a Byte. Does not require any validation.

    bool

    The Boolean to convert into a Byte

    returns

    0 if false, 1 if true

  6. def byteToBoolean(b: Byte): Validation[String, Boolean]

    Permalink

    Helper to convert a Byte value (1 or 0) into a Boolean.

    Helper to convert a Byte value (1 or 0) into a Boolean.

    b

    The Byte to turn into a Boolean

    returns

    the Boolean value of b, or an error message if b is not 0 or 1 - all boxed in a Scalaz Validation

  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def decodeBase64Url(field: String, str: String): Validation[String, String]

    Permalink

    Decodes a URL-safe Base64 string.

    Decodes a URL-safe Base64 string.

    For details on the Base 64 Encoding with URL and Filename Safe Alphabet see:

    http://tools.ietf.org/html/rfc4648#page-7

    field

    The name of the field

    str

    The encoded string to be decoded

    returns

    a Scalaz Validation, wrapping either an an error String or the decoded String

  9. val decodeString: (String, String, String) ⇒ ValidatedString

    Permalink

    Decodes a String in the specific encoding, also removing: * Newlines - because they will break Hive * Tabs - because they will break non-Hive targets (e.g.

    Decodes a String in the specific encoding, also removing: * Newlines - because they will break Hive * Tabs - because they will break non-Hive targets (e.g. Infobright)

    IMPLDIFF: note that this version, unlike the Hive serde version, does not call cleanUri. This is because we cannot assume that str is a URI which needs 'cleaning'.

    TODO: simplify this when we move to a more robust output format (e.g. Avro) - as then no need to remove line breaks, tabs etc

    returns

    a Scalaz Validation, wrapping either an error String or the decoded String

  10. def doubleDecode(field: String, str: String): ValidatedString

    Permalink

    Decode double-encoded percents, then percent decode

    Decode double-encoded percents, then percent decode

    field

    The name of the field

    str

    The String to decode

    returns

    a Scalaz Validation, wrapping either an error String or the decoded String

  11. def encodeBase64Url(str: String): String

    Permalink

    Encodes a URL-safe Base64 string.

    Encodes a URL-safe Base64 string.

    For details on the Base 64 Encoding with URL and Filename Safe Alphabet see:

    http://tools.ietf.org/html/rfc4648#page-7

    str

    The string to be encoded

    returns

    the string encoded in URL-safe Base64

  12. def encodeString(enc: String, str: String): String

    Permalink

    Encodes a string in the specified encoding

    Encodes a string in the specified encoding

    enc

    The encoding to be used

    str

    The string which needs to be URLEncoded

    returns

    a URL encoded string

  13. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  15. def explodeUri(uri: URI): UriComponents

    Permalink

    Explodes a URI into its 6 components pieces.

    Explodes a URI into its 6 components pieces. Simple code but we use it in multiple places

    uri

    The URI to explode into its constituent pieces

    returns

    The 6 components in a UriComponents case class

  16. def extractQuerystring(uri: URI, encoding: String): Validation[String, Map[String, String]]

    Permalink

    Attempt to extract the querystring from a URI as a map

    Attempt to extract the querystring from a URI as a map

    uri

    URI containing the querystring

    encoding

    Encoding of the URI

  17. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  18. def fixTabsNewlines(str: String): Option[String]

    Permalink

    Replaces tabs with four spaces and removes newlines altogether.

    Replaces tabs with four spaces and removes newlines altogether.

    Useful to prepare user-created strings for fragile storage formats like TSV.

    str

    The String to fix

    returns

    The String with tabs and newlines fixed.

  19. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  20. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. def makeTsvSafe(str: String): String

    Permalink

    Quick helper to make sure our Strings are TSV-safe, i.e.

    Quick helper to make sure our Strings are TSV-safe, i.e. don't include tabs, special characters, newlines etc.

    str

    The string we want to make safe

    returns

    a safe String

  23. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. def singleEncodePcts(str: String): String

    Permalink

    On 17th August 2013, Amazon made an unannounced change to their CloudFront log format - they went from always encoding % characters, to only encoding % characters which were not previously encoded.

    On 17th August 2013, Amazon made an unannounced change to their CloudFront log format - they went from always encoding % characters, to only encoding % characters which were not previously encoded. For a full discussion of this see:

    https://forums.aws.amazon.com/thread.jspa?threadID=134017&tstart=0#

    On 14th September 2013, Amazon rolled out a further fix, from which point onwards all fields, including the referer and useragent, would have %s double-encoded.

    This causes issues, because the ETL process expects referers and useragents to be only single-encoded.

    This function turns a double-encoded percent (%) into a single-encoded one.

    Examples: 1. "page=Celestial%25Tarot" - no change (only single encoded) 2. "page=Dreaming%2520Way%2520Tarot" -> "page=Dreaming%20Way%20Tarot" 3. "loading 30%2525 complete" -> "loading 30%25 complete"

    Limitation of this approach: %2588 is ambiguous. Is it a: a) A double-escaped caret "ˆ" (%2588 -> %88 -> ^), or: b) A single-escaped "%88" (%2588 -> %88)

    This code assumes it's a).

    str

    The String which potentially has double-encoded %s

    returns

    the String with %s now single-encoded

  27. val stringToBoolean: (String, String) ⇒ Validation[String, Boolean]

    Permalink

    Converts a String of value "1" or "0" to true or false respectively.

    Converts a String of value "1" or "0" to true or false respectively.

    returns

    True for "1", false for "0", or an error message for any other value, all boxed in a Scalaz Validation

  28. val stringToBooleanlikeJByte: (String, String) ⇒ Validation[String, Byte]

    Permalink

    Extract a Java Byte representing 1 or 0 only from a String, or error.

    Extract a Java Byte representing 1 or 0 only from a String, or error.

    returns

    a Scalaz Validation, being either a Failure String or a Success Byte

  29. val stringToDouble: (String, String) ⇒ Validation[String, Double]

    Permalink

    Converts a String to a Double.

    Converts a String to a Double. Takes a field name and a string value and return a validated float.

  30. val stringToDoublelike: (String, String) ⇒ ValidatedString

    Permalink

    Convert a String to a String containing a Redshift-compatible Double.

    Convert a String to a String containing a Redshift-compatible Double.

    Necessary because Redshift does not support all Java Double syntaxes e.g. "3.4028235E38"

    Note that this code does NOT check that the value will fit within a Redshift Double - meaning Redshift may silently round this number on load.

    returns

    a Scalaz Validation, being either a Failure String or a Success String

  31. val stringToJInteger: (String, String) ⇒ Validation[String, Integer]

    Permalink

    Extract a Scala Int from a String, or error.

    Extract a Scala Int from a String, or error.

    returns

    a Scalaz Validation, being either a Failure String or a Success JInt

  32. def stringToMaybeDouble(field: String, str: String): Validation[String, Option[Double]]

    Permalink

    Convert a String to a Double

    Convert a String to a Double

    field

    The name of the field we are validating. To use in our error message

    str

    The String which we hope contains a Double

    returns

    a Scalaz Validation, being either a Failure String or a Success Double

  33. val stringToTwoDecimals: (String, String) ⇒ Validation[String, Double]

    Permalink

    Converts a String to a Double with two decimal places.

    Converts a String to a Double with two decimal places. Used to honor schemas with multipleOf 0.01. Takes a field name and a string value and return a validated double.

  34. def stringToUri(uri: String): Validation[String, Option[URI]]

    Permalink

    Parses a string to create a URI.

    Parses a string to create a URI. Parsing is relaxed, i.e. even if a URL is not correctly percent-encoded or not RFC 3986-compliant, it can be parsed.

    uri

    String containing the URI to parse.

    returns

    Validation wrapping the result of the parsing:

    • Success with the parsed URI if there was no error or with None if the input was null.
    • Failure with the error message if something went wrong.
  35. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  36. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  37. def truncate(str: String, length: Int): String

    Permalink

    Truncates a String - useful for making sure Strings can't overflow a database field.

    Truncates a String - useful for making sure Strings can't overflow a database field.

    str

    The String to truncate

    length

    The maximum length of the String to keep

    returns

    the truncated String

  38. val validateInteger: (String, String) ⇒ ValidatedString

    Permalink

    returns

    a Scalaz ValidatedString containing either the original String on Success, or an error String on Failure.

  39. val validateUuid: (String, String) ⇒ ValidatedString

    Permalink

    Validates that the given field contains a valid UUID.

    Validates that the given field contains a valid UUID.

    returns

    a Scalaz ValidatedString containing either the original String on Success, or an error String on Failure.

  40. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped