Object

org.spark.anonymizer

Anonymizer

Related Doc: package anonymizer

Permalink

object Anonymizer extends Serializable

Anonymizes selected columns in a dataframe while preserving format.

To anonymize selected columns in a dataframe:

import org.spark.Anonymizer.Extensions

val df = input_df.anonymize((p => Array("col1", "col2").contains(p)))

To anonymize all columns in a dataframe: val df = input_df.anonymize()

To anonymize all columns in a dataframe except one: val df = input_df.anonymize((p => p != "id"))

To anonymize a single column:

import org.spark.Anonymizer.Extensions

df.withColumn("anonymized_col1", Anonymizer.AnonymizeStringUdf($"col1"))

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Anonymizer
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. val AnonymizeByteUdf: UserDefinedFunction

    Permalink

    UDF to anonymize an Byte while preserving its number of digits.

  5. val AnonymizeDateUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Date.

  6. val AnonymizeDecimalUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Decimal.

  7. val AnonymizeDoubleUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Double.

  8. val AnonymizeFloatUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Float.

  9. val AnonymizeIntegerUdf: UserDefinedFunction

    Permalink

    UDF to anonymize an Integer while preserving its number of digits.

  10. val AnonymizeJsonStringUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a JSON string while preserving property names

  11. val AnonymizeLongUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Long while preserving its number of digits.

  12. val AnonymizeShortUdf: UserDefinedFunction

    Permalink

    UDF to anonymize an Short while preserving its number of digits.

  13. val AnonymizeStringUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a string while preserving its format.

  14. val AnonymizeTimestampUdf: UserDefinedFunction

    Permalink

    UDF to anonymize a Timestamp.

  15. val AsciiLowerLetters: List[Char]

    Permalink
    Attributes
    protected[this]
  16. val AsciiUpperLetters: List[Char]

    Permalink
    Attributes
    protected[this]
  17. val MinNumber: Int

    Permalink
  18. val Numbers: Inclusive[Char]

    Permalink
    Attributes
    protected[this]
  19. val UtfLetters: List[Char]

    Permalink
    Attributes
    protected[this]
  20. def anonymize(df: DataFrame, columnPathFilter: (String) ⇒ Boolean = p => true): DataFrame

    Permalink

    Anonymize selected fields in a dataframe.

  21. def anonymizeDouble(d: Option[Double]): Option[Double]

    Permalink

    Function to anonymize a Double.

  22. def anonymizeJson(c: Column, f: StructField): Column

    Permalink
    Attributes
    protected[this]
  23. def anonymizeJsonAst(jsValue: JsValue): JsValue

    Permalink

    Function to anonymize a JsValue (JSON AST document, see https://javadoc.io/static/io.spray/spray-json_2.12/1.3.5/spray/json/JsValue.html)

    Function to anonymize a JsValue (JSON AST document, see https://javadoc.io/static/io.spray/spray-json_2.12/1.3.5/spray/json/JsValue.html)

    Attributes
    protected
  24. def anonymizeJsonString(jsonString: Option[String]): Option[String]

    Permalink

    Function to anonymize a JSON string while preserving property names

  25. def anonymizeLong(l: Option[Long]): Option[Long]

    Permalink

    Function to anonymize a Long while preserving its number of digits.

  26. def anonymizeString(s: Option[String]): Option[String]

    Permalink

    Anonymize a string while preserving its format.

  27. def anonymizeTimestamp(ts: Option[Timestamp]): Option[Timestamp]

    Permalink

    Function to anonymize a Timestamp.

  28. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  29. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  31. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  32. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  33. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  34. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  35. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  36. def mutate(df: DataFrame, columnPathFilter: (String) ⇒ Boolean): DataFrame

    Permalink

    Update all columns of a dataframe.

    Update all columns of a dataframe.

    Attributes
    protected[this]
  37. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  38. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  39. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  40. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  41. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  42. def traverse(schema: StructType, columnPathFilter: (String) ⇒ Boolean, path: String = ""): Array[Column]

    Permalink

    Traverse all columns of a dataframe schema and execute anonization functions based on column data types.

    Traverse all columns of a dataframe schema and execute anonization functions based on column data types.

    Attributes
    protected[this]
  43. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped