UDF to anonymize an Byte while preserving its number of digits.
UDF to anonymize a Date.
UDF to anonymize a Decimal.
UDF to anonymize a Double.
UDF to anonymize a Float.
UDF to anonymize an Integer while preserving its number of digits.
UDF to anonymize a JSON string while preserving property names
UDF to anonymize a Long while preserving its number of digits.
UDF to anonymize an Short while preserving its number of digits.
UDF to anonymize a string while preserving its format.
UDF to anonymize a Timestamp.
Anonymize selected fields in a dataframe.
Function to anonymize a Double.
Function to anonymize a JsValue (JSON AST document, see https://javadoc.io/static/io.spray/spray-json_2.12/1.3.5/spray/json/JsValue.html)
Function to anonymize a JsValue (JSON AST document, see https://javadoc.io/static/io.spray/spray-json_2.12/1.3.5/spray/json/JsValue.html)
Function to anonymize a JSON string while preserving property names
Function to anonymize a Long while preserving its number of digits.
Anonymize a string while preserving its format.
Function to anonymize a Timestamp.
Update all columns of a dataframe.
Update all columns of a dataframe.
Traverse all columns of a dataframe schema and execute anonization functions based on column data types.
Traverse all columns of a dataframe schema and execute anonization functions based on column data types.
Anonymizes selected columns in a dataframe while preserving format.
To anonymize selected columns in a dataframe:
import org.spark.Anonymizer.Extensions
val df = input_df.anonymize((p => Array("col1", "col2").contains(p)))
To anonymize all columns in a dataframe: val df = input_df.anonymize()
To anonymize all columns in a dataframe except one: val df = input_df.anonymize((p => p != "id"))
To anonymize a single column:
import org.spark.Anonymizer.Extensions
df.withColumn("anonymized_col1", Anonymizer.AnonymizeStringUdf($"col1"))