org.spark
Anonymizes selected columns in a dataframe while preserving format.
To anonymize selected columns in a dataframe:
import org.spark.Anonymizer.Extensions
val df = input_df.anonymize((p => Array("col1", "col2").contains(p)))
To anonymize all columns in a dataframe: val df = input_df.anonymize()
To anonymize all columns in a dataframe except one: val df = input_df.anonymize((p => p != "id"))
To anonymize a single column:
df.withColumn("anonymized_col1", Anonymizer.AnonymizeStringUdf($"col1"))
Anonymizes selected columns in a dataframe while preserving format.
To anonymize selected columns in a dataframe:
import org.spark.Anonymizer.Extensions
val df = input_df.anonymize((p => Array("col1", "col2").contains(p)))
To anonymize all columns in a dataframe: val df = input_df.anonymize()
To anonymize all columns in a dataframe except one: val df = input_df.anonymize((p => p != "id"))
To anonymize a single column:
import org.spark.Anonymizer.Extensions
df.withColumn("anonymized_col1", Anonymizer.AnonymizeStringUdf($"col1"))