NormalizeVariantsTransformer

Implements DataFrameTransformer to transform the input DataFrame of variants to an output DataFrame of normalized variants (normalization is as defined in vt normalize or bcftools norm).

A path to the reference genome .fasta file must be provided through the reference_genome_path option. The .fasta file must be accompanied with a .fai index file in the same folder.

The transformer output columns can be controlled by the replace_columns option:

If the replace_columns option is false, the transformer does not touch the original start, end, referenceAllele and alternateAlleles columns. Instead, a StructType column called normalizationResult is added to the DataFrame which contains the normalized start, end, referenceAllele, and alternateAlleles columns as well as the normalizationStatus StructType as the fifth field, which contains the following subfields:

changed: A boolean field indicating whether the variant data was changed as a result of normalization.
errorMessage: An error message in case the attempt at normalizing the row hit an error. In this case, the changed field will be set to false. If no errors occur this field will be null. In case of error, the first four fields in normalizationResult will be null.

If replace_columns option is true (default), the transformer replaces the original start, end, referenceAllele, and alternateAlleles columns with the normalized value in case they have changed. Otherwise (in case of no change or an error), the original start, end, referenceAllele, and alternateAlleles are not touched. A StructType normalizationStatus column is added to the DataFrame with the same subfields as above.

Linear Supertypes

HlsEventRecorder, HlsUsageLogging, GlowLogging, LazyLogging, LazyLogging, Logging, DataFrameTransformer, Named, AnyRef, Any

Instance Constructors

new NormalizeVariantsTransformer()

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def backwardCompatibleTransform(df: DataFrame, refGenomePathString: Option[String], replaceColumns: Boolean, modeOption: Option[String]): DataFrame

The following function is for backward compatibility with the previous API where the normalizer could act in different modes: The default mode was normalizing the variants without splitting multiallelic ones.
The following function is for backward compatibility with the previous API where the normalizer could act in different modes: The default mode was normalizing the variants without splitting multiallelic ones. The "mode" option could be used to change this behavior. Setting "mode" to "split" only splits multiallelic variants and skips normalization. Setting "mode" to split_and_normalize splits multiallelic variants and then normalizes the split variants, which is equivalent to using split_multiallelics transformer followed by normalize_variants transformer.
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
def hlsJsonBuilder(options: Map[String, Any]): String

Attributes
protected
Definition Classes
HlsUsageLogging
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
lazy val logger: Logger

Attributes
protected
Definition Classes
LazyLogging → Logging
def name: String

Definition Classes
NormalizeVariantsTransformer → Named
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def recordHlsEvent(tag: String, options: Map[String, Any] = Map.empty): Unit

Definition Classes
HlsEventRecorder
def recordHlsUsage(metric: MetricDefinition, tags: Map[TagDefinition, String] = Map.empty, blob: String = null): Unit

Attributes
protected
Definition Classes
HlsUsageLogging
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def transform(df: DataFrame, options: Map[String, String]): DataFrame

Definition Classes
NormalizeVariantsTransformer → DataFrameTransformer
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object NormalizeVariantsTransformer | package normalizevariants

class NormalizeVariantsTransformer extends DataFrameTransformer with HlsEventRecorder

Instance Constructors

new NormalizeVariantsTransformer()

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def backwardCompatibleTransform(df: DataFrame, refGenomePathString: Option[String], replaceColumns: Boolean, modeOption: Option[String]): DataFrame

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

def hlsJsonBuilder(options: Map[String, Any]): String

final def isInstanceOf[T0]: Boolean

lazy val logger: Logger

def name: String

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def recordHlsEvent(tag: String, options: Map[String, Any] = Map.empty): Unit

def recordHlsUsage(metric: MetricDefinition, tags: Map[TagDefinition, String] = Map.empty, blob: String = null): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

def transform(df: DataFrame, options: Map[String, String]): DataFrame

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from HlsEventRecorder

Inherited from HlsUsageLogging

Inherited from GlowLogging

Inherited from LazyLogging

Inherited from LazyLogging

Inherited from Logging

Inherited from DataFrameTransformer

Inherited from Named

Inherited from AnyRef

Inherited from Any

Ungrouped