Class

com.salesforce.op.stages.impl.feature

DecisionTreeNumericMapBucketizer

Related Doc: package feature

Permalink

class DecisionTreeNumericMapBucketizer[N, I2 <: OPMap[N]] extends BinaryEstimator[RealNN, I2, OPVector] with DecisionTreeNumericBucketizerParams with VectorizerDefaults with TrackInvalidParam with TrackNullsParam with NumericBucketizerMetadata with MapPivotParams with CleanTextMapFun with AllowLabelAsInput[OPVector]

Smart bucketizer for numeric map values based on a Decision Tree classifier.

N

numeric feature type value

I2

numeric map feature type

Linear Supertypes
AllowLabelAsInput[OPVector], CleanTextMapFun, CleanTextFun, MapPivotParams, NumericBucketizerMetadata, TrackNullsParam, TrackInvalidParam, VectorizerDefaults, DecisionTreeNumericBucketizerParams, BinaryEstimator[RealNN, I2, OPVector], OpPipelineStage2[RealNN, I2, OPVector], HasIn2, HasIn1, OpPipelineStage[OPVector], OpPipelineStageBase, MLWritable, OpPipelineStageParams, InputParams, Estimator[BinaryModel[RealNN, I2, OPVector]], PipelineStage, Logging, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DecisionTreeNumericMapBucketizer
  2. AllowLabelAsInput
  3. CleanTextMapFun
  4. CleanTextFun
  5. MapPivotParams
  6. NumericBucketizerMetadata
  7. TrackNullsParam
  8. TrackInvalidParam
  9. VectorizerDefaults
  10. DecisionTreeNumericBucketizerParams
  11. BinaryEstimator
  12. OpPipelineStage2
  13. HasIn2
  14. HasIn1
  15. OpPipelineStage
  16. OpPipelineStageBase
  17. MLWritable
  18. OpPipelineStageParams
  19. InputParams
  20. Estimator
  21. PipelineStage
  22. Logging
  23. Params
  24. Serializable
  25. Serializable
  26. Identifiable
  27. AnyRef
  28. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DecisionTreeNumericMapBucketizer(operationName: String = "dtNumMapBuck", uid: String = ...)(implicit tti2: scala.reflect.api.JavaUniverse.TypeTag[I2], ttiv2: scala.reflect.api.JavaUniverse.TypeTag[Map[String, N]], nev: Numeric[N])

    Permalink

    operationName

    unique name of the operation this stage performs

    uid

    uid for instance

    tti2

    type tag for numeric feature type

    ttiv2

    type tag for numeric feature value type

    nev

    numeric evidence for feature type value

Type Members

  1. final type InputFeatures = (FeatureLike[RealNN], FeatureLike[I2])

    Permalink
    Definition Classes
    OpPipelineStage2 → OpPipelineStage → InputParams
  2. final type OutputFeatures = FeatureLike[OPVector]

    Permalink
    Definition Classes
    OpPipelineStage → OpPipelineStageBase
  3. case class Splits(shouldSplit: Boolean, splits: Array[Double], bucketLabels: Array[String]) extends Product with Serializable

    Permalink

    Computed splits

    Computed splits

    shouldSplit

    should or not split

    splits

    computed split values

    bucketLabels

    bucket labels

    Definition Classes
    DecisionTreeNumericBucketizerParams

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. final val blackListKeys: StringArrayParam

    Permalink
    Definition Classes
    MapPivotParams
  7. implicit def booleanToDouble(v: Boolean): Double

    Permalink
    Definition Classes
    VectorizerDefaults
  8. final def checkInputLength(features: Array[_]): Boolean

    Permalink
    Definition Classes
    OpPipelineStage2 → InputParams
  9. final def checkSerializable: Try[Unit]

    Permalink
    Definition Classes
    BinaryEstimator → OpPipelineStageBase
  10. final val cleanKeys: BooleanParam

    Permalink
    Definition Classes
    MapPivotParams
  11. def cleanMap[V](m: Map[String, V], shouldCleanKey: Boolean, shouldCleanValue: Boolean): Map[String, V]

    Permalink
    Definition Classes
    CleanTextMapFun
  12. def cleanTextFn(s: String, shouldClean: Boolean): String

    Permalink
    Definition Classes
    CleanTextFun
  13. final def clear(param: Param[_]): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    Params
  14. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  15. def computeSplits(data: Dataset[(Double, Double)], featureName: String): Splits

    Permalink

    Compute splits using DecisionTreeClassifier

    Compute splits using DecisionTreeClassifier

    data

    input dataset of (label, feature) tuples

    featureName

    feature name

    returns

    computed Splits

    Attributes
    protected
    Definition Classes
    DecisionTreeNumericBucketizerParams
  16. val convertI1: FeatureTypeSparkConverter[RealNN]

    Permalink
    Definition Classes
    BinaryEstimator
  17. val convertI2: FeatureTypeSparkConverter[I2]

    Permalink
    Definition Classes
    BinaryEstimator
  18. final def copy(extra: ParamMap): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    OpPipelineStageBase → Params
  19. def copyValues[T <: Params](to: T, extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  20. final def defaultCopy[T <: Params](extra: ParamMap): T

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  21. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  23. def explainParam(param: Param[_]): String

    Permalink
    Definition Classes
    Params
  24. def explainParams(): String

    Permalink
    Definition Classes
    Params
  25. final def extractParamMap(): ParamMap

    Permalink
    Definition Classes
    Params
  26. final def extractParamMap(extra: ParamMap): ParamMap

    Permalink
    Definition Classes
    Params
  27. def filterKeys[V](m: Map[String, V], shouldCleanKey: Boolean, shouldCleanValue: Boolean): Map[String, V]

    Permalink
    Attributes
    protected
    Definition Classes
    MapPivotParams
  28. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  29. def fit(dataset: Dataset[_]): BinaryModel[RealNN, I2, OPVector]

    Permalink
    Definition Classes
    BinaryEstimator → Estimator
  30. def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[BinaryModel[RealNN, I2, OPVector]]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  31. def fit(dataset: Dataset[_], paramMap: ParamMap): BinaryModel[RealNN, I2, OPVector]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" )
  32. def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): BinaryModel[RealNN, I2, OPVector]

    Permalink
    Definition Classes
    Estimator
    Annotations
    @Since( "2.0.0" ) @varargs()
  33. def fitFn(dataset: Dataset[(Option[Double], Map[String, N])]): BinaryModel[RealNN, I2, OPVector]

    Permalink
    Definition Classes
    DecisionTreeNumericMapBucketizer → BinaryEstimator
  34. final def get[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  35. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  36. final def getDefault[T](param: Param[T]): Option[T]

    Permalink
    Definition Classes
    Params
  37. final def getImpurity: String

    Permalink

  38. final def getInputFeature[T <: FeatureType](i: Int): Option[FeatureLike[T]]

    Permalink
    Definition Classes
    InputParams
  39. final def getInputFeatures(): Array[OPFeature]

    Permalink
    Definition Classes
    InputParams
  40. final def getInputSchema(): StructType

    Permalink
    Definition Classes
    OpPipelineStageParams
  41. final def getMaxBins: Int

    Permalink

  42. final def getMaxDepth: Int

    Permalink

  43. final def getMetadata(): Metadata

    Permalink
    Definition Classes
    OpPipelineStageParams
  44. final def getMinInfoGain: Double

    Permalink

  45. final def getMinInstancesPerNode: Int

    Permalink

  46. final def getOrDefault[T](param: Param[T]): T

    Permalink
    Definition Classes
    Params
  47. def getOutput(): FeatureLike[OPVector]

    Permalink
    Definition Classes
    OpPipelineStage2 → OpPipelineStageBase
  48. final def getOutputFeatureName: String

    Permalink
    Definition Classes
    OpPipelineStage
  49. def getParam(paramName: String): Param[Any]

    Permalink
    Definition Classes
    Params
  50. final def getTransientFeature(i: Int): Option[TransientFeature]

    Permalink
    Definition Classes
    InputParams
  51. final def getTransientFeatures(): Array[TransientFeature]

    Permalink
    Definition Classes
    InputParams
  52. final def hasDefault[T](param: Param[T]): Boolean

    Permalink
    Definition Classes
    Params
  53. def hasParam(paramName: String): Boolean

    Permalink
    Definition Classes
    Params
  54. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  55. implicit val i1Encoder: Encoder[features.types.RealNN.Value]

    Permalink
    Definition Classes
    BinaryEstimator
  56. implicit val i2Encoder: Encoder[I2.Value]

    Permalink
    Definition Classes
    BinaryEstimator
  57. final val impurity: Param[String]

    Permalink

    Criterion used for information gain calculation (case-insensitive).

    Criterion used for information gain calculation (case-insensitive). Supported: "entropy" and "gini". (default = gini)

    Definition Classes
    DecisionTreeNumericBucketizerParams
  58. final def in1: TransientFeature

    Permalink
    Attributes
    protected
    Definition Classes
    HasIn1
  59. final def in2: TransientFeature

    Permalink
    Attributes
    protected
    Definition Classes
    HasIn2
  60. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  61. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  62. final def inputAsArray(in: InputFeatures): Array[OPFeature]

    Permalink
    Definition Classes
    OpPipelineStage2 → InputParams
  63. final def isDefined(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  64. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  65. final def isSet(param: Param[_]): Boolean

    Permalink
    Definition Classes
    Params
  66. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  67. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  68. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  69. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  70. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  71. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  72. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  73. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  74. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  75. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  76. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  77. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  78. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  79. def makeVectorColumnMetadata(input: TransientFeature, bucketLabels: Array[String], grouping: Option[String], trackInvalid: Boolean, trackNulls: Boolean): Array[OpVectorColumnMetadata]

    Permalink
    Attributes
    protected
    Definition Classes
    NumericBucketizerMetadata
  80. def makeVectorMetadata(input: TransientFeature, bucketLabels: Array[String], trackInvalid: Boolean, trackNulls: Boolean): OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    NumericBucketizerMetadata
  81. final val maxBins: IntParam

    Permalink

    Maximum number of bins Must be >= 2 and <= number of categories in any categorical feature.

    Maximum number of bins Must be >= 2 and <= number of categories in any categorical feature. (default = 32)

    Definition Classes
    DecisionTreeNumericBucketizerParams
  82. final val maxDepth: IntParam

    Permalink

    Maximum depth of the tree (>= 0).

    Maximum depth of the tree (>= 0). E.g., depth 0 means 1 leaf node; depth 1 means 1 internal node + 2 leaf nodes. (default = 5)

    Definition Classes
    DecisionTreeNumericBucketizerParams
  83. final val minInfoGain: DoubleParam

    Permalink

    Minimum information gain for a split to be considered at a tree node.

    Minimum information gain for a split to be considered at a tree node. Should be >= 0.0. (default = 0.0)

    Definition Classes
    DecisionTreeNumericBucketizerParams
  84. final val minInstancesPerNode: IntParam

    Permalink

    Minimum number of instances each child must have after split.

    Minimum number of instances each child must have after split. If a split causes the left or right child to have fewer than minInstancesPerNode, the split will be discarded as invalid. Should be >= 1. (default = 1)

    Definition Classes
    DecisionTreeNumericBucketizerParams
  85. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  86. implicit val nev: Numeric[N]

    Permalink

    numeric evidence for feature type value

  87. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  88. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  89. def onGetMetadata(): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    OpPipelineStageParams
  90. def onSetInput(): Unit

    Permalink
    Definition Classes
    VectorizerDefaults → OpPipelineStageBase
  91. val operationName: String

    Permalink

    unique name of the operation this stage performs

    unique name of the operation this stage performs

    Definition Classes
    BinaryEstimator → OpPipelineStageBase
  92. final def outputAsArray(out: OutputFeatures): Array[OPFeature]

    Permalink
    Definition Classes
    OpPipelineStage → OpPipelineStageBase
  93. def outputFeatureUid: String

    Permalink
    Attributes
    protected[com.salesforce.op]
    Definition Classes
    OpPipelineStage2 → OpPipelineStage
  94. def outputIsResponse: Boolean

    Permalink
    Definition Classes
    AllowLabelAsInput → OpPipelineStage
  95. def outputVectorMeta: OpVectorMetadata

    Permalink

    Get the metadata describing the output vector

    Get the metadata describing the output vector

    This does not trigger onGetMetadata()

    returns

    Metadata of output vector

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  96. lazy val params: Array[Param[_]]

    Permalink
    Definition Classes
    Params
  97. def save(path: String): Unit

    Permalink
    Definition Classes
    MLWritable
    Annotations
    @Since( "1.6.0" ) @throws( ... )
  98. final def set(paramPair: ParamPair[_]): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  99. final def set(param: String, value: Any): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  100. final def set[T](param: Param[T], value: T): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    Params
  101. final def setBlackListKeys(keys: Array[String]): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  102. def setCleanKeys(clean: Boolean): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  103. final def setDefault(paramPairs: ParamPair[_]*): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  104. final def setDefault[T](param: Param[T], value: T): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    Params
  105. final def setImpurity(value: Impurity): DecisionTreeNumericMapBucketizer.this.type

    Permalink

  106. final def setInput(features: InputFeatures): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    OpPipelineStageBase
  107. final def setInputFeatures[S <: OPFeature](features: Array[S]): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Attributes
    protected
    Definition Classes
    InputParams
  108. def setMaxBins(value: Int): DecisionTreeNumericMapBucketizer.this.type

    Permalink

  109. def setMaxDepth(value: Int): DecisionTreeNumericMapBucketizer.this.type

    Permalink

  110. final def setMetadata(m: Metadata): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    OpPipelineStageParams
  111. def setMinInfoGain(value: Double): DecisionTreeNumericMapBucketizer.this.type

    Permalink

  112. def setMinInstancesPerNode(value: Int): DecisionTreeNumericMapBucketizer.this.type

    Permalink

  113. def setOutputFeatureName(name: String): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    OpPipelineStage
  114. def setTrackInvalid(v: Boolean): DecisionTreeNumericMapBucketizer.this.type

    Permalink

    Option to keep track of invalid values

    Option to keep track of invalid values

    Definition Classes
    TrackInvalidParam
  115. def setTrackNulls(v: Boolean): DecisionTreeNumericMapBucketizer.this.type

    Permalink

    Option to keep track of values that were missing

    Option to keep track of values that were missing

    Definition Classes
    TrackNullsParam
  116. final def setWhiteListKeys(keys: Array[String]): DecisionTreeNumericMapBucketizer.this.type

    Permalink
    Definition Classes
    MapPivotParams
  117. final def stageName: String

    Permalink
    Definition Classes
    OpPipelineStageBase
  118. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  119. def toString(): String

    Permalink
    Definition Classes
    Identifiable → AnyRef → Any
  120. final val trackInvalid: BooleanParam

    Permalink
    Definition Classes
    TrackInvalidParam
  121. final val trackNulls: BooleanParam

    Permalink
    Definition Classes
    TrackNullsParam
  122. final def transformSchema(schema: StructType): StructType

    Permalink
    Definition Classes
    OpPipelineStageBase
  123. def transformSchema(schema: StructType, logging: Boolean): StructType

    Permalink
    Attributes
    protected
    Definition Classes
    PipelineStage
    Annotations
    @DeveloperApi()
  124. implicit val tti1: scala.reflect.api.JavaUniverse.TypeTag[RealNN]

    Permalink
    Definition Classes
    BinaryEstimator
  125. implicit val tti2: scala.reflect.api.JavaUniverse.TypeTag[I2]

    Permalink

    type tag for numeric feature type

    type tag for numeric feature type

    Definition Classes
    BinaryEstimator
  126. implicit val ttiv1: scala.reflect.api.JavaUniverse.TypeTag[features.types.RealNN.Value]

    Permalink
    Definition Classes
    BinaryEstimator
  127. implicit val ttiv2: scala.reflect.api.JavaUniverse.TypeTag[I2.Value]

    Permalink

    type tag for numeric feature value type

    type tag for numeric feature value type

    Definition Classes
    BinaryEstimator
  128. implicit val tto: scala.reflect.api.JavaUniverse.TypeTag[OPVector]

    Permalink
    Definition Classes
    BinaryEstimator → OpPipelineStage2
  129. implicit val ttov: scala.reflect.api.JavaUniverse.TypeTag[Value]

    Permalink
    Definition Classes
    BinaryEstimator → OpPipelineStage2
  130. implicit val tupleEncoder: Encoder[(features.types.RealNN.Value, I2.Value)]

    Permalink
    Definition Classes
    BinaryEstimator
  131. val uid: String

    Permalink

    uid for instance

    uid for instance

    Definition Classes
    BinaryEstimator → Identifiable
  132. def vectorMetadataFromInputFeatures: OpVectorMetadata

    Permalink

    Compute the output vector metadata only from the input features.

    Compute the output vector metadata only from the input features. Vectorizers use this to derive the full vector, including pivot columns or indicator features.

    returns

    Vector metadata from input features

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  133. def vectorMetadataWithNullIndicators: OpVectorMetadata

    Permalink
    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  134. def vectorOutputName: String

    Permalink

    Get the name of the output vector

    Get the name of the output vector

    returns

    Output vector name as a string

    Attributes
    protected
    Definition Classes
    VectorizerDefaults
  135. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  136. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  137. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  138. final val whiteListKeys: StringArrayParam

    Permalink
    Definition Classes
    MapPivotParams
  139. final def write: MLWriter

    Permalink
    Definition Classes
    OpPipelineStageBase → MLWritable

Inherited from AllowLabelAsInput[OPVector]

Inherited from CleanTextMapFun

Inherited from CleanTextFun

Inherited from MapPivotParams

Inherited from NumericBucketizerMetadata

Inherited from TrackNullsParam

Inherited from TrackInvalidParam

Inherited from VectorizerDefaults

Inherited from BinaryEstimator[RealNN, I2, OPVector]

Inherited from OpPipelineStage2[RealNN, I2, OPVector]

Inherited from HasIn2

Inherited from HasIn1

Inherited from OpPipelineStage[OPVector]

Inherited from OpPipelineStageBase

Inherited from MLWritable

Inherited from OpPipelineStageParams

Inherited from InputParams

Inherited from Estimator[BinaryModel[RealNN, I2, OPVector]]

Inherited from PipelineStage

Inherited from Logging

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

getParam

param

setParam

Ungrouped